insertion_scan
Insert sequences from insertion_pool at every position along the background
sequence (or within a named region). Unlike replacement_scan(),
no background bases are removed, so output sequences are longer than the input.
Set replace=True to replace a window of ins_length bases at each
position rather than inserting without deletion; output length stays equal
to the background length. This is equivalent to
replacement_scan().
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
The background Pool to scan. Can also be a plain sequence string. |
|
|
(required) |
Pool or sequence string whose content is inserted at each scanned position. An L-mer has L + 1 valid insertion sites (before each base and after the last). |
|
|
|
Explicit list of insertion positions. |
|
|
|
Restrict insertions to a named region or |
|
|
|
When |
|
|
|
Named display style applied to inserted bases. |
|
|
|
Prefix for auto-generated sequence names. |
|
|
|
|
|
|
|
Number of output states. |
|
|
|
Enumeration order when combined with other pools. |
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see insertion_scan() in the
API Reference.
Examples
Single-base insertions at every position
An 8-mer has 9 insertion sites. 9 sites × 4 bases = 36 sequences, each of length 9.
wt = pp.from_seq("ACGTACGT")
bases = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan = wt.insertion_scan(insertion_pool=bases, mode="sequential", style="red")
scan.print_library()
AACGTACGT
ACAGTACGT
ACGATACGT
ACGTAACGT ... (36 total)
All-dinucleotide insertions
Use from_iupac("NN") to enumerate all 16 dinucleotide inserts.
9 sites × 16 inserts = 144 sequences, each of length 10.
wt = pp.from_seq("ACGTACGT")
nn = pp.from_iupac("NN", mode="sequential")
scan = wt.insertion_scan(insertion_pool=nn, mode="sequential", style="red")
scan.print_library()
AAACGTACGT
ACAAGTACGT
ACGAATACGT
ACGTAAACGT ... (144 total)
Insert-and-replace mode (replace=True)
replace=True replaces a window equal in width to the insert (here 2
bases) at each position. For an 8-mer with a 2-base insert: 8 − 2 + 1 = 7
valid positions; output length stays 8. This is equivalent to calling
replacement_scan().
wt = pp.from_seq("ACGTACGT")
bases = pp.from_seqs(["AA", "CC", "GG", "TT"], mode="sequential")
scan = wt.insertion_scan(insertion_pool=bases, replace=True, mode="sequential",
style="red")
scan.print_library()
AAATACGT
ACAAACGT
ACGAACGT
ACGTAAGT ... (28 total)
Insertion scan within a named region
Restrict insertion sites to the cre region. The 8-base region has 9
valid insertion sites; flanks are never altered.
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
bases = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan = wt.insertion_scan(insertion_pool=bases, region="cre", mode="sequential",
style="red")
scan.print_library()
AAAA<cre>AATCGATCG</cre>TTTT
AAAA<cre>ATACGATCG</cre>TTTT
AAAA<cre>ATCAGATCG</cre>TTTT
AAAA<cre>ATCGAATCG</cre>TTTT ... (36 total)
Explicit position list
Limit the scan to specific insertion sites.
wt = pp.from_seq("ACGTACGT")
bases = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan = wt.insertion_scan(insertion_pool=bases, positions=[0, 4, 8],
mode="sequential", style="red")
scan.print_library()
ACGTAACGT
ACGTACGTA
CACGTACGT
ACGTCACGT ... (12 total)
Random motif insertion (mode=”random”)
mode='random' draws insertion positions stochastically. Here a degenerate
6-base IUPAC motif (R = A|G, Y = C|T) is inserted at random
positions along a 12-mer.
wt = pp.from_seq("ACGTACGTACGT")
motif = pp.from_iupac("RRYYYY")
scan = wt.insertion_scan(insertion_pool=motif, mode="random", num_states=5,
style="red")
scan.print_library()
ACGTACGTACGTGATCTT
ACGACCTTGTACGTACGT
ACGAACTTTTACGTACGT
ACGTACAATTCCGTACGT
See insertion_scan().