insertion_multiscan
Insert sequences at multiple positions simultaneously, lengthening the output sequence by the total inserted content. Insertion sites are chosen randomly and are guaranteed to be non-overlapping.
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
Input pool or sequence string. |
|
|
(required) |
Number of simultaneous non-overlapping insertion sites per draw. |
|
|
(required) |
Pool(s) supplying inserted content. A single pool is reused at every site; a list assigns one pool per site. |
|
|
|
Allowed position sets for each insertion site. |
|
|
|
Named region or interval to restrict insertions to. |
|
|
|
Names for each insertion window. |
|
|
|
If |
|
|
|
Display style for inserted content. |
|
|
|
|
|
|
|
Minimum gap (in bases) between insertion sites. |
|
|
|
Maximum gap (in bases) between insertion sites. |
|
|
|
Prefix for the operation node name in the pool graph. |
|
|
|
|
|
|
|
Number of states. |
|
|
|
Iteration priority for downstream multi-pool iteration. |
|
|
|
Design card columns to include in library output. |
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see insertion_multiscan() in the
API Reference.
Examples
Two simultaneous single-base insertions
Insert a single random base at each of two independently chosen positions.
mode="random" makes each print_library() draw one stochastic
outcome (num_states=1 per preview).
wt = pp.from_seq("ATCGATCGATCG")
insert = pp.from_iupac("N") # any single base
scan = wt.insertion_multiscan(num_insertions=2,
insertion_pools=insert, mode="random",
style="red")
scan.print_library()
Two simultaneous 2-mer insertions
Use from_iupac("NN") to enumerate all 16 dinucleotide insertions at
each of the two chosen positions.
wt = pp.from_seq("ATCGATCGATCG")
insert = pp.from_iupac("NN") # all 16 dinucleotides
scan = wt.insertion_multiscan(num_insertions=2,
insertion_pools=insert, mode="random",
style="red")
scan.print_library()
Multiscan insertion within a named region
Restrict both insertion sites to within the cre region; flanking bases
are never modified.
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
insert = pp.from_iupac("N")
scan = wt.insertion_multiscan(num_insertions=2,
insertion_pools=insert,
region="cre", mode="random",
style="red")
scan.print_library()
Spacing constraints (min_spacing, max_spacing)
min_spacing and max_spacing control the gap between insertion sites.
Here two 6-base motif insertions must be 4–8 bases apart on a 24-mer.
wt = pp.from_seq("ATCGATCGATCGATCGATCGATCG")
motif = pp.from_seq("GATTAC")
scan = wt.insertion_multiscan(num_insertions=2,
insertion_pools=motif,
min_spacing=4, max_spacing=8,
mode="sequential", style="red")
scan.print_library()
GATTACATCGAGATTACTCGATCGATCGATCGATCG
GATTACATCGATGATTACCGATCGATCGATCGATCG
GATTACATCGATCGATTACGATCGATCGATCGATCG
GATTACATCGATCGGATTACATCGATCGATCGATCG ... (95 total)
PPM-based insertion pool (from_motif)
Use from_motif() to supply a position-probability matrix
as the inserted content. Each draw samples a different 6-mer from the PPM,
producing biologically realistic variation at each insertion site.
import pandas as pd
pfm = pd.DataFrame({
"A": [0.8, 0.1, 0.5, 0.1, 0.7, 0.1],
"C": [0.1, 0.7, 0.2, 0.1, 0.1, 0.1],
"G": [0.05, 0.1, 0.2, 0.1, 0.1, 0.7],
"T": [0.05, 0.1, 0.1, 0.7, 0.1, 0.1],
})
wt = pp.from_seq("ATCGATCGATCGATCGATCGATCG")
motif = pp.from_motif(pfm)
scan = wt.insertion_multiscan(num_insertions=2,
insertion_pools=motif, mode="random",
num_states=5, style="red")
scan.print_library()
CCATATATCGATCGATCGATCGATCGATAGGCATCG
ATCGATCGATCGACCGCAGTCGATCGATCAAATAGG
CCATATATCGATCGATCGATCGAACATAGTCGATCG
ATCGATCGATACATAGCGATCGATCGACATACATCG
Explicit position sets (positions)
Specify allowed insertion sites for each window, using a distinct pool for
each site. Below, the first insertion (GGG) can occur at position 0, 4,
or 8 and the second (AAA) at position 10 or 14.
wt = pp.from_seq("ATCGATCGATCGATCG")
pools = [pp.from_seq("GGG"), pp.from_seq("AAA")]
scan = wt.insertion_multiscan(num_insertions=2,
insertion_pools=pools,
positions=[[0, 4, 8], [10, 14]],
mode="sequential", style="red")
scan.print_library()
GGGATCGATCGATCGATAAACG
ATCGGGGATCGATAAACGATCG
ATCGGGGATCGATCGATAAACG
ATCGATCGGGGATAAACGATCG
ATCGATCGGGGATCGATAAACG