insertion_scan

Insert sequences from insertion_pool at every position along the background sequence (or within a named region). Unlike replacement_scan(), no background bases are removed, so output sequences are longer than the input. Set replace=True to replace a window of ins_length bases at each position rather than inserting without deletion; output length stays equal to the background length. This is equivalent to replacement_scan().

import poolparty as pp
pp.init()

Parameters

Parameter

Type

Default

Description

pool

Pool | str

(required)

The background Pool to scan. Can also be a plain sequence string.

insertion_pool

Pool

(required)

Pool whose sequences are inserted at each scanned position. An L-mer has L + 1 valid insertion sites (before each base and after the last).

positions

list[int] | None

None

Explicit list of insertion positions. None = all valid positions.

region

str | None

None

Name of a tagged region to restrict insertions to. Flanking sequences are never modified.

replace

bool

False

When True, a window of ins_length bases is replaced at each position (equivalent to replacement_scan()). Valid positions = background length − insert length + 1; output length = background length.

style

str | None

None

Named display style applied to inserted bases.

prefix

str | None

None

Prefix for auto-generated sequence names.

mode

str

'random'

'sequential' iterates positions then inserts in order; 'random' shuffles the (position × insert) product.

num_states

int | None

None

Fix the total number of output states.

iter_order

int | None

None

Iteration priority for downstream multi-pool iteration.


Note

Only the most commonly used parameters are shown above. For the full parameter list, see insertion_scan() in the API Reference.

Examples

Single-base insertions at every position

An 8-mer has 9 insertion sites. 9 sites × 4 bases = 36 sequences, each of length 9.

import poolparty as pp
pp.init()
wt    = pp.from_seq("ACGTACGT")
bases = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan  = wt.insertion_scan(insertion_pool=bases, mode="sequential", style="red")
scan.print_library()
scan: seq_length=9, num_states=36 AACGTACGT
AACGTACGT
ACAGTACGT
ACGATACGT
ACGTAACGT ... (36 total)

All-dinucleotide insertions

Use from_iupac("NN") to enumerate all 16 dinucleotide inserts. 9 sites × 16 inserts = 144 sequences, each of length 10.

import poolparty as pp
pp.init()
wt   = pp.from_seq("ACGTACGT")
nn   = pp.from_iupac("NN", mode="sequential")
scan = wt.insertion_scan(insertion_pool=nn, mode="sequential", style="red")
scan.print_library()
scan: seq_length=10, num_states=144 AAACGTACGT
AAACGTACGT
ACAAGTACGT
ACGAATACGT
ACGTAAACGT ... (144 total)

Insert-and-replace mode (replace=True)

replace=True replaces a window equal in width to the insert (here 2 bases) at each position. For an 8-mer with a 2-base insert: 8 − 2 + 1 = 7 valid positions; output length stays 8. This is equivalent to calling replacement_scan().

import poolparty as pp
pp.init()
wt    = pp.from_seq("ACGTACGT")
bases = pp.from_seqs(["AA", "CC", "GG", "TT"], mode="sequential")
scan  = wt.insertion_scan(insertion_pool=bases, replace=True, mode="sequential",
                          style="red")
scan.print_library()
scan: seq_length=8, num_states=28 AAGTACGT
AAATACGT
ACAAACGT
ACGAACGT
ACGTAAGT ... (28 total)

Insertion scan within a named region

Restrict insertion sites to the cre region. The 8-base region has 9 valid insertion sites; flanks are never altered.

import poolparty as pp
pp.init()
wt    = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
bases = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan  = wt.insertion_scan(insertion_pool=bases, region="cre", mode="sequential",
                          style="red")
scan.print_library()
scan: seq_length=17, num_states=36 AAAA<cre>AATCGATCG</cre>TTTT
AAAA<cre>AATCGATCG</cre>TTTT
AAAA<cre>ATACGATCG</cre>TTTT
AAAA<cre>ATCAGATCG</cre>TTTT
AAAA<cre>ATCGAATCG</cre>TTTT ... (36 total)

Explicit position list

Limit the scan to specific insertion sites.

import poolparty as pp
pp.init()
wt    = pp.from_seq("ACGTACGT")
bases = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan  = wt.insertion_scan(insertion_pool=bases, positions=[0, 4, 8],
                          mode="sequential", style="red")
scan.print_library()
scan: seq_length=9, num_states=12 AACGTACGT
ACGTAACGT
ACGTACGTA
CACGTACGT
ACGTCACGT ... (12 total)

Random motif insertion (mode=”random”)

mode='random' draws insertion positions stochastically. Here a degenerate 6-base IUPAC motif (R = A|G, Y = C|T) is inserted at random positions along a 12-mer.

import poolparty as pp
pp.init()
wt    = pp.from_seq("ACGTACGTACGT")
motif = pp.from_iupac("RRYYYY")
scan  = wt.insertion_scan(insertion_pool=motif, mode="random", num_states=5,
                          style="red")
scan.print_library()
scan: seq_length=18, num_states=5 ACGTACGTACGTGACCCT
ACGTACGTACGTGATCTT
ACGACCTTGTACGTACGT
ACGAACTTTTACGTACGT
ACGTACAATTCCGTACGT

See insertion_scan().