Sequence Regions

You often want to perform different operations on different parts of a sequence. Regions let you mark specific segments with XML-style tags so that operations can target them by name.

import poolparty as pp
pp.init()

Tag syntax

PoolParty supports two forms of region tag:

Opening/closing pairs enclose a segment of the sequence:

wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
wt.print_library()
pool[0]: seq_length=16, num_states=1 AAAA<cre>ATCGATCG</cre>TTTT

Self-closing tags mark a zero-length insertion point:

wt = pp.from_seq("ACGT<ins/>ACGT")
wt.print_library()
pool[0]: seq_length=8, num_states=1 ACGT<ins/>ACGT

Tags can be written inline when creating a pool with from_seq or from_seqs, or added programmatically with insert_tags or annotate_region.


Targeting operations with region=

Many operations accept a region parameter that restricts the operation to the tagged region. Flanking sequences are left unchanged:

wt      = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential")
mutants.print_library(num_seqs=4)
pool[1]: seq_length=16, num_states=24 AAAA<cre>CTCGATCG</cre>TTTT
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>TTCGATCG</cre>TTTT
AAAA<cre>AACGATCG</cre>TTTT ... (24 total)

Only the 8 bases inside <cre> are mutated; the flanking AAAA and TTTT remain intact. See Region Operations for the full list of region-aware operations.


Persistence through the DAG

Region tags persist through the DAG and remain valid even when upstream operations change the content within a region. This means multiple operations can target the same region in series:

wt      = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential")
dels    = mutants.deletion_scan(deletion_length=3, region="cre", mode="sequential")
dels.print_library(num_seqs=4)
pool[4]: seq_length=16, num_states=144 AAAA<cre>---GATCG</cre>TTTT
AAAA<cre>C---ATCG</cre>TTTT
AAAA<cre>CT---TCG</cre>TTTT
AAAA<cre>CTC---CG</cre>TTTT ... (144 total)

Here mutagenize produces 24 single-point mutants of the cre region, and deletion_scan then slides a 3-bp deletion across the same region (6 positions per mutant), giving 24 × 6 = 144 total sequences. The cre tag is valid at both steps.


Inspecting regions

Every pool tracks which regions are present in its sequences via the pool.regions property:

wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT<ins/>GGGG")
wt.regions
{Region(name='cre', seq_length=8), Region(name='ins', seq_length=0)}

Each Region object records the region’s name and the length of its content (0 for self-closing tags). See Region in the API Reference for full details.