Sequence Regions
You often want to perform different operations on different parts of a sequence. Regions let you mark specific segments with XML-style tags so that operations can target them by name.
import poolparty as pp
pp.init()
Tag syntax
PoolParty supports two forms of region tag:
Opening/closing pairs enclose a segment of the sequence:
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
wt.print_library()
Self-closing tags mark a zero-length insertion point:
wt = pp.from_seq("ACGT<ins/>ACGT")
wt.print_library()
Tags can be written inline when creating a pool with from_seq or
from_seqs, or added programmatically with insert_tags
or annotate_region.
Targeting operations with region=
Many operations accept a region parameter that restricts the operation to
the tagged region. Flanking sequences are left unchanged:
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential")
mutants.print_library(num_seqs=4)
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>TTCGATCG</cre>TTTT
AAAA<cre>AACGATCG</cre>TTTT ... (24 total)
Only the 8 bases inside <cre> are mutated; the flanking AAAA and
TTTT remain intact. See Region Operations for operations
that create and manage region tags.
Persistence through the DAG
Region tags persist through the DAG and remain valid even when upstream operations change the content within a region. This means multiple operations can target the same region in series:
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential")
dels = mutants.deletion_scan(deletion_length=3, region="cre", mode="sequential")
dels.print_library(num_seqs=4)
AAAA<cre>C---ATCG</cre>TTTT
AAAA<cre>CT---TCG</cre>TTTT
AAAA<cre>CTC---CG</cre>TTTT ... (144 total)
Here mutagenize produces 24 single-point mutants of the cre region,
and deletion_scan then slides a 3-bp deletion across the same region (6
positions per mutant), giving 24 × 6 = 144 total sequences. The cre tag
is valid at both steps.
Inspecting regions
Every pool tracks which regions are present in its sequences via the
pool.regions property:
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT<ins/>GGGG")
wt.regions
{Region(name='cre', seq_length=8), Region(name='ins', seq_length=0)}
Each Region object records the region’s name and the
length of its content (0 for self-closing tags). See
Region in the API Reference for full details.