Sequence Regions
================
You often want to perform different operations on different parts of a
sequence. Regions let you mark specific segments with XML-style tags so that
operations can target them by name.
.. code-block:: python
import poolparty as pp
pp.init()
----
Tag syntax
----------
PoolParty supports two forms of region tag:
**Opening/closing pairs** enclose a segment of the sequence:
.. code-block:: python
wt = pp.from_seq("AAAAATCGATCGTTTT")
wt.print_library()
.. raw:: html
AAAA<cre>ATCGATCG</cre>TTTT
**Self-closing tags** mark a zero-length insertion point:
.. code-block:: python
wt = pp.from_seq("ACGTACGT")
wt.print_library()
.. raw:: html
ACGT<ins/>ACGT
Tags can be written inline when creating a pool with ``from_seq`` or
``from_seqs``, or added programmatically with :doc:`operations/insert_tags`
or :doc:`operations/annotate_region`.
----
Targeting operations with ``region=``
-------------------------------------
Many operations accept a ``region`` parameter that restricts the operation to
the tagged region. Flanking sequences are left unchanged:
.. code-block:: python
wt = pp.from_seq("AAAAATCGATCGTTTT")
mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential")
mutants.print_library(num_seqs=4)
.. raw:: html
AAAA<cre>CTCGATCG</cre>TTTT
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>TTCGATCG</cre>TTTT
AAAA<cre>AACGATCG</cre>TTTT
... (24 total)
Only the 8 bases inside ```` are mutated; the flanking ``AAAA`` and
``TTTT`` remain intact. See :doc:`operations/region_operations` for the full
list of region-aware operations.
----
Persistence through the DAG
----------------------------
Region tags persist through the DAG and remain valid even when upstream
operations change the content within a region. This means multiple operations
can target the same region in series:
.. code-block:: python
wt = pp.from_seq("AAAAATCGATCGTTTT")
mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential")
dels = mutants.deletion_scan(deletion_length=3, region="cre", mode="sequential")
dels.print_library(num_seqs=4)
.. raw:: html
AAAA<cre>---GATCG</cre>TTTT
AAAA<cre>C---ATCG</cre>TTTT
AAAA<cre>CT---TCG</cre>TTTT
AAAA<cre>CTC---CG</cre>TTTT
... (144 total)
Here ``mutagenize`` produces 24 single-point mutants of the ``cre`` region,
and ``deletion_scan`` then slides a 3-bp deletion across the same region (6
positions per mutant), giving 24 × 6 = 144 total sequences. The ``cre`` tag
is valid at both steps.
----
Inspecting regions
------------------
Every pool tracks which regions are present in its sequences via the
``pool.regions`` property:
.. code-block:: python
wt = pp.from_seq("AAAAATCGATCGTTTTGGGG")
wt.regions
.. code-block:: text
{Region(name='cre', seq_length=8), Region(name='ins', seq_length=0)}
Each :class:`~poolparty.Region` object records the region's name and the
length of its content (``0`` for self-closing tags). See
:class:`~poolparty.Region` in the :doc:`api` for full details.