insertion_multiscan =================== Insert sequences at multiple positions simultaneously, lengthening the output sequence by the total inserted content. Insertion sites are chosen randomly and are guaranteed to be non-overlapping. .. code-block:: python import poolparty as pp pp.init() ---- Parameters ---------- .. list-table:: :widths: auto :header-rows: 1 * - Parameter - Type - Default - Description * - ``pool`` - ``Pool | str`` - *(required)* - Input pool or sequence string. * - ``num_insertions`` - ``int`` - *(required)* - Number of simultaneous non-overlapping insertion sites per draw. * - ``insertion_pools`` - ``Pool | list[Pool]`` - *(required)* - Pool(s) supplying inserted content. A single pool is reused at every site; a list assigns one pool per site. * - ``positions`` - ``list | None`` - ``None`` - Allowed position sets for each insertion site. ``None`` allows any valid non-overlapping arrangement. * - ``region`` - ``str | list | None`` - ``None`` - Named region or interval to restrict insertions to. * - ``names`` - ``list[str] | None`` - ``None`` - Names for each insertion window. * - ``replace`` - ``bool`` - ``False`` - If ``True``, replace the bases at each site instead of inserting between them (same behaviour as ``replacement_multiscan``). * - ``style`` - ``str | None`` - ``None`` - Display style for inserted content. * - ``insertion_mode`` - ``str`` - ``"ordered"`` - ``"ordered"`` preserves the left-to-right order of positions; ``"unordered"`` allows any permutation. * - ``min_spacing`` - ``int | None`` - ``None`` - Minimum gap (in bases) between insertion sites. * - ``max_spacing`` - ``int | None`` - ``None`` - Maximum gap (in bases) between insertion sites. * - ``prefix`` - ``str | None`` - ``None`` - Prefix for the operation node name in the pool graph. * - ``mode`` - ``str`` - ``"random"`` - ``"random"`` or ``"sequential"``. * - ``num_states`` - ``int | None`` - ``None`` - Number of states. ``None`` lets PoolParty choose automatically. * - ``iter_order`` - ``float | None`` - ``None`` - Iteration priority for downstream multi-pool iteration. * - ``cards`` - ``dict | list | None`` - ``None`` - Design card columns to include in library output. ---- .. note:: Only the most commonly used parameters are shown above. For the full parameter list, see :func:`~poolparty.insertion_multiscan` in the :doc:`API Reference `. Examples -------- Two simultaneous single-base insertions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Insert a single random base at each of two independently chosen positions. ``mode="random"`` makes each ``print_library()`` draw one stochastic outcome (``num_states=1`` per preview). .. code-block:: python wt = pp.from_seq("ATCGATCGATCG") insert = pp.from_iupac("N") # any single base scan = wt.insertion_multiscan(num_insertions=2, insertion_pools=insert, mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=14, num_states=1 ATCGATGCGATCTG
Two simultaneous 2-mer insertions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use ``from_iupac("NN")`` to enumerate all 16 dinucleotide insertions at each of the two chosen positions. .. code-block:: python wt = pp.from_seq("ATCGATCGATCG") insert = pp.from_iupac("NN") # all 16 dinucleotides scan = wt.insertion_multiscan(num_insertions=2, insertion_pools=insert, mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=16, num_states=1 ATCGATGACGATCTTG
Multiscan insertion within a named region ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Restrict both insertion sites to within the ``cre`` region; flanking bases are never modified. .. code-block:: python wt = pp.from_seq("AAAAATCGATCGTTTT") insert = pp.from_iupac("N") scan = wt.insertion_multiscan(num_insertions=2, insertion_pools=insert, region="cre", mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=18, num_states=1 AAAA<cre>ATCGGATCGT</cre>TTTT
Spacing constraints (min_spacing, max_spacing) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``min_spacing`` and ``max_spacing`` control the gap between insertion sites. Here two 6-base motif insertions must be 4–8 bases apart on a 24-mer. .. code-block:: python wt = pp.from_seq("ATCGATCGATCGATCGATCGATCG") motif = pp.from_seq("GATTAC") scan = wt.insertion_multiscan(num_insertions=2, insertion_pools=motif, min_spacing=4, max_spacing=8, mode="sequential", style="red") scan.print_library() .. raw:: html
scan: seq_length=36, num_states=95 GATTACATCGGATTACATCGATCGATCGATCGATCG
GATTACATCGAGATTACTCGATCGATCGATCGATCG
GATTACATCGATGATTACCGATCGATCGATCGATCG
GATTACATCGATCGATTACGATCGATCGATCGATCG
GATTACATCGATCGGATTACATCGATCGATCGATCG ... (95 total)
PPM-based insertion pool (from_motif) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use :func:`~poolparty.from_motif` to supply a position-probability matrix as the inserted content. Each draw samples a different 6-mer from the PPM, producing biologically realistic variation at each insertion site. .. code-block:: python import pandas as pd pfm = pd.DataFrame({ "A": [0.8, 0.1, 0.5, 0.1, 0.7, 0.1], "C": [0.1, 0.7, 0.2, 0.1, 0.1, 0.1], "G": [0.05, 0.1, 0.2, 0.1, 0.1, 0.7], "T": [0.05, 0.1, 0.1, 0.7, 0.1, 0.1], }) wt = pp.from_seq("ATCGATCGATCGATCGATCGATCG") motif = pp.from_motif(pfm) scan = wt.insertion_multiscan(num_insertions=2, insertion_pools=motif, mode="random", num_states=5, style="red") scan.print_library() .. raw:: html
scan: seq_length=36, num_states=5 ATCGATCGATCGCCGTAGATCGATCGATACCCAGCG
CCATATATCGATCGATCGATCGATCGATAGGCATCG
ATCGATCGATCGACCGCAGTCGATCGATCAAATAGG
CCATATATCGATCGATCGATCGAACATAGTCGATCG
ATCGATCGATACATAGCGATCGATCGACATACATCG
Explicit position sets (positions) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Specify allowed insertion sites for each window, using a distinct pool for each site. Below, the first insertion (``GGG``) can occur at position 0, 4, or 8 and the second (``AAA``) at position 10 or 14. .. code-block:: python wt = pp.from_seq("ATCGATCGATCGATCG") pools = [pp.from_seq("GGG"), pp.from_seq("AAA")] scan = wt.insertion_multiscan(num_insertions=2, insertion_pools=pools, positions=[[0, 4, 8], [10, 14]], mode="sequential", style="red") scan.print_library() .. raw:: html
scan: seq_length=22, num_states=6 GGGATCGATCGATAAACGATCG
GGGATCGATCGATCGATAAACG
ATCGGGGATCGATAAACGATCG
ATCGGGGATCGATCGATAAACG
ATCGATCGGGGATAAACGATCG
ATCGATCGGGGATCGATAAACG
See :func:`~poolparty.insertion_multiscan`.