shuffle_scan ============ Slide a window of fixed length across the sequence (or a named region) and, at each position, shuffle the bases within that window. Bases outside the window are unchanged. Use ``mode='sequential'`` to enumerate every window start as its own state; ``mode='random'`` (the default) samples window positions according to ``num_states`` (default ``1`` for a single draw). Pair with ``shuffles_per_position`` to list several independent shuffles per draw. .. code-block:: python import poolparty as pp pp.init() ---- Parameters ---------- .. list-table:: :header-rows: 1 :widths: auto * - Parameter - Type - Default - Description * - ``pool`` - ``Pool | str`` - *(required)* - The Pool to scan. Can also be a plain sequence string. * - ``shuffle_length`` - ``int`` - *(required)* - Width of the shuffle window in bases. A sequence of length *L* produces *L* - ``shuffle_length`` + 1 window positions. * - ``positions`` - ``list[int] | None`` - ``None`` - Explicit list of window start positions. ``None`` = all valid positions. * - ``region`` - ``str | None`` - ``None`` - Name of a tagged region to restrict the scan to. Flanks are never modified. * - ``shuffle_type`` - ``str`` - ``"mono"`` - ``"mono"`` shuffles individual bases; ``"dinuc"`` preserves dinucleotide frequencies. * - ``shuffles_per_position`` - ``int`` - ``1`` - Number of independent shuffles generated per window position. Values > 1 multiply the library size by that factor. * - ``prefix`` - ``str | None`` - ``None`` - Prefix for auto-generated sequence names. * - ``mode`` - ``str`` - ``'random'`` - ``'sequential'`` iterates positions left-to-right; ``'random'`` shuffles. * - ``num_states`` - ``int | None`` - ``None`` - Fix the total number of output states. * - ``style`` - ``str | None`` - ``None`` - Named display style applied to the shuffled window. * - ``iter_order`` - ``int | None`` - ``None`` - Controls which axis varies fastest when ``shuffles_per_position > 1``. .. note:: With ``shuffle_type="dinuc"``, the **first and last bases of each window are always fixed** — this is a mathematical constraint of the Euler-path algorithm used to preserve dinucleotide frequencies. ---- .. note:: Only the most commonly used parameters are shown above. For the full parameter list, see :func:`~poolparty.shuffle_scan` in the :doc:`API Reference `. Examples -------- 3-base shuffle window across an 8-mer ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Six starts are valid for a length-3 window on an 8-mer. With ``mode='random'`` and default ``num_states``, the pool has a single state, so ``print_library()`` shows one shuffled draw (reproducible after ``pp.init()`` with default library generation seeding). .. code-block:: python import poolparty as pp pp.init() wt = pp.from_seq("ACGTACGT") scan = wt.shuffle_scan(shuffle_length=3, mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=8, num_states=1 ACGCATGT
Multiple shuffles per position (shuffles_per_position) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``shuffles_per_position=3`` attaches three independent shuffles to the drawn window; the preview lists each of the three pool states. .. code-block:: python import poolparty as pp pp.init() wt = pp.from_seq("ACGTACGT") scan = wt.shuffle_scan(shuffle_length=3, shuffles_per_position=3, mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=8, num_states=3 ACGCATGT
ACGTACGT
ACGTCAGT
Explicit position list ~~~~~~~~~~~~~~~~~~~~~~~ Pass ``positions=[0, 3, 6]`` so the shuffle window may start only at those indices. With ``mode='random'`` and default ``num_states``, one of those starts is drawn per state (here the preview is a single row). .. code-block:: python import poolparty as pp pp.init() wt = pp.from_seq("ACGTACGT") scan = wt.shuffle_scan(shuffle_length=2, positions=[0, 3, 6], mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=8, num_states=1 ACGTACGT
Shuffle scan within a named region ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Restrict the scan to the ``cre`` region; flanking sequences are never shuffled. Literal tags appear in the printed sequence; below they are escaped for HTML. .. code-block:: python import poolparty as pp pp.init() wt = pp.from_seq("AAAAATCGATCGTTTT") scan = wt.shuffle_scan(shuffle_length=3, region="cre", mode="random", style="red") scan.print_library() .. raw:: html
scan: seq_length=16, num_states=1 AAAA<cre>ATCTAGCG</cre>TTTT
Sequential mode — all window positions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``mode='sequential'`` enumerates every window start position left-to-right, producing one shuffled state per position. With a 3-base window on an 8-mer, there are 6 positions. .. code-block:: python import poolparty as pp pp.init() wt = pp.from_seq("ACGTACGT") scan = wt.shuffle_scan(shuffle_length=3, mode="sequential", style="red") scan.print_library() .. raw:: html
scan: seq_length=8, num_states=6 GCATACGT
ACGTACGT
ACGATCGT
ACGCATGT
ACGTACGT
ACGTATGC
See :func:`~poolparty.shuffle_scan`.