shuffle_scan

Slide a window of fixed length across the sequence (or a named region) and, at each position, shuffle the bases within that window. Bases outside the window are unchanged. Use mode='sequential' to enumerate every window start as its own state; mode='random' (the default) samples window positions according to num_states (default 1 for a single draw). Pair with shuffles_per_position to list several independent shuffles per draw.

import poolparty as pp
pp.init()

Parameters

Parameter

Type

Default

Description

pool

Pool | str

(required)

The Pool to scan. Can also be a plain sequence string.

shuffle_length

int

(required)

Width of the shuffle window in bases. A sequence of length L produces L - shuffle_length + 1 window positions.

positions

list[int] | None

None

Explicit list of window start positions. None = all valid positions.

region

str | list | None

None

Restrict the scan to a named region or [start, stop] interval. Flanks are never modified.

shuffle_type

str

"mono"

"mono" shuffles individual bases; "dinuc" preserves dinucleotide frequencies.

shuffles_per_position

int

1

Number of independent shuffles generated per window position. Values > 1 multiply the library size by that factor.

prefix

str | None

None

Prefix for auto-generated sequence names.

mode

str

'random'

'sequential' iterates positions left-to-right; 'random' shuffles.

num_states

int | None

None

Number of output states. None auto-computes in sequential mode or defaults to 1 in random mode.

style

str | None

None

Named display style applied to the shuffled window.

iter_order

int | None

None

Enumeration order when combined with other pools.

Note

With shuffle_type="dinuc", the first and last bases of each window are always fixed — this is a mathematical constraint of the Euler-path algorithm used to preserve dinucleotide frequencies.


Note

Only the most commonly used parameters are shown above. For the full parameter list, see shuffle_scan() in the API Reference.

Examples

3-base shuffle window across an 8-mer

Six starts are valid for a length-3 window on an 8-mer. With mode='random' and default num_states, the pool has a single state, so print_library() shows one shuffled draw (reproducible after pp.init() with default library generation seeding).

wt   = pp.from_seq("ACGTACGT")
scan = wt.shuffle_scan(shuffle_length=3, mode="random", style="red")
scan.print_library()
scan: seq_length=8, num_states=1 ACGCATGT

Multiple shuffles per position (shuffles_per_position)

shuffles_per_position=3 attaches three independent shuffles to the drawn window; the preview lists each of the three pool states.

wt   = pp.from_seq("ACGTACGT")
scan = wt.shuffle_scan(shuffle_length=3, shuffles_per_position=3, mode="random",
                       style="red")
scan.print_library()
scan: seq_length=8, num_states=3 ACGCATGT
ACGTACGT
ACGTCAGT

Explicit position list

Pass positions=[0, 3, 6] so the shuffle window may start only at those indices. With mode='random' and default num_states, one of those starts is drawn per state (here the preview is a single row).

wt   = pp.from_seq("ACGTACGT")
scan = wt.shuffle_scan(shuffle_length=2, positions=[0, 3, 6], mode="random",
                       style="red")
scan.print_library()
scan: seq_length=8, num_states=1 ACGTACGT

Shuffle scan within a named region

Restrict the scan to the cre region; flanking sequences are never shuffled. Literal tags appear in the printed sequence; below they are escaped for HTML.

wt   = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
scan = wt.shuffle_scan(shuffle_length=3, region="cre", mode="random",
                       style="red")
scan.print_library()
scan: seq_length=16, num_states=1 AAAA<cre>ATCTAGCG</cre>TTTT

Sequential mode — all window positions

mode='sequential' enumerates every window start position left-to-right, producing one shuffled state per position. With a 3-base window on an 8-mer, there are 6 positions.

wt   = pp.from_seq("ACGTACGT")
scan = wt.shuffle_scan(shuffle_length=3, mode="sequential", style="red")
scan.print_library()
scan: seq_length=8, num_states=6 GCATACGT
ACGTACGT
ACGATCGT
ACGCATGT
ACGTACGT
ACGTATGC

See shuffle_scan().