subseq_scan

Slide a window across the sequence and extract the subsequence at each position. Unlike deletion_scan (which removes the window) or replacement_scan (which replaces it), subseq_scan returns only the window content — producing a pool of short subsequences tiling across the input.

import poolparty as pp
pp.init()

Parameters

Parameter

Type

Default

Description

pool

Pool | str

(required)

Input pool or sequence string.

subseq_length

int

(required)

Length of the subsequence window to extract at each position.

positions

list[int] | None

None

Explicit window start positions. None uses all valid positions.

region

str | list | None

None

Restrict the scan to a named region or [start, stop] interval.

prefix

str | None

None

Prefix for the operation node name in the pool graph.

mode

str

"random"

"sequential" iterates positions left-to-right; "random" samples one position per draw.

num_states

int | None

None

Override the automatically-computed state count.

iter_order

float | None

None

Iteration priority for downstream multi-pool iteration.

cards

dict | list | None

None

Design card columns to include in library output. Available keys: "position_index", "start", "end", "name", "region_seq".


Note

Only the most commonly used parameters are shown above. For the full parameter list, see subseq_scan() in the API Reference.

Examples

Extract all 4-mers from an 8-mer

A window of length 4 over an 8-base sequence yields 5 subsequences.

import poolparty as pp
pp.init()
pool    = pp.from_seq("ACGTACGT")
submers = pool.subseq_scan(subseq_length=4, mode="sequential")
submers.print_library()
submers: seq_length=4, num_states=5 ACGT
CGTA
GTAC
TACG
ACGT

Extract at specific positions

Supply positions to extract from chosen sites only.

import poolparty as pp
pp.init()
pool    = pp.from_seq("ACGTACGT")
submers = pool.subseq_scan(subseq_length=3, positions=[0, 3, 5],
                           mode="sequential")
submers.print_library()
submers: seq_length=3, num_states=3 ACG
TAC
CGT

Tile within a named region

Restrict the scan to a tagged region; only bases inside the region are considered.

import poolparty as pp
pp.init()
pool    = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
submers = pool.subseq_scan(subseq_length=4, region="cre",
                           mode="sequential")
submers.print_library()
submers: seq_length=4, num_states=5 ATCG
TCGA
CGAT
GATC
ATCG

Random subsequence sampling (mode=”random”)

mode='random' draws window positions stochastically. Use num_states to control how many subsequences are sampled.

import poolparty as pp
pp.init()
pool    = pp.from_seq("ACGTACGTACGT")
submers = pool.subseq_scan(subseq_length=4, mode="random", num_states=5)
submers.print_library()
submers: seq_length=4, num_states=5 ACGT
GTAC
ACGT
CGTA
CGTA

See subseq_scan().