subseq_scan
Slide a window across the sequence and extract the subsequence at each
position. Unlike deletion_scan (which removes the window) or
replacement_scan (which replaces it), subseq_scan returns only the
window content — producing a pool of short subsequences tiling across the
input.
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
Input pool or sequence string. |
|
|
(required) |
Length of the subsequence window to extract at each position. |
|
|
|
Explicit window start positions. |
|
|
|
Restrict the scan to a named region or |
|
|
|
Prefix for the operation node name in the pool graph. |
|
|
|
|
|
|
|
Override the automatically-computed state count. |
|
|
|
Enumeration order when combined with other pools. |
|
|
|
Design card columns to include in library output. Available keys:
|
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see subseq_scan() in the
API Reference.
Examples
Extract all 4-mers from an 8-mer
A window of length 4 over an 8-base sequence yields 5 subsequences.
pool = pp.from_seq("ACGTACGT")
submers = pool.subseq_scan(subseq_length=4, mode="sequential")
submers.print_library()
CGTA
GTAC
TACG
ACGT
Extract at specific positions
Supply positions to extract from chosen sites only.
pool = pp.from_seq("ACGTACGT")
submers = pool.subseq_scan(subseq_length=3, positions=[0, 3, 5],
mode="sequential")
submers.print_library()
TAC
CGT
Tile within a named region
Restrict the scan to a tagged region; only bases inside the region are considered.
pool = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
submers = pool.subseq_scan(subseq_length=4, region="cre",
mode="sequential")
submers.print_library()
TCGA
CGAT
GATC
ATCG
Random subsequence sampling (mode=”random”)
mode='random' draws window positions stochastically. Use num_states
to control how many subsequences are sampled.
pool = pp.from_seq("ACGTACGTACGT")
submers = pool.subseq_scan(subseq_length=4, mode="random", num_states=5)
submers.print_library()
GTAC
ACGT
CGTA
CGTA
See subseq_scan().