deletion_scan

Slide a deletion window of fixed length across the sequence (or a named region) and, at each position, remove those bases. By default deleted positions are filled with - gap characters so all output sequences remain alignment-compatible. Pass deletion_marker=None to produce shorter sequences instead.

import poolparty as pp
pp.init()

Parameters

Parameter

Type

Default

Description

pool

Pool | str

(required)

The Pool to scan. Can also be a plain sequence string.

deletion_length

int

(required)

Width of the deletion window in bases. A sequence of length L produces L - deletion_length + 1 variants.

deletion_marker

str | None

'-'

Character used to fill deleted positions. Pass None to remove deleted bases entirely (output sequences are shorter than the input).

region

str | None

None

Name of a tagged region to restrict the scan to. Flanking sequences are never modified.

positions

list[int] | None

None

Explicit list of window start positions. None = all valid positions.

mode

str

'random'

'sequential' iterates left-to-right; 'random' shuffles.

num_states

int | None

None

Fix the total number of output states.

style

str | None

None

Named display style applied to the deletion marker. Only takes effect when deletion_marker is not None.

iter_order

int | None

None

Dimension-name ordering for downstream multi-pool iteration.

prefix

str | None

None

Prefix for auto-generated sequence names.


Note

Only the most commonly used parameters are shown above. For the full parameter list, see deletion_scan() in the API Reference.

Examples

Single-base deletion with default marker

Delete one base at each of the 8 positions in an 8-mer; deleted positions are marked with -.

wt   = pp.from_seq("ACGTACGT")
dels = wt.deletion_scan(deletion_length=1, mode="sequential", style="grey")
dels.print_library()
dels: seq_length=8, num_states=8 -CGTACGT
A-GTACGT
AC-TACGT
ACG-ACGT
ACGT-CGT
ACGTA-GT
ACGTAC-T
ACGTACG-

True deletion (deletion_marker=None)

deletion_marker=None removes the bases entirely; output sequences are shorter than the input.

wt   = pp.from_seq("ACGTACGT")
dels = wt.deletion_scan(deletion_length=2, deletion_marker=None, mode="sequential")
dels.print_library()
dels: seq_length=6, num_states=7 GTACGT
ATACGT
ACACGT
ACGCGT
ACGTGT
ACGTAT
ACGTAC

2-base window deletion

Delete two consecutive bases at each position. An 8-mer yields 7 variants.

wt   = pp.from_seq("ACGTACGT")
dels = wt.deletion_scan(deletion_length=2, mode="sequential", style="grey")
dels.print_library()
dels: seq_length=8, num_states=7 --GTACGT
A--TACGT
AC--ACGT
ACG--CGT
ACGT--GT
ACGTA--T
ACGTAC--

Deletion scan within a named region

Restrict the scan to the cre region; the AAAA and TTTT flanks are always returned unchanged.

wt   = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
dels = wt.deletion_scan(deletion_length=2, region="cre", mode="sequential",
                        style="grey")
dels.print_library()
dels: seq_length=16, num_states=7 AAAA<cre>--CGATCG</cre>TTTT
AAAA<cre>A--GATCG</cre>TTTT
AAAA<cre>AT--ATCG</cre>TTTT
AAAA<cre>ATC--TCG</cre>TTTT
AAAA<cre>ATCG--CG</cre>TTTT
AAAA<cre>ATCGA--G</cre>TTTT
AAAA<cre>ATCGAT--</cre>TTTT

Scan only specific positions

Supply an explicit positions list to delete at chosen sites only.

wt   = pp.from_seq("ACGTACGT")
dels = wt.deletion_scan(deletion_length=1, positions=[1, 3, 5], mode="sequential",
                        style="grey")
dels.print_library()
dels: seq_length=8, num_states=3 A-GTACGT
ACG-ACGT
ACGTA-GT

Random deletion positions (mode=”random”)

mode='random' draws deletion positions stochastically. Here a 3-base deletion window samples 5 random positions along a 12-mer.

wt   = pp.from_seq("ACGTACGTACGT")
dels = wt.deletion_scan(deletion_length=3, mode="random", num_states=5,
                        style="grey")
dels.print_library()
dels: seq_length=12, num_states=5 ACGTA---ACGT
ACGTAC---CGT
ACGTA---ACGT
A---ACGTACGT
A---ACGTACGT

See deletion_scan().