clear_gaps
Remove all gap and non-molecular characters (-, ., spaces, and any
other characters outside the DNA alphabet) from sequences. XML region tags are
preserved intact; only characters between tags are filtered. Because the output
length varies with the number of gaps removed, the resulting pool does not
carry a fixed seq_length.
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
The Pool (or plain sequence string) to clear gaps from. |
|
|
|
Restrict gap removal to a named region or |
|
|
|
When |
|
|
|
Dimension-name ordering for downstream multi-pool iteration. |
|
|
|
Prefix for auto-generated sequence names. |
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see clear_gaps() in the
API Reference.
Examples
Remove gap markers from a deletion_scan result
A deletion_scan replaces deleted bases with - markers. Pipe the result
through clear_gaps to produce gapless sequences of varying length.
wt = pp.from_seq("ATCGATCG")
dels = pp.deletion_scan(wt, deletion_length=2, mode="sequential")
clean = pp.clear_gaps(dels)
clean.print_library()
AGATCG
ATATCG
ATCTCG
ATCGCG
ATCGAG
ATCGAT
Clear gaps from a manually gapped sequence
Strip dash characters from a sequence that was constructed with explicit alignment gaps.
wt = pp.from_seq("AT--CG--AT")
clean = pp.clear_gaps(wt)
clean.print_library()
Chain clear_gaps with another operation
Remove gaps first, then apply rc to produce gapless reverse-complement
sequences ready for downstream analysis.
wt = pp.from_seq("AT--CG")
clean = pp.clear_gaps(wt)
rev = pp.rc(clean)
rev.print_library()
See clear_gaps().