clear_annotation

Strip all XML region tags and non-molecular characters from sequences, then uppercase the result. This produces clean, undecorated sequences suitable for export (e.g., writing to FASTA) or for operations that do not accept tagged input. When a region= is specified, only the content inside that tagged segment is cleaned; the outer tags themselves are removed from the output according to the remove_tags setting.

import poolparty as pp
pp.init()

Parameters

Parameter	Type	Default	Description
`pool`	`Pool \| str`	(required)	Parent pool or sequence to transform.
`region`	`str \| list \| None`	`None`	Region to transform: marker name, `[start, stop]`, or `None` for the full sequence.
`remove_tags`	`bool \| None`	`None`	If `True` and `region` is a marker name, strip marker tags from the output.
`iter_order`	`float \| None`	`None`	Iteration order priority for the operation.
`prefix`	`str \| None`	`None`	Prefix for sequence names in the resulting pool.

Note

Only the most commonly used parameters are shown above. For the full parameter list, see clear_annotation() in the API Reference.

Examples

Strip tags from a region-tagged sequence

Remove the cre open/close tags and uppercase every base, producing a plain sequence with no markup.

wt    = pp.from_seq("AAAA<cre>ATCG</cre>TTTT")
plain = pp.clear_annotation(wt)
plain.print_library()

plain: seq_length=None, num_states=1 AAAAATCGTTTT

Strip tags from the result of a replacement_scan

A replacement_scan within a tagged region carries nested region tags. Pipe through clear_annotation to yield bare sequences ready for counting or export.

wt   = pp.from_seq("AAAA<cre>ATCG</cre>TTTT")
alt  = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
scan = wt.replacement_scan(replacement_pool=alt, region="cre",
                           mode="sequential")
bare = pp.clear_annotation(scan)
bare.print_library()

bare: seq_length=None, num_states=16 AAAAATCGTTTT
AAAAACGTTTT
AAAAATAGTTTT
AAAAATCATTTT
AAAACTCGTTTT
... (16 total)

Clear annotation before saving to FASTA

Chain clear_annotation so downstream steps see plain uppercase sequences with no markup characters.

wt      = pp.from_seq("AAAA<cre>ATCG</cre>TTTT")
mutants = pp.mutagenize(wt, num_mutations=1, region="cre", mode="random")
clean   = pp.clear_annotation(mutants)
clean.print_library()

clean: seq_length=None, num_states=1 AAAAATGGTTTT

See clear_annotation().