annotate_orf

Register a region as an OrfRegion so that downstream operations such as mutagenize_orf, stylize_orf, and translate can look up the reading frame automatically. If the region does not yet exist, extent= sets its boundaries. Optionally apply codon-based styling at the same time via style_codons= or style_frames=.

import poolparty as pp
pp.init()

Parameters

Parameter

Type

Default

Description

pool

Pool

(required)

The Pool to annotate.

region_name

str

(required)

Name for the ORF region.

extent

tuple[int, int] | None

None

(start, stop) half-open interval defining the region boundaries. If None, the region must already exist as a tagged region in the sequence.

frame

int

1

Reading frame (1, 2, or 3). Determines which codon grid is used by downstream ORF operations.

style

str | None

None

A single display style applied uniformly to the ORF region.

style_codons

list[str] | None

None

List of style names cycled across whole codons within the ORF.

style_frames

list[str] | None

None

List of style names (length a multiple of 3) applied per base position within each codon.

iter_order

float | None

None

Dimension-name ordering for downstream multi-pool iteration.

prefix

str | None

None

Prefix for the operation node name in the pool graph.


Note

Only the most commonly used parameters are shown above. For the full parameter list, see annotate_orf() in the API Reference.

Examples

Annotate a pre-tagged ORF

When the region already exists as XML tags in the sequence, just pass the region name. annotate_orf registers it as an OrfRegion (with reading frame) without changing the sequence.

wt  = pp.from_seq("<gene>ATGAAATTTGGGCCC</gene>")
orf = pp.annotate_orf(wt, "gene")
orf.print_library()
orf: seq_length=15, num_states=1 <gene>ATGAAATTTGGGCCC</gene>

Define region boundaries with extent

Use extent=(start, stop) to tag positions 4 through 19 (half-open) as the ORF, without needing XML tags in the original sequence.

seq = pp.from_seq("TATAATGAAATTTGGGCCCTAA")
orf = pp.annotate_orf(seq, "gene", extent=(4, 19))
orf.print_library()
orf: seq_length=22, num_states=1 TATA<gene>ATGAAATTTGGGCCC</gene>TAA

Style the ORF with codon colouring (style_codons)

Pass style_codons= to apply alternating codon colours at annotation time, making the reading frame immediately visible.

seq    = pp.from_seq("TATAATGAAATTTGGGCCCTAA")
styled = pp.annotate_orf(seq, "gene", extent=(4, 19),
                         style_codons=["blue", "red"])
styled.print_library()
styled: seq_length=22, num_states=1 TATA<gene>ATGAAATTTGGGCCC</gene>TAA

Chain with mutagenize_orf

After annotating the ORF, mutagenize_orf can look up the frame automatically. Here every single-codon missense variant is enumerated with the mutated codon highlighted in red.

seq  = pp.from_seq("TATAATGAAATTTGGGCCCTAA")
orf  = pp.annotate_orf(seq, "gene", extent=(4, 19))
muts = pp.mutagenize_orf(orf, region="gene", num_mutations=1,
                         style="red", mode="sequential")
muts.print_library()
muts: seq_length=22, num_states=95 TATA<gene>TTCAAATTTGGGCCC</gene>TAA
TATA<gene>CTGAAATTTGGGCCC</gene>TAA
TATA<gene>ATCAAATTTGGGCCC</gene>TAA
TATA<gene>GTGAAATTTGGGCCC</gene>TAA
TATA<gene>AGCAAATTTGGGCCC</gene>TAA ... (95 total)

See annotate_orf().