API Reference

This page provides complete API documentation for all public classes and functions in PoolParty, automatically generated from source code docstrings.

Core Classes

Pool

The main class for building and manipulating sequence pools.

class poolparty.Pool(operation, name=None, state=None, iter_order=None, regions=None)[source]

Bases: CommonOpsMixin, ScanOpsMixin, GenericFixedOpsMixin, StateOpsMixin, RegionOpsMixin

Base pool class - a node in the computation DAG.

Pool provides generic operations that work on any sequence type. For DNA-specific operations, use DnaPool. For protein-specific operations, use ProteinPool.

__init__(operation, name=None, state=None, iter_order=None, regions=None)[source]

Initialize Pool and build its state.

property iter_order: Real

Iteration order for this pool.

property name: str

Name of this pool.

property num_states: int

Number of states for this pool.

property parents: list

Get parent pools from the operation.

property seq_length: int | None

Sequence length (None if variable).

property regions: set[Region]

Set of Region objects present in this pool’s sequences.

has_region(name)[source]

Check if a region with the given name is present in this pool.

Return type:

bool

add_region(region)[source]

Add a region to this pool’s region set.

Return type:

None

__add__(other)[source]

Stack two pools (union of states via sum_counters).

Return type:

Self

__mul__(n)[source]

Repeat this pool n times (repeat states).

Return type:

Self

__rmul__(n)[source]

Repeat this pool n times (repeat states).

Return type:

Self

__getitem__(key)[source]

Slice this pool’s states (not sequences).

Return type:

Self

named(name)[source]

Set the name of this pool, return self for chaining.

Return type:

Self

copy(name=None)[source]

Create a copy of this pool with a copied operation.

The copied operation references the same parent_pools, so the copy represents a parallel branch in the computation graph that shares the same upstream DAG.

Must be called within an active Party context.

Parameters:

name (Optional[str]) – Optional name for the copied pool. If None, uses self.name + ‘.copy’ as the default.

Return type:

Self

Returns:

A new Pool backed by a copied Operation.

deepcopy(name=None)[source]

Create a deep copy of this pool, recursively copying the entire upstream DAG.

Unlike copy(), this creates independent copies of all upstream pools and operations, resulting in a fully independent computation DAG.

Must be called within an active Party context.

Parameters:

name (Optional[str]) – Optional name for the copied pool. If None, uses self.name + ‘.deepcopy’ as the default.

Return type:

Self

Returns:

A new Pool backed by a recursively copied Operation.

generate_library(num_cycles=1, num_seqs=None, seed=None, init_state=None, seqs_only=False, _include_inline_styles=False, discard_null_seqs=False, max_iterations=None, min_acceptance_rate=None, attempts_per_rate_assessment=100)[source]

Generate sequences from a pool.

Args:

Returns:

name, seq, plus any requested design card columns. Or list of sequences if seqs_only=True. Entries are None for null rows when discard_null_seqs=False.

Return type:

Union[DataFrame, list[str | None]]

Note

Design card columns are opt-in via the cards parameter on individual operations. Default output contains only ‘name’ and ‘seq’ columns.

print_library(num_seqs=None, num_cycles=None, show_header=True, show_state=False, show_name=True, show_seq=True, pad_names=True, seed=None, discard_null_seqs=False, max_iterations=None, min_acceptance_rate=None, attempts_per_rate_assessment=100)[source]

Print preview sequences from this pool; returns self for chaining.

Parameters:
  • num_seqs (Optional[Integral]) – Number of sequences to generate.

  • num_cycles (Optional[Integral]) – Number of complete iterations through all states.

  • show_header (bool) – Whether to show the pool header line.

  • show_state (bool) – Whether to show the state column. Requires the pool to have been built with design cards that produce a state column; silently ignored otherwise.

  • show_name (bool) – Whether to show the name column.

  • show_seq (bool) – Whether to show the seq column.

  • pad_names (bool) – Whether to pad names to align sequences.

  • seed (Optional[Integral]) – Random seed for reproducibility.

  • discard_null_seqs (bool) – If True, only show valid (non-null) sequences.

  • max_iterations (Optional[int]) – Maximum iterations before stopping.

  • min_acceptance_rate (Optional[float]) – Minimum fraction of sequences that must pass.

  • attempts_per_rate_assessment (int) – Iterations between acceptance rate checks.

Return type:

Self

print_dag(style='clean', show_pools=True)[source]

Print the ASCII tree visualization rooted at this pool.

Return type:

Self

Party

Context manager for PoolParty sessions.

class poolparty.Party(genetic_code='standard')[source]

Bases: object

Context manager for building and executing sequence libraries.

__init__(genetic_code='standard')[source]
property state_manager: Manager

Access the statetracker Manager for debugging state iteration.

property codon_table: CodonTable

Access the CodonTable for ORF operations.

property suppress_styles: bool

Return True if inline styles are suppressed.

property suppress_cards: bool

Return True if design cards are suppressed.

set_genetic_code(genetic_code)[source]

Set or change the genetic code used for ORF operations.

Return type:

None

get_effective_seq_length(seq)[source]

Get effective sequence length (DNA characters only, excluding markers).

Return type:

int

get_length_without_tags(seq)[source]

Get sequence length excluding only region tags (includes all chars).

Return type:

int

get_molecular_positions(seq)[source]

Get raw string positions of valid DNA characters, excluding marker interiors.

Return type:

list[int]

__enter__()[source]

Enter the Party context, saving any previous active party.

Return type:

Party

__exit__(exc_type, exc_val, exc_tb)[source]

Exit the Party context, restoring the previous party.

Return type:

None

get_pool_by_id(id_)[source]

Get a pool by its ID.

Return type:

Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool]

get_pool_by_name(name)[source]

Get a pool by its name.

Return type:

Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool]

get_op_by_id(id_)[source]

Get an operation by its ID.

Return type:

poolparty.operation.Operation

get_op_by_name(name)[source]

Get an operation by its name.

Return type:

poolparty.operation.Operation

register_region(name, seq_length)[source]

Register a region with this party.

If a region with the same name already exists: - If it has the same seq_length, return the existing region - If it has a different seq_length, raise ValueError

Parameters:
  • name (str) – The region name.

  • seq_length (Optional[int]) – The expected content length (None for variable, 0 for zero-length).

Returns:

The registered region (existing or newly created).

Return type:

Region

Raises:

ValueError – If a region with the same name but different seq_length exists.

register_orf_region(name, seq_length, frame=1)[source]

Register an ORF region with this party.

If a region with the same name already exists: - If it’s an OrfRegion with same seq_length and frame, return it - Otherwise raise ValueError

Parameters:
  • name (str) – The region name.

  • seq_length (Optional[int]) – The expected content length (None for variable, 0 for zero-length).

  • frame (int) – Reading frame (+1, +2, +3, -1, -2, -3). Default +1.

Returns:

The registered ORF region.

Return type:

OrfRegion

upgrade_to_orf_region(name, frame=1)[source]

Upgrade an existing plain Region to an OrfRegion.

Only valid if the existing region is a plain Region (not already an OrfRegion).

Parameters:
  • name (str) – The name of the existing region to upgrade.

  • frame (int) – Reading frame for the ORF (+1, +2, +3, -1, -2, -3). Default +1.

Returns:

The upgraded ORF region.

Return type:

OrfRegion

Raises:

ValueError – If region doesn’t exist or is already an OrfRegion.

get_region_by_id(id_)[source]

Get a region by its ID.

Return type:

Region

get_region_by_name(name)[source]

Get a region by its name.

Return type:

Region

get_region(name)[source]

Get a registered region by name. Alias for get_region_by_name.

Return type:

Region

has_region(name)[source]

Check if a region with the given name is registered.

Return type:

bool

clear_pools()[source]

Clear all pools, operations, and regions without resetting configuration or genetic code.

Unlike init(), this preserves: - Configuration settings (_config) - Genetic code settings (_codon_table)

Return type:

None

output(pool, name=None)[source]

Mark a pool as an output of this library.

Return type:

None

print_graph(style='clean')[source]

Print an ASCII tree visualization of the Pool-Operation computation graph.

Shows pools (places) with parentheses and operations (transitions) with brackets, similar to a Petri net diagram. Root pools (not consumed by other operations) are printed first, with their upstream DAGs.

Parameters:

style (str) –

Display style - 'clean' (default), 'minimal', or 'repr'.

  • 'clean': Shows names with key attributes (e.g., (name) pool: n=num_states, [name] op: factory_name, mode, n=num_states).

  • 'minimal': Shows just names (e.g., (name), [name]).

  • 'repr': Shows full repr() of each object.

Return type:

None

Operation

Abstract base class for all pool operations.

class poolparty.Operation(parent_pools, num_states=1, mode='fixed', seq_length=None, name=None, iter_order=None, prefix=None, region=None, remove_tags=None, _natural_num_states=None, cards=None)[source]

Bases: object

Base class for all operations.

design_card_keys: Sequence[str] = []
max_num_sequential_states: int = 1000000
factory_name: str = 'op'
classmethod validate_num_states(num_states, mode)[source]

Validate num_states against max_num_sequential_states.

Return type:

int | float

__init__(parent_pools, num_states=1, mode='fixed', seq_length=None, name=None, iter_order=None, prefix=None, region=None, remove_tags=None, _natural_num_states=None, cards=None)[source]

Initialize Operation.

property iter_order: Real

Iteration order for this operation.

property seq_length: int | None

Sequence length produced by this operation (None if variable).

property natural_num_states: int | None

Natural number of states (computed from operation, before user override).

property action_uniquely_determined_by_state: bool

True if same state value always produces the same output.

property has_cards: bool

True if this operation has any cards requested.

property uses_custom_column_names: bool

True if this operation uses dict-style custom column names.

property id: int

Unique ID for this operation.

property name: str

Name of this operation.

build_pool_counter(parent_pools)[source]

Build the output Pool’s state from parent pool states.

Return type:

State

compute(parents, rng=None)[source]

Compute output Seq and design card with automatic region handling.

This is the public entry point for operations. It handles region extraction/reassembly automatically, then delegates to _compute_core().

Parameters:
  • parents (list[Seq]) – Input Seq objects from parent pools.

  • rng (Generator | None) – Random number generator (for random mode operations).

Return type:

tuple[Seq, dict]

Returns:

  • tuple[Seq, dict] – Output Seq (with string and style) and design card dict.

  • If region is specified

  • 1. Extracts region from parents[0] as a Seq

  • 2. Calls _compute_core with modified parent list

  • 3. Reassembles prefix + result + suffix using Seq.join

  • 4. Removes region tags if remove_tags=True and region is a region name

compute_name_contributions(global_state=None, max_global_state=None)[source]

Compute this operation’s contributions to the final sequence name.

Returns list of name elements in the order they should appear. Default: [prefix_state.value] when active, [] otherwise. For stateless random operations, uses global_state if provided.

Parameters:
  • global_state (Optional[int]) – The global row index, used for stateless random operations.

  • max_global_state (Optional[int]) – The maximum global state that will be used, for zero-padding.

Returns:

List of name elements, or empty list if no contribution.

Return type:

list[str]

copy(name=None)[source]

Create a copy of this operation with a new ID.

The copy references the same parent_pools but has its own Counter. Must be called within an active Party context.

Parameters:

name (Optional[str]) – Optional name for the copied operation. If None, uses self.name + ‘.copy’ as the default.

Return type:

Operation

Returns:

A new Operation of the same type with the same parameters.

deepcopy(name=None)[source]

Create a deep copy of this operation, recursively copying all parent pools.

Unlike copy(), this creates independent copies of all upstream pools, resulting in a fully independent computation DAG.

Must be called within an active Party context.

Parameters:

name (Optional[str]) – Optional name for the copied operation. If None, uses self.name + ‘.deepcopy’ as the default.

Return type:

Operation

Returns:

A new Operation with recursively copied parent pools.

print_dag(style='clean')[source]

Print the ASCII tree visualization rooted at this operation.

Parameters:

style (str) – Display style - ‘clean’ (default), ‘minimal’, or ‘repr’.

Return type:

None

Region

Represents a tagged region within a sequence.

class poolparty.Region(name, seq_length, _id=-1)[source]

Bases: object

Represents a registered region in a poolparty Party.

Regions identify sections of sequences for later modification. Each region has a name and a seq_length that specifies the expected length of content within the region tags.

name

The region name (used in XML tags like <name>…</name>).

Type:

str

seq_length

The expected length of content within the region: - None: Variable-length region (content length not fixed) - 0: Zero-length region (insertion point, <name/>) - >0: Fixed-length region (content must be this length)

Type:

Optional[int]

_id

Unique identifier assigned by the Party upon registration.

Type:

int

name: str
seq_length: int | None
__post_init__()[source]

Validate region attributes.

property is_variable_length: bool

True if this region has variable length (seq_length is None).

property is_zero_length: bool

True if this region is a zero-length insertion point.

__hash__()[source]

Hash based on name (regions with same name should be the same).

__eq__(other)[source]

Equality based on name.

__init__(name, seq_length, _id=-1)

Initialization Functions

poolparty.init(genetic_code='standard', log_level=None)[source]

Initialize (or reset) the default Party, clearing all registered pools/operations/regions.

Parameters:
  • genetic_code (Union[str, dict]) – Genetic code to use for ORF operations.

  • log_level (Optional[str]) – If provided, configure logging at this level (“DEBUG”, “INFO”, “WARNING”, “ERROR”).

Return type:

Party

poolparty.get_active_party()[source]

Get the currently active Party context, or None if not in a context.

Return type:

Optional[Party]

poolparty.clear_pools()[source]

Clear all pools, operations, and regions from the active Party without resetting configuration or genetic code.

Return type:

None

poolparty.configure_logging(level='WARNING', format='%(levelname)s - %(name)s - %(message)s', handler=None)[source]

Configure logging for poolparty and statetracker.

Parameters:
  • level (str) – Logging level (“DEBUG”, “INFO”, “WARNING”, “ERROR”, “CRITICAL”).

  • format (str) – Log message format string.

  • handler (Optional[Handler]) – Custom handler (defaults to StreamHandler if None).

Return type:

None

poolparty.toggle_styles(on=True)[source]

Toggle inline styling on/off for the active Party.

When off (on=False), Seq.style will be None to avoid style overhead. When on (on=True), normal style tracking is restored.

Return type:

None

poolparty.toggle_cards(on=True)[source]

Toggle design card computation on/off for the active Party.

When off (on=False), operations skip building design card data and columns. Inline styles are unaffected (controlled by toggle_styles).

Return type:

None

Base Operations

Functions for creating and transforming sequence pools.

Sequence Creation

poolparty.from_seq(seq, pool=None, region=None, remove_tags=None, style=None, iter_order=None, prefix=None, _factory_name=None)[source]

Create a Pool containing a single, fixed sequence.

If pool and region are provided, the sequence replaces the region content in pool. Otherwise, creates a standalone pool with the sequence.

Parameters:
  • seq (str) – The sequence to include in the pool (or to insert at region).

  • pool (Union[Pool, str, None]) – Pool or sequence. If provided with region, seq replaces the region.

  • region (str | Sequence[Integral] | None) – Region to replace in pool. Can be marker name (str) or [start, stop].

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • style (Optional[str]) – Style to apply to the sequence (e.g., ‘red’, ‘blue bold’).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for auto-generated sequence names.

Returns:

A Pool object yielding the provided sequence (or bg_pool with region replaced).

Return type:

DnaPool

poolparty.from_seqs(seqs, pool=None, region=None, style=None, seq_names=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]

Create a Pool containing the specified sequences.

Parameters:
  • seqs (Sequence[str]) – Sequence of string sequences to include in the pool.

  • pool (Union[Pool, str, None]) – Background pool or sequence. If provided with region, selected sequence replaces the region content.

  • region (str | Sequence[Integral] | None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.

  • seq_names (Optional[Sequence[str]]) – Explicit names for each sequence. If provided, these are used directly.

  • prefix (Optional[str]) – Prefix for auto-generated names (e.g., ‘seq_’ produces ‘seq_0’, ‘seq_1’, …). Cannot be used together with seq_names.

  • mode (Literal['random', 'sequential', 'fixed']) – Sequence selection mode: ‘sequential’ or ‘random’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • style (Optional[str]) – Style to apply to output sequences (e.g., ‘red’, ‘blue bold’).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'seq_name', 'seq_index'.

Returns:

A Pool object yielding the provided sequences using the specified selection mode.

Return type:

DnaPool

Raises:
  • TypeError – If seqs is a bare string instead of a list of strings.

  • ValueError – If pool is provided without region.

poolparty.from_fasta(fasta_path, coordinates, pool=None, region=None, remove_tags=None, iter_order=None, prefix=None, style=None, cards=None)[source]

Extract genomic region(s) from a FASTA file and create a Pool.

Parameters:
  • fasta_path (str) – Path to the FASTA file (will be indexed with pyfaidx).

  • coordinates (Union[tuple[str, int, int, Literal['+', '-']], Sequence[tuple[str, int, int, Literal['+', '-']]]]) – Single coordinate as (chrom, start, stop, strand) or list of such tuples. Coordinates are 0-based [start, stop). If strand=’-’, sequence is reverse complemented. For circular genomes, start > stop indicates wrap-around.

  • pool (Union[Pool, str, None]) – Background pool or sequence. If provided with region, extracted sequence(s) replace the region content.

  • region (str | Sequence[Integral] | None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from the output. Only relevant in single-coordinate mode (has no effect in batch mode).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation (batch mode only).

  • prefix (Optional[str]) – Prefix for sequence names. Names are “{prefix}_{chrom}:{start}-{stop}({strand})” or “{chrom}:{start}-{stop}({strand})” if no prefix.

  • style (Optional[str]) – Style to apply to extracted sequences (e.g., ‘red’, ‘blue bold’).

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys (batch mode only): 'seq_name', 'seq_index'. Ignored in single-coordinate mode.

Returns:

A Pool yielding the extracted genomic sequence(s).

Return type:

DnaPool

poolparty.from_iupac(iupac_seq, pool=None, region=None, prefix=None, mode='random', num_states=None, iter_order=None, style=None, cards=None)[source]

Create a Pool that generates DNA sequences from IUPAC notation.

Parameters:
  • iupac_seq (str) – IUPAC sequence string (e.g., ‘RN’ for purine + any base). Valid characters: A, C, G, T, U, R, Y, S, W, K, M, B, D, H, V, N.

  • pool (Union[Pool, str, None]) – Background pool or sequence. If provided with region, generated sequence replaces the region content.

  • region (str | Sequence[Integral] | None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Sequence selection mode: ‘sequential’ or ‘random’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • style (Optional[str]) – Style to apply to generated sequences (e.g., ‘red’, ‘blue bold’).

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'iupac_state'.

Returns:

A Pool yielding DNA sequences from the IUPAC pattern.

Return type:

DnaPool

Raises:

ValueError – If pool is provided without region.

poolparty.from_motif(prob_df, pool=None, region=None, prefix=None, mode='random', num_states=None, iter_order=None, style=None, cards=None)[source]

Create a Pool that samples sequences from a position probability matrix.

Parameters:
  • prob_df (DataFrame) – DataFrame with probability values for each position. Columns should be alphabet characters (e.g., ‘A’, ‘C’, ‘G’, ‘T’). Rows represent positions. Values are probabilities (auto-normalized).

  • pool (Union[Pool, str, None]) – Background pool or sequence. If provided with region, generated sequence replaces the region content.

  • region (str | Sequence[Integral] | None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Sequence selection mode: ‘random’.

  • num_states (Optional[Integral]) – Number of states for random mode. If None, defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • style (Optional[str]) – Style to apply to generated sequences (e.g., ‘red’, ‘blue bold’).

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'prob_state'.

Returns:

A Pool yielding sequences sampled from the probability matrix.

Return type:

DnaPool

Raises:

ValueError – If pool is provided without region.

poolparty.get_kmers(length, pool=None, region=None, style=None, case='upper', prefix=None, mode='random', num_states=None, iter_order=None, cards=None)[source]

Create a Pool that generates DNA k-mers (all possible sequences of length k).

Must be called within a Party context.

Parameters:
  • pool (Union[Pool, str, None]) – Pool or sequence. If provided with region, generated k-mer replaces the region content.

  • region (str | Sequence[Integral] | None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.

  • length (Integral) – Length of k-mers to generate.

  • case (Literal['lower', 'upper']) – Case of output k-mers: ‘upper’ for uppercase, ‘lower’ for lowercase.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Sequence selection mode: ‘sequential’ or ‘random’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • style (Optional[str]) – Style to apply to generated k-mers (e.g., ‘red’, ‘blue bold’).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'kmer_index', 'kmer'.

Returns:

A Pool whose states yield DNA k-mers of the specified length.

Return type:

DnaPool

Raises:

Sequence Transformation

poolparty.mutagenize(pool, region=None, num_mutations=None, mutation_rate=None, allowed_chars=None, style=None, prefix=None, mode='random', num_states=None, iter_order=None, _remove_tags=False, cards=None, _factory_name='mutagenize')[source]

Create a Pool that applies mutations to a sequence.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string to mutate.

  • region (str | Sequence[Integral] | None) – Region to mutagenize. Can be a marker name (str), explicit interval [start, stop], or None to mutagenize entire sequence. Positions are region-relative.

  • num_mutations (Optional[Integral]) – Fixed number of mutations to apply (mutually exclusive with mutation_rate).

  • mutation_rate (Optional[Real]) – Probability of mutation at each position (mutually exclusive with num_mutations).

  • allowed_chars (Optional[str]) – IUPAC string of same length as sequence, specifying allowed bases at each position. Each character is an IUPAC code (A, C, G, T, R, Y, S, W, K, M, B, D, H, V, N). Positions where only the wild-type is allowed are treated as non-mutable.

  • style (Optional[str]) – Style to apply to mutated positions (e.g., ‘red’, ‘blue bold’).

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’. Sequential only available with num_mutations.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'positions', 'wt_chars', 'mut_chars'.

Returns:

A Pool that generates mutated sequences.

Return type:

Pool

poolparty.shuffle_seq(pool, region=None, shuffle_type='mono', prefix=None, mode='random', num_states=None, iter_order=None, _remove_tags=False, style=None, cards=None, _factory_name=None)[source]

Create a Pool that shuffles characters within a specified region.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to shuffle.

  • region (str | Sequence[Integral] | None) – Region to shuffle. Can be a marker name (str), explicit interval [start, stop], or None to shuffle entire sequence.

  • shuffle_type (Literal['mono', 'dinuc']) –

    Type of shuffle to perform:

    • "mono": random permutation preserving mononucleotide composition.

    • "dinuc": Euler-path shuffle preserving dinucleotide frequencies. The first and last characters are always fixed (mathematical constraint of the Euler path algorithm).

  • mode (Literal['random', 'sequential', 'fixed']) – Shuffle mode: ‘random’. Sequential is not supported.

  • num_states (Optional[Integral]) – Number of states for random mode. If None, defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • style (Optional[str]) – Style to apply to shuffled characters (e.g., ‘red’, ‘blue bold’).

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'permutation'.

Returns:

A Pool that yields shuffled sequences.

Return type:

Pool

poolparty.recombine(pool=None, region=None, sources=(), num_breakpoints=1, positions=None, mode='random', num_states=None, prefix=None, styles=None, style_by='order', iter_order=None, cards=None, _factory_name='recombine')[source]

Create a Pool that recombines segments from multiple source pools at breakpoints.

Parameters:
  • pool (Union[Pool, str, None]) – Parent pool for region-based recombination. If provided with region, the recombined sequences replace the region content.

  • region (str | Sequence[Integral] | None) – Region in pool where recombined sequences will be inserted. Region content is discarded (not used as a source pool).

  • sources (Sequence[Union[Pool, str]]) – Source pools for recombination. All must have the same seq_length.

  • num_breakpoints (Integral) – Number of recombination breakpoints. Must be <= seq_length - 1.

  • positions (Optional[Sequence[Integral]]) – Valid breakpoint positions. If None, defaults to range(seq_length - 1). Position i means “breakpoint after index i”.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ (random breakpoints and pool assignments) or ‘sequential’ (enumerate all combinations).

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • styles (Optional[list[str]]) –

    List of styles to apply to segments. Both modes accept any non-empty list and cycle. Use empty string ‘’ for segments that shouldn’t have additional styling. Styles overlay on top of inherited source pool styles.

    • If style_by=’order’: cycles through styles for segments by position (e.g., with 2 styles and 5 segments: style[0], style[1], style[0], style[1], style[0]).

    • If style_by=’source’: cycles through styles based on source pool index (e.g., with 2 styles and 3 sources: source[0]->style[0], source[1]->style[1], source[2]->style[0]).

  • style_by (Literal['source', 'order']) –

    Determines how styles are assigned to segments:

    • 'order': styles[i % len(styles)] applied to segment i (cycles by position).

    • 'source': styles[j % len(styles)] applied to segments from sources[j] (cycles by source index).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'breakpoints', 'pool_assignments'.

Returns:

A Pool that generates recombined sequences.

Return type:

Pool

poolparty.join(pools, spacer_str='', iter_order=None, prefix=None, style=None, _factory_name=None)[source]

Concatenate multiple Pools or string sequences into a single Pool.

Parameters:
  • pools (Sequence[Union[TypeVar(T, bound= Pool), str]]) – List of Pool objects and/or strings to be joined in order. Any provided string is automatically converted to a constant Pool.

  • spacer_str (str) – String to insert between joined sequences.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • style (Optional[str]) – Style to apply to the resulting concatenated sequences (e.g., ‘red’, ‘blue bold’).

Returns:

A Pool whose states yield joined sequences from the specified inputs.

Return type:

TypeVar(T, bound= Pool)

Fixed Operations

Operations that transform sequences without changing pool size.

poolparty.rc(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None)[source]

Create a Pool containing the reverse complement of sequences from the input pool.

Note: Region tags are not preserved in the output. If you need to preserve regions, use extract_region with rc=True instead.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to reverse complement.

  • region (str | Sequence[Integral] | None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • style (Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).

Returns:

A Pool containing reverse-complemented sequences.

Return type:

Pool

poolparty.upper(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None)[source]

Create a Pool containing uppercase sequences from the input pool.

Preserves XML marker tags exactly as they appear (only transforms non-marker characters).

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to convert to uppercase.

  • region (str | Sequence[Integral] | None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • style (Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).

Returns:

A Pool containing uppercase sequences.

Return type:

Pool

poolparty.lower(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None)[source]

Create a Pool containing lowercase sequences from the input pool.

Preserves XML marker tags exactly as they appear (only transforms non-marker characters).

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to convert to lowercase.

  • region (str | Sequence[Integral] | None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • style (Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).

Returns:

A Pool containing lowercase sequences.

Return type:

Pool

poolparty.swapcase(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None, _factory_name=None)[source]

Create a Pool containing case-swapped sequences from the input pool.

Preserves XML marker tags exactly as they appear (only transforms non-marker characters).

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to swap case.

  • region (str | Sequence[Integral] | None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • style (Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).

Returns:

A Pool containing case-swapped sequences.

Return type:

Pool

poolparty.slice_seq(pool, region=None, start=None, stop=None, step=None, keep_context=False, iter_order=None, prefix=None, style=None)[source]

Create a Pool containing sliced sequences from the input pool.

Extracts a subsequence based on region and/or Python-style slice parameters.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – The Pool (or sequence string) whose sequences will be sliced.

  • region (str | Sequence[Integral] | None) – Region to slice from. Can be: - str: Name of an annotated region (e.g., ‘orf’) - Sequence[int]: [start, stop] interval in the sequence - None: Use the full sequence If only region is specified (no start/stop/step), returns just that region.

  • start (Optional[Integral]) – Start position for slicing (0-indexed, Python-style). Applied after region extraction if region is specified.

  • stop (Optional[Integral]) – Stop position for slicing (exclusive, Python-style). Applied after region extraction if region is specified.

  • step (Optional[Integral]) – Step for slicing (Python-style). Applied after region extraction if region is specified.

  • keep_context (bool) – If True, reassemble the sliced content back into the original sequence context (prefix + sliced_content + suffix). If False (default), return only the sliced content.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence naming.

  • style (Optional[str]) – Style to apply to the resulting sliced sequences (e.g., ‘red’, ‘blue bold’).

Returns:

A Pool containing sliced sequences.

Return type:

Pool

Examples

>>> with pp.Party():
...     # Slice positions 2-6 from the full sequence
...     pool = pp.from_seq('ACGTACGT')
...     sliced = pp.slice_seq(pool, start=2, stop=6)
...     # Result: 'GTAC'
...
...     # Extract just a named region
...     pool = pp.from_seq('AAA<orf>ATGCCC</orf>TTT')
...     orf = pp.slice_seq(pool, region='orf')
...     # Result: 'ATGCCC'
...
...     # Slice within a named region
...     pool = pp.from_seq('AAA<orf>ATGCCC</orf>TTT')
...     sliced = pp.slice_seq(pool, region='orf', start=0, stop=3)
...     # Result: 'ATG'
...
...     # Slice with step (every other character)
...     pool = pp.from_seq('ABCDEFGH')
...     sliced = pp.slice_seq(pool, step=2)
...     # Result: 'ACEG'
...
...     # Use as a method on Pool objects
...     pool = pp.from_seq('ACGTACGT')
...     sliced = pool.slice_seq(start=0, stop=4)
...     # Result: 'ACGT'
...
...     # Keep context - reassemble into original sequence
...     pool = pp.from_seq('AAA<orf>ATGCCC</orf>TTT')
...     sliced = pp.slice_seq(pool, region='orf', start=0, stop=3, keep_context=True)
...     # Result: 'AAAATGTTT' (prefix + sliced region + suffix)
poolparty.clear_gaps(pool, region=None, remove_tags=None, iter_order=None, prefix=None)[source]

Create a Pool with all gap/non-molecular characters removed from sequences.

This removes everything that is NOT a valid molecular character (DNA or protein), including gaps ‘-’, dots ‘.’, spaces ‘ ‘, and any other non-molecular characters.

Marker tags are preserved intact.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to filter.

  • region (str | Sequence[Integral] | None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool containing only molecular alphabet characters (markers preserved). Always has seq_length=None because output length depends on how many non-molecular characters each sequence contains.

Return type:

Pool

poolparty.clear_annotation(pool, region=None, remove_tags=None, iter_order=None, prefix=None)[source]

Create a Pool with all annotations cleared and sequences uppercased.

Removes all XML marker tags and non-molecular characters, then uppercases the result. When a region is specified, only transforms content within that region (nested markers and non-molecular chars inside are cleared).

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to transform.

  • region (str | Sequence[Integral] | None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.

  • remove_tags (Optional[bool]) – If True and region is a marker name, remove marker tags from output.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool with cleared annotations and uppercase sequences. Always has seq_length=None because output length depends on how many tags and non-molecular characters each sequence contains.

Return type:

Pool

poolparty.stylize(pool, region=None, *, style, which='contents', regex=None, iter_order=None, prefix=None)[source]

Apply inline styling to sequences without modifying them.

Styles are attached directly to sequences as they flow through the pool chain.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Parent pool or sequence to style.

  • region (str | Sequence[Integral] | None) – Region to restrict styling. Can be marker name or [start, stop]. If None, styles the entire sequence.

  • style (str) – Style spec string (e.g., ‘red bold’, ‘lower cyan’). Can include ‘upper’/’lower’ for case transforms.

  • which (Literal['all', 'upper', 'lower', 'gap', 'tags', 'contents']) – Pattern selector: ‘all’, ‘upper’, ‘lower’, ‘gap’, ‘tags’, ‘contents’.

  • regex (Optional[str]) – Custom regex pattern. If specified, overrides which.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool with inline styling attached to sequences.

Return type:

Pool

Scan Operations

Tiled mutagenesis operations that scan across sequence positions.

poolparty.insertion_scan(pool, insertion_pool, positions=None, region=None, replace=False, style=None, prefix=None, prefix_position=None, prefix_insert=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='insertion_scan')[source]

Insert or replace a sequence at specified scanning positions.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • insertion_pool (Union[Pool, str]) – The pool or sequence string to be inserted.

  • positions (Sequence[Integral] | slice | None) – Positions for insertion/replacement (0-based). If None, all valid positions.

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.

  • replace (bool) – If False, insert at position (output length = bg + ins). If True, replace content at position (output length = bg).

  • style (Optional[str]) – Style to apply to inserted content (e.g., ‘red’, ‘blue bold’).

  • prefix (Optional[str]) – Prefix for cartesian product index (e.g., ‘ins_’ produces ‘ins_0’, ‘ins_1’, …).

  • prefix_position (Optional[str]) – Prefix for position index (e.g., ‘pos_’ produces ‘pos_0’, ‘pos_1’, …).

  • prefix_insert (Optional[str]) – Prefix for insert index (e.g., ‘ins_’ produces ‘ins_0’, ‘ins_1’, …).

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'position_index', 'start', 'end', 'name', 'region_seq'.

Returns:

A Pool yielding sequences with the insert placed at selected position(s).

Return type:

Pool

poolparty.deletion_scan(pool, deletion_length, deletion_marker='-', positions=None, region=None, prefix=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='deletion_scan')[source]

Scan a pool for all possible single deletions of a fixed length.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • deletion_length (Integral) – Number of characters to delete at each valid position.

  • deletion_marker (Optional[str]) – Character to insert at the deletion site. If None, segment is removed.

  • positions (Sequence[Integral] | slice | None) – Positions to consider for the start of the deletion (0-based, relative to region).

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • style (Optional[str]) – Style to apply to deletion gap characters (e.g., ‘gray’, ‘red bold’).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'position_index', 'start', 'end', 'name', 'region_seq'.

Returns:

A Pool yielding sequences where a segment of the specified length is removed from the source at each allowed position, optionally with a marker inserted.

Return type:

Pool

poolparty.replacement_scan(pool, replacement_pool, positions=None, region=None, style=None, prefix=None, prefix_position=None, prefix_insert=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='replacement_scan')[source]

Replace a segment with insert at specified scanning positions.

Equivalent to insertion_scan(..., replace=True). See insertion_scan() for full parameter documentation.

Return type:

Pool

poolparty.shuffle_scan(pool, shuffle_length, positions=None, region=None, shuffle_type='mono', shuffles_per_position=1, prefix=None, prefix_position=None, prefix_shuffle=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='shuffle_scan')[source]

Shuffle characters within a window at specified scanning positions.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • shuffle_length (Integral) – Length of the region to shuffle at each position.

  • positions (Sequence[Integral] | slice | None) – Positions to consider for the start of the shuffle region (0-based).

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.

  • shuffle_type (Literal['mono', 'dinuc']) –

    Type of shuffle to perform:

    • "mono": random permutation preserving mononucleotide composition.

    • "dinuc": Euler-path shuffle preserving dinucleotide frequencies. The first and last characters of each window are fixed.

  • shuffles_per_position (Integral) – Number of shuffles to perform at each position.

  • prefix (Optional[str]) – Prefix for cartesian product index (e.g., ‘shuf’ produces ‘shuf_0’, ‘shuf_1’, …).

  • prefix_position (Optional[str]) – Prefix for position index (e.g., ‘pos’ produces ‘pos_0’, ‘pos_1’, …).

  • prefix_shuffle (Optional[str]) – Prefix for shuffle variant index (e.g., ‘var’ produces ‘var_0’, ‘var_1’, …).

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • style (Optional[str]) – Style to apply to shuffled characters (e.g., ‘purple’, ‘red bold’).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (Optional[tuple[None | list[str] | dict[str, str], None | list[str] | dict[str, str]]]) – Design card keys as a 2-tuple (scan_cards, shuffle_cards). Scan keys: 'position_index', 'start', 'end', 'name', 'region_seq'. Shuffle keys: 'permutation'.

Returns:

A Pool yielding sequences where a region of the specified length is shuffled at each allowed position.

Return type:

Pool

poolparty.mutagenize_scan(pool, mutagenize_length, num_mutations=None, mutation_rate=None, positions=None, region=None, prefix=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='mutagenize_scan')[source]

Apply mutagenesis within a window at specified scanning positions.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • mutagenize_length (Integral) – Length of the region to mutagenize at each position.

  • num_mutations (Optional[Integral]) – Fixed number of mutations to apply (mutually exclusive with mutation_rate).

  • mutation_rate (Optional[Real]) – Probability of mutation at each position (mutually exclusive with num_mutations).

  • positions (Sequence[Integral] | slice | None) – Positions to consider for the start of the mutagenize region (0-based). If None, all valid positions are used.

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval. If specified, positions are relative to the region start.

  • prefix (Union[str, Sequence[str], None]) – Prefix for sequence names. If a 2-tuple, first element is for scanning positions, second for mutagenization.

  • mode (Union[Literal['random', 'sequential', 'fixed'], tuple[Literal['random', 'sequential', 'fixed'], Literal['random', 'sequential', 'fixed']]]) – Selection mode: ‘random’ or ‘sequential’. A scalar value is broadcast to both scan and mutagenize sub-operations. If a 2-tuple, first element is for scanning positions, second for mutagenization.

  • num_states (Union[Integral, Sequence[Optional[Integral]], None]) – Number of states. A scalar value is broadcast to both sub-operations. If a 2-tuple, first element is for scanning positions, second for mutagenization. For each element: None means auto-compute in sequential mode (enumerate all variants) or 1 in random mode (pure random sampling). Example: num_states=(3, None) with mode=("random", "sequential") picks 3 random scan positions and enumerates all mutation variants at each.

  • style (Optional[str]) – Style to apply to mutated characters (e.g., ‘red’, ‘blue bold’).

  • iter_order (Union[Real, Sequence[Real], None]) – Iteration order priority for the Operation. If a 2-tuple, first element is for scanning positions, second for mutagenization.

  • cards (Optional[tuple[None | list[str] | dict[str, str], None | list[str] | dict[str, str]]]) – Design card keys as a 2-tuple (scan_cards, mutagenize_cards). Scan keys: 'position_index', 'start', 'end', 'name', 'region_seq'. Mutagenize keys: 'positions', 'wt_chars', 'mut_chars'.

Returns:

A Pool yielding sequences where a region of the specified length is mutagenized at each allowed position.

Return type:

Pool

poolparty.subseq_scan(pool, subseq_length, positions=None, region=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='subseq_scan')[source]

Extract subsequences of a specified length at scanning positions.

Scans a region across the pool and extracts the region content, returning subsequences at each valid position.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • subseq_length (Integral) – Length of subsequence to extract at each position.

  • positions (Sequence[Integral] | slice | None) – Positions to consider for the start of extraction (0-based). If None, all valid positions are used.

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval. If specified, positions are relative to the region start.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'position_index', 'start', 'end', 'name', 'region_seq'.

Returns:

A Pool yielding subsequences extracted at each allowed position.

Return type:

Pool

Region Operations

Operations for working with tagged sequence regions.

poolparty.insert_tags(pool, region_name, start, stop=None, iter_order=None, prefix=None)[source]

Insert XML-style region tags at a fixed position in sequences.

Parameters:
  • pool (Pool or str) – Input Pool or sequence string to add tags to.

  • region_name (str) – Name for the region (e.g., ‘region’, ‘orf’, ‘insert’).

  • start (int) – Start position (0-based) for the region.

  • stop (Optional[int]) – End position (exclusive). If None, creates a zero-length region at start.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool yielding sequences with the region tags inserted.

Return type:

Pool

Examples

>>> with pp.Party():
...     bg = pp.from_seq('ACGTACGT')
...     # Region tags encompassing positions 2-5
...     marked = pp.insert_tags(bg, 'region', start=2, stop=5)
...     # Result: 'AC<region>GTA</region>CGT'
...
...     # Zero-length region at position 4
...     marked = pp.insert_tags(bg, 'ins', start=4)
...     # Result: 'ACGT<ins/>ACGT'
poolparty.remove_tags(pool, region_name, keep_content=True, iter_order=None, prefix=None)[source]

Remove region tags from sequences.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool, str]) – Input Pool or sequence string containing the region.

  • region_name (str) – Name of the region to remove.

  • keep_content (bool) – If True, keep the content inside the region (just remove tags). If False, remove both the region tags and their content.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool yielding sequences with the region tags removed.

Return type:

Pool

Examples

>>> with pp.Party():
...     bg = pp.from_seq('ACGT<region>TTAA</region>GCGC')
...
...     # Keep content (just remove tags)
...     result = pp.remove_tags(bg, 'region', keep_content=True)
...     # Result: 'ACGTTTAAGCGC'
...
...     # Remove content too
...     result = pp.remove_tags(bg, 'region', keep_content=False)
...     # Result: 'ACGTGCGC'
poolparty.extract_region(pool, region_name, rc=False, iter_order=None, prefix=None)[source]

Extract content from a named region as a new Pool.

Creates a Pool that yields the content inside the specified region.

Parameters:
  • pool (Pool or str) – Input Pool or sequence string containing the region.

  • region_name (str) – Name of the region to extract content from.

  • rc (bool) – If True, reverse-complement the extracted content.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool yielding the content inside the region.

Return type:

Pool

Examples

>>> with pp.Party():
...     bg = pp.from_seq('ACGT<region>TTAA</region>GCGC')
...     content = pp.extract_region(bg, 'region')
...     # content yields: 'TTAA'
...
...     # With rc=True, content is reverse-complemented
...     content_rc = pp.extract_region(bg, 'region', rc=True)
...     # content_rc yields: 'TTAA' (reverse complement of TTAA)
poolparty.replace_region(pool, content_pool, region_name, rc=False, sync=True, keep_tags=True, iter_order=None, prefix=None, _factory_name=None, _style=None)[source]

Replace a region with content from another Pool.

The region (including its tags and any content) is replaced with sequences from content_pool.

Parameters:
  • pool (Pool or str) – Background Pool or sequence string containing the region.

  • content_pool (Pool or str) – Pool or sequence string to insert at the region position.

  • region_name (str) – Name of the region to replace.

  • rc (bool) – If True, reverse-complement the content before insertion.

  • sync (bool) – If True, synchronize pool and content_pool so they iterate in lock-step (1:1 pairing) instead of a Cartesian product.

  • keep_tags (bool) – If True, preserve the region’s XML tags around the new content. The region remains tracked in the resulting pool.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

Returns:

A Pool yielding pool sequences with the region replaced by content_pool sequences.

Return type:

Pool

Examples

>>> with pp.Party():
...     # Replace region with content from another pool
...     bg = pp.from_seq('ACGT<insert/>TTTT')
...     inserts = pp.from_seqs(['AAA', 'GGG'], mode='sequential')
...     result = pp.replace_region(bg, inserts, 'insert')
...     # Result yields: 'ACGTAAATTTT', 'ACGTGGGTTTT'
...
...     # With sync=True for 1:1 pairing
...     bg = pp.from_seqs(['ACGT<bc/>TTTT', 'CCCC<bc/>GGGG'], mode='sequential')
...     barcodes = pp.get_barcodes(num_barcodes=2, length=4, seed=42)
...     result = pp.replace_region(bg, barcodes, 'bc', sync=True)
...     # Each background gets a unique barcode (no Cartesian product)
...
...     # With keep_tags=True to preserve region tracking
...     result = pp.replace_region(bg, barcodes, 'bc', keep_tags=True)
...     # Region tags remain: 'ACGT<bc>XXXX</bc>TTTT'
poolparty.apply_at_region(pool, region_name, transform_fn, rc=False, remove_tags=True, iter_order=None, prefix=None)[source]

Apply a transformation to the content of a region.

This is a high-level convenience function that: 1. Extracts content from the named region (reverse-complementing if rc=True) 2. Applies transform_fn to create a transformed content Pool 3. Replaces the region with the transformed content (reverse-complementing back if rc=True)

Parameters:
  • pool (Pool or str) – Input Pool or sequence string containing the region.

  • region_name (str) – Name of the region whose content to transform.

  • transform_fn (Callable) – Function that takes a Pool and returns a transformed Pool. Examples: pp.rc, pp.shuffle_seq, lambda p: pp.mutagenize(p, …)

  • rc (bool) – If True, reverse-complement content before transform and reverse-complement result back before insertion.

  • remove_tags (bool) – If True, region tags are removed from the result. If False, region tags are preserved around the transformed content.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

Returns:

A Pool with the region content transformed.

Return type:

Pool

Examples

>>> with pp.Party():
...     # Reverse complement a region (tags removed)
...     bg = pp.from_seq('ACGT<orf>ATGCCC</orf>TTTT')
...     result = pp.apply_at_region(bg, 'orf', pp.rc)
...     # Result: 'ACGTGGGCATTTTT'
...
...     # Keep tags around transformed content
...     bg = pp.from_seq('AAA<region>ACGT</region>TTT')
...     result = pp.apply_at_region(
...         bg, 'region',
...         lambda p: pp.mutagenize(p, num_mutations=1),
...         remove_tags=False,
...     )
...     # Result: 'AAA<region>ACCT</region>TTT' (tags preserved)

Notes

If rc=True, the transform_fn receives reverse-complemented content, and the result is reverse-complemented back before insertion.

poolparty.region_scan(pool, tag_name='region', positions=None, region=None, remove_tags=None, region_length=0, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]

Insert XML-style region tags at scanning positions in a sequence.

Parameters:
  • pool (Pool or str) – Input Pool or sequence string to insert tags into.

  • tag_name (str) – Name for the XML tag to insert.

  • positions (Sequence[Integral] | slice | None) – Valid insertion positions (0-based). If None, all positions are valid.

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be region name (str) or [start, stop].

  • remove_tags (Optional[bool]) – If True and region is a region name, remove tags from output.

  • region_length (int) – Length of sequence to encompass. 0 creates zero-length regions (<name/>), >0 creates region tags (<name>BASES</name>).

  • mode (Literal['random', 'sequential', 'fixed']) – Position selection mode: ‘random’ or ‘sequential’.

  • _factory_name (Optional[str]) – Sets default name of the resulting operation

Returns:

A Pool yielding sequences with the region tags inserted at selected positions.

Return type:

Pool

poolparty.region_multiscan(pool, tag_names, num_insertions, positions=None, region=None, remove_tags=None, region_length=0, insertion_mode='ordered', min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]

Insert multiple XML-style region tags into a sequence.

Parameters:
  • pool (Pool or str) – Input Pool or sequence string to insert tags into.

  • tag_names (Sequence[str] or str) – Tag name(s) to insert. If a single string, used for all insertions.

  • num_insertions (int) – Number of region tags to insert.

  • positions (Sequence[Integral] | Sequence[Sequence[Integral]] | slice | None) – Valid insertion positions (0-based, nontag-relative). Flat list/slice/None for shared positions; list-of-lists for per-insert positions (one per insert).

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be region name (str) or [start, stop].

  • region_length (int | Sequence[int]) – Length of sequence to encompass per region. Single int for uniform length, or a sequence of ints for per-region lengths (one per insertion).

  • insertion_mode (Literal['ordered', 'unordered']) – How to assign tags to positions: - ‘ordered’: tag_names[i] goes to the i-th selected position (left to right) - ‘unordered’: all valid assignments of tags to positions are enumerated

  • min_spacing (Optional[int]) – Minimum gap between end of one region and start of next. Default: 0 (non-overlapping, touching OK).

  • max_spacing (Optional[int]) – Maximum gap between adjacent regions. None = unbounded.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Position selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. If None, auto-determined for sequential mode.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

Returns:

A Pool yielding sequences with multiple region tags inserted.

Return type:

Pool

Multiscan Operations

Multi-region scanning operations.

poolparty.deletion_multiscan(pool, deletion_length, num_deletions, deletion_marker='-', positions=None, region=None, names=None, min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='deletion_multiscan')[source]

Delete segments at multiple positions simultaneously.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • deletion_length (Integral) – Number of characters to delete at each position.

  • num_deletions (Integral) – Number of simultaneous deletions to make.

  • deletion_marker (Optional[str]) – Character to insert at each deletion site. If None, deleted segments are removed with no marker.

  • positions (Sequence[Integral] | Sequence[Sequence[Integral]] | slice | None) – Valid positions for deletion starts (0-based). Can be a flat list (shared across all deletions) or a list of per-deletion sublists. If None, all valid positions are used.

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.

  • names (Optional[Sequence[str]]) – Custom names for the deletion regions. If None, auto-generated (_del_0, _del_1, …).

  • min_spacing (Optional[Integral]) – Minimum gap between end of one deletion and start of next.

  • max_spacing (Optional[Integral]) – Maximum gap between adjacent deletions. None = unbounded.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • style (Optional[str]) – Style to apply to deletion marker characters (e.g., ‘gray’, ‘red bold’).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'combination_index', 'starts', 'ends', 'names', 'region_seqs'.

Returns:

A Pool yielding sequences with multiple segments deleted simultaneously.

Return type:

Pool

poolparty.insertion_multiscan(pool, num_insertions, insertion_pools, positions=None, region=None, names=None, replace=False, style=None, insertion_mode='ordered', min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]

Insert or replace sequences at multiple positions simultaneously.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string.

  • num_insertions (Integral) – Number of simultaneous insertions/replacements to make.

  • insertion_pools (Union[Pool, Sequence[Pool]]) – Pool(s) providing content. If a single Pool is provided, it will be deepcopied num_insertions - 1 times. If a Sequence of Pools is provided, its length must equal num_insertions.

  • positions (Sequence[Integral] | Sequence[Sequence[Integral]] | slice | None) – Valid positions (0-based). Can be a flat list (shared across all insertions) or a list of per-insertion sublists. If None, all valid positions are used.

  • region (str | Sequence[Integral] | None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.

  • names (Optional[Sequence[str]]) – Custom names for the insertion regions. If None, auto-generated (_ins_0, _ins_1, …).

  • replace (bool) – If True, replace existing content at each position (region_length = pool seq_length). If False, insert at zero-width positions.

  • style (Optional[str]) – Style to apply to inserted/replaced content (e.g., ‘red’, ‘blue bold’).

  • insertion_mode (Literal['ordered', 'unordered']) – How to assign pools to positions. 'ordered' preserves position order; 'unordered' uses all permutations.

  • min_spacing (Optional[Integral]) – Minimum gap between adjacent positions.

  • max_spacing (Optional[Integral]) – Maximum gap between adjacent positions. None = unbounded.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’.

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'combination_index', 'starts', 'ends', 'names', 'region_seqs'.

Returns:

A Pool yielding sequences with multiple insertions or replacements.

Return type:

Pool

poolparty.replacement_multiscan(pool, num_replacements, replacement_pools, positions=None, region=None, names=None, style=None, insertion_mode='ordered', min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='replacement_multiscan')[source]

Replace segments at multiple positions simultaneously.

Equivalent to insertion_multiscan(..., replace=True). See insertion_multiscan() for full parameter documentation.

Return type:

Pool

State Operations

Operations that manipulate the state space of pools.

poolparty.stack(pools, prefix=None, iter_order=None, cards=None)[source]

Create a pool by stacking multiple input pools state-wise (disjoint union).

Parameters:
  • pools (Sequence[TypeVar(T, bound= Pool)]) – Sequence of Pool objects to stack into a single Pool.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'active_parent'.

Returns:

A Pool whose states are the disjoint union of all input pools’ states. Each state produces the sequence from the corresponding input pool.

Return type:

TypeVar(T, bound= Pool)

poolparty.repeat(pool, times, prefix=None, iter_order=None, cards=None)[source]

Repeat a pool’s states a specified number of times.

Parameters:
  • pool (TypeVar(T, bound= Pool)) – The Pool whose states are to be repeated.

  • times (Integral) – The number of times to repeat the pool’s states. Must be >= 1.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'repeat_index'.

Returns:

A new Pool with times as many states as the input pool.

Return type:

TypeVar(T, bound= Pool)

Raises:

ValueError – If times is less than 1.

poolparty.state_slice(pool, key, prefix=None, iter_order=None)[source]

Create a Pool containing a slice of states from the input Pool.

Parameters:
  • pool (TypeVar(T, bound= Pool)) – The Pool whose states will be sliced.

  • key (Union[Integral, slice]) – Integer index or slice specifying which states to include from the input Pool.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

Returns:

A Pool containing states selected by applying the provided index or slice to the input Pool’s state space.

Return type:

TypeVar(T, bound= Pool)

poolparty.state_shuffle(pool, seed=None, permutation=None, prefix=None, iter_order=None)[source]

Create a Pool with randomly permuted states from the input Pool.

Parameters:
  • pool (TypeVar(T, bound= Pool)) – The Pool whose states will be shuffled.

  • seed (Optional[Integral]) – Random seed for deterministic shuffling. If None, a random seed is generated.

  • permutation (Optional[Sequence[Integral]]) – Custom permutation to use. If provided, seed must not be specified.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

Returns:

A Pool containing the same states as the input but in a randomly permuted order.

Return type:

TypeVar(T, bound= Pool)

poolparty.sample(pool, num_seqs=None, seq_states=None, seed=None, with_replacement=True, prefix=None, iter_order=None)[source]

Sample states from a pool.

Parameters:
  • pool (TypeVar(T, bound= Pool)) – The Pool to sample states from.

  • num_seqs (Optional[Integral]) – Number of states to sample randomly. Mutually exclusive with seq_states.

  • seq_states (Optional[Sequence[Integral]]) – Explicit list of state indices to select. Mutually exclusive with num_seqs.

  • seed (Optional[Integral]) – Random seed for deterministic sampling. Only used with num_seqs.

  • with_replacement (bool) – Whether to sample with replacement. If False, num_seqs must be <= pool.num_states.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

Returns:

A Pool containing the sampled states from the input Pool.

Return type:

TypeVar(T, bound= Pool)

Raises:

ValueError – If both num_seqs and seq_states are provided, or if neither is provided. If with_replacement is False and num_seqs exceeds the pool’s state count.

poolparty.sync(pools)[source]

Synchronize multiple pools to iterate in lockstep (in-place).

Parameters:

pools (Sequence[Pool]) – Sequence of Pool objects to synchronize. All pools must have the same number of states.

Returns:

Pools are modified in-place; no new Pool is returned.

Return type:

None

Raises:

ValueError – If the input sequence is empty, if the pools have differing numbers of states, or if any pool is an ancestor of another (circular constraint).

ORF Operations

Codon-aware operations for protein-coding sequences.

poolparty.mutagenize_orf(pool, region=None, *, num_mutations=None, mutation_rate=None, mutation_type='missense_only_first', codon_positions=None, style=None, frame=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None)[source]

Apply codon-level mutations to an ORF sequence. Requires active Party context.

Parameters:
  • pool (Union[Pool, str]) – Parent pool or sequence string to mutate.

  • region (str | Sequence[Integral] | None) – Region to mutate. Can be marker name (e.g., “orf”) or [start, stop]. If None, mutates the entire sequence.

  • num_mutations (Optional[Integral]) – Fixed number of codon mutations (mutually exclusive with mutation_rate).

  • mutation_rate (Optional[Real]) – Per-codon mutation probability (mutually exclusive with num_mutations).

  • mutation_type (str) – Type of mutation: ‘any_codon’, ‘nonsynonymous_first’, ‘nonsynonymous_random’, ‘missense_only_first’, ‘missense_only_random’, ‘synonymous’, ‘nonsense’.

  • codon_positions (Union[Sequence[Integral], slice, None]) – Eligible codon indices: None (all), list of indices, or slice.

  • style (Optional[str]) – Style to apply to mutated codon positions (e.g., ‘red’, ‘bold’).

  • frame (Optional[int]) – Reading frame and orientation. Valid values: +1, +2, +3, -1, -2, -3. Positive values indicate left-to-right orientation (5’->3’), negative values indicate right-to-left orientation (3’->5’). The absolute value indicates the frame of the boundary base (1-indexed). If None and region is a named OrfRegion, uses the OrfRegion’s frame.

  • prefix (Optional[str]) – Prefix for sequence names in the resulting Pool.

  • mode (Literal['random', 'sequential', 'fixed']) – Selection mode: ‘random’ or ‘sequential’. Sequential requires num_mutations (not mutation_rate) and a uniform mutation_type (‘any_codon’, ‘nonsynonymous_first’, ‘missense_only_first’, or ‘nonsense’).

  • num_states (Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).

  • iter_order (Optional[Real]) – Iteration order priority for the Operation.

  • cards (None | list[str] | dict[str, str]) – Design card keys to include. Available keys: 'codon_positions', 'wt_codons', 'mut_codons', 'wt_aas', 'mut_aas'.

Returns:

A Pool that generates codon-mutated sequences.

Return type:

Pool

Raises:

ValueError – If frame is None and region is a named plain Region (not OrfRegion), if mutation_rate is used with sequential mode, if mutation_type is non-uniform with sequential mode, or if num_mutations exceeds eligible codons.

Library Generation

poolparty.generate_library(pool, num_cycles=1, num_seqs=None, seed=None, init_state=None, seqs_only=False, _include_inline_styles=False, discard_null_seqs=False, max_iterations=None, min_acceptance_rate=None, attempts_per_rate_assessment=100)[source]

Generate sequences from a pool.

Parameters:
  • pool (Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool]) – The pool to generate sequences from.

  • num_cycles (Integral) – Number of complete iterations through all states.

  • num_seqs (Optional[Integral]) – Number of sequences to generate.

  • seed (Optional[Integral]) – Random seed for reproducibility.

  • init_state (Optional[int]) – Initial state to start generation from.

  • seqs_only (bool) – If True, return list of sequences instead of DataFrame.

  • discard_null_seqs (bool) – If True, discard sequences that fail filters (null sequences). With num_seqs, keeps sampling until N valid sequences are collected. With num_cycles, enumerates all states and returns only the valid ones (output may have fewer than num_cycles * num_states rows).

  • max_iterations (Optional[int]) – Maximum iterations before stopping. Default: state space size for sequential mode, or num_seqs * 100 for random mode.

  • min_acceptance_rate (Optional[float]) – Minimum fraction of sequences that must pass filters. If actual rate falls below this, generation stops with a warning.

  • attempts_per_rate_assessment (int) – Iterations between acceptance rate checks.

Returns:

name, seq, plus any requested design card columns. Or list of sequences if seqs_only=True. Entries are None for null rows when discard_null_seqs=False.

Return type:

Union[DataFrame, list[str | None]]

Note

Design card columns are opt-in via the cards parameter on individual operations. Default output contains only ‘name’ and ‘seq’ columns.

Utility Functions

poolparty.print_named_colors()[source]

Print all named colors (CSS + basic ANSI) each styled in that color.

Return type:

None

Constants

DNA Constants

poolparty.BASES

Standard DNA bases: ['A', 'C', 'G', 'T']

poolparty.COMPLEMENT

Complement mapping for DNA bases.

poolparty.IUPAC_TO_DNA

Mapping from IUPAC ambiguity codes to DNA bases.

poolparty.VALID_CHARS

Set of valid characters in DNA sequences.

poolparty.IGNORE_CHARS

Characters to ignore in DNA sequences (gaps, annotations).

Operation Classes

These are the underlying operation classes. Most users will use the convenience functions above instead of instantiating these directly.

Base Operation Classes

class poolparty.FromSeqsOp(seqs, parent_pool=None, region=None, style=None, seq_names=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None, _factory_name=None)[source]

Bases: Operation

Create a pool from a list of sequences.

design_card_keys: Sequence[str] = ['seq_name', 'seq_index']
__init__(seqs, parent_pool=None, region=None, style=None, seq_names=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None, _factory_name=None)[source]

Initialize FromSeqsOp.

factory_name: str = 'from_seqs'
compute_name_contributions(global_state=None, max_global_state=None)[source]

Compute name contributions - explicit seq_names or prefix pattern.

Return type:

list[str]

class poolparty.FromIupacOp(iupac_seq, parent_pool=None, region=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, style=None, cards=None)[source]

Bases: Operation

Generate DNA sequences from IUPAC notation.

factory_name: str = 'from_iupac'
design_card_keys: Sequence[str] = ['iupac_state']
__init__(iupac_seq, parent_pool=None, region=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, style=None, cards=None)[source]

Initialize FromIupacOp.

class poolparty.FromMotifOp(prob_df, parent_pool=None, region=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, style=None, cards=None)[source]

Bases: Operation

Sample sequences from a position probability matrix.

factory_name: str = 'from_motif'
design_card_keys: Sequence[str] = ['prob_state']
__init__(prob_df, parent_pool=None, region=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, style=None, cards=None)[source]

Initialize FromMotifOp.

class poolparty.GetKmersOp(length, pool=None, region=None, style=None, case='upper', prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None)[source]

Bases: Operation

Generate DNA k-mers.

factory_name: str = 'get_kmers'
design_card_keys: Sequence[str] = ['kmer_index', 'kmer']
__init__(length, pool=None, region=None, style=None, case='upper', prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None)[source]

Initialize GetKmersOp.

class poolparty.MutagenizeOp(pool, num_mutations=None, mutation_rate=None, allowed_chars=None, region=None, style=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, _remove_tags=False, cards=None, _factory_name='mutagenize')[source]

Bases: Operation

Apply mutations to a parent sequence or a specified region within it.

Supports two mutation modes: - num_mutations: Apply exactly this many mutations to each sequence - mutation_rate: Apply a random number of mutations based on a binomial distribution

Exactly one of num_mutations or mutation_rate must be provided. Sequential mode is only available when num_mutations is specified.

design_card_keys: Sequence[str] = ['positions', 'wt_chars', 'mut_chars']
__init__(pool, num_mutations=None, mutation_rate=None, allowed_chars=None, region=None, style=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, _remove_tags=False, cards=None, _factory_name='mutagenize')[source]

Initialize Operation.

factory_name: str = 'mutagenize'
class poolparty.SeqShuffleOp(parent_pool, region=None, shuffle_type='mono', spacer_str='', prefix=None, mode='random', num_states=None, name=None, iter_order=None, _remove_tags=False, style=None, cards=None, _factory_name=None)[source]

Bases: Operation

Randomly shuffle characters within a region of the parent sequence.

design_card_keys: Sequence[str] = ['permutation']
__init__(parent_pool, region=None, shuffle_type='mono', spacer_str='', prefix=None, mode='random', num_states=None, name=None, iter_order=None, _remove_tags=False, style=None, cards=None, _factory_name=None)[source]

Initialize SeqShuffleOp.

factory_name: str = 'shuffle_seq'
class poolparty.RecombineOp(parent_pool, sources, num_breakpoints=1, positions=None, region=None, styles=None, style_by='order', prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None, _factory_name='recombine')[source]

Bases: Operation

Recombine segments from multiple source pools at specified breakpoints.

In sequential mode, enumerates all breakpoint positions × pool assignment combinations. In random mode, randomly selects breakpoints and pool assignments.

design_card_keys: Sequence[str] = ['breakpoints', 'pool_assignments']
__init__(parent_pool, sources, num_breakpoints=1, positions=None, region=None, styles=None, style_by='order', prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None, _factory_name='recombine')[source]

Initialize Operation.

factory_name: str = 'recombine'

Fixed Operation Classes

class poolparty.FixedOp(parent_pools, seq_from_seqs_fn, seq_length_from_pool_lengths_fn, region=None, remove_tags=None, spacer_str='', name=None, iter_order=None, prefix=None, _factory_name=None, _pass_through_styles=True, _style_combiner_fn=None)[source]

Bases: Operation

Fixed operation that applies a user-defined function to parent sequences.

design_card_keys: Sequence[str] = []
__init__(parent_pools, seq_from_seqs_fn, seq_length_from_pool_lengths_fn, region=None, remove_tags=None, spacer_str='', name=None, iter_order=None, prefix=None, _factory_name=None, _pass_through_styles=True, _style_combiner_fn=None)[source]

Initialize FixedOp.

factory_name: str = 'fixed'
class poolparty.StylizeOp(pool, style, region=None, which='contents', regex=None, name=None, iter_order=None, prefix=None)[source]

Bases: Operation

Apply inline styling to sequences without modification.

factory_name: str = 'stylize'
design_card_keys: list[str] = []
__init__(pool, style, region=None, which='contents', regex=None, name=None, iter_order=None, prefix=None)[source]

Initialize StylizeOp.

State Operation Classes

class poolparty.StackOp(parent_pools, prefix=None, name=None, iter_order=None, cards=None)[source]

Bases: Operation

Stack multiple pools sequentially (disjoint union).

factory_name: str = 'stack'
design_card_keys: Sequence[str] = ['active_parent']
__init__(parent_pools, prefix=None, name=None, iter_order=None, cards=None)[source]

Initialize StackOp.

build_pool_counter(parent_pools)[source]

Build pool state using st.stack (disjoint union).

Return type:

State

class poolparty.RepeatOp(pool, times, prefix=None, name=None, iter_order=None, cards=None)[source]

Bases: Operation

Repeat a pool’s states n times.

factory_name: str = 'repeat'
design_card_keys: Sequence[str] = ['repeat_index']
__init__(pool, times, prefix=None, name=None, iter_order=None, cards=None)[source]

Initialize RepeatOp.

class poolparty.StateSliceOp(parent_pool, start, stop, step, prefix=None, name=None, iter_order=None)[source]

Bases: Operation

Slice a pool’s states to select a subset.

factory_name: str = 'state_slice'
design_card_keys: Sequence[str] = []
__init__(parent_pool, start, stop, step, prefix=None, name=None, iter_order=None)[source]

Initialize StateSliceOp.

build_pool_counter(parent_pools)[source]

Build pool counter using st.slice.

Return type:

State

class poolparty.StateShuffleOp(parent_pool, seed=None, permutation=None, prefix=None, name=None, iter_order=None)[source]

Bases: Operation

Randomly permute a pool’s states.

factory_name: str = 'state_shuffle'
design_card_keys: Sequence[str] = []
__init__(parent_pool, seed=None, permutation=None, prefix=None, name=None, iter_order=None)[source]

Initialize StateShuffleOp.

build_pool_counter(parent_pools)[source]

Build pool counter using st.shuffle.

Return type:

State

class poolparty.SampleOp(parent_pool, num_seqs=None, seq_states=None, seed=None, with_replacement=True, prefix=None, name=None, iter_order=None)[source]

Bases: Operation

Sample states from a pool.

factory_name: str = 'sample'
design_card_keys: Sequence[str] = []
__init__(parent_pool, num_seqs=None, seq_states=None, seed=None, with_replacement=True, prefix=None, name=None, iter_order=None)[source]

Initialize SampleOp.

build_pool_counter(parent_pools)[source]

Build pool counter using st.sample.

Return type:

State

ORF Operation Classes

class poolparty.MutagenizeOrfOp(parent_pool, region=None, num_mutations=None, mutation_rate=None, mutation_type='missense_only_first', codon_positions=None, style=None, frame=1, prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None)[source]

Bases: Operation

Apply codon-level mutations to an ORF sequence.

factory_name: str = 'mutagenize_orf'
design_card_keys: Sequence[str] = ['codon_positions', 'wt_codons', 'mut_codons', 'wt_aas', 'mut_aas']
__init__(parent_pool, region=None, num_mutations=None, mutation_rate=None, mutation_type='missense_only_first', codon_positions=None, style=None, frame=1, prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None)[source]

Initialize MutagenizeOrfOp.