API Reference
This page provides complete API documentation for all public classes and functions in PoolParty, automatically generated from source code docstrings.
Core Classes
Pool
The main class for building and manipulating sequence pools.
- class poolparty.Pool(operation, name=None, state=None, iter_order=None, regions=None)[source]
Bases:
CommonOpsMixin,ScanOpsMixin,GenericFixedOpsMixin,StateOpsMixin,RegionOpsMixinBase pool class - a node in the computation DAG.
Pool provides generic operations that work on any sequence type. For DNA-specific operations, use DnaPool. For protein-specific operations, use ProteinPool.
- __init__(operation, name=None, state=None, iter_order=None, regions=None)[source]
Initialize Pool and build its state.
- has_region(name)[source]
Check if a region with the given name is present in this pool.
- Return type:
- copy(name=None)[source]
Create a copy of this pool with a copied operation.
The copied operation references the same parent_pools, so the copy represents a parallel branch in the computation graph that shares the same upstream DAG.
Must be called within an active Party context.
- deepcopy(name=None)[source]
Create a deep copy of this pool, recursively copying the entire upstream DAG.
Unlike copy(), this creates independent copies of all upstream pools and operations, resulting in a fully independent computation DAG.
Must be called within an active Party context.
- generate_library(num_cycles=1, num_seqs=None, seed=None, init_state=None, seqs_only=False, _include_inline_styles=False, discard_null_seqs=False, max_iterations=None, min_acceptance_rate=None, attempts_per_rate_assessment=100)[source]
Generate sequences from a pool.
Args:
- Returns:
name, seq, plus any requested design card columns. Or list of sequences if seqs_only=True. Entries are None for null rows when discard_null_seqs=False.
- Return type:
Note
Design card columns are opt-in via the cards parameter on individual operations. Default output contains only ‘name’ and ‘seq’ columns.
- print_library(num_seqs=None, num_cycles=None, show_header=True, show_state=False, show_name=True, show_seq=True, pad_names=True, seed=None, discard_null_seqs=False, max_iterations=None, min_acceptance_rate=None, attempts_per_rate_assessment=100)[source]
Print preview sequences from this pool; returns self for chaining.
- Parameters:
num_seqs (
Optional[Integral]) – Number of sequences to generate.num_cycles (
Optional[Integral]) – Number of complete iterations through all states.show_header (
bool) – Whether to show the pool header line.show_state (
bool) – Whether to show the state column. Requires the pool to have been built with design cards that produce a state column; silently ignored otherwise.show_name (
bool) – Whether to show the name column.show_seq (
bool) – Whether to show the seq column.pad_names (
bool) – Whether to pad names to align sequences.seed (
Optional[Integral]) – Random seed for reproducibility.discard_null_seqs (
bool) – If True, only show valid (non-null) sequences.max_iterations (
Optional[int]) – Maximum iterations before stopping.min_acceptance_rate (
Optional[float]) – Minimum fraction of sequences that must pass.attempts_per_rate_assessment (
int) – Iterations between acceptance rate checks.
- Return type:
Self
Party
Context manager for PoolParty sessions.
- class poolparty.Party(genetic_code='standard')[source]
Bases:
objectContext manager for building and executing sequence libraries.
- property codon_table: CodonTable
Access the CodonTable for ORF operations.
- set_genetic_code(genetic_code)[source]
Set or change the genetic code used for ORF operations.
- Return type:
- get_effective_seq_length(seq)[source]
Get effective sequence length (DNA characters only, excluding markers).
- Return type:
- get_length_without_tags(seq)[source]
Get sequence length excluding only region tags (includes all chars).
- Return type:
- get_molecular_positions(seq)[source]
Get raw string positions of valid DNA characters, excluding marker interiors.
- __exit__(exc_type, exc_val, exc_tb)[source]
Exit the Party context, restoring the previous party.
- Return type:
- get_pool_by_id(id_)[source]
Get a pool by its ID.
- Return type:
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool]
- get_pool_by_name(name)[source]
Get a pool by its name.
- Return type:
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool]
- get_op_by_name(name)[source]
Get an operation by its name.
- Return type:
poolparty.operation.Operation
- register_region(name, seq_length)[source]
Register a region with this party.
If a region with the same name already exists: - If it has the same seq_length, return the existing region - If it has a different seq_length, raise ValueError
- Parameters:
- Returns:
The registered region (existing or newly created).
- Return type:
- Raises:
ValueError – If a region with the same name but different seq_length exists.
- register_orf_region(name, seq_length, frame=1)[source]
Register an ORF region with this party.
If a region with the same name already exists: - If it’s an OrfRegion with same seq_length and frame, return it - Otherwise raise ValueError
- upgrade_to_orf_region(name, frame=1)[source]
Upgrade an existing plain Region to an OrfRegion.
Only valid if the existing region is a plain Region (not already an OrfRegion).
- Parameters:
- Returns:
The upgraded ORF region.
- Return type:
OrfRegion- Raises:
ValueError – If region doesn’t exist or is already an OrfRegion.
- get_region(name)[source]
Get a registered region by name. Alias for get_region_by_name.
- Return type:
- clear_pools()[source]
Clear all pools, operations, and regions without resetting configuration or genetic code.
Unlike init(), this preserves: - Configuration settings (_config) - Genetic code settings (_codon_table)
- Return type:
- print_graph(style='clean')[source]
Print an ASCII tree visualization of the Pool-Operation computation graph.
Shows pools (places) with parentheses and operations (transitions) with brackets, similar to a Petri net diagram. Root pools (not consumed by other operations) are printed first, with their upstream DAGs.
- Parameters:
style (
str) –Display style -
'clean'(default),'minimal', or'repr'.'clean': Shows names with key attributes (e.g.,(name) pool: n=num_states,[name] op: factory_name, mode, n=num_states).'minimal': Shows just names (e.g.,(name),[name]).'repr': Shows fullrepr()of each object.
- Return type:
Operation
Abstract base class for all pool operations.
- class poolparty.Operation(parent_pools, num_states=1, mode='fixed', seq_length=None, name=None, iter_order=None, prefix=None, region=None, remove_tags=None, _natural_num_states=None, cards=None)[source]
Bases:
objectBase class for all operations.
- classmethod validate_num_states(num_states, mode)[source]
Validate num_states against max_num_sequential_states.
- __init__(parent_pools, num_states=1, mode='fixed', seq_length=None, name=None, iter_order=None, prefix=None, region=None, remove_tags=None, _natural_num_states=None, cards=None)[source]
Initialize Operation.
- property natural_num_states: int | None
Natural number of states (computed from operation, before user override).
- property action_uniquely_determined_by_state: bool
True if same state value always produces the same output.
- property uses_custom_column_names: bool
True if this operation uses dict-style custom column names.
- build_pool_counter(parent_pools)[source]
Build the output Pool’s state from parent pool states.
- Return type:
- compute(parents, rng=None)[source]
Compute output Seq and design card with automatic region handling.
This is the public entry point for operations. It handles region extraction/reassembly automatically, then delegates to _compute_core().
- Parameters:
- Return type:
- Returns:
tuple[Seq, dict] – Output Seq (with string and style) and design card dict.
If region is specified
1. Extracts region from parents[0] as a Seq
2. Calls _compute_core with modified parent list
3. Reassembles prefix + result + suffix using Seq.join
4. Removes region tags if remove_tags=True and region is a region name
- compute_name_contributions(global_state=None, max_global_state=None)[source]
Compute this operation’s contributions to the final sequence name.
Returns list of name elements in the order they should appear. Default: [prefix_state.value] when active, [] otherwise. For stateless random operations, uses global_state if provided.
- copy(name=None)[source]
Create a copy of this operation with a new ID.
The copy references the same parent_pools but has its own Counter. Must be called within an active Party context.
Region
Represents a tagged region within a sequence.
- class poolparty.Region(name, seq_length, _id=-1)[source]
Bases:
objectRepresents a registered region in a poolparty Party.
Regions identify sections of sequences for later modification. Each region has a name and a seq_length that specifies the expected length of content within the region tags.
- seq_length
The expected length of content within the region: - None: Variable-length region (content length not fixed) - 0: Zero-length region (insertion point, <name/>) - >0: Fixed-length region (content must be this length)
- Type:
Optional[int]
- __init__(name, seq_length, _id=-1)
Initialization Functions
- poolparty.init(genetic_code='standard', log_level=None)[source]
Initialize (or reset) the default Party, clearing all registered pools/operations/regions.
- poolparty.get_active_party()[source]
Get the currently active Party context, or None if not in a context.
- poolparty.clear_pools()[source]
Clear all pools, operations, and regions from the active Party without resetting configuration or genetic code.
- Return type:
- poolparty.configure_logging(level='WARNING', format='%(levelname)s - %(name)s - %(message)s', handler=None)[source]
Configure logging for poolparty and statetracker.
Base Operations
Functions for creating and transforming sequence pools.
Sequence Creation
- poolparty.from_seq(seq, pool=None, region=None, remove_tags=None, style=None, iter_order=None, prefix=None, _factory_name=None)[source]
Create a Pool containing a single, fixed sequence.
If pool and region are provided, the sequence replaces the region content in pool. Otherwise, creates a standalone pool with the sequence.
- Parameters:
seq (
str) – The sequence to include in the pool (or to insert at region).pool (
Union[Pool,str,None]) – Pool or sequence. If provided with region, seq replaces the region.region (
str|Sequence[Integral] |None) – Region to replace in pool. Can be marker name (str) or [start, stop].remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.style (
Optional[str]) – Style to apply to the sequence (e.g., ‘red’, ‘blue bold’).iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for auto-generated sequence names.
- Returns:
A Pool object yielding the provided sequence (or bg_pool with region replaced).
- Return type:
DnaPool
- poolparty.from_seqs(seqs, pool=None, region=None, style=None, seq_names=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]
Create a Pool containing the specified sequences.
- Parameters:
seqs (
Sequence[str]) – Sequence of string sequences to include in the pool.pool (
Union[Pool,str,None]) – Background pool or sequence. If provided with region, selected sequence replaces the region content.region (
str|Sequence[Integral] |None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.seq_names (
Optional[Sequence[str]]) – Explicit names for each sequence. If provided, these are used directly.prefix (
Optional[str]) – Prefix for auto-generated names (e.g., ‘seq_’ produces ‘seq_0’, ‘seq_1’, …). Cannot be used together with seq_names.mode (
Literal['random','sequential','fixed']) – Sequence selection mode: ‘sequential’ or ‘random’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).style (
Optional[str]) – Style to apply to output sequences (e.g., ‘red’, ‘blue bold’).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'seq_name','seq_index'.
- Returns:
A Pool object yielding the provided sequences using the specified selection mode.
- Return type:
DnaPool- Raises:
TypeError – If seqs is a bare string instead of a list of strings.
ValueError – If pool is provided without region.
- poolparty.from_fasta(fasta_path, coordinates, pool=None, region=None, remove_tags=None, iter_order=None, prefix=None, style=None, cards=None)[source]
Extract genomic region(s) from a FASTA file and create a Pool.
- Parameters:
fasta_path (
str) – Path to the FASTA file (will be indexed with pyfaidx).coordinates (
Union[tuple[str,int,int,Literal['+','-']],Sequence[tuple[str,int,int,Literal['+','-']]]]) – Single coordinate as (chrom, start, stop, strand) or list of such tuples. Coordinates are 0-based [start, stop). If strand=’-’, sequence is reverse complemented. For circular genomes, start > stop indicates wrap-around.pool (
Union[Pool,str,None]) – Background pool or sequence. If provided with region, extracted sequence(s) replace the region content.region (
str|Sequence[Integral] |None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from the output. Only relevant in single-coordinate mode (has no effect in batch mode).iter_order (
Optional[Real]) – Iteration order priority for the Operation (batch mode only).prefix (
Optional[str]) – Prefix for sequence names. Names are “{prefix}_{chrom}:{start}-{stop}({strand})” or “{chrom}:{start}-{stop}({strand})” if no prefix.style (
Optional[str]) – Style to apply to extracted sequences (e.g., ‘red’, ‘blue bold’).cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys (batch mode only):'seq_name','seq_index'. Ignored in single-coordinate mode.
- Returns:
A Pool yielding the extracted genomic sequence(s).
- Return type:
DnaPool
- poolparty.from_iupac(iupac_seq, pool=None, region=None, prefix=None, mode='random', num_states=None, iter_order=None, style=None, cards=None)[source]
Create a Pool that generates DNA sequences from IUPAC notation.
- Parameters:
iupac_seq (
str) – IUPAC sequence string (e.g., ‘RN’ for purine + any base). Valid characters: A, C, G, T, U, R, Y, S, W, K, M, B, D, H, V, N.pool (
Union[Pool,str,None]) – Background pool or sequence. If provided with region, generated sequence replaces the region content.region (
str|Sequence[Integral] |None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Sequence selection mode: ‘sequential’ or ‘random’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.style (
Optional[str]) – Style to apply to generated sequences (e.g., ‘red’, ‘blue bold’).cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'iupac_state'.
- Returns:
A Pool yielding DNA sequences from the IUPAC pattern.
- Return type:
DnaPool- Raises:
ValueError – If pool is provided without region.
- poolparty.from_motif(prob_df, pool=None, region=None, prefix=None, mode='random', num_states=None, iter_order=None, style=None, cards=None)[source]
Create a Pool that samples sequences from a position probability matrix.
- Parameters:
prob_df (
DataFrame) – DataFrame with probability values for each position. Columns should be alphabet characters (e.g., ‘A’, ‘C’, ‘G’, ‘T’). Rows represent positions. Values are probabilities (auto-normalized).pool (
Union[Pool,str,None]) – Background pool or sequence. If provided with region, generated sequence replaces the region content.region (
str|Sequence[Integral] |None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Sequence selection mode: ‘random’.num_states (
Optional[Integral]) – Number of states for random mode. If None, defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.style (
Optional[str]) – Style to apply to generated sequences (e.g., ‘red’, ‘blue bold’).cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'prob_state'.
- Returns:
A Pool yielding sequences sampled from the probability matrix.
- Return type:
DnaPool- Raises:
ValueError – If pool is provided without region.
- poolparty.get_kmers(length, pool=None, region=None, style=None, case='upper', prefix=None, mode='random', num_states=None, iter_order=None, cards=None)[source]
Create a Pool that generates DNA k-mers (all possible sequences of length k).
Must be called within a Party context.
- Parameters:
pool (
Union[Pool,str,None]) – Pool or sequence. If provided with region, generated k-mer replaces the region content.region (
str|Sequence[Integral] |None) – Region to replace in pool. Can be a marker name or [start, stop] interval. Required if pool is provided.length (
Integral) – Length of k-mers to generate.case (
Literal['lower','upper']) – Case of output k-mers: ‘upper’ for uppercase, ‘lower’ for lowercase.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Sequence selection mode: ‘sequential’ or ‘random’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).style (
Optional[str]) – Style to apply to generated k-mers (e.g., ‘red’, ‘blue bold’).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'kmer_index','kmer'.
- Returns:
A Pool whose states yield DNA k-mers of the specified length.
- Return type:
DnaPool- Raises:
RuntimeError – If called outside of a Party context.
ValueError – If pool is provided without region.
Sequence Transformation
- poolparty.mutagenize(pool, region=None, num_mutations=None, mutation_rate=None, allowed_chars=None, style=None, prefix=None, mode='random', num_states=None, iter_order=None, _remove_tags=False, cards=None, _factory_name='mutagenize')[source]
Create a Pool that applies mutations to a sequence.
- Parameters:
pool (
Union[Pool,str]) – Parent pool or sequence string to mutate.region (
str|Sequence[Integral] |None) – Region to mutagenize. Can be a marker name (str), explicit interval [start, stop], or None to mutagenize entire sequence. Positions are region-relative.num_mutations (
Optional[Integral]) – Fixed number of mutations to apply (mutually exclusive with mutation_rate).mutation_rate (
Optional[Real]) – Probability of mutation at each position (mutually exclusive with num_mutations).allowed_chars (
Optional[str]) – IUPAC string of same length as sequence, specifying allowed bases at each position. Each character is an IUPAC code (A, C, G, T, R, Y, S, W, K, M, B, D, H, V, N). Positions where only the wild-type is allowed are treated as non-mutable.style (
Optional[str]) – Style to apply to mutated positions (e.g., ‘red’, ‘blue bold’).prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’. Sequential only available with num_mutations.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'positions','wt_chars','mut_chars'.
- Returns:
A Pool that generates mutated sequences.
- Return type:
- poolparty.shuffle_seq(pool, region=None, shuffle_type='mono', prefix=None, mode='random', num_states=None, iter_order=None, _remove_tags=False, style=None, cards=None, _factory_name=None)[source]
Create a Pool that shuffles characters within a specified region.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to shuffle.region (
str|Sequence[Integral] |None) – Region to shuffle. Can be a marker name (str), explicit interval [start, stop], or None to shuffle entire sequence.shuffle_type (
Literal['mono','dinuc']) –Type of shuffle to perform:
"mono": random permutation preserving mononucleotide composition."dinuc": Euler-path shuffle preserving dinucleotide frequencies. The first and last characters are always fixed (mathematical constraint of the Euler path algorithm).
mode (
Literal['random','sequential','fixed']) – Shuffle mode: ‘random’. Sequential is not supported.num_states (
Optional[Integral]) – Number of states for random mode. If None, defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.style (
Optional[str]) – Style to apply to shuffled characters (e.g., ‘red’, ‘blue bold’).cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'permutation'.
- Returns:
A Pool that yields shuffled sequences.
- Return type:
- poolparty.recombine(pool=None, region=None, sources=(), num_breakpoints=1, positions=None, mode='random', num_states=None, prefix=None, styles=None, style_by='order', iter_order=None, cards=None, _factory_name='recombine')[source]
Create a Pool that recombines segments from multiple source pools at breakpoints.
- Parameters:
pool (
Union[Pool,str,None]) – Parent pool for region-based recombination. If provided with region, the recombined sequences replace the region content.region (
str|Sequence[Integral] |None) – Region in pool where recombined sequences will be inserted. Region content is discarded (not used as a source pool).sources (
Sequence[Union[Pool,str]]) – Source pools for recombination. All must have the same seq_length.num_breakpoints (
Integral) – Number of recombination breakpoints. Must be <= seq_length - 1.positions (
Optional[Sequence[Integral]]) – Valid breakpoint positions. If None, defaults to range(seq_length - 1). Position i means “breakpoint after index i”.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ (random breakpoints and pool assignments) or ‘sequential’ (enumerate all combinations).num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.styles (
Optional[list[str]]) –List of styles to apply to segments. Both modes accept any non-empty list and cycle. Use empty string ‘’ for segments that shouldn’t have additional styling. Styles overlay on top of inherited source pool styles.
If style_by=’order’: cycles through styles for segments by position (e.g., with 2 styles and 5 segments:
style[0], style[1], style[0], style[1], style[0]).If style_by=’source’: cycles through styles based on source pool index (e.g., with 2 styles and 3 sources:
source[0]->style[0], source[1]->style[1], source[2]->style[0]).
style_by (
Literal['source','order']) –Determines how styles are assigned to segments:
'order':styles[i % len(styles)]applied to segment i (cycles by position).'source':styles[j % len(styles)]applied to segments fromsources[j](cycles by source index).
iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'breakpoints','pool_assignments'.
- Returns:
A Pool that generates recombined sequences.
- Return type:
- poolparty.join(pools, spacer_str='', iter_order=None, prefix=None, style=None, _factory_name=None)[source]
Concatenate multiple Pools or string sequences into a single Pool.
- Parameters:
pools (
Sequence[Union[TypeVar(T, bound=Pool),str]]) – List of Pool objects and/or strings to be joined in order. Any provided string is automatically converted to a constant Pool.spacer_str (
str) – String to insert between joined sequences.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.style (
Optional[str]) – Style to apply to the resulting concatenated sequences (e.g., ‘red’, ‘blue bold’).
- Returns:
A Pool whose states yield joined sequences from the specified inputs.
- Return type:
Fixed Operations
Operations that transform sequences without changing pool size.
- poolparty.rc(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None)[source]
Create a Pool containing the reverse complement of sequences from the input pool.
Note: Region tags are not preserved in the output. If you need to preserve regions, use extract_region with rc=True instead.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to reverse complement.region (
str|Sequence[Integral] |None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.style (
Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).
- Returns:
A Pool containing reverse-complemented sequences.
- Return type:
- poolparty.upper(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None)[source]
Create a Pool containing uppercase sequences from the input pool.
Preserves XML marker tags exactly as they appear (only transforms non-marker characters).
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to convert to uppercase.region (
str|Sequence[Integral] |None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.style (
Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).
- Returns:
A Pool containing uppercase sequences.
- Return type:
- poolparty.lower(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None)[source]
Create a Pool containing lowercase sequences from the input pool.
Preserves XML marker tags exactly as they appear (only transforms non-marker characters).
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to convert to lowercase.region (
str|Sequence[Integral] |None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.style (
Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).
- Returns:
A Pool containing lowercase sequences.
- Return type:
- poolparty.swapcase(pool, region=None, remove_tags=None, iter_order=None, prefix=None, style=None, _factory_name=None)[source]
Create a Pool containing case-swapped sequences from the input pool.
Preserves XML marker tags exactly as they appear (only transforms non-marker characters).
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to swap case.region (
str|Sequence[Integral] |None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.style (
Optional[str]) – Style to apply to the resulting sequences (e.g., ‘red’, ‘blue bold’).
- Returns:
A Pool containing case-swapped sequences.
- Return type:
- poolparty.slice_seq(pool, region=None, start=None, stop=None, step=None, keep_context=False, iter_order=None, prefix=None, style=None)[source]
Create a Pool containing sliced sequences from the input pool.
Extracts a subsequence based on region and/or Python-style slice parameters.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – The Pool (or sequence string) whose sequences will be sliced.region (
str|Sequence[Integral] |None) – Region to slice from. Can be: - str: Name of an annotated region (e.g., ‘orf’) - Sequence[int]: [start, stop] interval in the sequence - None: Use the full sequence If only region is specified (no start/stop/step), returns just that region.start (
Optional[Integral]) – Start position for slicing (0-indexed, Python-style). Applied after region extraction if region is specified.stop (
Optional[Integral]) – Stop position for slicing (exclusive, Python-style). Applied after region extraction if region is specified.step (
Optional[Integral]) – Step for slicing (Python-style). Applied after region extraction if region is specified.keep_context (
bool) – If True, reassemble the sliced content back into the original sequence context (prefix + sliced_content + suffix). If False (default), return only the sliced content.iter_order (
Optional[Real]) – Iteration order priority for the Operation.style (
Optional[str]) – Style to apply to the resulting sliced sequences (e.g., ‘red’, ‘blue bold’).
- Returns:
A Pool containing sliced sequences.
- Return type:
Examples
>>> with pp.Party(): ... # Slice positions 2-6 from the full sequence ... pool = pp.from_seq('ACGTACGT') ... sliced = pp.slice_seq(pool, start=2, stop=6) ... # Result: 'GTAC' ... ... # Extract just a named region ... pool = pp.from_seq('AAA<orf>ATGCCC</orf>TTT') ... orf = pp.slice_seq(pool, region='orf') ... # Result: 'ATGCCC' ... ... # Slice within a named region ... pool = pp.from_seq('AAA<orf>ATGCCC</orf>TTT') ... sliced = pp.slice_seq(pool, region='orf', start=0, stop=3) ... # Result: 'ATG' ... ... # Slice with step (every other character) ... pool = pp.from_seq('ABCDEFGH') ... sliced = pp.slice_seq(pool, step=2) ... # Result: 'ACEG' ... ... # Use as a method on Pool objects ... pool = pp.from_seq('ACGTACGT') ... sliced = pool.slice_seq(start=0, stop=4) ... # Result: 'ACGT' ... ... # Keep context - reassemble into original sequence ... pool = pp.from_seq('AAA<orf>ATGCCC</orf>TTT') ... sliced = pp.slice_seq(pool, region='orf', start=0, stop=3, keep_context=True) ... # Result: 'AAAATGTTT' (prefix + sliced region + suffix)
- poolparty.clear_gaps(pool, region=None, remove_tags=None, iter_order=None, prefix=None)[source]
Create a Pool with all gap/non-molecular characters removed from sequences.
This removes everything that is NOT a valid molecular character (DNA or protein), including gaps ‘-’, dots ‘.’, spaces ‘ ‘, and any other non-molecular characters.
Marker tags are preserved intact.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to filter.region (
str|Sequence[Integral] |None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool containing only molecular alphabet characters (markers preserved). Always has
seq_length=Nonebecause output length depends on how many non-molecular characters each sequence contains.- Return type:
- poolparty.clear_annotation(pool, region=None, remove_tags=None, iter_order=None, prefix=None)[source]
Create a Pool with all annotations cleared and sequences uppercased.
Removes all XML marker tags and non-molecular characters, then uppercases the result. When a region is specified, only transforms content within that region (nested markers and non-molecular chars inside are cleared).
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to transform.region (
str|Sequence[Integral] |None) – Region to apply transformation to. Can be marker name (str), [start, stop], or None.remove_tags (
Optional[bool]) – If True and region is a marker name, remove marker tags from output.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool with cleared annotations and uppercase sequences. Always has
seq_length=Nonebecause output length depends on how many tags and non-molecular characters each sequence contains.- Return type:
- poolparty.stylize(pool, region=None, *, style, which='contents', regex=None, iter_order=None, prefix=None)[source]
Apply inline styling to sequences without modifying them.
Styles are attached directly to sequences as they flow through the pool chain.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Parent pool or sequence to style.region (
str|Sequence[Integral] |None) – Region to restrict styling. Can be marker name or [start, stop]. If None, styles the entire sequence.style (
str) – Style spec string (e.g., ‘red bold’, ‘lower cyan’). Can include ‘upper’/’lower’ for case transforms.which (
Literal['all','upper','lower','gap','tags','contents']) – Pattern selector: ‘all’, ‘upper’, ‘lower’, ‘gap’, ‘tags’, ‘contents’.regex (
Optional[str]) – Custom regex pattern. If specified, overrides which.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool with inline styling attached to sequences.
- Return type:
Scan Operations
Tiled mutagenesis operations that scan across sequence positions.
- poolparty.insertion_scan(pool, insertion_pool, positions=None, region=None, replace=False, style=None, prefix=None, prefix_position=None, prefix_insert=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='insertion_scan')[source]
Insert or replace a sequence at specified scanning positions.
- Parameters:
insertion_pool (
Union[Pool,str]) – The pool or sequence string to be inserted.positions (
Sequence[Integral] |slice|None) – Positions for insertion/replacement (0-based). If None, all valid positions.region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.replace (
bool) – If False, insert at position (output length = bg + ins). If True, replace content at position (output length = bg).style (
Optional[str]) – Style to apply to inserted content (e.g., ‘red’, ‘blue bold’).prefix (
Optional[str]) – Prefix for cartesian product index (e.g., ‘ins_’ produces ‘ins_0’, ‘ins_1’, …).prefix_position (
Optional[str]) – Prefix for position index (e.g., ‘pos_’ produces ‘pos_0’, ‘pos_1’, …).prefix_insert (
Optional[str]) – Prefix for insert index (e.g., ‘ins_’ produces ‘ins_0’, ‘ins_1’, …).mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'position_index','start','end','name','region_seq'.
- Returns:
A Pool yielding sequences with the insert placed at selected position(s).
- Return type:
- poolparty.deletion_scan(pool, deletion_length, deletion_marker='-', positions=None, region=None, prefix=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='deletion_scan')[source]
Scan a pool for all possible single deletions of a fixed length.
- Parameters:
deletion_length (
Integral) – Number of characters to delete at each valid position.deletion_marker (
Optional[str]) – Character to insert at the deletion site. If None, segment is removed.positions (
Sequence[Integral] |slice|None) – Positions to consider for the start of the deletion (0-based, relative to region).region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).style (
Optional[str]) – Style to apply to deletion gap characters (e.g., ‘gray’, ‘red bold’).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'position_index','start','end','name','region_seq'.
- Returns:
A Pool yielding sequences where a segment of the specified length is removed from the source at each allowed position, optionally with a marker inserted.
- Return type:
- poolparty.replacement_scan(pool, replacement_pool, positions=None, region=None, style=None, prefix=None, prefix_position=None, prefix_insert=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='replacement_scan')[source]
Replace a segment with insert at specified scanning positions.
Equivalent to
insertion_scan(..., replace=True). Seeinsertion_scan()for full parameter documentation.- Return type:
- poolparty.shuffle_scan(pool, shuffle_length, positions=None, region=None, shuffle_type='mono', shuffles_per_position=1, prefix=None, prefix_position=None, prefix_shuffle=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='shuffle_scan')[source]
Shuffle characters within a window at specified scanning positions.
- Parameters:
shuffle_length (
Integral) – Length of the region to shuffle at each position.positions (
Sequence[Integral] |slice|None) – Positions to consider for the start of the shuffle region (0-based).region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.shuffle_type (
Literal['mono','dinuc']) –Type of shuffle to perform:
"mono": random permutation preserving mononucleotide composition."dinuc": Euler-path shuffle preserving dinucleotide frequencies. The first and last characters of each window are fixed.
shuffles_per_position (
Integral) – Number of shuffles to perform at each position.prefix (
Optional[str]) – Prefix for cartesian product index (e.g., ‘shuf’ produces ‘shuf_0’, ‘shuf_1’, …).prefix_position (
Optional[str]) – Prefix for position index (e.g., ‘pos’ produces ‘pos_0’, ‘pos_1’, …).prefix_shuffle (
Optional[str]) – Prefix for shuffle variant index (e.g., ‘var’ produces ‘var_0’, ‘var_1’, …).mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).style (
Optional[str]) – Style to apply to shuffled characters (e.g., ‘purple’, ‘red bold’).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
Optional[tuple[None|list[str] |dict[str,str],None|list[str] |dict[str,str]]]) – Design card keys as a 2-tuple(scan_cards, shuffle_cards). Scan keys:'position_index','start','end','name','region_seq'. Shuffle keys:'permutation'.
- Returns:
A Pool yielding sequences where a region of the specified length is shuffled at each allowed position.
- Return type:
- poolparty.mutagenize_scan(pool, mutagenize_length, num_mutations=None, mutation_rate=None, positions=None, region=None, prefix=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='mutagenize_scan')[source]
Apply mutagenesis within a window at specified scanning positions.
- Parameters:
mutagenize_length (
Integral) – Length of the region to mutagenize at each position.num_mutations (
Optional[Integral]) – Fixed number of mutations to apply (mutually exclusive with mutation_rate).mutation_rate (
Optional[Real]) – Probability of mutation at each position (mutually exclusive with num_mutations).positions (
Sequence[Integral] |slice|None) – Positions to consider for the start of the mutagenize region (0-based). If None, all valid positions are used.region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval. If specified, positions are relative to the region start.prefix (
Union[str,Sequence[str],None]) – Prefix for sequence names. If a 2-tuple, first element is for scanning positions, second for mutagenization.mode (
Union[Literal['random','sequential','fixed'],tuple[Literal['random','sequential','fixed'],Literal['random','sequential','fixed']]]) – Selection mode: ‘random’ or ‘sequential’. A scalar value is broadcast to both scan and mutagenize sub-operations. If a 2-tuple, first element is for scanning positions, second for mutagenization.num_states (
Union[Integral,Sequence[Optional[Integral]],None]) – Number of states. A scalar value is broadcast to both sub-operations. If a 2-tuple, first element is for scanning positions, second for mutagenization. For each element: None means auto-compute in sequential mode (enumerate all variants) or 1 in random mode (pure random sampling). Example:num_states=(3, None)withmode=("random", "sequential")picks 3 random scan positions and enumerates all mutation variants at each.style (
Optional[str]) – Style to apply to mutated characters (e.g., ‘red’, ‘blue bold’).iter_order (
Union[Real,Sequence[Real],None]) – Iteration order priority for the Operation. If a 2-tuple, first element is for scanning positions, second for mutagenization.cards (
Optional[tuple[None|list[str] |dict[str,str],None|list[str] |dict[str,str]]]) – Design card keys as a 2-tuple(scan_cards, mutagenize_cards). Scan keys:'position_index','start','end','name','region_seq'. Mutagenize keys:'positions','wt_chars','mut_chars'.
- Returns:
A Pool yielding sequences where a region of the specified length is mutagenized at each allowed position.
- Return type:
- poolparty.subseq_scan(pool, subseq_length, positions=None, region=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='subseq_scan')[source]
Extract subsequences of a specified length at scanning positions.
Scans a region across the pool and extracts the region content, returning subsequences at each valid position.
- Parameters:
subseq_length (
Integral) – Length of subsequence to extract at each position.positions (
Sequence[Integral] |slice|None) – Positions to consider for the start of extraction (0-based). If None, all valid positions are used.region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval. If specified, positions are relative to the region start.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'position_index','start','end','name','region_seq'.
- Returns:
A Pool yielding subsequences extracted at each allowed position.
- Return type:
Region Operations
Operations for working with tagged sequence regions.
- poolparty.insert_tags(pool, region_name, start, stop=None, iter_order=None, prefix=None)[source]
Insert XML-style region tags at a fixed position in sequences.
- Parameters:
pool (Pool or str) – Input Pool or sequence string to add tags to.
region_name (
str) – Name for the region (e.g., ‘region’, ‘orf’, ‘insert’).start (
int) – Start position (0-based) for the region.stop (
Optional[int]) – End position (exclusive). If None, creates a zero-length region at start.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool yielding sequences with the region tags inserted.
- Return type:
Examples
>>> with pp.Party(): ... bg = pp.from_seq('ACGTACGT') ... # Region tags encompassing positions 2-5 ... marked = pp.insert_tags(bg, 'region', start=2, stop=5) ... # Result: 'AC<region>GTA</region>CGT' ... ... # Zero-length region at position 4 ... marked = pp.insert_tags(bg, 'ins', start=4) ... # Result: 'ACGT<ins/>ACGT'
- poolparty.remove_tags(pool, region_name, keep_content=True, iter_order=None, prefix=None)[source]
Remove region tags from sequences.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool,str]) – Input Pool or sequence string containing the region.region_name (
str) – Name of the region to remove.keep_content (
bool) – If True, keep the content inside the region (just remove tags). If False, remove both the region tags and their content.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool yielding sequences with the region tags removed.
- Return type:
Examples
>>> with pp.Party(): ... bg = pp.from_seq('ACGT<region>TTAA</region>GCGC') ... ... # Keep content (just remove tags) ... result = pp.remove_tags(bg, 'region', keep_content=True) ... # Result: 'ACGTTTAAGCGC' ... ... # Remove content too ... result = pp.remove_tags(bg, 'region', keep_content=False) ... # Result: 'ACGTGCGC'
- poolparty.extract_region(pool, region_name, rc=False, iter_order=None, prefix=None)[source]
Extract content from a named region as a new Pool.
Creates a Pool that yields the content inside the specified region.
- Parameters:
pool (Pool or str) – Input Pool or sequence string containing the region.
region_name (
str) – Name of the region to extract content from.rc (
bool) – If True, reverse-complement the extracted content.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool yielding the content inside the region.
- Return type:
Examples
>>> with pp.Party(): ... bg = pp.from_seq('ACGT<region>TTAA</region>GCGC') ... content = pp.extract_region(bg, 'region') ... # content yields: 'TTAA' ... ... # With rc=True, content is reverse-complemented ... content_rc = pp.extract_region(bg, 'region', rc=True) ... # content_rc yields: 'TTAA' (reverse complement of TTAA)
- poolparty.replace_region(pool, content_pool, region_name, rc=False, sync=True, keep_tags=True, iter_order=None, prefix=None, _factory_name=None, _style=None)[source]
Replace a region with content from another Pool.
The region (including its tags and any content) is replaced with sequences from content_pool.
- Parameters:
pool (Pool or str) – Background Pool or sequence string containing the region.
content_pool (Pool or str) – Pool or sequence string to insert at the region position.
region_name (
str) – Name of the region to replace.rc (
bool) – If True, reverse-complement the content before insertion.sync (
bool) – If True, synchronize pool and content_pool so they iterate in lock-step (1:1 pairing) instead of a Cartesian product.keep_tags (
bool) – If True, preserve the region’s XML tags around the new content. The region remains tracked in the resulting pool.iter_order (
Optional[Real]) – Iteration order priority for the Operation.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.
- Returns:
A Pool yielding pool sequences with the region replaced by content_pool sequences.
- Return type:
Examples
>>> with pp.Party(): ... # Replace region with content from another pool ... bg = pp.from_seq('ACGT<insert/>TTTT') ... inserts = pp.from_seqs(['AAA', 'GGG'], mode='sequential') ... result = pp.replace_region(bg, inserts, 'insert') ... # Result yields: 'ACGTAAATTTT', 'ACGTGGGTTTT' ... ... # With sync=True for 1:1 pairing ... bg = pp.from_seqs(['ACGT<bc/>TTTT', 'CCCC<bc/>GGGG'], mode='sequential') ... barcodes = pp.get_barcodes(num_barcodes=2, length=4, seed=42) ... result = pp.replace_region(bg, barcodes, 'bc', sync=True) ... # Each background gets a unique barcode (no Cartesian product) ... ... # With keep_tags=True to preserve region tracking ... result = pp.replace_region(bg, barcodes, 'bc', keep_tags=True) ... # Region tags remain: 'ACGT<bc>XXXX</bc>TTTT'
- poolparty.apply_at_region(pool, region_name, transform_fn, rc=False, remove_tags=True, iter_order=None, prefix=None)[source]
Apply a transformation to the content of a region.
This is a high-level convenience function that: 1. Extracts content from the named region (reverse-complementing if rc=True) 2. Applies transform_fn to create a transformed content Pool 3. Replaces the region with the transformed content (reverse-complementing back if rc=True)
- Parameters:
pool (Pool or str) – Input Pool or sequence string containing the region.
region_name (
str) – Name of the region whose content to transform.transform_fn (
Callable) – Function that takes a Pool and returns a transformed Pool. Examples: pp.rc, pp.shuffle_seq, lambda p: pp.mutagenize(p, …)rc (
bool) – If True, reverse-complement content before transform and reverse-complement result back before insertion.remove_tags (
bool) – If True, region tags are removed from the result. If False, region tags are preserved around the transformed content.iter_order (
Optional[Real]) – Iteration order priority for the Operation.
- Returns:
A Pool with the region content transformed.
- Return type:
Examples
>>> with pp.Party(): ... # Reverse complement a region (tags removed) ... bg = pp.from_seq('ACGT<orf>ATGCCC</orf>TTTT') ... result = pp.apply_at_region(bg, 'orf', pp.rc) ... # Result: 'ACGTGGGCATTTTT' ... ... # Keep tags around transformed content ... bg = pp.from_seq('AAA<region>ACGT</region>TTT') ... result = pp.apply_at_region( ... bg, 'region', ... lambda p: pp.mutagenize(p, num_mutations=1), ... remove_tags=False, ... ) ... # Result: 'AAA<region>ACCT</region>TTT' (tags preserved)
Notes
If rc=True, the transform_fn receives reverse-complemented content, and the result is reverse-complemented back before insertion.
- poolparty.region_scan(pool, tag_name='region', positions=None, region=None, remove_tags=None, region_length=0, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]
Insert XML-style region tags at scanning positions in a sequence.
- Parameters:
pool (Pool or str) – Input Pool or sequence string to insert tags into.
tag_name (
str) – Name for the XML tag to insert.positions (
Sequence[Integral] |slice|None) – Valid insertion positions (0-based). If None, all positions are valid.region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be region name (str) or [start, stop].remove_tags (
Optional[bool]) – If True and region is a region name, remove tags from output.region_length (
int) – Length of sequence to encompass. 0 creates zero-length regions (<name/>), >0 creates region tags (<name>BASES</name>).mode (
Literal['random','sequential','fixed']) – Position selection mode: ‘random’ or ‘sequential’._factory_name (
Optional[str]) – Sets default name of the resulting operation
- Returns:
A Pool yielding sequences with the region tags inserted at selected positions.
- Return type:
- poolparty.region_multiscan(pool, tag_names, num_insertions, positions=None, region=None, remove_tags=None, region_length=0, insertion_mode='ordered', min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]
Insert multiple XML-style region tags into a sequence.
- Parameters:
pool (Pool or str) – Input Pool or sequence string to insert tags into.
tag_names (Sequence[str] or str) – Tag name(s) to insert. If a single string, used for all insertions.
num_insertions (
int) – Number of region tags to insert.positions (
Sequence[Integral] |Sequence[Sequence[Integral]] |slice|None) – Valid insertion positions (0-based, nontag-relative). Flat list/slice/None for shared positions; list-of-lists for per-insert positions (one per insert).region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be region name (str) or [start, stop].region_length (
int|Sequence[int]) – Length of sequence to encompass per region. Single int for uniform length, or a sequence of ints for per-region lengths (one per insertion).insertion_mode (
Literal['ordered','unordered']) – How to assign tags to positions: - ‘ordered’: tag_names[i] goes to the i-th selected position (left to right) - ‘unordered’: all valid assignments of tags to positions are enumeratedmin_spacing (
Optional[int]) – Minimum gap between end of one region and start of next. Default: 0 (non-overlapping, touching OK).max_spacing (
Optional[int]) – Maximum gap between adjacent regions. None = unbounded.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Position selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. If None, auto-determined for sequential mode.iter_order (
Optional[Real]) – Iteration order priority for the Operation.
- Returns:
A Pool yielding sequences with multiple region tags inserted.
- Return type:
Multiscan Operations
Multi-region scanning operations.
- poolparty.deletion_multiscan(pool, deletion_length, num_deletions, deletion_marker='-', positions=None, region=None, names=None, min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, style=None, iter_order=None, cards=None, _factory_name='deletion_multiscan')[source]
Delete segments at multiple positions simultaneously.
- Parameters:
deletion_length (
Integral) – Number of characters to delete at each position.num_deletions (
Integral) – Number of simultaneous deletions to make.deletion_marker (
Optional[str]) – Character to insert at each deletion site. If None, deleted segments are removed with no marker.positions (
Sequence[Integral] |Sequence[Sequence[Integral]] |slice|None) – Valid positions for deletion starts (0-based). Can be a flat list (shared across all deletions) or a list of per-deletion sublists. If None, all valid positions are used.region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.names (
Optional[Sequence[str]]) – Custom names for the deletion regions. If None, auto-generated (_del_0,_del_1, …).min_spacing (
Optional[Integral]) – Minimum gap between end of one deletion and start of next.max_spacing (
Optional[Integral]) – Maximum gap between adjacent deletions. None = unbounded.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).style (
Optional[str]) – Style to apply to deletion marker characters (e.g., ‘gray’, ‘red bold’).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'combination_index','starts','ends','names','region_seqs'.
- Returns:
A Pool yielding sequences with multiple segments deleted simultaneously.
- Return type:
- poolparty.insertion_multiscan(pool, num_insertions, insertion_pools, positions=None, region=None, names=None, replace=False, style=None, insertion_mode='ordered', min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name=None)[source]
Insert or replace sequences at multiple positions simultaneously.
- Parameters:
num_insertions (
Integral) – Number of simultaneous insertions/replacements to make.insertion_pools (
Union[Pool,Sequence[Pool]]) – Pool(s) providing content. If a single Pool is provided, it will be deepcopiednum_insertions - 1times. If a Sequence of Pools is provided, its length must equalnum_insertions.positions (
Sequence[Integral] |Sequence[Sequence[Integral]] |slice|None) – Valid positions (0-based). Can be a flat list (shared across all insertions) or a list of per-insertion sublists. If None, all valid positions are used.region (
str|Sequence[Integral] |None) – Region to constrain the scan to. Can be a marker name or [start, stop] interval.names (
Optional[Sequence[str]]) – Custom names for the insertion regions. If None, auto-generated (_ins_0,_ins_1, …).replace (
bool) – If True, replace existing content at each position (region_length = pool seq_length). If False, insert at zero-width positions.style (
Optional[str]) – Style to apply to inserted/replaced content (e.g., ‘red’, ‘blue bold’).insertion_mode (
Literal['ordered','unordered']) – How to assign pools to positions.'ordered'preserves position order;'unordered'uses all permutations.min_spacing (
Optional[Integral]) – Minimum gap between adjacent positions.max_spacing (
Optional[Integral]) – Maximum gap between adjacent positions. None = unbounded.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’.num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'combination_index','starts','ends','names','region_seqs'.
- Returns:
A Pool yielding sequences with multiple insertions or replacements.
- Return type:
- poolparty.replacement_multiscan(pool, num_replacements, replacement_pools, positions=None, region=None, names=None, style=None, insertion_mode='ordered', min_spacing=None, max_spacing=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None, _factory_name='replacement_multiscan')[source]
Replace segments at multiple positions simultaneously.
Equivalent to
insertion_multiscan(..., replace=True). Seeinsertion_multiscan()for full parameter documentation.- Return type:
State Operations
Operations that manipulate the state space of pools.
- poolparty.stack(pools, prefix=None, iter_order=None, cards=None)[source]
Create a pool by stacking multiple input pools state-wise (disjoint union).
- Parameters:
pools (
Sequence[TypeVar(T, bound=Pool)]) – Sequence of Pool objects to stack into a single Pool.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'active_parent'.
- Returns:
A Pool whose states are the disjoint union of all input pools’ states. Each state produces the sequence from the corresponding input pool.
- Return type:
- poolparty.repeat(pool, times, prefix=None, iter_order=None, cards=None)[source]
Repeat a pool’s states a specified number of times.
- Parameters:
pool (
TypeVar(T, bound=Pool)) – The Pool whose states are to be repeated.times (
Integral) – The number of times to repeat the pool’s states. Must be >= 1.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'repeat_index'.
- Returns:
A new Pool with
timesas many states as the input pool.- Return type:
- Raises:
ValueError – If
timesis less than 1.
- poolparty.state_slice(pool, key, prefix=None, iter_order=None)[source]
Create a Pool containing a slice of states from the input Pool.
- Parameters:
pool (
TypeVar(T, bound=Pool)) – The Pool whose states will be sliced.key (
Union[Integral,slice]) – Integer index or slice specifying which states to include from the input Pool.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.iter_order (
Optional[Real]) – Iteration order priority for the Operation.
- Returns:
A Pool containing states selected by applying the provided index or slice to the input Pool’s state space.
- Return type:
- poolparty.state_shuffle(pool, seed=None, permutation=None, prefix=None, iter_order=None)[source]
Create a Pool with randomly permuted states from the input Pool.
- Parameters:
pool (
TypeVar(T, bound=Pool)) – The Pool whose states will be shuffled.seed (
Optional[Integral]) – Random seed for deterministic shuffling. If None, a random seed is generated.permutation (
Optional[Sequence[Integral]]) – Custom permutation to use. If provided, seed must not be specified.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.iter_order (
Optional[Real]) – Iteration order priority for the Operation.
- Returns:
A Pool containing the same states as the input but in a randomly permuted order.
- Return type:
- poolparty.sample(pool, num_seqs=None, seq_states=None, seed=None, with_replacement=True, prefix=None, iter_order=None)[source]
Sample states from a pool.
- Parameters:
pool (
TypeVar(T, bound=Pool)) – The Pool to sample states from.num_seqs (
Optional[Integral]) – Number of states to sample randomly. Mutually exclusive withseq_states.seq_states (
Optional[Sequence[Integral]]) – Explicit list of state indices to select. Mutually exclusive withnum_seqs.seed (
Optional[Integral]) – Random seed for deterministic sampling. Only used withnum_seqs.with_replacement (
bool) – Whether to sample with replacement. If False,num_seqsmust be <=pool.num_states.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.iter_order (
Optional[Real]) – Iteration order priority for the Operation.
- Returns:
A Pool containing the sampled states from the input Pool.
- Return type:
- Raises:
ValueError – If both
num_seqsandseq_statesare provided, or if neither is provided. Ifwith_replacementis False andnum_seqsexceeds the pool’s state count.
- poolparty.sync(pools)[source]
Synchronize multiple pools to iterate in lockstep (in-place).
- Parameters:
pools (
Sequence[Pool]) – Sequence of Pool objects to synchronize. All pools must have the same number of states.- Returns:
Pools are modified in-place; no new Pool is returned.
- Return type:
- Raises:
ValueError – If the input sequence is empty, if the pools have differing numbers of states, or if any pool is an ancestor of another (circular constraint).
ORF Operations
Codon-aware operations for protein-coding sequences.
- poolparty.mutagenize_orf(pool, region=None, *, num_mutations=None, mutation_rate=None, mutation_type='missense_only_first', codon_positions=None, style=None, frame=None, prefix=None, mode='random', num_states=None, iter_order=None, cards=None)[source]
Apply codon-level mutations to an ORF sequence. Requires active Party context.
- Parameters:
pool (
Union[Pool,str]) – Parent pool or sequence string to mutate.region (
str|Sequence[Integral] |None) – Region to mutate. Can be marker name (e.g., “orf”) or [start, stop]. If None, mutates the entire sequence.num_mutations (
Optional[Integral]) – Fixed number of codon mutations (mutually exclusive with mutation_rate).mutation_rate (
Optional[Real]) – Per-codon mutation probability (mutually exclusive with num_mutations).mutation_type (
str) – Type of mutation: ‘any_codon’, ‘nonsynonymous_first’, ‘nonsynonymous_random’, ‘missense_only_first’, ‘missense_only_random’, ‘synonymous’, ‘nonsense’.codon_positions (
Union[Sequence[Integral],slice,None]) – Eligible codon indices: None (all), list of indices, or slice.style (
Optional[str]) – Style to apply to mutated codon positions (e.g., ‘red’, ‘bold’).frame (
Optional[int]) – Reading frame and orientation. Valid values: +1, +2, +3, -1, -2, -3. Positive values indicate left-to-right orientation (5’->3’), negative values indicate right-to-left orientation (3’->5’). The absolute value indicates the frame of the boundary base (1-indexed). If None and region is a named OrfRegion, uses the OrfRegion’s frame.prefix (
Optional[str]) – Prefix for sequence names in the resulting Pool.mode (
Literal['random','sequential','fixed']) – Selection mode: ‘random’ or ‘sequential’. Sequential requiresnum_mutations(notmutation_rate) and a uniformmutation_type(‘any_codon’, ‘nonsynonymous_first’, ‘missense_only_first’, or ‘nonsense’).num_states (
Optional[Integral]) – Number of states. In sequential mode, overrides the computed count (cycling if greater, clipping if less). In random mode, if None defaults to 1 (pure random sampling).iter_order (
Optional[Real]) – Iteration order priority for the Operation.cards (
None|list[str] |dict[str,str]) – Design card keys to include. Available keys:'codon_positions','wt_codons','mut_codons','wt_aas','mut_aas'.
- Returns:
A Pool that generates codon-mutated sequences.
- Return type:
- Raises:
ValueError – If frame is None and region is a named plain Region (not OrfRegion), if mutation_rate is used with sequential mode, if mutation_type is non-uniform with sequential mode, or if num_mutations exceeds eligible codons.
Library Generation
- poolparty.generate_library(pool, num_cycles=1, num_seqs=None, seed=None, init_state=None, seqs_only=False, _include_inline_styles=False, discard_null_seqs=False, max_iterations=None, min_acceptance_rate=None, attempts_per_rate_assessment=100)[source]
Generate sequences from a pool.
- Parameters:
pool (
Union[poolparty.pool.Pool, poolparty.dna_pool.DnaPool, poolparty.protein_pool.ProteinPool]) – The pool to generate sequences from.num_cycles (
Integral) – Number of complete iterations through all states.num_seqs (
Optional[Integral]) – Number of sequences to generate.seed (
Optional[Integral]) – Random seed for reproducibility.init_state (
Optional[int]) – Initial state to start generation from.seqs_only (
bool) – If True, return list of sequences instead of DataFrame.discard_null_seqs (
bool) – If True, discard sequences that fail filters (null sequences). With num_seqs, keeps sampling until N valid sequences are collected. With num_cycles, enumerates all states and returns only the valid ones (output may have fewer than num_cycles * num_states rows).max_iterations (
Optional[int]) – Maximum iterations before stopping. Default: state space size for sequential mode, or num_seqs * 100 for random mode.min_acceptance_rate (
Optional[float]) – Minimum fraction of sequences that must pass filters. If actual rate falls below this, generation stops with a warning.attempts_per_rate_assessment (
int) – Iterations between acceptance rate checks.
- Returns:
name, seq, plus any requested design card columns. Or list of sequences if seqs_only=True. Entries are None for null rows when discard_null_seqs=False.
- Return type:
Note
Design card columns are opt-in via the cards parameter on individual operations. Default output contains only ‘name’ and ‘seq’ columns.
Utility Functions
Constants
DNA Constants
- poolparty.BASES
Standard DNA bases:
['A', 'C', 'G', 'T']
- poolparty.COMPLEMENT
Complement mapping for DNA bases.
- poolparty.IUPAC_TO_DNA
Mapping from IUPAC ambiguity codes to DNA bases.
- poolparty.VALID_CHARS
Set of valid characters in DNA sequences.
- poolparty.IGNORE_CHARS
Characters to ignore in DNA sequences (gaps, annotations).
Operation Classes
These are the underlying operation classes. Most users will use the convenience functions above instead of instantiating these directly.
Base Operation Classes
- class poolparty.FromSeqsOp(seqs, parent_pool=None, region=None, style=None, seq_names=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None, _factory_name=None)[source]
Bases:
OperationCreate a pool from a list of sequences.
- class poolparty.FromIupacOp(iupac_seq, parent_pool=None, region=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, style=None, cards=None)[source]
Bases:
OperationGenerate DNA sequences from IUPAC notation.
- class poolparty.FromMotifOp(prob_df, parent_pool=None, region=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, style=None, cards=None)[source]
Bases:
OperationSample sequences from a position probability matrix.
- class poolparty.GetKmersOp(length, pool=None, region=None, style=None, case='upper', prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None)[source]
Bases:
OperationGenerate DNA k-mers.
- class poolparty.MutagenizeOp(pool, num_mutations=None, mutation_rate=None, allowed_chars=None, region=None, style=None, prefix=None, mode='random', num_states=None, name=None, iter_order=None, _remove_tags=False, cards=None, _factory_name='mutagenize')[source]
Bases:
OperationApply mutations to a parent sequence or a specified region within it.
Supports two mutation modes: - num_mutations: Apply exactly this many mutations to each sequence - mutation_rate: Apply a random number of mutations based on a binomial distribution
Exactly one of num_mutations or mutation_rate must be provided. Sequential mode is only available when num_mutations is specified.
- class poolparty.SeqShuffleOp(parent_pool, region=None, shuffle_type='mono', spacer_str='', prefix=None, mode='random', num_states=None, name=None, iter_order=None, _remove_tags=False, style=None, cards=None, _factory_name=None)[source]
Bases:
OperationRandomly shuffle characters within a region of the parent sequence.
- class poolparty.RecombineOp(parent_pool, sources, num_breakpoints=1, positions=None, region=None, styles=None, style_by='order', prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None, _factory_name='recombine')[source]
Bases:
OperationRecombine segments from multiple source pools at specified breakpoints.
In sequential mode, enumerates all breakpoint positions × pool assignment combinations. In random mode, randomly selects breakpoints and pool assignments.
Fixed Operation Classes
- class poolparty.FixedOp(parent_pools, seq_from_seqs_fn, seq_length_from_pool_lengths_fn, region=None, remove_tags=None, spacer_str='', name=None, iter_order=None, prefix=None, _factory_name=None, _pass_through_styles=True, _style_combiner_fn=None)[source]
Bases:
OperationFixed operation that applies a user-defined function to parent sequences.
State Operation Classes
- class poolparty.StackOp(parent_pools, prefix=None, name=None, iter_order=None, cards=None)[source]
Bases:
OperationStack multiple pools sequentially (disjoint union).
- class poolparty.RepeatOp(pool, times, prefix=None, name=None, iter_order=None, cards=None)[source]
Bases:
OperationRepeat a pool’s states n times.
- class poolparty.StateSliceOp(parent_pool, start, stop, step, prefix=None, name=None, iter_order=None)[source]
Bases:
OperationSlice a pool’s states to select a subset.
- class poolparty.StateShuffleOp(parent_pool, seed=None, permutation=None, prefix=None, name=None, iter_order=None)[source]
Bases:
OperationRandomly permute a pool’s states.
- class poolparty.SampleOp(parent_pool, num_seqs=None, seq_states=None, seed=None, with_replacement=True, prefix=None, name=None, iter_order=None)[source]
Bases:
OperationSample states from a pool.
ORF Operation Classes
- class poolparty.MutagenizeOrfOp(parent_pool, region=None, num_mutations=None, mutation_rate=None, mutation_type='missense_only_first', codon_positions=None, style=None, frame=1, prefix=None, mode='random', num_states=None, name=None, iter_order=None, cards=None)[source]
Bases:
OperationApply codon-level mutations to an ORF sequence.