PoolParty Documentation
PoolParty is a Python package for designing complex oligonucleotide sequence libraries. It provides a declarative, composable interface for generating DNA libraries used in MPRA (massively parallel reporter assays), deep mutational scanning, in silico analysis of genomic DNNs, and other high-throughput experiments.
Why PoolParty?
Designing DNA libraries often involves combining multiple types of sequence modifications — mutations, insertions, deletions, shuffles — across multiple regions with mixed coverage requirements. PoolParty lets you:
Compose operations: Chain operations like
.mutagenize(),.deletion_scan(), and.insertion_scan()to build complex librariesTag regions: Use XML-like syntax to mark and manipulate specific regions of sequences
Use lazy evaluation: Sequences are generated on-demand, enabling libraries with billions of potential variants
Track provenance: Each sequence comes with a structured record of how it was built — ready for filtering, grouping, and modeling
Style output: Visual annotations highlight sequence modifications and regions for quick auditing
import poolparty as pp
# Initialize PoolParty
pp.init()
# Create a template with tagged regions
template = pp.from_seq("ACGT<cre>GGAAAGCGGGCAGTGAGC</cre>TTTT<bc/>GGGG")
# Generate single-nucleotide mutations in the CRE region
mutant_library = template.mutagenize(
region="cre",
num_mutations=1,
mode="sequential"
)
# Generate the library as a DataFrame
df = mutant_library.generate_library()
print(f"Generated {len(df)} sequences")
Operations
- Source
Create sequence pools from sequences, FASTA files, IUPAC codes, motifs, k-mer enumeration, and constrained barcodes.
- Transformation
Apply nucleotide and codon-level mutagenesis, shuffling, and recombination. Codon-aware operations preserve reading frames for protein-coding sequences.
- Scanning
Perform positional scanning with insertion, deletion, replacement, and mutagenesis scans across sequence regions.
- Region
Tag regions with XML-like syntax, extract or replace tagged regions, and target operations to specific sequence regions.
- Composition & Control
Combine pools with stack and join. Slice, shuffle, sample, repeat, filter, and synchronize library states.
- Export
Generate libraries as DataFrames, CSV, or FASTA files.
Installation
Install from PyPI:
pip install poolparty
Or install from source:
git clone https://github.com/jbkinney/poolparty-statetracker.git
cd poolparty-statetracker/poolparty
pip install -e .
Quick Example
Stack different variant types into a single barcoded library:
import poolparty as pp
pp.init()
# Create a template with tagged regions
template = pp.from_seq("ACGT<cre>GGAAAGCGGGCAGTGAGC</cre>TTTT<bc/>GGGG")
# Create different variant pools
mutations = template.mutagenize(region="cre", num_mutations=1)
deletions = template.deletion_scan(region="cre", deletion_length=5)
# Combine into one library
combined = pp.stack([mutations, deletions])
# Add barcodes to all variants
barcoded = combined.insert_kmers(region="bc", length=10)
# Generate final library
df = barcoded.generate_library()
print(f"Generated {len(df)} sequences")
Contents
Indices and Tables
See Also
StateTracker: Composable states for combinatorial enumeration (used internally by PoolParty)