PoolParty Documentation ======================= **PoolParty** is a Python package for designing complex oligonucleotide sequence libraries. It provides a declarative, composable interface for generating DNA libraries used in MPRA (massively parallel reporter assays), deep mutational scanning, in silico analysis of genomic DNNs, and other high-throughput experiments. Why PoolParty? -------------- Designing DNA libraries often involves combining multiple types of sequence modifications — mutations, insertions, deletions, shuffles — across multiple regions with mixed coverage requirements. PoolParty lets you: - **Compose operations**: Chain operations like ``.mutagenize()``, ``.deletion_scan()``, and ``.insertion_scan()`` to build complex libraries - **Tag regions**: Use XML-like syntax to mark and manipulate specific regions of sequences - **Use lazy evaluation**: Sequences are generated on-demand, enabling libraries with billions of potential variants - **Track provenance**: Each sequence comes with a structured record of how it was built — ready for filtering, grouping, and modeling - **Style output**: Visual annotations highlight sequence modifications and regions for quick auditing .. code-block:: python import poolparty as pp # Initialize PoolParty pp.init() # Create a template with tagged regions template = pp.from_seq("ACGTGGAAAGCGGGCAGTGAGCTTTTGGGG") # Generate single-nucleotide mutations in the CRE region mutant_library = template.mutagenize( region="cre", num_mutations=1, mode="sequential" ) # Generate the library as a DataFrame df = mutant_library.generate_library() print(f"Generated {len(df)} sequences") Operations ---------- **Source** Create sequence pools from sequences, FASTA files, IUPAC codes, motifs, k-mer enumeration, and constrained barcodes. **Transformation** Apply nucleotide and codon-level mutagenesis, shuffling, and recombination. Codon-aware operations preserve reading frames for protein-coding sequences. **Scanning** Perform positional scanning with insertion, deletion, replacement, and mutagenesis scans across sequence regions. **Region** Tag regions with XML-like syntax, extract or replace tagged regions, and target operations to specific sequence regions. **Composition & Control** Combine pools with stack and join. Slice, shuffle, sample, repeat, filter, and synchronize library states. **Export** Generate libraries as DataFrames, CSV, or FASTA files. Installation ------------ Install from PyPI: .. code-block:: bash pip install poolparty Or install from source: .. code-block:: bash git clone https://github.com/jbkinney/poolparty-statetracker.git cd poolparty-statetracker/poolparty pip install -e . Quick Example ------------- Stack different variant types into a single barcoded library: .. code-block:: python import poolparty as pp pp.init() # Create a template with tagged regions template = pp.from_seq("ACGTGGAAAGCGGGCAGTGAGCTTTTGGGG") # Create different variant pools mutations = template.mutagenize(region="cre", num_mutations=1) deletions = template.deletion_scan(region="cre", deletion_length=5) # Combine into one library combined = pp.stack([mutations, deletions]) # Add barcodes to all variants barcoded = combined.insert_kmers(region="bc", length=10) # Generate final library df = barcoded.generate_library() print(f"Generated {len(df)} sequences") Contents -------- .. toctree:: :maxdepth: 2 :caption: User Guide quickstart tutorials/index pool Sequence Regions Sequence Metadata operations/index .. toctree:: :maxdepth: 2 :caption: Reference api Indices and Tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` See Also -------- - `StateTracker `_: Composable states for combinatorial enumeration (used internally by PoolParty)