Quickstart Guide
This guide introduces the core concepts of PoolParty through practical examples.
Installation
pip install poolparty
Basic Concepts
PoolParty uses Pools to represent collections of DNA sequences. Pools are:
Lazy: Sequences are generated on-demand, not stored in memory
Composable: Pools can be combined using operations like
join,+, and*Stateful: Each pool tracks its position in a combinatorial space via StateTracker
Getting Started
First, import PoolParty and initialize a session:
import poolparty as pp
# Initialize PoolParty (creates a default Party context)
pp.init()
Creating Pools
From a Single Sequence
Create a pool containing a single sequence:
# Create a pool from a single sequence
wt = pp.from_seq("ATCGATCGATCG")
# Generate and display
df = wt.generate_library()
print(df[["seq"]])
From Multiple Sequences
Create a pool that selects from multiple sequences:
# Create a pool from multiple sequences
variants = pp.from_seqs(["AAAA", "CCCC", "GGGG", "TTTT"])
df = variants.generate_library()
print(df[["seq"]])
K-mer Pools
Generate all k-mers of a given length. The mode="sequential" argument
tells PoolParty to enumerate every k-mer rather than sampling randomly
(see Operation Modes).
kmers = pp.get_kmers(length=3, mode="sequential") # all 64 3-mers
df = kmers.generate_library()
print(f"Generated {len(df)} sequences")
print(df[["seq"]].head(10))
Combining Pools
Concatenation with join
Join pools to create composite sequences:
# Create components
pp.init() # Reset to fresh state
promoter = pp.from_seq("ATCG")
barcode = pp.get_kmers(length=4, mode="sequential") # all 256 4-mers
# Join them together
library = pp.join([promoter, barcode])
df = library.generate_library()
print(f"Generated {len(df)} sequences")
print(df[["seq"]].head(5))
Using the + Operator
Pools can also be concatenated with +:
pp.init()
left = pp.from_seq("AAA")
middle = pp.from_seqs(["G", "C"])
right = pp.from_seq("TTT")
combined = left + middle + right
df = combined.generate_library()
print(df[["seq"]])
Mutagenesis
Random Mutations
Apply random mutations to a sequence. Operations can be called as methods on
a Pool — wt.mutagenize(...) is equivalent to pp.mutagenize(wt, ...).
pp.init()
# Start with a wild-type sequence
wt = pp.from_seq("ATCGATCGATCG")
# Create single-mutation variants
mutants = wt.mutagenize(num_mutations=1)
df = mutants.generate_library()
print(f"Generated {len(df)} single mutants")
print(df[["seq"]].head(10))
Scan Operations
Scan operations tile across sequence positions.
Replacement Scan
Replace each position with alternative bases:
pp.init()
wt = pp.from_seq("ATCG")
alt = pp.from_seqs(["A", "C", "G", "T"], mode="sequential")
# Replace each position with all 4 bases
scan = wt.replacement_scan(replacement_pool=alt, mode="sequential")
df = scan.generate_library()
print(df[["name", "seq"]])
Deletion Scan
Systematically delete portions of a sequence:
pp.init()
wt = pp.from_seq("ATCGATCG")
# Delete 2-nt windows across the sequence
deletions = wt.deletion_scan(deletion_length=2, mode="sequential")
df = deletions.generate_library()
print(df[["name", "seq"]])
Working with Regions
PoolParty supports XML-like region tagging for targeting specific parts of sequences. See Sequence Regions for a full explanation of tag syntax and region behaviour.
Tagging Regions
pp.init()
# Define a sequence with a tagged region
seq = "AAAA<cre>ATCGATCG</cre>TTTT"
wt = pp.from_seq(seq)
# Apply mutations only to the CRE region
mutants = wt.mutagenize(num_mutations=1, region="cre")
df = mutants.generate_library()
print(f"Generated {len(df)} CRE mutants")
print(df[["seq"]].head(5))
Generating Libraries
The generate_library() method produces a pandas DataFrame with sequence information:
pp.init()
# Create a simple library
promoter = pp.from_seq("<promoter>ATCG</promoter>")
barcode = pp.from_iupac("MMM", mode="sequential") # M = A or C
library = pp.join([promoter, barcode])
# Generate with full metadata
df = library.generate_library()
print("Columns available:")
print(df.columns.tolist())
print()
print(df[["name", "seq"]].head())
Initialisation and Context Management
pp.init() — persistent context
pp.init() creates a long-lived Party context that stays active for the
rest of the session. This is the recommended approach for notebooks and
interactive scripts.
import poolparty as pp
pp.init()
pool = pp.from_seq("ACGT")
df = pool.generate_library()
Note
If you need a clean slate — for example, at the top of a new notebook cell
block or after an experiment — call pp.init() again. This tears down the
previous context and starts fresh: all prior pools and operations are
discarded.
pp.init() accepts:
genetic_code(str | dict, default"standard") — genetic code for ORF operations.log_level(str | None, defaultNone) — if set, configures logging ("DEBUG","INFO","WARNING", etc.).
with pp.Party() — scoped context
For isolation — running independent experiments, writing reusable functions,
or testing — use with pp.Party(). The context is cleaned up automatically
when the block exits.
with pp.Party() as party:
pool = pp.from_seq("ACGT")
df = pool.generate_library()
# context closed
Contexts nest automatically:
with pp.Party() as outer:
wt = pp.from_seq("ACGT")
with pp.Party() as inner:
other = pp.from_seq("TTTT") # inner is active
# outer is active again; wt is still usable
Scenario |
|
|
|---|---|---|
Interactive notebook or REPL |
Recommended |
|
Multiple independent experiments |
Recommended |
|
Inside a reusable function |
Recommended |
|
Quick reset (discard all pools) |
Call again |
Start a new |
Configuration
These functions apply to whichever Party is currently active:
Function |
Description |
|---|---|
|
Discard all pools and operations without resetting configuration. |
|
Enable or disable inline sequence styling. |
|
Enable or disable design card computation. |
|
Use text-based progress bars instead of notebook widgets. |
|
Set the logging level for |
|
Change the genetic code (affects ORF operations). |
# Disable cards and styles for a performance-sensitive run
pp.init()
pp.toggle_cards(on=False)
pp.toggle_styles(on=False)
pool = pp.from_iupac("NNNNNNNN", mode="sequential")
df = pool.to_df(num_cycles=1) # no card columns, no style overhead
Next Steps
Browse the Operations for the full list of composable operations
See Pools for Pool properties and export methods (
to_df,to_file)Check out StateTracker for understanding the underlying state algebra