materialize

Eagerly generate sequences from a pool and cache them in a new, standalone pool whose state space is exactly the set of stored sequences. The resulting pool has no parent references (severed DAG), so it can be used as a cheap starting point for any number of independent downstream pipelines.

import poolparty as pp
pp.init()

Parameters

Parameter	Type	Default	Description
`pool`	`Pool`	(required)	Input pool to materialize.
`num_seqs`	`int \| None`	`None`	Number of sequences to generate and cache. Provide either `num_seqs` or `num_cycles`.
`num_cycles`	`int \| None`	`None`	Number of complete cycles through the state space.
`seed`	`int \| None`	`None`	Random seed for reproducible generation.
`discard_null_seqs`	`bool`	`True`	If `True`, skip filtered-out (`NullSeq`) sequences.
`max_iterations`	`int \| None`	`None`	Maximum iterations before stopping (useful with filters that reject most draws).
`min_acceptance_rate`	`float \| None`	`None`	If the acceptance rate drops below this threshold, generation stops early.
`attempts_per_rate_assessment`	`int`	`100`	Number of draws between acceptance-rate checks.
`name`	`str \| None`	`None`	Name for the materialized pool.
`prefix`	`str \| None`	`None`	Prefix for the operation node name in the pool graph.
`cards`	`dict \| list \| None`	`None`	Design card columns to include in library output.

Note

Only the most commonly used parameters are shown above. For the full parameter list, see materialize() in the API Reference.

Examples

Materialize before applying downstream scans

Pre-compute an expensive mutagenize result once and reuse it across multiple scan operations without re-running the mutation logic each time.

wt      = pp.from_seq("ATCGATCG")
mutants = pp.mutagenize(wt, num_mutations=1)

# Freeze 20 mutants into a standalone pool
cached  = pp.materialize(mutants, num_seqs=20, seed=42)

# Apply different downstream scans to the same cached pool
scan_a  = pp.deletion_scan(cached, deletion_length=2)
scan_b  = pp.mutagenize(cached, num_mutations=1)

df_a    = pp.generate_library(scan_a, num_seqs=6)
df_b    = pp.generate_library(scan_b, num_seqs=6)

cached.print_library()

cached: seq_length=8, num_states=20 ATCGAACG
ACCGATCG
ATCGATCT
ATCGACCG
ATCGAGCG
... (20 total)

Reproducible caching with `seed`

Pass seed= so that re-running the same script produces the same materialized pool every time.

wt     = pp.from_seq("ATCGATCG")
pool   = pp.mutagenize(wt, num_mutations=1, mode="random")
cached = pp.materialize(pool, num_seqs=5, seed=0)
cached.print_library()

cached: seq_length=8, num_states=5 ATCGGTCG
ATCGAACG
ATCGCTCG
GTCGATCG
ACCGATCG

Materialize then apply a deletion scan

Because materialize returns a standalone pool with no upstream parents, chaining a deletion scan is just as efficient as starting from a plain from_seqs pool.

wt      = pp.from_seq("ATCGATCG")
mutants = pp.mutagenize(wt, num_mutations=1)

# Snapshot 8 mutants once; subsequent operations cost nothing extra
cached  = pp.materialize(mutants, num_seqs=8, seed=1)

# Systematically delete 2-base windows across every cached sequence
scan    = pp.deletion_scan(cached, deletion_length=2)
df      = pp.generate_library(scan, num_seqs=8)

cached.print_library()

cached: seq_length=8, num_states=8 ATCGGTCG
ATGGATCG
ATCGATCA
ATCGATCT
ATCGATCC
... (8 total)

See materialize() or materialize().