materialize
Eagerly generate sequences from a pool and cache them in a new, standalone pool whose state space is exactly the set of stored sequences. The resulting pool has no parent references (severed DAG), so it can be used as a cheap starting point for any number of independent downstream pipelines.
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
Input pool to materialize. |
|
|
|
Number of sequences to generate and cache. Provide either
|
|
|
|
Number of complete cycles through the state space. |
|
|
|
Random seed for reproducible generation. |
|
|
|
If |
|
|
|
Maximum iterations before stopping (useful with filters that reject most draws). |
|
|
|
If the acceptance rate drops below this threshold, generation stops early. |
|
|
|
Number of draws between acceptance-rate checks. |
|
|
|
Name for the materialized pool. |
|
|
|
Prefix for the operation node name in the pool graph. |
|
|
|
Design card columns to include in library output. |
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see materialize() in the
API Reference.
Examples
Materialize before applying downstream scans
Pre-compute an expensive mutagenize result once and reuse it across multiple scan operations without re-running the mutation logic each time.
wt = pp.from_seq("ATCGATCG")
mutants = pp.mutagenize(wt, num_mutations=1)
# Freeze 20 mutants into a standalone pool
cached = pp.materialize(mutants, num_seqs=20, seed=42)
# Apply different downstream scans to the same cached pool
scan_a = pp.deletion_scan(cached, deletion_length=2)
scan_b = pp.mutagenize(cached, num_mutations=1)
df_a = pp.generate_library(scan_a, num_seqs=6)
df_b = pp.generate_library(scan_b, num_seqs=6)
cached.print_library()
ACCGATCG
ATCGATCT
ATCGACCG
ATCGAGCG
... (20 total)
Reproducible caching with seed
Pass seed= so that re-running the same script produces the same
materialized pool every time.
wt = pp.from_seq("ATCGATCG")
pool = pp.mutagenize(wt, num_mutations=1, mode="random")
cached = pp.materialize(pool, num_seqs=5, seed=0)
cached.print_library()
ATCGAACG
ATCGCTCG
GTCGATCG
ACCGATCG
Materialize then apply a deletion scan
Because materialize returns a standalone pool with no upstream parents,
chaining a deletion scan is just as efficient as starting from a plain
from_seqs pool.
wt = pp.from_seq("ATCGATCG")
mutants = pp.mutagenize(wt, num_mutations=1)
# Snapshot 8 mutants once; subsequent operations cost nothing extra
cached = pp.materialize(mutants, num_seqs=8, seed=1)
# Systematically delete 2-base windows across every cached sequence
scan = pp.deletion_scan(cached, deletion_length=2)
df = pp.generate_library(scan, num_seqs=8)
cached.print_library()
ATGGATCG
ATCGATCA
ATCGATCT
ATCGATCC
... (8 total)
See materialize() or materialize().