materialize
Eagerly generate sequences from a pool and cache them in a new, standalone pool whose state space is exactly the set of stored sequences. The resulting pool is independent of its parent pools, so it can be used as a cheap starting point for any number of independent downstream pipelines.
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
Input pool to materialize. |
|
|
|
Number of sequences to generate and cache. Provide either
|
|
|
|
Number of complete cycles through the state space. |
|
|
|
Random seed for reproducible generation. |
|
|
|
If |
|
|
|
Maximum iterations before stopping (useful with filters that reject most draws). |
|
|
|
If the acceptance rate drops below this threshold, generation stops early. |
|
|
|
Number of draws between acceptance-rate checks. |
|
|
|
Name for the materialized pool. |
|
|
|
Prefix for the operation node name in the pool graph. |
|
|
|
Design card columns to include in library output. |
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see materialize() in the
API Reference.
Examples
Materialize before applying downstream scans
Pre-compute an expensive mutagenize result once and reuse it across multiple scan operations without re-running the mutation logic each time.
wt = pp.from_seq("ATCGATCG")
mutants = pp.mutagenize(wt, num_mutations=1)
# Freeze 20 mutants into a standalone pool
cached = pp.materialize(mutants, num_seqs=20, seed=42)
# Apply different downstream scans to the same cached pool
scan_a = pp.deletion_scan(cached, deletion_length=2)
scan_b = pp.mutagenize(cached, num_mutations=1)
df_a = pp.generate_library(scan_a, num_seqs=6)
df_b = pp.generate_library(scan_b, num_seqs=6)
cached.print_library()
ACCGATCG
ATCGATCT
ATCGACCG
ATCGAGCG
... (20 total)
Reproducible caching with seed
Pass seed= so that re-running the same script produces the same
materialized pool every time.
wt = pp.from_seq("ATCGATCG")
pool = pp.mutagenize(wt, num_mutations=1, mode="random")
cached = pp.materialize(pool, num_seqs=5, seed=0)
cached.print_library()
ATCGAACG
ATCGCTCG
GTCGATCG
ACCGATCG
Materialize after filtering
Combine filter with materialize to lock in the accepted sequences.
The materialized pool contains only the sequences that passed the predicate,
with NullSeq entries already discarded.
wt = pp.from_seq("ATCGATCG")
mutants = pp.mutagenize(wt, num_mutations=1, mode="random", num_states=20)
passed = pp.filter(mutants, lambda s: s.count("G") + s.count("C") >= 4)
cached = pp.materialize(passed, num_seqs=5, seed=0, discard_null_seqs=True)
cached.print_library()
ATCGAACG
ATCGCTCG
GTCGATCG
ACCGATCG
See materialize() or materialize().