score
Evaluate a user-supplied function on each sequence and record the result
as a design card column. The sequence passes through unchanged — score
is a passthrough operation that adds metadata without altering content.
The function receives the clean (tag-stripped) sequence string, or the
clean content of a named region when region is specified.
Compatible with built-in utilities such as pp.calc_gc,
pp.calc_dust, and pp.calc_complexity.
import poolparty as pp
pp.init()
Parameters
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
|
(required) |
The Pool or sequence string to score. |
|
|
(required) |
Scoring function |
|
|
|
Design card column name under which the result is stored. |
|
|
|
Region to score. A named tag (str), |
|
|
|
Prefix for auto-generated sequence names. |
|
|
|
Design card keys to include. The available key is the value of
|
Note
Only the most commonly used parameters are shown above. For the full
parameter list, see score() in the
API Reference.
Examples
GC content with pp.calc_gc
Record the GC fraction of every sequence using the built-in utility.
Design cards are opt-in — pass cards={"gc": "gc"} (a dict mapping
the card key to the desired column name) to include the column in the
output. Using a list ["gc"] also works but prefixes the column name
with the operation id (e.g. op[12]:score.gc); the dict form avoids
this.
wt = pp.from_iupac("NNNN", mode="sequential")
scored = pp.score(wt, pp.calc_gc, card_key="gc", cards={"gc": "gc"})
scored.print_library()
# scored.generate_library() adds a "gc" column per sequence
AAAC
AAAG
AAAT
AACA
... (256 total)
Custom scoring function
Any callable works. Here a lambda counts A/T bases for AT richness.
wt = pp.from_seqs(["AAAA", "GCGC", "ATCG"], mode="sequential")
scored = pp.score(wt, lambda s: s.count("A") + s.count("T"),
card_key="at_count", cards=["at_count"])
scored.print_library()
GCGC
ATCG
Built-in scoring functions
PoolParty includes several sequence property functions that work directly
with score:
pp.calc_gc— GC fractionpp.calc_complexity— linguistic complexity (0–1)pp.calc_dust— DUST low-complexity score (lower = more complex)
wt = pp.from_iupac("NNNNNNNN", mode="sequential", num_states=5)
scored = pp.score(wt, pp.calc_complexity, card_key="complexity",
cards={"complexity": "complexity"})
scored.print_library()
AAAAAAAC
AAAAAAAG
AAAAAAAT
AAAAAACA
Score only a named region
region restricts scoring to the tagged segment; the full sequence
still passes through unchanged. With mutagenize(..., mode="random"),
set num_states if you want more than one independent draw (the
default is a single random mutant).
wt = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
muts = pp.mutagenize(wt, num_mutations=1, region="cre",
mode="random", num_states=5)
scored = pp.score(muts, pp.calc_gc, region="cre", card_key="cre_gc",
cards=["cre_gc"])
scored.print_library()
AAAA<cre>ATCGAACG</cre>TTTT
AAAA<cre>ATCGCTCG</cre>TTTT
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>ACCGATCG</cre>TTTT
Multiple scores in a pipeline
Chain two score calls to record multiple metrics in one library.
wt = pp.from_iupac("NNNNNNNN", mode="sequential", num_states=10)
scored = pp.score(wt, pp.calc_gc, card_key="gc", cards={"gc": "gc"})
scored = pp.score(scored, pp.calc_complexity, card_key="complexity", cards=["complexity"])
df = scored.generate_library()
print(df.to_string())
# df has both "gc" and "op[...]:score.complexity" columns
| name | seq | gc | op[2]:score.complexity |
|---|---|---|---|
| None | AAAAAAAA | 0.000 | 0.186508 |
| None | AAAAAAAC | 0.125 | 0.373016 |
| None | AAAAAAAG | 0.125 | 0.373016 |
| None | AAAAAAAT | 0.000 | 0.373016 |
| None | AAAAAACA | 0.125 | 0.476190 |
| None | AAAAAACC | 0.250 | 0.476190 |
| None | AAAAAACG | 0.250 | 0.559524 |
| None | AAAAAACT | 0.125 | 0.559524 |
| None | AAAAAAGA | 0.125 | 0.476190 |
| None | AAAAAAGC | 0.250 | 0.559524 |
See score().