score

Evaluate a user-supplied function on each sequence and record the result as a design card column. The sequence passes through unchanged — score is a passthrough operation that adds metadata without altering content. The function receives the clean (tag-stripped) sequence string, or the clean content of a named region when region is specified.

Compatible with built-in utilities such as pp.calc_gc, pp.calc_dust, and pp.calc_complexity.

import poolparty as pp
pp.init()

Parameters

Parameter

Type

Default

Description

pool

Pool | str

(required)

The Pool or sequence string to score.

fn

callable

(required)

Scoring function (str) -> any. Receives a clean (tag-free) sequence string and returns any scalar value to record.

card_key

str

'score'

Design card column name under which the result is stored.

region

str | list | None

None

Region to score. A named tag (str), [start, stop] interval, or None to score the full sequence.

prefix

str | None

None

Prefix for auto-generated sequence names.

cards

list | dict | None

None

Design card keys to include. The available key is the value of card_key (default 'score').


Note

Only the most commonly used parameters are shown above. For the full parameter list, see score() in the API Reference.

Examples

GC content with pp.calc_gc

Record the GC fraction of every sequence using the built-in utility. Design cards are opt-in — pass cards={"gc": "gc"} (a dict mapping the card key to the desired column name) to include the column in the output. Using a list ["gc"] also works but prefixes the column name with the operation id (e.g. op[12]:score.gc); the dict form avoids this.

wt     = pp.from_iupac("NNNN", mode="sequential")
scored = pp.score(wt, pp.calc_gc, card_key="gc", cards={"gc": "gc"})
scored.print_library()
# scored.generate_library() adds a "gc" column per sequence
scored: seq_length=4, num_states=256 AAAA
AAAC
AAAG
AAAT
AACA
... (256 total)

Custom scoring function

Any callable works. Here a lambda counts A/T bases for AT richness.

wt     = pp.from_seqs(["AAAA", "GCGC", "ATCG"], mode="sequential")
scored = pp.score(wt, lambda s: s.count("A") + s.count("T"),
                 card_key="at_count", cards=["at_count"])
scored.print_library()
scored: seq_length=4, num_states=3 AAAA
GCGC
ATCG

Built-in scoring functions

PoolParty includes several sequence property functions that work directly with score:

  • pp.calc_gc — GC fraction

  • pp.calc_complexity — linguistic complexity (0–1)

  • pp.calc_dust — DUST low-complexity score (lower = more complex)

wt     = pp.from_iupac("NNNNNNNN", mode="sequential", num_states=5)
scored = pp.score(wt, pp.calc_complexity, card_key="complexity",
                 cards={"complexity": "complexity"})
scored.print_library()
scored: seq_length=8, num_states=5 AAAAAAAA
AAAAAAAC
AAAAAAAG
AAAAAAAT
AAAAAACA

Score only a named region

region restricts scoring to the tagged segment; the full sequence still passes through unchanged. With mutagenize(..., mode="random"), set num_states if you want more than one independent draw (the default is a single random mutant).

wt     = pp.from_seq("AAAA<cre>ATCGATCG</cre>TTTT")
muts   = pp.mutagenize(wt, num_mutations=1, region="cre",
                      mode="random", num_states=5)
scored = pp.score(muts, pp.calc_gc, region="cre", card_key="cre_gc",
                 cards=["cre_gc"])
scored.print_library()
scored: seq_length=16, num_states=5 AAAA<cre>ATCGGTCG</cre>TTTT
AAAA<cre>ATCGAACG</cre>TTTT
AAAA<cre>ATCGCTCG</cre>TTTT
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>ACCGATCG</cre>TTTT

Multiple scores in a pipeline

Chain two score calls to record multiple metrics in one library.

wt     = pp.from_iupac("NNNNNNNN", mode="sequential", num_states=10)
scored = pp.score(wt,     pp.calc_gc,        card_key="gc",         cards={"gc": "gc"})
scored = pp.score(scored, pp.calc_complexity, card_key="complexity", cards=["complexity"])
df     = scored.generate_library()
print(df.to_string())
# df has both "gc" and "op[...]:score.complexity" columns
name seq gc op[2]:score.complexity
NoneAAAAAAAA0.0000.186508
NoneAAAAAAAC0.1250.373016
NoneAAAAAAAG0.1250.373016
NoneAAAAAAAT0.0000.373016
NoneAAAAAACA0.1250.476190
NoneAAAAAACC0.2500.476190
NoneAAAAAACG0.2500.559524
NoneAAAAAACT0.1250.559524
NoneAAAAAAGA0.1250.476190
NoneAAAAAAGC0.2500.559524

See score().