score ===== Evaluate a user-supplied function on each sequence and record the result as a design card column. The sequence passes through unchanged — ``score`` is a passthrough operation that adds metadata without altering content. The function receives the clean (tag-stripped) sequence string, or the clean content of a named region when ``region`` is specified. Compatible with built-in utilities such as ``pp.calc_gc``, ``pp.calc_dust``, and ``pp.calc_complexity``. .. code-block:: python import poolparty as pp pp.init() ---- Parameters ---------- .. list-table:: :header-rows: 1 :widths: auto * - Parameter - Type - Default - Description * - ``pool`` - ``Pool | str`` - *(required)* - The Pool or sequence string to score. * - ``fn`` - ``callable`` - *(required)* - Scoring function ``(str) -> any``. Receives a clean (tag-free) sequence string and returns any scalar value to record. * - ``card_key`` - ``str`` - ``'score'`` - Design card column name under which the result is stored. * - ``region`` - ``str | list | None`` - ``None`` - Region to score. A named tag (str), ``[start, stop]`` interval, or ``None`` to score the full sequence. * - ``prefix`` - ``str | None`` - ``None`` - Prefix for auto-generated sequence names. * - ``cards`` - ``list | dict | None`` - ``None`` - Design card keys to include. The available key is the value of ``card_key`` (default ``'score'``). ---- .. note:: Only the most commonly used parameters are shown above. For the full parameter list, see :func:`~poolparty.score` in the :doc:`API Reference `. Examples -------- GC content with ``pp.calc_gc`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Record the GC fraction of every sequence using the built-in utility. Design cards are opt-in — pass ``cards={"gc": "gc"}`` (a dict mapping the card key to the desired column name) to include the column in the output. Using a list ``["gc"]`` also works but prefixes the column name with the operation id (e.g. ``op[12]:score.gc``); the dict form avoids this. .. code-block:: python wt = pp.from_iupac("NNNN", mode="sequential") scored = pp.score(wt, pp.calc_gc, card_key="gc", cards={"gc": "gc"}) scored.print_library() # scored.generate_library() adds a "gc" column per sequence .. raw:: html
scored: seq_length=4, num_states=256 AAAA
AAAC
AAAG
AAAT
AACA
... (256 total)
Custom scoring function ~~~~~~~~~~~~~~~~~~~~~~~~ Any callable works. Here a lambda counts A/T bases for AT richness. .. code-block:: python wt = pp.from_seqs(["AAAA", "GCGC", "ATCG"], mode="sequential") scored = pp.score(wt, lambda s: s.count("A") + s.count("T"), card_key="at_count", cards=["at_count"]) scored.print_library() .. raw:: html
scored: seq_length=4, num_states=3 AAAA
GCGC
ATCG
Built-in scoring functions ~~~~~~~~~~~~~~~~~~~~~~~~~~~ PoolParty includes several sequence property functions that work directly with ``score``: - ``pp.calc_gc`` — GC fraction - ``pp.calc_complexity`` — linguistic complexity (0–1) - ``pp.calc_dust`` — DUST low-complexity score (lower = more complex) .. code-block:: python wt = pp.from_iupac("NNNNNNNN", mode="sequential", num_states=5) scored = pp.score(wt, pp.calc_complexity, card_key="complexity", cards={"complexity": "complexity"}) scored.print_library() .. raw:: html
scored: seq_length=8, num_states=5 AAAAAAAA
AAAAAAAC
AAAAAAAG
AAAAAAAT
AAAAAACA
Score only a named region ~~~~~~~~~~~~~~~~~~~~~~~~~~ ``region`` restricts scoring to the tagged segment; the full sequence still passes through unchanged. With ``mutagenize(..., mode="random")``, set ``num_states`` if you want more than one independent draw (the default is a single random mutant). .. code-block:: python wt = pp.from_seq("AAAAATCGATCGTTTT") muts = pp.mutagenize(wt, num_mutations=1, region="cre", mode="random", num_states=5) scored = pp.score(muts, pp.calc_gc, region="cre", card_key="cre_gc", cards=["cre_gc"]) scored.print_library() .. raw:: html
scored: seq_length=16, num_states=5 AAAA<cre>ATCGGTCG</cre>TTTT
AAAA<cre>ATCGAACG</cre>TTTT
AAAA<cre>ATCGCTCG</cre>TTTT
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>ACCGATCG</cre>TTTT
Multiple scores in a pipeline ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Chain two ``score`` calls to record multiple metrics in one library. .. code-block:: python wt = pp.from_iupac("NNNNNNNN", mode="sequential", num_states=10) scored = pp.score(wt, pp.calc_gc, card_key="gc", cards={"gc": "gc"}) scored = pp.score(scored, pp.calc_complexity, card_key="complexity", cards=["complexity"]) df = scored.generate_library() print(df.to_string()) # df has both "gc" and "op[...]:score.complexity" columns .. raw:: html
name seq gc op[2]:score.complexity
NoneAAAAAAAA0.0000.186508
NoneAAAAAAAC0.1250.373016
NoneAAAAAAAG0.1250.373016
NoneAAAAAAAT0.0000.373016
NoneAAAAAACA0.1250.476190
NoneAAAAAACC0.2500.476190
NoneAAAAAACG0.2500.559524
NoneAAAAAACT0.1250.559524
NoneAAAAAAGA0.1250.476190
NoneAAAAAAGC0.2500.559524
See :func:`~poolparty.score`.