score
=====
Evaluate a user-supplied function on each sequence and record the result
as a design card column. The sequence passes through unchanged — ``score``
is a passthrough operation that adds metadata without altering content.
The function receives the clean (tag-stripped) sequence string, or the
clean content of a named region when ``region`` is specified.
Compatible with built-in utilities such as ``pp.calc_gc``,
``pp.calc_dust``, and ``pp.calc_complexity``.
.. code-block:: python
import poolparty as pp
pp.init()
----
Parameters
----------
.. list-table::
:header-rows: 1
:widths: auto
* - Parameter
- Type
- Default
- Description
* - ``pool``
- ``Pool | str``
- *(required)*
- The Pool or sequence string to score.
* - ``fn``
- ``callable``
- *(required)*
- Scoring function ``(str) -> any``. Receives a clean (tag-free)
sequence string and returns any scalar value to record.
* - ``card_key``
- ``str``
- ``'score'``
- Design card column name under which the result is stored.
* - ``region``
- ``str | list | None``
- ``None``
- Region to score. A named tag (str), ``[start, stop]`` interval, or
``None`` to score the full sequence.
* - ``prefix``
- ``str | None``
- ``None``
- Prefix for auto-generated sequence names.
* - ``cards``
- ``list | dict | None``
- ``None``
- Design card keys to include. The available key is the value of
``card_key`` (default ``'score'``).
----
.. note::
Only the most commonly used parameters are shown above. For the full
parameter list, see :func:`~poolparty.score` in the
:doc:`API Reference `.
Examples
--------
Custom scoring function
~~~~~~~~~~~~~~~~~~~~~~~~
The scoring function takes a sequence string and returns any scalar value.
Define it as a regular function so the pattern is explicit.
.. code-block:: python
def gc_fraction(seq):
return (seq.count("G") + seq.count("C")) / len(seq)
pool = pp.from_seqs(["AAAA", "ACGT", "GCGC", "CCCC", "ATAT"],
mode="sequential")
scored = pp.score(pool, gc_fraction, card_key="gc",
cards={"gc": "gc"})
df = scored.generate_library()
.. raw:: html
| name | seq | gc |
| None | AAAA | 0.0 |
| None | ACGT | 0.5 |
| None | GCGC | 1.0 |
| None | CCCC | 1.0 |
| None | ATAT | 0.0 |
The ``cards`` parameter controls how the card column is named in the
output. A dict ``{"gc": "gc"}`` maps the card key directly to the column
name. A list ``["gc"]`` also works but prefixes the column with the
operation id (e.g., ``op[1]:score.gc``); use the dict form to keep column
names clean.
Built-in scoring functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~
PoolParty includes several scoring functions that match the same
``(str) -> scalar`` pattern:
- ``pp.calc_gc`` — GC fraction
- ``pp.calc_complexity`` — linguistic complexity (0–1)
- ``pp.calc_dust`` — DUST low-complexity score (lower = more complex)
.. code-block:: python
pool = pp.from_seqs(["AAAA", "ACGT", "GCGC", "CCCC", "ATAT"],
mode="sequential")
scored = pp.score(pool, pp.calc_gc, card_key="gc",
cards={"gc": "gc"})
df = scored.generate_library()
.. raw:: html
| name | seq | gc |
| None | AAAA | 0.0 |
| None | ACGT | 0.5 |
| None | GCGC | 1.0 |
| None | CCCC | 1.0 |
| None | ATAT | 0.0 |
Score only a named region
~~~~~~~~~~~~~~~~~~~~~~~~~~
``region`` restricts scoring to the tagged segment; the full sequence
passes through unchanged.
.. code-block:: python
wt = pp.from_seq("AAAAATCGATCGTTTT")
muts = pp.mutagenize(wt, num_mutations=1, region="cre",
mode="random", num_states=5)
scored = pp.score(muts, pp.calc_gc, region="cre", card_key="cre_gc",
cards={"cre_gc": "cre_gc"})
df = scored.generate_library()
.. raw:: html
| name | seq | cre_gc |
| None | AAAA<cre>ATCGGTCG</cre>TTTT | 0.625 |
| None | AAAA<cre>ATCGAACG</cre>TTTT | 0.500 |
| None | AAAA<cre>ATCGCTCG</cre>TTTT | 0.625 |
| None | AAAA<cre>GTCGATCG</cre>TTTT | 0.625 |
| None | AAAA<cre>ACCGATCG</cre>TTTT | 0.625 |
Multiple scores in a pipeline
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Chain two ``score`` calls to record multiple metrics in one library.
.. code-block:: python
pool = pp.from_seqs(["AAAA", "ACGT", "GCGC", "CCCC", "ATAT"],
mode="sequential")
scored = pp.score(pool, pp.calc_gc, card_key="gc",
cards={"gc": "gc"})
scored = pp.score(scored, pp.calc_complexity, card_key="complexity",
cards={"complexity": "complexity"})
df = scored.generate_library()
.. raw:: html
| name | seq | gc | complexity |
| None | AAAA | 0.00 | 0.36 |
| None | ACGT | 0.50 | 1.00 |
| None | GCGC | 1.00 | 0.72 |
| None | CCCC | 1.00 | 0.36 |
| None | ATAT | 0.00 | 0.72 |
See :func:`~poolparty.score`.