mutagenize ========== Introduce point mutations into every sequence in a pool. Exactly one of ``num_mutations`` or ``mutation_rate`` must be supplied. Pass ``region`` to restrict mutagenesis to a named tagged segment; use ``allowed_chars`` to limit which substitutions are permitted. .. code-block:: python import poolparty as pp pp.init() ---- Parameters ---------- .. list-table:: :header-rows: 1 :widths: auto * - Parameter - Type - Default - Description * - ``pool`` - ``Pool | str`` - *(required)* - The Pool to mutagenize. Can also be a plain sequence string. * - ``num_mutations`` - ``int | None`` - ``None`` - Fixed number of point mutations per draw. Mutually exclusive with ``mutation_rate``. * - ``mutation_rate`` - ``float | None`` - ``None`` - Per-base probability of mutation. Each base is mutated independently with this probability. Mutually exclusive with ``num_mutations``. * - ``region`` - ``str | None`` - ``None`` - Region to restrict mutations to: a tag name (``str``), an explicit ``[start, stop]`` interval, or ``None`` for the full sequence. * - ``allowed_chars`` - ``str | None`` - ``None`` - IUPAC string of the same length as the sequence specifying the allowed bases at each position. Only positions with more than one allowed base are mutable. * - ``style`` - ``str | None`` - ``None`` - Named display style applied to mutated bases. * - ``prefix`` - ``str | None`` - ``None`` - Prefix for auto-generated sequence names. * - ``mode`` - ``str`` - ``'random'`` - ``'sequential'`` enumerates mutation variants in order (requires ``num_mutations``); ``'random'`` samples each draw independently. * - ``num_states`` - ``int | None`` - ``None`` - Number of output states. ``None`` auto-computes in sequential mode or defaults to 1 in random mode. * - ``iter_order`` - ``int | None`` - ``None`` - Enumeration order when combined with other pools. ---- .. note:: Only the most commonly used parameters are shown above. For the full parameter list, see :func:`~poolparty.mutagenize` in the :doc:`API Reference `. Examples -------- Single random mutation (num_mutations=1) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Each draw returns one sequence with a single substitution at a randomly chosen position. .. code-block:: python wt = pp.from_seq("ATCGATCG") mutants = wt.mutagenize(num_mutations=1, mode="random", style="red") mutants.print_library() .. raw:: html
mutants: seq_length=8, num_states=1 ATCGGTCG
Multiple independent mutants with ``num_states`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Pass ``num_states`` to draw multiple independent single-mutant sequences in one ``generate_library`` call. .. code-block:: python wt = pp.from_seq("ATCGATCG") mutants = wt.mutagenize(num_mutations=1, num_states=5, mode="random", style="red") mutants.print_library() .. raw:: html
mutants: seq_length=8, num_states=5 ATCGGTCG
ATCGAACG
ATCGCTCG
GTCGATCG
ACCGATCG
Per-base mutation rate (mutation_rate=0.1) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``mutation_rate`` applies an independent per-position probability; the number of substitutions per draw follows a Binomial distribution and may be zero. .. code-block:: python wt = pp.from_seq("ATCGATCG") mutants = wt.mutagenize(mutation_rate=0.1, mode="random", style="red") mutants.print_library() .. raw:: html
mutants: seq_length=8, num_states=1 ATCGTTAG
Mutate only within a named region ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``region`` confines all mutations to the tagged segment; flanks are returned unchanged. With ``mode='sequential'``, every single-base variant within the region is enumerated. .. code-block:: python wt = pp.from_seq("AAAAATCGATCGTTTT") mutants = wt.mutagenize(num_mutations=1, region="cre", mode="sequential", style="red") mutants.print_library() .. raw:: html
mutants: seq_length=16, num_states=24 AAAA<cre>CTCGATCG</cre>TTTT
AAAA<cre>GTCGATCG</cre>TTTT
AAAA<cre>TTCGATCG</cre>TTTT
AAAA<cre>AACGATCG</cre>TTTT
AAAA<cre>ACCGATCG</cre>TTTT ... (24 total)
Restrict substitutions with ``allowed_chars`` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``allowed_chars="SSSSSSSS"`` (S = {G,C}) restricts mutations to G↔C swaps at every position; no A or T substitutions are made. ``mode='sequential'`` enumerates every allowed swap. .. code-block:: python wt = pp.from_seq("GCGCGCGC") mutants = wt.mutagenize(num_mutations=1, allowed_chars="SSSSSSSS", mode="sequential", style="red") mutants.print_library() .. raw:: html
mutants: seq_length=8, num_states=8 CCGCGCGC
GGGCGCGC
GCCCGCGC
GCGGGCGC
GCGCCCGC
GCGCGGGC
GCGCGCCC
GCGCGCGG
Sequential enumeration (mode="sequential") ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``mode='sequential'`` with ``num_mutations=1`` enumerates every single-point variant in deterministic order, covering all positions and non-wild-type bases. .. code-block:: python wt = pp.from_seq("ACGT") mutants = wt.mutagenize(num_mutations=1, mode="sequential", style="red") mutants.print_library() .. raw:: html
mutants: seq_length=4, num_states=12 CCGT
GCGT
TCGT
AAGT
AGGT ... (12 total)
See :func:`~poolparty.mutagenize`.