Library Size
============

Every pool has a ``num_states`` property — the number of distinct sequences it
can produce. Each operation has an **internal state** (see :doc:`modes`) whose
count is determined by its mode and parameters. When operations are chained, the output pool's ``num_states`` is the product
of all internal state counts along the chain. Other composition patterns
(stacking, synchronisation) follow different rules described below.

.. code-block:: python

    import poolparty as pp
    pp.init()

----

Composition rules
-----------------

Three rules determine how ``pool.num_states`` changes as operations are
applied.

.. rubric:: Multiply (Cartesian product)

Chaining an operation pairs every input sequence with every possibility of
the operation, producing all combinations. The resulting ``num_states`` is
the input pool's ``num_states`` multiplied by ``operation.num_states``.

.. code-block:: python

    seqs    = pp.from_seqs(["AAA", "CCC", "GGG"], mode="sequential")  # seqs (pool): 3 states
    mutants = pp.mutagenize(seqs, num_mutations=1, mode="sequential")  # mutagenize (op): 9 internal states
    print(mutants.num_states)   # 27  (3 × 9)

.. rubric:: Add (disjoint union)

``stack`` (or the ``+`` operator) places its input pools side by side. Sequences
from each branch appear in the output but are not combined with each other,
so the resulting ``num_states`` is the sum of the inputs' ``num_states``.

.. code-block:: python

    a = pp.from_seqs(["AAA", "CCC", "GGG"], mode="sequential")          # a (pool): 3 states
    b = pp.from_seqs(["TTT", "ATA", "GAG", "CTC"], mode="sequential")  # b (pool): 4 states
    combined = pp.stack([a, b])
    print(combined.num_states)  # 7  (3 + 4)

.. rubric:: No change (×1)

Fixed-mode operations transform each input sequence in exactly one
deterministic way, so the number of sequences stays the same.
``operation.num_states`` is 1.

.. code-block:: python

    seqs    = pp.from_seqs(["ACG", "TGA", "CCC"], mode="sequential")  # seqs (pool): 3 states
    flipped = pp.rc(seqs)                                              # rc (op): 1 internal state
    print(flipped.num_states)   # 3  (3 × 1)

----

Per-category behaviour
----------------------

.. list-table::
   :header-rows: 1
   :widths: 15 15 70

   * - Category
     - Effect
     - Operation(s)
   * - Source
     - sets initial size
     - ``from_seq``, ``from_seqs``, ``from_fasta``, ``from_iupac``, ``from_motif``, ``get_kmers``, ``get_barcodes``
   * - Mutagenesis
     - multiplies
     - ``mutagenize``, ``shuffle_seq``, ``recombine``, ``flip``
   * - Scanning
     - multiplies
     - ``deletion_scan``, ``insertion_scan``, ``replacement_scan``, ``shuffle_scan``, ``mutagenize_scan``, ``subseq_scan``, and multi-window variants
   * - Regions
     - multiplies
     - ``replace_region``, ``region_scan``, ``region_multiscan``
   * - Regions
     - unchanged
     - ``insert_tags``, ``remove_tags``, ``annotate_region``, ``apply_at_region``, ``extract_region``
   * - ORF
     - multiplies
     - ``mutagenize_orf``, ``reverse_translate``
   * - ORF
     - unchanged
     - ``translate``, ``annotate_orf``, ``stylize_orf``
   * - Composition
     - multiplies
     - ``join``
   * - Composition
     - adds
     - ``stack``
   * - State
     - multiplies
     - ``repeat``
   * - State
     - reduces
     - ``sample``, ``filter``, ``slice_states``
   * - State
     - unchanged
     - ``shuffle_states``, ``sync``, ``score``, ``materialize``
   * - Utilities
     - unchanged
     - ``rc``, ``upper``, ``lower``, ``swapcase``, ``stylize``, ``clear_gaps``, ``clear_annotation``, ``slice_seq``, ``add_prefix``

----

Synchronisation
---------------

Without ``sync``, joining two independent 3-state pools produces
3 × 3 = 9 states (Cartesian product). After syncing, the pools iterate in
lockstep, producing only 3 paired states. See :doc:`sync` for details.

.. code-block:: python

    left  = pp.from_seqs(["AAA", "CCC", "GGG"], mode="sequential")
    right = pp.from_seqs(["TTT", "AAA", "CCC"], mode="sequential")

    # Without sync: 3 × 3 = 9
    print(pp.join([left, right]).num_states)   # 9

.. code-block:: python

    left  = pp.from_seqs(["AAA", "CCC", "GGG"], mode="sequential")
    right = pp.from_seqs(["TTT", "AAA", "CCC"], mode="sequential")

    pp.sync([left, right])
    paired = pp.join([left, right])
    print(paired.num_states)                   # 3

----

Worked example
--------------

A realistic pipeline that uses chaining (multiply), stack (add), and a final
chained step (multiply again):

.. code-block:: python

    wt = pp.from_seq("ACGT<cre>ATCG</cre>TTTT<bc/>GGGG")        # wt.num_states: 1

    # Branch 1: sequential mutagenesis
    mutants = wt.mutagenize(region="cre", num_mutations=1,
                            mode="sequential")
    # mutagenize (op): 4 positions × 3 alt bases = 12 internal states
    # mutants.num_states: 1 × 12 = 12

    # Branch 2: deletion scan
    dels = wt.deletion_scan(region="cre", deletion_length=2,
                            mode="sequential")
    # deletion_scan (op): 3 window positions = 3 internal states
    # dels.num_states: 1 × 3 = 3

    # Stack branches (addition)
    combined = pp.stack([mutants, dels])
    # combined.num_states: 12 + 3 = 15

    # Add barcodes (Cartesian product)
    barcoded = combined.insert_kmers(region="bc", length=2,
                                     mode="sequential")
    # insert_kmers (op): 4² = 16 internal states
    # barcoded.num_states: 15 × 16 = 240

    print(barcoded.num_states)  # 240

----

Practical tips
--------------

- Use the ``num_states`` parameter to cap large sequential enumerations before
  they multiply downstream.
- Use :doc:`sync` to pair pools that should iterate together instead of
  forming a Cartesian product.
- Use ``sample`` or ``slice_states`` to reduce an oversized library after
  construction.
- In random mode without the ``num_states`` parameter, each sequence gets a
  fresh random draw (×1, no multiplication). With ``num_states=N``, the
  operation contributes *N* randomly chosen designs that multiply with the
  input pool (×N).