For contributors · humans & agents

Developer Guide

How the ppvm repository is organised, how to build and test it, and where to look when you want to extend it. Written so that an AI agent can read just the sections it needs.

§ 1Orient yourself

ppvm is a Cargo workspace with a Python wrapper layered on top. The Rust crates are the source of truth; the Python package is a thin PyO3-based binding.

ppvm/
├── crates/
│   ├── ppvm-traits          # Trait system, Config bundle, Pauli alphabet, map impls
│   ├── ppvm-pauli-word      # Packed Pauli strings: PauliWord, phased, lossy, pattern
│   ├── ppvm-pauli-sum       # PauliSum engine, truncation strategy, concrete configs
│   ├── ppvm-tableau         # Stabilizer + generalized-tableau simulator
│   ├── ppvm-sym             # Symbolic (parametric) Pauli propagation
│   ├── ppvm-stim            # Stim program execution against the tableau
│   ├── stim-parser          # Standalone parser for the Stim circuit format
│   └── ppvm-python-native   # PyO3 bindings, compiled into `ppvm` as `ppvm._core`
├── ppvm-python/             # Python package `ppvm` (maturin: wrapper + `ppvm._core`)
├── docs/                    # This documentation site (Astro)
├── examples/                # Rust examples (symbolic.rs, trotter.rs)
└── AGENTS.md                # Pointer to this guide

Dependency graph. ppvm-traits is the foundation; ppvm-pauli-word builds on it, and ppvm-pauli-sum builds on both. ppvm-tableau, ppvm-sym, and ppvm-stim depend on the Pauli crates. ppvm-stim additionally depends on ppvm-tableau and stim-parser. ppvm-python-native depends on ppvm-pauli-sum and ppvm-tableau.

§ 2Build & test

Rust

# All Rust tests
cargo test --workspace

# A single crate
cargo test -p ppvm-tableau

# A single test by name
cargo test -p ppvm-pauli-sum -- test_ghz

# Benchmarks
cargo bench -p ppvm-tableau --bench micro
cargo bench --bench micro -- "gates/single-qubit/h"

Rust edition 2024. On x86 the default hasher (gxhash) needs AES/SSE2 target features; this repo sets them for x86_64 in .cargo/config.toml, and CI does the same. On non-x86 hosts, build with --no-default-features --features=indexmap,ahash or similar.

WebAssembly (wasm32)

The whole workspace except ppvm-python-native (a CPython extension, never a wasm target) cross-compiles to browser wasm with no extra flags:

rustup target add wasm32-unknown-unknown

# The simulators, Pauli engine, Stim parser, and top-level `ppvm` crate.
cargo build --target wasm32-unknown-unknown --workspace --exclude ppvm-python-native

The build is wasm-clean automatically. Native-only acceleration dependencies — gxhash (AES intrinsics), dashmap → rayon (OS threads), and ahash — live in [target.'cfg(not(target_arch = "wasm32"))'.dependencies] tables, so on wasm they are pruned and the matching features go inert (the code that names those crates is gated with the same not(target_arch = "wasm32")). The fx64hash configs use native-word [usize; N] storage (u64 on 64-bit, u32 on wasm) since bitvec only implements BitStore for u64 on 64-bit pointer widths. ppvm-tableau-sum's structural fingerprint falls back from gxhash to fxhash on wasm.

rand's entropy (rand::make_rng()) has no default source on wasm32-unknown-unknown, so the getrandom wasm_js backend (Web Crypto API) is selected via a --cfg getrandom_backend="wasm_js" rustflag in .cargo/config.toml plus the wasm_js feature in ppvm-tableau's wasm-only dependency table — so the JS runtime supplies randomness. There are no OS threads on wasm: the rayon feature is unavailable, and the Stim parser runs its recursive grammar inline instead of on a dedicated stack thread. The wasm32 build CI job compiles the workspace for this target on every PR.

Python

# Requires uv (https://docs.astral.sh/uv/)
# The native module compiles automatically via maturin on first run.

uv run --project ppvm-python --group dev pytest ppvm-python/test/

# A single file
uv run --project ppvm-python --group dev pytest ppvm-python/test/test_basics.py

# A single test by name
uv run --project ppvm-python --group dev pytest ppvm-python/test/ -k test_ghz

The compiled ppvm._core is part of the ppvm wheel, so after changing Rust force a rebuild with uv sync --project ppvm-python --reinstall-package ppvm (or maturin develop -m crates/ppvm-python-native/Cargo.toml). The Python project is configured to use uv-managed Python installations, so a fresh uv sync avoids linking PyO3 builds against a system Python.

This docs site

The Astro site you're reading lives under docs/. Every build step (rustdoc-JSON extraction, griffe-based Python API extraction, notebook execution, Astro build) is wired into docs/package.json, so a fresh checkout has one command to remember:

cd docs
npm install            # one-time
npm run dev            # extract everything, then `astro dev` (port 4321)
npm run build          # extract everything, then `astro build` → dist/

npm run dev / npm run build chain the three extraction steps in order so the rendered site picks up every public API change automatically. When you're iterating on a single layer, run that step on its own and refresh the already-running astro dev:

Command	Rebuilds	When to use
`npm run extract:rust`	`src/data/rust-api.json`	You changed a public Rust item — trait, struct, doc comment — and want it surfaced on `/api/`. Needs `cargo +nightly`.
`npm run extract:python`	`src/data/python-api.json`	You changed a public Python item under `ppvm-python/src/ppvm/` and want it surfaced on `/api/`. Uses `griffe` via `uv`.
`npm run extract:notebooks`	`src/generated/notebooks/*`	You edited or added a Jupytext file under `docs/notebooks/`. Re-executes the notebooks against the current ppvm-python build and embeds outputs.
`npm run extract`	All of the above, in order.	Touched several layers at once.
`npm run astro:dev` / `npm run astro:build`	Just Astro.	You're only editing `.astro` / `.css` files and trust the existing extractor outputs — fastest loop.

docs/src/data/ and docs/src/generated/ are both .gitignore'd; the only sources of truth for those files are the extractor scripts, which CI re-runs on every build. Adding a notebook is a single drop-in: docs/notebooks/my_notebook.py (Jupytext-percent format) → npm run extract:notebooks → the Examples landing page picks it up from src/generated/notebooks/index.json.

2.1 Notebook execution & caching

The script behind npm run extract:notebooks lives at docs/scripts/build-notebooks.py. Per-notebook pipeline:

Read the Jupytext percent-format .py file and convert to an in-memory ipynb.
Prepend a hidden setup cell that switches matplotlib to the IPython inline backend — without this, plt.show() renders to a buffer that never reaches the cell output and plots are silently dropped.
Execute every code cell via nbclient. Text output, tracebacks, and matplotlib figures are captured inline; figures are embedded as base64 PNGs.
Drop the hidden setup cell, render to an HTML fragment via nbconvert's basic template (no JupyterLab chrome — the site stylesheet themes the .jp-* classes), and sanitise through bleach against an allow-list that permits data:image/png URLs but strips scripts and iframes.
Write docs/src/generated/notebooks/<slug>.html + <slug>.json (title, ordered headings, language, source path). The Astro routes at docs/src/pages/examples/index.astro and [slug].astro consume index.json + the per-slug fragments at build time.

Content-addressed cache. Executing every notebook from scratch on every PR is expensive — the long-running examples can dominate CI. To avoid that, every successful run also writes its output to docs/.notebook-cache/<hash>.{html,json}, keyed by

sha256(CACHE_SCHEMA_VERSION + docs/scripts/build-notebooks.py + notebook source + Cargo.lock + Cargo.toml + crates/*/Cargo.toml + ppvm-python/uv.lock)

Hashing the extractor itself means that a change to the rendering / sanitiser / matplotlib-setup logic invalidates every cached entry automatically — without that, a tweak to the bleach allow-list would silently keep serving the previous HTML for every unchanged notebook source. The CACHE_SCHEMA_VERSION constant at the top of the script is an explicit global invalidation knob for changes the hash can't see (e.g. a new field in the sidecar JSON that downstream Astro pages start depending on).

On the next run the script restores from the cache when the hash matches and only re-executes notebooks whose fingerprint changed. CI persists the directory via actions/cache keyed on the same set of files (see the "Restore executed-notebook cache" step in .github/workflows/docs.yml), so a docs-only PR that touches only .astro or .css hits the cache for every notebook and the build takes seconds.

What the fingerprint deliberately does not include: Rust .rs sources and Python package sources. Hashing every workspace file would force a re-execution on any cosmetic edit, which is exactly the cost we want to avoid. The tradeoff is that a numerical change inside a Rust crate without a dependency or Cargo.toml bump won't invalidate cached notebook outputs on a docs-only PR — rely on the standard test suites (cargo test --workspace, pytest) to catch those. (A scheduled full-rebuild workflow as a second safety net would be a reasonable future addition, but none exists today; bump CACHE_SCHEMA_VERSION manually if you ever need to force a global re-execution.)

Override knobs (mostly for debugging):

PPVM_NOTEBOOK_CACHE=0 — force re-execution of every notebook regardless of cache state (use when investigating suspected numerical drift).
PPVM_NOTEBOOK_CACHE_DIR=<path> — point the cache at a non-default directory (CI uses this implicitly via the default docs/.notebook-cache; tweak only if you need to share a cache across worktrees).

Where to look when you need to change this. Adding a new notebook: drop a Jupytext file under docs/notebooks/ — no extractor change needed. Changing how notebooks render (sanitiser allow-list, matplotlib DPI, output format): docs/scripts/build-notebooks.py — every edit to this file already invalidates the cache via the fingerprint, so no version bump is needed for routine pipeline tweaks. Changing the fingerprint inputs (e.g. another lockfile becomes relevant): edit _shared_fingerprint_files() in that same script and the hashFiles(...) argument on the cache step in .github/workflows/docs.yml — those two lists must stay in sync (note that docs/scripts/build-notebooks.py itself appears in both), otherwise the GH Actions cache key drifts from the script's per-notebook key and you get either stale outputs or perpetual misses. To force a global invalidation independent of file content (e.g. cached-artefact schema change), bump CACHE_SCHEMA_VERSION in the script; bump the notebooks-v1- prefix in the workflow when the GH Actions cache itself needs a clean slate. Changing the Examples landing or per-notebook page chrome: the two .astro files under docs/src/pages/examples/.

Local prerequisites the scripts assume: node ≥ 20 (Astro 5), uv, Rust nightly (rustup toolchain install nightly). The full layout and rationale live in docs/README.md.

Continuous integration

CI lives in .github/workflows/ci.yml and is staged so the cheap, platform-independent checks gate the expensive cross-OS ones:

pre-commit (Linux) runs the full prek hook suite — rustfmt, clippy, cargo check --workspace --all-targets, ruff, ty, hawkeye, and the file hygiene hooks. Every other job needs: it, so a lint or type failure stops the run before any test minutes are spent.
rust-tests and python-tests (Linux) run cargo test --workspace and the pytest suites. The pure-Rust crates are platform-agnostic, so Linux is the only OS that runs the full test suites.
extension-cross-platform (macOS + Windows) is the only cross-OS job. It builds the PyO3 extension via maturin and runs the extension's pytest suite. It needs: [rust-tests, python-tests], so the macOS/Windows runners only start once Linux is fully green.

Why cross-OS is extension-only. The compiled PyO3 module is the only artifact whose build is OS-sensitive — macOS needs -undefined dynamic_lookup (added by ppvm-python-native/build.rs; maturin sets it too), Windows links python3.lib, and Linux needs neither. Building that extension with maturin also compiles ppvm-python-native and its entire dependency tree on the target OS, so a cross-platform compile regression in any crate still surfaces here — without separately running cargo build for the whole workspace three times.

No global RUSTFLAGS. gxhash's +aes,+sse2 target features are set arch-scoped in .cargo/config.toml (cfg(target_arch = "x86_64")), not as a workflow-wide RUSTFLAGS — those x86 features are invalid on the aarch64 macos-latest runner and would fail to compile there. Linux and Windows (x86_64) still pick them up from the config.

§ 3Architecture

ppvm implements two complementary quantum simulation backends. They share a common gate / noise trait hierarchy from ppvm-traits.

3.1 Pauli propagation (`ppvm-pauli-sum`)

Tracks Pauli operator evolution through circuits in the Heisenberg picture (circuits run backwards). The central type is PauliSum<T: Config>, a dictionary of Pauli strings to coefficients.

Key design patterns — respect these when editing:

Config-based generics. The Config trait bundles Storage, Coefficient, Strategy, Map, and BuildHasher choices at compile time. Implementations live in config/ (fxhash, indexmap, dashmap, gxhash). Do not introduce runtime dispatch where a Config bound would do.
Dual-map optimisation. PauliSum maintains two internal maps (main + auxiliary) and swaps between them during gate propagation to avoid repeated allocations. Any new gate that writes to a fresh map must respect this swap.
Strategy pattern. Truncation policies (CoefficientThreshold, MaxPauliWeight, MaxLossWeight, CombinedStrategy) decide when small terms are dropped. Call .truncate() to apply.
Backward propagation. Circuits run backwards. To simulate H(0); CNOT(0,1) in the Heisenberg picture, call state.cnot(0,1); state.h(0): the CNOT precedes the Hadamard in code.

3.2 Generalized stabilizer tableau (`ppvm-tableau`)

Full state simulation using stabilizer formalism, extended to handle non-Clifford gates (T, rotations) via stabilizer rank decomposition with sparse coefficient tracking.

Tableau<T: Config> — 2n-row stabilizer / destabilizer tableau (rows 0..n = destabilizers, n..2n = stabilizers).
GeneralizedTableau<T: Config, IndexType> — extends Tableau with a sparse coefficient vector for non-Clifford state tracking. IndexType can be usize, u128, or bnum::types::U256 for large qubit counts.
SparseVector<T, I> — stores coefficients indexed by bitstrings. Indices can be large integers (U256, U512, U1024) for simulations beyond 64 qubits.
Stim compatibility. Rust-side Stim support lives in ppvm_stim (parse_extended, run_string, run_file). Python-side Stim parsing uses StimProgram.parse / StimProgram.from_file. Execute parsed programs with tab.run(prog) or sample many shots with ppvm.sample_stim / GeneralizedTableau.sample.

3.3 Trait hierarchy (`ppvm-traits/src/traits/`)

Gate behaviour is defined via traits reused across both backends:

Clifford / CliffordExtensions — single- and two-qubit Clifford gates.
TGate, RotationOne, RotationTwo, U3Gate — non-Clifford gates (branching).
Measure / LossyMeasure — Z-basis measurement.
Depolarizing, PauliError, LossChannel, CorrelatedLossChannel — noise channels.

§ 4Conventions

4.1 Commit messages

Use Conventional Commits: <type>(<scope>): <description>.

feat(tableau): add correlated loss channel
fix(pauli-sum): handle zero-norm in truncation
test(stim-parser): add fast fuzz/proptest suite
chore: restore lockfile consistency

4.2 Code style

Run cargo fmt --all before committing Rust.
Run cargo clippy --workspace --all-targets; fix or justify all warnings.
Python is formatted with ruff format and linted with ruff check.
Public Rust items should have doc comments; cargo doc --no-deps must build cleanly because the API site is built from rustdoc JSON.
Python docstrings use Google style (griffe parses with -d google) and are rendered as Markdown via marked. Use backtick spans for cross-references — not Sphinx/RST syntax:
- ✅ `fork` or `GeneralizedTableau.sample`
- ❌ :meth:`fork`, :func:`ppvm.sample_stim` — these are never parsed and appear as literal text.

4.3 Tests

Add tests in the same crate as the code they cover. Prefer property tests (proptest) for parser and arithmetic changes; stim-parser already has a proptest suite worth modelling new tests on.

§ 5Python bindings

Single mixed wheel. ppvm-python is one maturin package: it bundles the pure-Python wrapper under src/ppvm/ together with the PyO3 crate (ppvm-python-native, Rust → cdylib via PyO3 0.29), which maturin compiles and drops in as the private ppvm._core submodule. Users only ever import ppvm.

Python ≥ 3.10 required (.python-version pins 3.12 for dev). The wheel is built against PyO3's abi3-py310 stable ABI, so one cp310-abi3 wheel per platform loads on 3.10+.
uv manages the venv and deps and triggers the maturin build on uv sync.
ppvm-python/pyproject.toml sets build-backend = "maturin" with [tool.maturin] manifest-path → the crate, python-source = "src", and module-name = "ppvm._core".
Plain cargo build also links the cdylib (a build.rs in ppvm-python-native adds the macOS -undefined dynamic_lookup flag), so the Rust-only workflows work without maturin.
The native module exports 16 PauliSum variants × 2 (with/without loss) + 32 GeneralizedTableau variants (1–32 qubits) via the create_interface! / create_interface_range! macros.

When adding a new method to a Python-facing type, edit the macro invocation in ppvm-python-native so every config variant picks it up; do not hand-write methods for one variant.

§ 6Extending ppvm

Adding a new gate

Decide which trait it belongs to (Clifford, RotationOne, etc.) in ppvm-traits/src/traits/.
Implement it for PauliSum<T: Config> in ppvm-pauli-sum/src/sum/.
Implement it for Tableau / GeneralizedTableau in ppvm-tableau/src/gates/.
Expose it in ppvm-python-native through the relevant create_interface! macro, and wrap it in ppvm-python/src/ppvm/….
Add tests on both sides and a benchmark if it is on a hot path.

Adding a new noise channel

Follow the pattern of LossChannel / CorrelatedLossChannel. Implement the trait in ppvm-traits/src/traits/noise.rs, then mirror in ppvm-tableau if it is meaningful in the tableau picture.

Adding a new `Config`

Create a module under ppvm-pauli-sum/src/config/, implement the Config trait (defined in ppvm-traits), and re-export it from config/mod.rs. If it should be exposed to Python, add a variant to the create_interface! macro call.

§ 7Where to look for X

Pauli arithmetic, PauliSum: crates/ppvm-pauli-sum/src/sum/; word / phase / loss / pattern types in crates/ppvm-pauli-word/src/
Gate & noise traits: crates/ppvm-traits/src/traits/
Truncation strategies (CoefficientThreshold, MaxPauliWeight, …): crates/ppvm-pauli-sum/src/strategy.rs, crates/ppvm-traits/src/traits/strategy.rs
Config trait & implementations: trait in crates/ppvm-traits/src/config.rs; concrete bundles in crates/ppvm-pauli-sum/src/config/
Stabilizer tableau core (Tableau, GeneralizedTableau): crates/ppvm-tableau/src/data.rs, tableau_like.rs
Tableau gates: crates/ppvm-tableau/src/gates/
Stim parsing: crates/stim-parser/ (parser only) and crates/ppvm-stim/ (execution)
PyO3 bindings & macros: crates/ppvm-python-native/src/
Python wrapper & mixins: ppvm-python/src/ppvm/
Python tests: ppvm-python/test/

Found something out of date? Send a PR — this guide is the canonical source for both human and agent contributors.