Skip to content

Performance

valgebra compiles a schema once into a Rust validator tree and crosses into Rust exactly once per validation call. This page records how that is measured, a reproducible baseline against other validators, and the honest limits of the numbers.

A speed claim is only as good as its methodology. Every number here states the harness, the dataset, the library versions, and the machine class. Re-run the harnesses on your own hardware before relying on a ratio: absolute times move with the CPU, and the comparison points do different amounts of work.

What is measured

Two harnesses, one per side of the boundary:

  • Core micro-benchmarks (crates/valgebra-core/benches/core.rs, criterion) time the pure-Rust schema transformations — the simplifier, the index remap behind validator composition, and the recursive open/closed record transform. No Python is involved.
  • End-to-end benchmarks (benches/, pytest-benchmark) time a single boundary-crossing validation call through the public API, over synthetic shapes that each stress one cost dimension.

Run them with:

# Core micro-benchmarks (Rust):
cargo bench --bench core

# End-to-end and comparison benchmarks (Python); install the bench group first:
uv sync --group bench
# To match the published figures, build the same PGO wheel the release ships
# (needs the llvm-tools rustup component) and install it; a plain build is slower:
uv run --group bench maturin build --release --pgo --out dist
uv pip install --reinstall --no-deps dist/*.whl
uv run --group bench pytest benches/bench_validate.py
uv run --group bench pytest benches/bench_compare.py --benchmark-group-by=group

Comparison is not apples-to-apples

The comparison runs the same shapes through three checkers that do different work. Read the ratios with that in mind:

  • valgebra checks membership of the object already in hand: no copy, no coercion. is_valid returns a bool through the membership fast path.
  • jsonschema (Draft202012Validator.is_valid) is also a pure check with no coercion — the closest semantic analogue — but it is pure Python.
  • pydantic (TypeAdapter.validate_python, strict mode) validates and constructs a value. Strict mode disables coercion, but it still builds and returns output, so it does strictly more work than a membership check. It is the relevant point of comparison because it is the fast, Rust-cored validator most users reach for.

The record shape compares valgebra's closed record against a pydantic TypedDict and a jsonschema object with additionalProperties: false, so all three check the same set of named fields.

Baseline matrix

Machine class: AMD Ryzen 7 PRO 7840U (Zen 4, 8c/16t, up to 5.1 GHz, a 2023-era mobile part) under WSL2 on Linux 6.18. Toolchain: rustc 1.96.0 (the build these numbers were measured on; the supported minimum is the lower rust-version in the manifest), CPython 3.14.6, pydantic 2.13.4, jsonschema 4.26.0, criterion 0.8.2, pytest-benchmark 5.2.3. The extension is the PGO release build — the profile-guided, fat-LTO wheel the release ships (maturin build --release --pgo); a plain --release build is a few tens of percent slower on these shapes, and a debug build is not representative. pydantic's PyPI wheels are likewise PGO-built, so this is a release-to-release comparison. Figures are the per-call median; re-run on your own hardware for absolute numbers. They are measured on the wheel carrying valgebra's full feature set — the per-validator precompute (record-field lookups, literal-union dispatch) and native string patterns — which leaves these shapes unchanged: the features earn their keep elsewhere, not by regressing the core.

End-to-end validation of a value that passes (median time per call, lower is better):

Shape valgebra pydantic (strict) jsonschema
list[int], 10,000 elements 47 us 88 us 26,000 us
Closed record, 50 int fields 0.97 us 1.9 us 134 us
Nested list[...], depth 25 0.25 us 1.9 us 77 us

valgebra relative to pydantic on this machine: ~7x faster on deep nesting, ~2x faster on the wide record, and ~1.9x faster on the large flat array. It is consistently far ahead of pure-Python jsonschema — by two to three orders of magnitude on every shape. pydantic does strictly more work on the record (it constructs output), so read that shape as a margin over a heavier operation, not a like-for-like loss for pydantic.

Core micro-benchmarks (criterion, release+LTO, indicative single run):

Operation Corpus Median
simplify redundant Boolean expression, depth 8 ~1.3 us
shifted 64-field pool-indexed record ~2.0 us
with_records_open record spine, depth 32 ~4.8 us

Honest limits

  • The numbers are a single machine class. They establish relative behavior, not a universal ranking. Shared CI runners are too noisy for a tight wall-clock budget, so the merge gate measures a deterministic instruction count instead.
  • The margins against pydantic come with the caveat that the two tools do different work: pydantic constructs output, valgebra only checks membership. The ratios answer "how fast is each tool's validation step," not "how much faster is membership than construction." Deep nesting is the widest gap; the array and record margins are narrower but consistent.
  • The comparison measures different operations (check vs check-and-construct vs pure-Python check). It answers "how fast is the validation step for each tool," not "are these tools interchangeable" — they are not. See the README for what valgebra is and is not for.
  • These figures are for the object path — validating a value already in hand. The JSON input path is measured separately, on the same machine class, in the JSON page.

How the record fast path is tuned

The closed-record membership check visits each dict entry once and matches the key against the declared fields, rather than looking up every declared field in turn (which builds a temporary Python string per field) and then scanning the dict a second time for undeclared keys. The key's UTF-8 is borrowed without allocating, and the field-name index is computed once when the validator is first used — with a fast non-cryptographic hasher, since the keys are the schema's own declared names rather than attacker input — then reused across calls, so a wide record no longer rebuilds and reallocates its name map on every validation. On the 50-field record above this measures ~1.0 us per call (PGO release build); the earlier per-field-lookup form was several times slower. Profiling with cachegrind attributed the removed cost to temporary-string creation, hashing, and allocation churn from the per-field lookups, and that attribution is an instruction count, so it holds across machine classes. The bool fast path and the aggregating explain walk stay membership-equivalent, locked by tests that assert both reach the same verdict across record shapes.

How large literal unions dispatch

A union whose members are all literals (a Literal["a", "b", ...] enum, or a discriminator) is compiled once into value-keyed sets — one for the integer literals, one for the string literals. An exact int or str value is then a single set lookup rather than a scan of every branch, so membership cost stops growing with the number of literals. The same-type literal rule is preserved: the integer set is consulted only for an exact int (never a bool), the string set only for an exact str, and any other value — a bool, float, None, a subclass instance, a big integer, or a JSON value — falls back to the linear scan that remains the single source of truth. On a 32-literal union this cuts the per-call median several-fold; the decision is identical to the scan, locked by tests over the cross-type cases.

Regression gate

The wall-clock numbers above are for humans reading results; they are too noisy on shared CI runners to gate a merge. The merge gate is instead a deterministic instruction count: a fixed workload exercises the core schema operations (crates/valgebra-core/examples/perf_workload.rs), runs under cachegrind, and its executed-instruction count is compared against a committed budget (scripts/perf_budget.json) by scripts/perf_gate.py. The count is identical across runs of a given build, so a regression past the budget ceiling fails the build without flaking. The tolerance absorbs cross-environment startup and compiler-codegen drift while still catching algorithmic regressions, which are far larger than the tolerance.

The gate covers the pure-Rust schema engine, which is portable enough for a committed budget. The end-to-end wall-clock suites run on the same CI lane with timing disabled, as a smoke test that they keep working.

The headline claim — that valgebra is pydantic-core-class — is gated too, by scripts/compare_gate.py. For each shape in a matrix it measures the ratio of per-call time (valgebra over pydantic-core), taking the minimum over many repeats, and compares each ratio against a recorded baseline (scripts/perf_compare.json) with a tolerance. A ratio cancels the runner's absolute speed: if the machine is slow, both libraries are slow in proportion, so the comparison survives the shared-runner noise an absolute budget cannot. A shape fails the merge gate when valgebra's ratio rises materially past its baseline — a competitive regression, whether from valgebra slowing down or ceding ground.

Re-record the budgets after an intentional change with:

python scripts/perf_gate.py --update            # core instruction budget
python scripts/compare_gate.py --update         # competitive ratios