Skip to content

refactor(bench): baseline-vs-feature comparison with structured output#14313

Draft
decofe wants to merge 6 commits intomasterfrom
zerosnacks/bench-autoopt
Draft

refactor(bench): baseline-vs-feature comparison with structured output#14313
decofe wants to merge 6 commits intomasterfrom
zerosnacks/bench-autoopt

Conversation

@decofe
Copy link
Copy Markdown
Contributor

@decofe decofe commented Apr 14, 2026

Replaces the version-comparison benchmarking tool with a baseline-vs-feature model. Adds structured JSON output, automated regression detection, and new benchmark types focused on test performance.

Changes

  • Baseline vs Feature comparison: compare two Foundry versions (branches, tags, commits) directly instead of iterating over a list of versions
  • Structured JSON output (--json): machine-readable bundle with per-benchmark comparisons and overall verdict
  • Automated verdict: each comparison is classified as improved/regressed/neutral based on a configurable noise threshold (--noise-threshold), process exits non-zero on regression
  • New benchmarks: forge_invariant_test and forge_fork_test (fork requires explicit --fork-url)
  • Fail-closed on errors: benchmark failures are now fatal instead of silently skipped
  • Security: fork URL passed via environment variable instead of shell string interpolation
  • Simplified workflow: single benchmark run instead of multiple invocations

Usage

# Basic comparison
foundry-bench --baseline stable --feature nightly --force-install

# With JSON output for automation
foundry-bench --baseline stable --feature nightly --json

# Specific benchmarks including fork mode
foundry-bench --baseline stable --feature nightly \
  --benchmarks forge_test,forge_fuzz_test,forge_invariant_test,forge_fork_test \
  --fork-url https://eth.merkle.io

Co-Authored-By: zerosnacks 95942363+zerosnacks@users.noreply.github.com

Prompted by: zerosnacks

Replace version-comparison benchmarking with baseline-vs-feature comparison.
Add structured JSON bundle output with automated regression detection.
Add new benchmark types: invariant test, fork test.
Make benchmark failures fatal instead of silently skipping.
Pass fork URL via environment variable to avoid shell interpolation.

Co-Authored-By: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
decofe and others added 5 commits April 14, 2026 21:15
Delete combine-benchmarks.sh, format-pr-comment.sh,
commit-and-read-benchmarks.sh, benchmark.sh, and LATEST.md.
Simplify workflow to run benchmarks in a read-only job and post
results via artifact-based publish step.

Co-Authored-By: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
…ures

Add self-contained Solidity benchmark suite at benches/fixtures/bench-suite/
that replaces external repo dependencies as the default benchmark target.

The suite is designed to be backwards compatible (pragma >=0.8.0), has no
external dependencies (no forge-std, no git submodules), and targets specific
Foundry subsystems:

- ERC20: baseline EVM execution, storage reads/writes
- Vault: AMM constant-product pool (math-heavy, multi-contract)
- Registry: mapping-heavy key-value store (storage-bound, batch ops)
- FuzzERC20/FuzzVault: fuzzer input generation, property checking
- InvariantVault/InvariantRegistry: handler-based invariant testing
- UnitTests: test runner startup / TTFB

External repos can still be used via --repos flag.

Co-Authored-By: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Add three new test files to the built-in bench suite:

- CheatcodeTests.t.sol: exercises the cheatcode inspector across
  deal, prank, warp/roll, store/load, etch, snapshot/revertTo,
  mockCall, expectRevert, label, record/accesses, getNonce/setNonce,
  and a combined cheatcode storm
- ForkTests.t.sol: exercises vm.createFork, forked state reads/writes,
  WETH/USDC/DAI reads, deposit on fork, vm.rollFork
- MultiForkTests.t.sol: exercises multi-fork switching, vm.makePersistent,
  cross-fork state reads, fork switch stress test

New benchmark types: forge_cheatcode_test, forge_multifork_test.
Existing forge_fork_test now runs targeted ForkTests instead of
global --fork-url mode. Fork/multifork tests read FORK_URL env var
via vm.envString. Shared Vm interface extracted to test/Vm.sol.

Co-Authored-By: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
- Fix Vm.sol: bytes -> bytes calldata for etch()
- Use targetContracts() getter pattern instead of vm.targetContract()
  cheatcode (works on both stable and nightly)
- Tighten FuzzVault swap bounds to avoid liquidity edge cases
- Exclude Fork/Invariant tests from forge_test to avoid failures
  when FORK_URL is unset or vm version differs
- Clear fuzz failure cache before each benchmark run

Co-Authored-By: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Co-Authored-By: zerosnacks <95942363+zerosnacks@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants