Skip to content

fix: restore mainnet prover agents to 4 replicas#22305

Merged
alexghr merged 1 commit intomerge-train/spartanfrom
claudebox/0be52f54ae6efa8c-5
Apr 3, 2026
Merged

fix: restore mainnet prover agents to 4 replicas#22305
alexghr merged 1 commit intomerge-train/spartanfrom
claudebox/0be52f54ae6efa8c-5

Conversation

@AztecBot
Copy link
Copy Markdown
Collaborator

@AztecBot AztecBot commented Apr 3, 2026

Summary

PROVER_REPLICAS was set to 0 in mainnet.env, which meant zero prover agent pods were deployed. All 4 agents shut down at Apr 2 19:50:43 UTC and never came back. The prover node keeps starting epoch proving jobs every ~38 min but they all hit their deadline because no agents are available to compute proofs.

The proven chain is still advancing (external validators submit proofs on L1), but our fisherman prover infrastructure has been non-functional for ~22 hours.

Changes

  • Set PROVER_REPLICAS=4 in spartan/environments/mainnet.env (was 0)

Test plan

  • Verify mainnet prover agents come back after deploy
  • Confirm epoch proofs start finalizing again

ClaudeBox log: https://claudebox.work/s/0be52f54ae6efa8c?run=5

@AztecBot AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels Apr 3, 2026
@alexghr alexghr added the ci-skip label Apr 3, 2026
@alexghr alexghr marked this pull request as ready for review April 3, 2026 15:53
@alexghr alexghr enabled auto-merge (squash) April 3, 2026 15:53
@alexghr alexghr disabled auto-merge April 3, 2026 15:53
@alexghr alexghr merged commit 74a5b84 into merge-train/spartan Apr 3, 2026
38 of 47 checks passed
@alexghr alexghr deleted the claudebox/0be52f54ae6efa8c-5 branch April 3, 2026 15:54
github-merge-queue Bot pushed a commit that referenced this pull request Apr 6, 2026
BEGIN_COMMIT_OVERRIDE
fix: deflake HA governance voting test by polling for L1/DB convergence
(#22220)
feat(world-state): add genesis timestamp support and GenesisData type
(#22201)
chore: revert: feat(world-state): add genesis timestamp support and
GenesisData type (#22201) (#22255)
fix(archiver): handle duplicate checkpoint from L1 reorg (#22252)
chore: update dashboard (#22260)
fix: remove detailed revert codes (#22274)
chore: use ESO in grafana (#22271)
chore: (A-751) robust response error handling in json-rpc client
(#22246)
fix: separate fisherman StatefulSet from rpc-node and stop archiver
pollution (#22183)
fix: restore mainnet prover agents to 4 replicas (#22305)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. ci-skip claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants