Skip to content

fix: separate fisherman StatefulSet from rpc-node and stop archiver pollution#22183

Merged
PhilWindle merged 3 commits intomerge-train/spartanfrom
spyros/a-889-fix-fisherman-archiver-pollution
Apr 2, 2026
Merged

fix: separate fisherman StatefulSet from rpc-node and stop archiver pollution#22183
PhilWindle merged 3 commits intomerge-train/spartanfrom
spyros/a-889-fix-fisherman-archiver-pollution

Conversation

@spypsy
Copy link
Copy Markdown
Member

@spypsy spypsy commented Mar 31, 2026

The mainnet rpc-node pod runs in fisherman mode, causing it to push locally-built blocks into the archiver every slot. This triggers a conflict-prune-reorg cascade on every block (~every 72 seconds), leaving the node in a perpetually unstable state and exposing fake blocks to connected RPC clients — which invalidates any transactions anchored against them.

Part 1 — Code fix (checkpoint_proposal_job.ts): syncProposedBlockToArchiver now skips the archiver push when fishermanMode is true. The fisherman continues building blocks for fee analysis and validation, but those blocks no longer pollute the local archiver.

Part 2 — Infrastructure split (mainnet.env, main.tf, variables.tf, deploy_network.sh): Adds a dedicated FISHERMAN_REPLICAS variable and a new fisherman Helm release (separate StatefulSet). mainnet.env now uses FISHERMAN_REPLICAS=1 instead of FISHERMAN_MODE=true on the rpc-node, so the rpc-node becomes a clean archiving/RPC node and the fisherman runs as mainnet-fisherman-aztec-node-0.

Fixes A-889

@spypsy spypsy marked this pull request as draft March 31, 2026 13:31
@spypsy spypsy force-pushed the spyros/a-889-fix-fisherman-archiver-pollution branch from 42d52c8 to e2510c3 Compare March 31, 2026 13:32
@spypsy spypsy marked this pull request as ready for review April 1, 2026 16:40
@PhilWindle PhilWindle merged commit c4220ce into merge-train/spartan Apr 2, 2026
21 checks passed
@PhilWindle PhilWindle deleted the spyros/a-889-fix-fisherman-archiver-pollution branch April 2, 2026 17:06
@AztecBot
Copy link
Copy Markdown
Collaborator

AztecBot commented Apr 2, 2026

❌ Failed to cherry-pick to v4-next due to conflicts. (🤖) View backport run.

AztecBot pushed a commit that referenced this pull request Apr 2, 2026
AztecBot added a commit that referenced this pull request Apr 2, 2026
- .gitignore: removed ignition-fisherman.env entry (PR intent was to delete it)
- devnet-avm-prover.env, ignition-fisherman.env: deleted (already removed on next)
- mainnet.env: kept deleted (doesn't exist on v4-next)
- checkpoint_proposal_job.ts: merged both conditions (kept !== false from v4-next + added fishermanMode check from PR)
spypsy added a commit that referenced this pull request Apr 3, 2026
…ollution (#22183)

The mainnet `rpc-node` pod runs in fisherman mode, causing it to push
locally-built blocks into the archiver every slot. This triggers a
conflict-prune-reorg cascade on every block (~every 72 seconds), leaving
the node in a perpetually unstable state and exposing fake blocks to
connected RPC clients — which invalidates any transactions anchored
against them.

**Part 1 — Code fix** (`checkpoint_proposal_job.ts`):
`syncProposedBlockToArchiver` now skips the archiver push when
`fishermanMode` is `true`. The fisherman continues building blocks for
fee analysis and validation, but those blocks no longer pollute the
local archiver.

**Part 2 — Infrastructure split** (`mainnet.env`, `main.tf`,
`variables.tf`, `deploy_network.sh`): Adds a dedicated
`FISHERMAN_REPLICAS` variable and a new `fisherman` Helm release
(separate StatefulSet). `mainnet.env` now uses `FISHERMAN_REPLICAS=1`
instead of `FISHERMAN_MODE=true` on the rpc-node, so the rpc-node
becomes a clean archiving/RPC node and the fisherman runs as
`mainnet-fisherman-aztec-node-0`.

Fixes
[A-889](https://linear.app/aztec-labs/issue/A-889/separate-fisherman-node-from-rpc-node-and-prevent-archiver-pollution)
ludamad added a commit that referenced this pull request Apr 3, 2026
…ollution (backport #22183) (#22284)

## Summary

Backport of #22183
to v4-next.

**Code fix** (`checkpoint_proposal_job.ts`):
`syncProposedBlockToArchiver` now skips the archiver push when
`fishermanMode` is `true`, preventing spurious reorg cascades on
mainnet.

**Infrastructure split** (`main.tf`, `variables.tf`,
`deploy_network.sh`): Replaces `FISHERMAN_MODE` with
`FISHERMAN_REPLICAS` and adds a dedicated `fisherman` Helm release as a
separate StatefulSet.

## Conflicts resolved

- `spartan/.gitignore`: Removed `ignition-fisherman.env` entry (PR
intent was to delete it; `block-capacity.env` doesn't exist on v4-next)
- `spartan/environments/devnet-avm-prover.env`,
`ignition-fisherman.env`: Deleted (modify/delete — PR deletes, v4-next
had modifications)
- `spartan/environments/mainnet.env`: Kept deleted (doesn't exist on
v4-next)
- `checkpoint_proposal_job.ts`: Merged v4-next's `!== false` condition
with PR's `|| this.config.fishermanMode` addition

ClaudeBox log: https://claudebox.work/s/b82065b06bcc585b?run=1
github-merge-queue Bot pushed a commit that referenced this pull request Apr 6, 2026
BEGIN_COMMIT_OVERRIDE
fix: deflake HA governance voting test by polling for L1/DB convergence
(#22220)
feat(world-state): add genesis timestamp support and GenesisData type
(#22201)
chore: revert: feat(world-state): add genesis timestamp support and
GenesisData type (#22201) (#22255)
fix(archiver): handle duplicate checkpoint from L1 reorg (#22252)
chore: update dashboard (#22260)
fix: remove detailed revert codes (#22274)
chore: use ESO in grafana (#22271)
chore: (A-751) robust response error handling in json-rpc client
(#22246)
fix: separate fisherman StatefulSet from rpc-node and stop archiver
pollution (#22183)
fix: restore mainnet prover agents to 4 replicas (#22305)
END_COMMIT_OVERRIDE
AztecBot added a commit that referenced this pull request Apr 8, 2026
BEGIN_COMMIT_OVERRIDE
fix: pippenger edge case (#22256)
cherry-pick: fix: separate fisherman StatefulSet from rpc-node and stop
archiver pollution (#22183) — WITH CONFLICTS
fix: separate fisherman StatefulSet from rpc-node and stop archiver
pollution (backport #22183) (#22284)
fix: preserve DeployAccountMethod type in with() method chaining
(#22322)
docs: backport docs build/release infrastructure from #22106 and #22144
(#22223)
chore(docs): remove v5 nightly and devnet versioned docs (backport
#22193) (#22236)
chore: improve release-docs skill and add release-network-docs skill
(#22328)
chore: remove dead to_be_bytes fn (#22243)
fix: correct args length in `#[authorize_once]` (#22209)
chore: fix inconsistent usage of contract class hash fn (#22248)
chore: delete old field comparison fns in favor of lt (#22249)
fix: all account overrides + gas limits (#22173)
feat: allow for runtime length arrays of sorts and selects (#22250)
chore: remove dead pub global vars reexport (#22244)
chore: changed default wait behavior (#22325)
chore: apply code consistency consolidation (#22251)
fix(docs): simplify TypeScript API reference links (backport #22232)
(#22369)
fix: remove detailed revert codes (#22380)
fix: backport #21673 — prevent HA peer proposals from blocking
equivocation in duplicate proposal test (#21693)
fix: subfield note selectors (#22211)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants