Skip to content

chore: deflake duplicate attestations and proposals slash tests#21294

Merged
PhilWindle merged 2 commits intomerge-train/spartanfrom
mr/deflake-duplicate-proposal-and-attestation-slash-tests
Mar 10, 2026
Merged

chore: deflake duplicate attestations and proposals slash tests#21294
PhilWindle merged 2 commits intomerge-train/spartanfrom
mr/deflake-duplicate-proposal-and-attestation-slash-tests

Conversation

@mrzeszutko
Copy link
Copy Markdown
Contributor

@mrzeszutko mrzeszutko commented Mar 10, 2026

Summary

Fixes a race condition in the duplicate_attestation_slash and duplicate_proposal_slash e2e tests that caused intermittent timeouts waiting for slashing offenses to be detected.

Root Cause

Investigated CI failure 2abee794200ae6a7 (commit f71c235a670d, merge queue for PR #20555). This run did include the fix from #20990, yet the duplicate_attestation_slash test still failed with a timeout after ~478 seconds.

The failure sequence from the logs:

  1. awaitEpochWithProposer found the malicious proposer at slot 14 (epoch 7)
  2. The function returned, having warped L1 time to the start of epoch 7
  3. Sequencers were then started (await Promise.all(nodes.map(n => n.getSequencer()!.start())))
  4. By the time sequencers were ready to build blocks, L1 time had advanced past epoch 7 into epoch 8 (first block built at slot 16)
  5. The malicious proposer was never selected in epoch 8+, so no duplicate proposals/attestations were created
  6. The slasher had nothing to detect, and the test timed out

The core issue: awaitEpochWithProposer warped to the target epoch and returned, but starting sequencers takes real time. With only 2 slots per epoch (48s total), the epoch passed before sequencers could act.

Fix

Renamed awaitEpochWithProposer to advanceToEpochBeforeProposer and changed the approach to a two-phase pattern:

  1. Find phase: The function now checks the next epoch's slots (N+1) instead of the current epoch's (N). When the target proposer is found, it returns { targetEpoch } while staying at epoch N -- one full epoch before the target.

  2. Start + warp phase (in the test): After the function returns, sequencers are started while still one epoch before the target. Only then does the test warp to the target epoch via advanceToEpoch(targetEpoch).

This eliminates the race because sequencers are already running when the target epoch begins.

The function can safely query future epoch slots because epochCache.getProposerAttesterAddressInSlot works for any slot within the lagInEpochsForValidatorSet window (typically 2 epochs ahead), and we only look 1 epoch ahead.

Changes

  • shared.ts: Renamed awaitEpochWithProposer -> advanceToEpochBeforeProposer. Now checks next epoch's slots and returns { targetEpoch: EpochNumber } instead of void.
  • duplicate_attestation_slash.test.ts: Updated to start sequencers before warping to target epoch.
  • duplicate_proposal_slash.test.ts: Same change. Also filtered offense assertions to only check DUPLICATE_PROPOSAL offenses, since the two malicious nodes sharing the same key each self-attest to their own (different) checkpoint proposals, causing honest nodes to also detect a DUPLICATE_ATTESTATION offense.

Fixes A-632

@PhilWindle PhilWindle merged commit d1e64d0 into merge-train/spartan Mar 10, 2026
12 checks passed
@PhilWindle PhilWindle deleted the mr/deflake-duplicate-proposal-and-attestation-slash-tests branch March 10, 2026 12:21
AztecBot pushed a commit that referenced this pull request Mar 10, 2026
BEGIN_COMMIT_OVERRIDE
fix: (A-623) increase committee timeout in scenario smoke test (#21193)
feat: orchestrator enqueues via serial queue (#21247)
feat: rollup mana limit gas validation (#21219)
fix: make e2e HA test more deterministic (#21199)
chore: fix chonk_browser lint warning (#21265)
chore: deploy SPONSORED_FPC in test networks (#21254)
fix: (A-635) e2e bot flake on nonce mismatch (#21288)
chore: deflake duplicate attestations and proposals slash tests (#21294)
fix(sequencer): fix log when not enough txs (#21297)
chore: send env var to pods (#21307)
END_COMMIT_OVERRIDE
github-merge-queue Bot pushed a commit that referenced this pull request Mar 11, 2026
BEGIN_COMMIT_OVERRIDE
fix: (A-623) increase committee timeout in scenario smoke test (#21193)
feat: orchestrator enqueues via serial queue (#21247)
feat: rollup mana limit gas validation (#21219)
fix: make e2e HA test more deterministic (#21199)
chore: fix chonk_browser lint warning (#21265)
chore: deploy SPONSORED_FPC in test networks (#21254)
fix: (A-635) e2e bot flake on nonce mismatch (#21288)
chore: deflake duplicate attestations and proposals slash tests (#21294)
fix(sequencer): fix log when not enough txs (#21297)
chore: send env var to pods (#21307)
fix: Simulate gas in n tps test. Set min txs per block to 1 (#21312)
fix: update dependabot dependencies (#21238)
test: run nightly bench of block capacity (#20726)
fix: update block_capacity test to use new send() result types (#21345)
fix(node): fix index misalignment in findLeavesIndexes (#21327)
fix(log): do not log validation error if unregistered handler (#21111)
fix: limit parallel blocks in prover to max AVM parallel simulations
(#21320)
fix: use native sha256 to speed up proving job id generation (#21292)
chore: remove v4-devnet-1 (#21044)
fix(validator): wait for l1 sync before processing block proposals
(#21336)
fix(txpool): cap priority fee with max fees when computing priority
(#21279)
chore: Properly compute finalized block (#21156)
fix: remove extra argument in KVArchiverDataStore constructor call
(#21361)
chore: revert l2 slot time 72 -> 36 on scenario network (#21291)
fix(archiver): do not error if proposed block matches checkpointed
(#21367)
fix(claude): rule to not append echo exit (#21368)
chore: reduce severity of errors due to HA node not acquiring signature
(#21311)
fix: make reqresp batch retry test deterministic (#21322)
fix: (A-643) add buffer to maxFeePerBlobGas for gas estimation and fix
bump loop truncation (#21323)
fix(e2e): use L2 priority fee in deploy_method same-block test (#21373)
fix: reqresp flake & add logging (#21334)
END_COMMIT_OVERRIDE
AztecBot pushed a commit that referenced this pull request Mar 19, 2026
## Summary

Fixes a race condition in the `duplicate_attestation_slash` and
`duplicate_proposal_slash` e2e tests that caused intermittent timeouts
waiting for slashing offenses to be detected.

## Root Cause

Investigated CI failure
[2abee794200ae6a7](http://ci.aztec-labs.com/2abee794200ae6a7) (commit
`f71c235a670d`, merge queue for PR #20555). This run **did include** the
fix from #20990, yet the `duplicate_attestation_slash` test still failed
with a timeout after ~478 seconds.

The failure sequence from the logs:
1. `awaitEpochWithProposer` found the malicious proposer at **slot 14**
(epoch 7)
2. The function returned, having warped L1 time to the start of epoch 7
3. Sequencers were then started (`await Promise.all(nodes.map(n =>
n.getSequencer()!.start()))`)
4. By the time sequencers were ready to build blocks, L1 time had
advanced past epoch 7 into **epoch 8** (first block built at slot 16)
5. The malicious proposer was never selected in epoch 8+, so no
duplicate proposals/attestations were created
6. The slasher had nothing to detect, and the test timed out

The core issue: `awaitEpochWithProposer` warped to the target epoch and
returned, but starting sequencers takes real time. With only 2 slots per
epoch (48s total), the epoch passed before sequencers could act.

## Fix

Renamed `awaitEpochWithProposer` to `advanceToEpochBeforeProposer` and
changed the approach to a two-phase pattern:

1. **Find phase**: The function now checks the **next** epoch's slots
(N+1) instead of the current epoch's (N). When the target proposer is
found, it returns `{ targetEpoch }` while staying at epoch N -- one full
epoch before the target.

2. **Start + warp phase** (in the test): After the function returns,
sequencers are started while still one epoch before the target. Only
then does the test warp to the target epoch via
`advanceToEpoch(targetEpoch)`.

This eliminates the race because sequencers are already running when the
target epoch begins.

The function can safely query future epoch slots because
`epochCache.getProposerAttesterAddressInSlot` works for any slot within
the `lagInEpochsForValidatorSet` window (typically 2 epochs ahead), and
we only look 1 epoch ahead.

## Changes

- **`shared.ts`**: Renamed `awaitEpochWithProposer` ->
`advanceToEpochBeforeProposer`. Now checks next epoch's slots and
returns `{ targetEpoch: EpochNumber }` instead of `void`.
- **`duplicate_attestation_slash.test.ts`**: Updated to start sequencers
before warping to target epoch.
- **`duplicate_proposal_slash.test.ts`**: Same change. Also filtered
offense assertions to only check `DUPLICATE_PROPOSAL` offenses, since
the two malicious nodes sharing the same key each self-attest to their
own (different) checkpoint proposals, causing honest nodes to also
detect a `DUPLICATE_ATTESTATION` offense.

Fixes A-632
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants