Skip to content

fix(sequencer): re-check parent checkpoint validity before pipelined L1 submission#22586

Merged
spalladino merged 7 commits intomerge-train/spartanfrom
palla/a-921-pipelining-fix-delayed-signature-invalidation
Apr 21, 2026
Merged

fix(sequencer): re-check parent checkpoint validity before pipelined L1 submission#22586
spalladino merged 7 commits intomerge-train/spartanfrom
palla/a-921-pipelining-fix-delayed-signature-invalidation

Conversation

@spalladino
Copy link
Copy Markdown
Contributor

@spalladino spalladino commented Apr 15, 2026

Motivation

With pipelining enabled, the sequencer optimistically builds a checkpoint on top of a proposed parent. If that parent checkpoint lands on L1 with invalid attestations, the pipelined checkpoint was never invalidating it — the invalidation was cleared at build time under the assumption the parent would handle it.

Approach

At submission time (after the pipelining sleep), the sequencer now waits for the parent checkpoint to land on L1, then verifies it matches expectations: correct hash, valid attestations, and no unexpected checkpoints appeared. If the parent is invalid, the pipelined work is discarded and an invalidation is enqueued instead. The skipInvalidateBlockAsProposer config is respected so the committee member fallback path still works.

Changes

  • sequencer-client (checkpoint_proposal_job): Restructured waitForAttestationsAndEnqueueSubmissionAsync to defer enqueueCheckpointForSubmission until after parent validation when pipelining. Added waitForParentCheckpointOnL1 which polls the archiver and checks 5 failure conditions: archiver sync timeout, parent not on L1, parent hash mismatch, parent invalid attestations, unexpected parent appeared. Added enqueueInvalidationForParent helper that respects skipInvalidateBlockAsProposer.
  • sequencer-client (events): New checkpoint-parent-mismatch event with slot, checkpoint number, and reason.
  • sequencer-client (metrics): New recordPipelineParentCheckpointMismatch counter with reason attribute.
  • telemetry-client: New SEQUENCER_PIPELINE_PARENT_CHECKPOINT_MISMATCH_COUNT metric definition.
  • sequencer-client (tests): 8 new unit tests covering all failure reasons and success paths via executeAndAwait, asserting publisher actions and emitted events.
  • end-to-end: Enabled pipelining (enableProposerPipelining: true, inboxLag: 2) in epochs_invalidate_block tests. Updated proposer invalidates multiple checkpoints test to verify the invalidation happens promptly (proposer path, not committee fallback).

Fixes A-909
Fixes A-921

@spalladino spalladino added the ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure label Apr 15, 2026
@spalladino spalladino marked this pull request as draft April 17, 2026 20:25
@spalladino spalladino force-pushed the palla/a-921-pipelining-fix-delayed-signature-invalidation branch from c554311 to 9aa1168 Compare April 17, 2026 22:33
@spalladino spalladino marked this pull request as ready for review April 20, 2026 14:31
@spalladino spalladino requested a review from Maddiaa0 April 20, 2026 20:39
private async waitForSyncedL2SlotNumber(waitForSlot: SlotNumber): Promise<boolean> {
const targetSlotStart = Number(getTimestampForSlot(this.targetSlot, this.l1Constants));
const targetSlotEndMs = (targetSlotStart + this.l1Constants.slotDuration) * 1000;
const syncDelayTolerance = this.l1Constants.slotDuration * 1000;
Copy link
Copy Markdown
Member

@Maddiaa0 Maddiaa0 Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a tolerance of an entire l2 slot feels very long

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. I intended an L1 slot but mistyped. Should've left this to Claude :-P

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up setting to two ethereum slots in case there's a missed L1 slot

* - If we built without a proposed parent: no new checkpoint must have appeared for that slot.
* If the parent has invalid attestations, enqueues an invalidation. Returns whether to proceed with the proposal.
*/
protected async waitForParentCheckpointOnL1(): Promise<boolean> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name here doesn't quite capture all the work done by this function. maybe ensureParentCheckpointValidity?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to waitForValidParentCheckpointOnL1 - IMO emphasis should be on the waiting part

Copy link
Copy Markdown
Member

@Maddiaa0 Maddiaa0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff

@spalladino spalladino force-pushed the palla/a-921-pipelining-fix-delayed-signature-invalidation branch from 10ff310 to 44206f5 Compare April 21, 2026 17:07
@spalladino spalladino enabled auto-merge (squash) April 21, 2026 17:07
spalladino and others added 7 commits April 21, 2026 14:17
…L1 submission

With pipelining, the sequencer optimistically builds on top of a proposed parent checkpoint. At submission time, we now wait for the parent to land on L1 and verify it matches expectations (hash, attestation validity). If invalid, the pipelined work is discarded and an invalidation is enqueued instead.

Fixes A-921

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The pipelining test was missing mocks for getSyncedL2SlotNumber and
getL2Tips, causing waitForParentCheckpointOnL1 to hang in a retryUntil
loop until the 120s Jest timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Avoid infinite loop in waitForSyncedL2SlotNumber when target slot has already expired (retryUntil treats timeout=0 as never-timeout)
- Rename checkpoint-parent-mismatch event to pipelined-checkpoint-discarded since it also fires for archiver-sync-timeout
- Replace `as any` casts in unit tests with proper InvalidateCheckpointRequest type

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@spalladino spalladino force-pushed the palla/a-921-pipelining-fix-delayed-signature-invalidation branch from 44206f5 to 9990a8b Compare April 21, 2026 17:17
@spalladino spalladino merged commit b3a425c into merge-train/spartan Apr 21, 2026
12 checks passed
@spalladino spalladino deleted the palla/a-921-pipelining-fix-delayed-signature-invalidation branch April 21, 2026 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants