Skip to content

fix(prover-client): reject stale job promises and count timeouts toward retry limit#21842

Merged
PhilWindle merged 4 commits intomerge-train/spartanfrom
spy/fix-proving-broker-audit-711-715
Apr 21, 2026
Merged

fix(prover-client): reject stale job promises and count timeouts toward retry limit#21842
PhilWindle merged 4 commits intomerge-train/spartanfrom
spy/fix-proving-broker-audit-711-715

Conversation

@spypsy
Copy link
Copy Markdown
Member

@spypsy spypsy commented Mar 20, 2026

Summary

  • Fixes A-711: cleanUpProvingJobState was calling deferred.promise.catch(() => {}) before deferred.reject() to suppress unhandled rejections, but this doesn't work — .catch() creates a new branched promise; any code already awaiting the original promise still receives an unhandled rejection. Fixed by resolving with { status: 'rejected', reason: '...' } instead, consistent with how the rest of the class settles promises, and making unhandled rejections impossible.

Fixes A-711

spypsy added 2 commits March 20, 2026 13:54
…rd retry limit

Fixes A-711: cleanUpProvingJobState was deleting promises without settling
them first, causing any awaiter to hang forever. Now rejects each unsettled
promise before removal.

Fixes A-715: timed-out jobs were re-enqueued without incrementing the retry
counter, allowing them to loop forever bypassing maxRetries. Now increments
the retry count on each timeout re-enqueue.

Adds regression tests for both fixes.

Made-with: Cursor
Base automatically changed from merge-train/spartan to next March 20, 2026 22:34
@PhilWindle PhilWindle requested a review from alexghr March 25, 2026 09:29
@@ -632,8 +637,10 @@ export class ProvingBroker implements ProvingJobProducer, ProvingJobConsumer, Pr
const now = this.msTimeSource();
const msSinceLastUpdate = now - metadata.lastUpdatedAt;
if (msSinceLastUpdate >= this.jobTimeoutMs) {
const retries = this.retries.get(id) ?? 0;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to make this change. We should limit this PR to the other change only.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undid A-715 fixes, will do in different PR

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting of this.retries should have been removed shouldn't it?

Comment thread yarn-project/prover-client/src/proving_broker/proving_broker.ts Outdated
Comment thread yarn-project/prover-client/src/proving_broker/proving_broker.ts Outdated
@spypsy spypsy changed the base branch from next to merge-train/spartan April 2, 2026 10:59
@spypsy spypsy force-pushed the spy/fix-proving-broker-audit-711-715 branch 3 times, most recently from 6ecca5c to 1da727e Compare April 2, 2026 11:54
@spypsy spypsy force-pushed the spy/fix-proving-broker-audit-711-715 branch from 1da727e to 5034e17 Compare April 7, 2026 11:32
@PhilWindle PhilWindle merged commit aef1970 into merge-train/spartan Apr 21, 2026
12 checks passed
@PhilWindle PhilWindle deleted the spy/fix-proving-broker-audit-711-715 branch April 21, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants