GH#15317: fix dispatch claim system — 5 design flaws causing duplicate dispatch#15324
GH#15317: fix dispatch claim system — 5 design flaws causing duplicate dispatch#15324marcusquinn wants to merge 1 commit intomainfrom
Conversation
…317) Five design flaws caused duplicate dispatches, claim spam, and dispatch loops: 1. 'Dispatching worker' comment was LLM-posted, not deterministic — workers could die before posting, leaving no persistent signal for Layer 5 dedup. Now posted by dispatch_with_dedup() after confirming worker PID is alive. 2. Layer 5 (has_dispatch_comment) skipped self-posted comments — same runner's dispatch comment was invisible to its own dedup check. Removed self-skip. 3. Self-reclaim allowed same runner to bypass claim dedup after 30s — created dispatch loops. Replaced with stale-self detection that rejects + cleans up. 4. DISPATCH_CLAIM comments never cleaned up — winner claims persisted forever. Now deleted after deterministic dispatch comment is posted. 5. Pulse LLM instructed to post 'Dispatching worker' manually — could produce duplicates or be skipped. Updated pulse.md/pulse-sweep.md/automate.md to document automatic posting. Evidence: awardsapp #2051 had 29 DISPATCH_CLAIM comments over 6 hours, aidevops #15285 had duplicate workers dispatched.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Caution Review failedPull request was closed or merged during review WalkthroughThe dispatch claim system is restructured to eliminate duplicate worker dispatches by eliminating self-claim reclaim logic, making dispatch comment posting deterministic via the wrapper rather than relying on worker sessions, blocking on all recent dispatch comments regardless of author, and cleaning up stale claim artifacts post-dispatch. Changes
Sequence DiagramsequenceDiagram
actor Runner
participant PulseWrapper as pulse-wrapper.sh
participant ClaimHelper as dispatch-claim-helper.sh
participant DedupHelper as dispatch-dedup-helper.sh
participant GH as GitHub API
Runner->>PulseWrapper: dispatch_with_dedup()
PulseWrapper->>ClaimHelper: claim (with nonce)
alt Same Runner, Stale Nonce
ClaimHelper->>ClaimHelper: Detect CLAIM_STALE_SELF
ClaimHelper->>GH: Delete stale claim
ClaimHelper->>GH: Delete fresh claim
ClaimHelper-->>PulseWrapper: return 1 (CLAIM_STALE_SELF)
else Claim Success
ClaimHelper->>GH: Create DISPATCH_CLAIM comment
ClaimHelper-->>PulseWrapper: return 0 + comment_id
end
alt Stale Claim
PulseWrapper-->>Runner: Fail-open, exit 2
else Claim Won
PulseWrapper->>DedupHelper: has_dispatch_comment()
DedupHelper->>GH: Query recent dispatch comments
alt Recent Dispatch Comment Exists (Any Author)
DedupHelper-->>PulseWrapper: return 0 (blocked)
PulseWrapper-->>Runner: Skip dispatch
else No Dispatch Comment
DedupHelper-->>PulseWrapper: return 1 (proceed)
PulseWrapper->>GH: Launch worker
PulseWrapper->>GH: Post deterministic "Dispatching worker" comment
PulseWrapper->>GH: Delete DISPATCH_CLAIM (via comment_id)
PulseWrapper-->>Runner: Success
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Completion Summary
This summary was written by the worker at PR creation time for the deterministic merge pass. |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report SonarCloud: 0 bugs, 0 vulnerabilities, 1 code smells Wed Apr 1 21:54:46 UTC 2026: Code review monitoring started 📈 Current Quality Metrics
Generated on: Wed Apr 1 21:54:49 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
Up to standards ✅🟢 Issues
|



Summary
Fixes 5 design flaws in the dispatch claim system that caused duplicate worker dispatches, claim comment spam, and same-runner dispatch loops.
Closes #15317
Root Causes Fixed
Fix 1: Deterministic "Dispatching worker" comment (pulse-wrapper.sh)
Problem: The "Dispatching worker" comment was LLM-posted by the worker session. Workers could crash before posting, leaving no persistent signal for Layer 5 dedup. Without this signal, the issue would be re-dispatched every pulse cycle.
Fix:
dispatch_with_dedup()now posts the comment deterministically after confirming the worker PID is alive. Workers no longer need to post this comment.Evidence: #2051 had 29 DISPATCH_CLAIM comments over 6 hours because workers kept dying before posting.
Fix 2: Remove self-skip in Layer 5 (dispatch-dedup-helper.sh)
Problem:
has_dispatch_comment()skipped comments fromself_login. The same runner's dispatch comment was invisible to its own dedup check, allowing re-dispatch every cycle.Fix: All dispatch comments block regardless of author. The
self_loginparameter is kept for backward compatibility but no longer filters.Evidence: #2040 had alex-solovyev dispatch twice (20:33 and 22:57) because Layer 5 skipped its own dispatch comment.
Fix 3: Remove self-reclaim from claim protocol (dispatch-claim-helper.sh)
Problem: After 30s, the same runner could "reclaim" its own stale claim, enabling re-dispatch loops: claim → dispatch → worker dies → 30s → self-reclaim → dispatch again.
Fix: Stale same-runner claims are now treated as lost (exit 1). Both the stale claim and the fresh one are deleted. Re-dispatch requires explicit kill/failure comment from the pulse.
Evidence: #2051 had 25 claims from alex-solovyev over 6 hours via self-reclaim loop.
Fix 4: Clean up DISPATCH_CLAIM comment after dispatch (pulse-wrapper.sh)
Problem: Winner claim comments persisted forever as "audit trail", producing comment spam on issues.
Fix: After posting the deterministic dispatch comment,
dispatch_with_dedup()deletes the DISPATCH_CLAIM comment. The claim served its purpose (8s consensus window); the dispatch comment is the persistent signal.Fix 5: Document automatic dispatch comment posting (pulse.md, pulse-sweep.md, automate.md)
Problem: pulse.md instructed the LLM to post "Dispatching worker" manually — could produce duplicates or be skipped entirely.
Fix: Updated docs to note that
dispatch_with_dedup()posts this comment automatically. LLM sessions no longer need to post it.Files Changed
.agents/scripts/dispatch-claim-helper.sh— Remove self-reclaim, add stale-self detection.agents/scripts/dispatch-dedup-helper.sh— Remove self-skip inhas_dispatch_comment().agents/scripts/pulse-wrapper.sh— Post deterministic dispatch comment, clean up claim comment.agents/scripts/commands/pulse.md— Document automatic dispatch comment.agents/scripts/commands/pulse-sweep.md— Document automatic dispatch comment.agents/automate.md— Document automatic dispatch comment.agents/scripts/tests/test-dispatch-claim-helper.sh— Tests for new self-reclaim behaviorRuntime Testing
Risk level: High (dispatch coordination, state machines, cross-machine locking)
test-dispatch-claim-helper.sh)Key Decisions
self_loginkept inhas_dispatch_commentsignature: Backward compatibility — callers pass it; removing it would require updating all call sites.aidevops.sh v3.5.580 plugin for OpenCode v1.3.13 with claude-sonnet-4-6 spent 5m on this as a headless worker.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation