fix: atomic claim prevents scheduled tasks from executing twice by gabi-simons · Pull Request #657 · nanocoai/nanoclaw

gabi-simons · 2026-03-02T20:43:49Z

Summary

Fixes the scheduler race condition where tasks execute twice because enqueueTask() only deduplicates against pendingTasks — a running task has already been shifted out, so the next poll re-queues it.

Closes #138, #211, #300, #578, #669

The fix

3-line fix in group-queue.ts: Add runningTaskId to GroupState. enqueueTask() now checks both the pending array and the currently-running task ID, rejecting duplicates at the queue layer where the state lives.

Drift fix in task-scheduler.ts: computeNextRun() anchors interval tasks to scheduled_time + N*interval instead of Date.now() + interval, preventing cumulative timing drift. Includes a guard against zero/negative interval values.

Why not "atomic claim" (previous approach in this PR)

The first commits implemented claimDueTasks() — a SQLite transaction that advanced next_run before execution. Per review, this had real downsides:

Once-task crash risk: process dies after claim → task orphaned forever
No retry on failure: next_run already skipped ahead
Misleading transaction: Node.js is single-threaded, better-sqlite3 is synchronous — no interleaving possible

The dedup belongs in GroupQueue, not the DB layer.

Test plan

334/334 tests pass (npx vitest run)
New regression test: rejects duplicate enqueue of a currently-running task
Drift test: interval anchored to scheduled time, not wall clock
Deploy to staging and verify no duplicate task runs over 24h

Credits

Drift logic from PR #601 by @taslim. Issues reported by @baijunjie (#138), @Michaelliv (#300), and @kk17 (#669).

🤖 Generated with Claude Code

@taslim

…coai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gavrielc · 2026-03-02T21:37:09Z

The drift fix (anchoring to scheduled_time + N*interval) is a solid improvement — worth keeping regardless of approach.

But I think the dedup is being solved at the wrong layer. The GroupQueue already deduplicates pending tasks by ID (group-queue.ts:94), but that check misses currently-running tasks (they've been shifted out of pendingTasks). So the next poll re-discovers the task and re-queues it. That's the actual bug.

Advancing next_run before execution works around it, but has real downsides:

Once-task crash risk: process dies after claim (next_run = NULL) but before execution → task is orphaned forever (active status, but no poll finds it).
No retry on failure: execution fails, but next_run already skipped ahead.
Transaction framing is misleading: Node.js is single-threaded, better-sqlite3 is synchronous — no interleaving is possible. The transaction provides no protection over sequential statements.

The fix is ~3 lines in group-queue.ts — track the running task ID in GroupState and include it in the dedup check:

// Add to GroupState: runningTaskId: string | null
// In enqueueTask: if (state.isTaskContainer && state.runningTaskId === taskId) return;
// Set/clear in runTask()

This keeps next_run accurate (updated post-execution), preserves crash recovery, and avoids coupling scheduler logic into the DB layer. The computeNextRun drift fix can be applied in updateTaskAfterRun where it belongs.

@taslim

…on (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- nanocoai#657: computeNextRun() prevents interval drift on scheduled tasks - nanocoai#655: auto-retry with fresh session when resume fails (breaks infinite loops) - nanocoai#636: update next_run before enqueue to prevent duplicate task runs - nanocoai#651: reduce container stop timeout to 1s for faster restarts - nanocoai#622: add jq to container Dockerfile for safe JSON parsing - nanocoai#627: inject date/time context into all agent prompts

Koshkoshinsk · 2026-03-04T10:33:11Z

E2E Test Results

@gabi-simons — Full end-to-end testing completed. All checks pass.

Summary

#	Check	Expected	Result
1	Phase 2 (main): duplicate execution	2+ runs	2 runs (10:21:33 & 10:21:48) — bug reproduced
2	Phase 3 (PR): "Task already running, skipping"	Duplicate rejected	3 duplicates rejected
3	Phase 3 (PR): single execution in DB	1 row	1 row (`success`, 49793ms)
4	Phase 4: `npm test`	0 failures	351 tests passed (30 files, 0 failures)

Test Details

Bug reproduction (main): Reduced poll interval to 15s, created an interval task with sleep 25. On main, the task was enqueued twice within 15 seconds — confirmed duplicate execution.
Fix verification (PR branch): Same setup. Logs showed Task already running, skipping on 3 consecutive polls while the container was active. DB confirmed exactly 1 execution.
Unit tests: All 351 tests pass including the 4 new tests (computeNextRun anchoring, once-task null return, missed interval skip, duplicate enqueue rejection).
Edge case review: Verified that different tasks for the same group are not blocked, runningTaskId clears properly in finally on both success and failure paths, and messages queue correctly while tasks run.

No regressions found.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

gavrielc · 2026-03-04T14:23:45Z

Great fix. Thanks!!!

Merge upstream/main which includes: - fix: scheduler race condition preventing duplicate task execution (nanocoai#657) - refactor: multi-channel architecture with self-registering channels (nanocoai#500) - feat: local whisper skill (nanocoai#702) - fix: WhatsApp error handling (nanocoai#695) Adapted our Telegram channel to use the new channel registry pattern (registerChannel + factory). Removed direct channel instantiation from index.ts in favor of the upstream barrel import approach. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

# [1.3.0](v1.2.0...v1.3.0) (2026-03-08) ### Bug Fixes * add-voice-transcription skill drops WhatsApp registerChannel call ([nanocoai#766](https://github.com/terrylica/nanoclaw/issues/766)) ([47ad2e6](47ad2e6)) * aggressive false positive prevention — 5-layer MiniMax pipeline, devil's advocate round, FP learning ([8bfa372](8bfa372)) * atomic claim prevents scheduled tasks from executing twice ([nanocoai#657](https://github.com/terrylica/nanoclaw/issues/657)) ([f794185](f794185)), closes [nanocoai#138](https://github.com/terrylica/nanoclaw/issues/138) [nanocoai#138](https://github.com/terrylica/nanoclaw/issues/138) [nanocoai#211](https://github.com/terrylica/nanoclaw/issues/211) [nanocoai#300](https://github.com/terrylica/nanoclaw/issues/300) [nanocoai#578](https://github.com/terrylica/nanoclaw/issues/578) [nanocoai#601](https://github.com/terrylica/nanoclaw/issues/601) [nanocoai#138](https://github.com/terrylica/nanoclaw/issues/138) [nanocoai#300](https://github.com/terrylica/nanoclaw/issues/300) [nanocoai#138](https://github.com/terrylica/nanoclaw/issues/138) * cc-skills now reads label strategy + content types; Claude JSON parsing hardened ([fd7fc7f](fd7fc7f)) * correct misleading send_message tool description for scheduled tasks ([nanocoai#729](https://github.com/terrylica/nanoclaw/issues/729)) ([ec0e42b](ec0e42b)) * **db:** add LIMIT to unbounded message history queries ([nanocoai#692](https://github.com/terrylica/nanoclaw/issues/692)) ([nanocoai#735](https://github.com/terrylica/nanoclaw/issues/735)) ([74b02c8](74b02c8)) * format src/index.ts to pass CI prettier check ([nanocoai#711](https://github.com/terrylica/nanoclaw/issues/711)) ([df2bac6](df2bac6)), closes [nanocoai#710](https://github.com/terrylica/nanoclaw/issues/710) * grant write permissions to CLAUDE.md maintenance claude -p call ([9ddb433](9ddb433)) * rename _chatJid to chatJid in onMessage callback ([1436186](1436186)) * use 'state' instead of 'stateReason' for gh compatibility on bigblack ([a4f2e92](a4f2e92)) * **whatsapp:** add error handling to messages.upsert handler ([nanocoai#695](https://github.com/terrylica/nanoclaw/issues/695)) ([5e3d8b6](5e3d8b6)) * **whatsapp:** write pairing code to file for immediate access ([nanocoai#745](https://github.com/terrylica/nanoclaw/issues/745)) ([be19911](be19911)) ### Features * add /add-ollama skill for local model inference ([nanocoai#712](https://github.com/terrylica/nanoclaw/issues/712)) ([298c3ea](298c3ea)) * add ast-grep rules for Python static analysis ([a548761](a548761)) * add mise deploy task for bigblack deployment ([c39a1f4](c39a1f4)) * add NDJSON telemetry logging for all Telegram messages ([7f64ea6](7f64ea6)) * add update_task tool and return task ID from schedule_task ([68123fd](68123fd)) * cc-skills integration — enhanced issue creation with taxonomy-aware labels, type-specific templates, and discovery provenance ([602e65d](602e65d)) * CLAUDE.md maintenance creates GitHub issues with full link to Telegram ([ba34620](ba34620)) * CLAUDE.md maintenance, devil's advocate fix, OpenGrep + proactive scanning ([ce66e88](ce66e88)) * confidence scoring, verification scripts, log rotation — 3 more FP prevention layers ([0ff2c3c](0ff2c3c)) * iterative MiniMax self-validation (3 adversarial rounds) ([fc05aff](fc05aff)) * Phase 0 — enable Telegram channel and Docker Compose deployment ([ebbf59c](ebbf59c)) * Phase 2 — MiniMax orchestrator loop for continuous validation ([17e90a3](17e90a3)) * proactive algo correctness scanning with full Telegram + GitHub issue reporting ([4b68c3e](4b68c3e)) * **skills:** add image vision skill for WhatsApp ([nanocoai#770](https://github.com/terrylica/nanoclaw/issues/770)) ([af937d6](af937d6)) * **skills:** add pdf-reader skill ([nanocoai#772](https://github.com/terrylica/nanoclaw/issues/772)) ([0b260ec](0b260ec)) * **skills:** add use-local-whisper skill package ([nanocoai#702](https://github.com/terrylica/nanoclaw/issues/702)) ([03f792b](03f792b)) * timezone-aware context injection for agent prompts ([nanocoai#691](https://github.com/terrylica/nanoclaw/issues/691)) ([632713b](632713b)), closes [nanocoai#483](https://github.com/terrylica/nanoclaw/issues/483) [nanocoai#483](https://github.com/terrylica/nanoclaw/issues/483) [nanocoai#526](https://github.com/terrylica/nanoclaw/issues/526) * whole-repo scanning instead of 3-file batches ([1ace951](1ace951)) * wire trace UUIDs into all Telegram notifications ([b48f0e9](b48f0e9))

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

Merged upstream changes including: - DB query limits (nanocoai#692/nanocoai#735) - Timezone-aware context injection (nanocoai#691) - SDK bump to 0.2.68 - update_task tool and task ID passthrough - Sender allowlist (nanocoai#705) - Atomic task claiming (nanocoai#657) Resolved conflicts in index.ts (sender allowlist + our steering/cancel), ipc.ts (idempotency check + taskId passthrough), and task-scheduler.ts (computeNextRun refactor + our cost tracking/alerting). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@taslim

* fix: atomic claim prevents scheduled tasks from executing twice (#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes #138, #211, #300, #578 Co-authored-by: @taslim (PR #601) Co-authored-by: @baijunjie (Issue #138) Co-authored-by: @Michaelliv (Issue #300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes #138, #211, #300, #578 Co-authored-by: @taslim (PR #601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

…coai#657)

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

@taslim

…coai#657) * fix: atomic claim prevents scheduled tasks from executing twice (nanocoai#138) Replace the two-phase getDueTasks() + deferred updateTaskAfterRun() with an atomic SQLite transaction (claimDueTasks) that advances next_run BEFORE dispatching tasks to the queue. This eliminates the race window where subsequent scheduler polls re-discover in-progress tasks. Key changes: - claimDueTasks(): SELECT + UPDATE in a single db.transaction(), so no poll can read stale next_run values. Once-tasks get next_run=NULL; recurring tasks get next_run advanced to the future. - computeNextRun(): anchors interval tasks to the scheduled time (not Date.now()) to prevent cumulative drift. Includes a while-loop to skip missed intervals and a guard against invalid interval values. - updateTaskAfterRun(): simplified to only record last_run/last_result since next_run is already handled by the claim. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-authored-by: @baijunjie (Issue nanocoai#138) Co-authored-by: @Michaelliv (Issue nanocoai#300) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * style: apply prettier formatting Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: track running task ID in GroupQueue to prevent duplicate execution (nanocoai#138) Previous commits implemented an "atomic claim" approach (claimDueTasks) that advanced next_run before execution. Per Gavriel's review, this solved the symptom at the wrong layer and introduced crash-recovery risks for once-tasks. This commit reverts claimDueTasks and instead fixes the actual bug: GroupQueue.enqueueTask() only checked pendingTasks for duplicates, but running tasks had already been shifted out. Adding runningTaskId to GroupState closes that gap with a 3-line fix at the correct layer. The computeNextRun() drift fix is retained, applied post-execution where it belongs. Closes nanocoai#138, nanocoai#211, nanocoai#300, nanocoai#578 Co-authored-by: @taslim (PR nanocoai#601) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add changelog entry for scheduler duplicate fix Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add contributors for scheduler race condition fix Co-Authored-By: Taslim <9999802+taslim@users.noreply.github.com> Co-Authored-By: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-Authored-By: Michael <13676242+Michaelliv@users.noreply.github.com> Co-Authored-By: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: gavrielc <gabicohen22@yahoo.com> Co-authored-by: Taslim <9999802+taslim@users.noreply.github.com> Co-authored-by: BaiJunjie <7956480+baijunjie@users.noreply.github.com> Co-authored-by: Michael <13676242+Michaelliv@users.noreply.github.com> Co-authored-by: Kyle Zhike Chen <3477852+kk17@users.noreply.github.com>

gabi-simons requested a review from gavrielc as a code owner March 2, 2026 20:43

style: apply prettier formatting

15668ad

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This was referenced Mar 3, 2026

🦞 OpenClaw 生态日报 2026-03-03 rollysys/agents-radar#28

Open

🦞 OpenClaw 生态日报 2026-03-03 duanyytop/agents-radar#66

Closed

gabi-simons and others added 2 commits March 3, 2026 19:52

Merge branch 'main' into fix/scheduler-race-condition-final

3f05647

This was referenced Mar 4, 2026

🦞 OpenClaw 生态日报 2026-03-04 duanyytop/agents-radar#71

Closed

🦞 OpenClaw 生态日报 2026-03-04 rollysys/agents-radar#33

Open

gabi-simons requested a review from Koshkoshinsk March 4, 2026 10:40

gabi-simons and others added 5 commits March 4, 2026 13:36

Merge branch 'main' into fix/scheduler-race-condition-final

142df02

docs: add changelog entry for scheduler duplicate fix

922219d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into fix/scheduler-race-condition-final

76d1c8b

Merge branch 'main' into fix/scheduler-race-condition-final

5410966

gavrielc merged commit f794185 into nanocoai:main Mar 4, 2026
1 check passed

taslim mentioned this pull request Mar 5, 2026

fix: prevent scheduled tasks from running twice when execution exceeds poll interval #601

Closed

5 tasks

This was referenced Mar 5, 2026

fix: prevent duplicate task runs when execution exceeds poll interval #636

Closed

fix: prevent scheduled tasks from running twice #536

Closed

fix: prevent duplicate and stale scheduled task execution #539

Closed

This was referenced Mar 5, 2026

fix(queue) issue #293: preempt idle group containers when scheduled tasks enqueue #294

Closed

fix: preempt active container when scheduled task is enqueued #296

Closed

XiRoSe pushed a commit to XiRoSe/nova-agent that referenced this pull request Apr 9, 2026

fix: atomic claim prevents scheduled tasks from executing twice (nano…

822df7f

…coai#657)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: atomic claim prevents scheduled tasks from executing twice#657

fix: atomic claim prevents scheduled tasks from executing twice#657
gavrielc merged 9 commits intonanocoai:mainfrom
gabi-simons:fix/scheduler-race-condition-final

gabi-simons commented Mar 2, 2026 •

edited

Loading

Uh oh!

gavrielc commented Mar 2, 2026

Uh oh!

Koshkoshinsk commented Mar 4, 2026

Uh oh!

Uh oh!

gavrielc commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gabi-simons commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The fix

Why not "atomic claim" (previous approach in this PR)

Test plan

Credits

Uh oh!

gavrielc commented Mar 2, 2026

Uh oh!

Koshkoshinsk commented Mar 4, 2026

E2E Test Results

Summary

Test Details

Uh oh!

Uh oh!

gavrielc commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gabi-simons commented Mar 2, 2026 •

edited

Loading