fix: signal active containers to close when a scheduled task is enqueued by aoberoi · Pull Request #320 · qwibitai/nanoclaw

aoberoi · 2026-02-20T08:29:16Z

Type of Change

Skill - adds a new skill in .claude/skills/
Fix - bug fix or security fix to source code
Simplification - reduces or simplifies source code

Description

Fixes #293. This is equivalent to Option A in that issue. I believe this is the cleanest way to fix the issue. Option B would screw up the workflow for scheduled tasks with context_mode: group, since a parallel container trying to resume a currently active session is going to lead to undefined outcomes. The single container per group invariant is worth preserving. Option C doesn't really provide any benefit if both idleWaiting and processing containers can correctly handle resuming sessions.

Also fixes a race condition (#320 (comment))

When a scheduled task was enqueued for a group with an active container, it would sit in pendingTasks until the container exited naturally — up to 30 minutes (idle timeout), or indefinitely if users kept resetting it.

Now writes the _close sentinel immediately when queuing a task against an active container. The container winds down after its current query, drainGroup fires, and the task runs in a new container within seconds.

I did some investigation to make sure the two kinds of scheduled tasks (context_mode: isolated and group) are considered. Here are all the scenarios:

Scenario 1: Container is idle, isolated task fires

Container already emitted the between-query session update marker before
going idle → sessions[group.folder] = "session-A" is current
_close written, container exits from waitForIpcMessage immediately
drainGroup runs isolated task with sessionId = undefined → creates fresh
"session-B" → discarded by scheduler
sessions[group.folder] is still "session-A"
User's next message: resumes "session-A" correctly ✓

Scenario 2: Container is busy (mid-query), isolated task fires

_close detected by pollIpcDuringQuery, stream.end() called
Current query runs to completion, emits OUTPUT_MARKER with newSessionId →
wrappedOnOutput updates sessions[group.folder] = "session-A" (or a new turn
ID)
Container exits (note: closedDuringQuery = true means it skips the
between-query session update marker, but the result OUTPUT_MARKER already
updated sessions)
drainGroup runs isolated task with sessionId = undefined → fresh session,
discarded
User's next message: resumes "session-A" correctly ✓

Scenario 3: Container is idle, group task fires

Same session preservation as Scenario 1
Group task runs with sessionId = "session-A" — resumes and extends that
session
Scheduler discards newSessionId returned from the task — but since resuming
doesn't create a new session ID, this is fine
User's next message: resumes "session-A", which now includes the group
task's turns ✓

Scenario 4: Container is busy, group task fires

Same as Scenario 2 — current query completes, sessions[group.folder]
updated
Group task runs resuming that session, extends it
User's next message: resumes correctly ✓

For Skills

I have not made any changes to source code
My skill contains instructions for Claude to follow (not pre-built code)
I tested this skill on a fresh clone

longregen · 2026-02-20T16:12:29Z

Please clean the commit history before creating a PR: 8c1cad8 and dc4a443 don't seem relevant for the purpose of the PR.

aoberoi · 2026-02-20T16:38:01Z

@longregen whoops! good catch

When a scheduled task was enqueued for a group with an active container, it would sit in pendingTasks until the container exited naturally — up to 30 minutes (idle timeout), or indefinitely if users kept resetting it. Now writes the _close sentinel immediately when queuing a task against an active container. The container winds down after its current query, drainGroup fires, and the task runs in a new container within seconds. Fixes qwibitai#293 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

aoberoi · 2026-02-20T16:45:52Z

cleaned the commit history

aoberoi · 2026-02-20T17:24:49Z

Found another related issue in my testing. Scheduled tasks can sometimes fire more than once because of a race condition. The race is between the SCHEDULER_POLL_INTERVAL (60s) and the execution time of a task (which can take longer than 60s).

The solution I'm working on is to add a new "running" state to a task, and make sure the scheduler doesn't pick up running tasks to run on the next scheduler poll interval. There was an alternative approach, to just mark the status = NULL, but I prefer this solution because its explicit (which can help for inspecting issues in the system) and we can actually implement scheduled task recovery if there's an unexpected crash.

The fix includes all the state transitions to make this work and some tests.

Introduces a status='running' transition set atomically at enqueue time, preventing the scheduler's 60s poll from re-enqueueing a task that is still executing. Also fixes updateTaskAfterRun to transition running→active (not running→running) for recurring tasks, and adds startup crash recovery via resetRunningTasks(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…qwibitai#320) Migrate the three highest-complexity hooks from bash to TypeScript: - pr-review-loop.sh (452 lines) → src/hooks/pr-review-loop.ts - pr-kaizen-clear.sh (290 lines) → src/hooks/pr-kaizen-clear.ts - kaizen-reflect.sh (197 lines) → src/hooks/kaizen-reflect.ts Shared infrastructure: - src/hooks/hook-utils.ts — stdin JSON parsing, git helpers - src/hooks/parse-command.ts — command parsing (port of lib/parse-command.sh) - src/hooks/state-utils.ts — atomic state writes, typed objects, no stat portability - src/hooks/telegram-ipc.ts — Telegram notification via IPC Improvements over bash: - Atomic state writes (temp file + rename) prevent race conditions - Native JSON parsing (no jq pipelines or sed extraction) - No pipe-splitting corruption (IFS='|' read bug) - No stat portability issues (fs.statSync works everywhere) - Proper typed validation with clear error messages - 115 vitest tests covering all state machine paths Also: - Add priority:critical and priority:high labels to issue taxonomy - Update /pick-work and /make-a-dent to prefer high-priority issues - Old bash hooks deactivated with migration comments - Filed qwibitai#331 (worktree-du migration), qwibitai#332 (CI smoke tests), qwibitai#333 (shared lib retirement) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

First hook migrated to TypeScript, establishing the pattern for all future L3-L4 hook migrations per docs/hook-language-boundaries.md. What changed: - src/hooks/kaizen-reflect.ts: Full TS port of the PostToolUse hook - src/hooks/hook-io.ts: Shared hook I/O (stdin JSON, stdout advisory) - src/hooks/parse-command.ts: TS port of parse-command.sh library - src/hooks/state-utils.ts: TS port of state-utils.sh library - Thin bash wrapper (kaizen-reflect-ts.sh) delegates to npx tsx - settings.json updated to use the TS wrapper - 53 vitest tests covering all three modules - Old kaizen-reflect.sh marked as deactivated (kept for reference) Run tag: batch-260321-1108-3ef8/run-7 Closes Garsson-io/kaizen#320 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…261) First hook migrated to TypeScript, establishing the pattern for all future L3-L4 hook migrations per docs/hook-language-boundaries.md. What changed: - src/hooks/kaizen-reflect.ts: Full TS port of the PostToolUse hook - src/hooks/hook-io.ts: Shared hook I/O (stdin JSON, stdout advisory) - src/hooks/parse-command.ts: TS port of parse-command.sh library - src/hooks/state-utils.ts: TS port of state-utils.sh library - Thin bash wrapper (kaizen-reflect-ts.sh) delegates to npx tsx - settings.json updated to use the TS wrapper - 53 vitest tests covering all three modules - Old kaizen-reflect.sh marked as deactivated (kept for reference) Run tag: batch-260321-1108-3ef8/run-7 Closes Garsson-io/kaizen#320 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ke tests & policy (qwibitai#320, qwibitai#332) Builds on the kaizen-reflect migration (already in main) to complete Phase 3: - pr-review-loop.ts: multi-round PR self-review state machine (452→240 lines) - pr-kaizen-clear.ts: kaizen gate validation + clearing (290→370 lines, with kaizen #198 waiver elimination) - wrapper-smoke.test.ts: 9 end-to-end smoke tests for bash→tsx→hook chain - resolve-project-root.sh: DRY shared lib for TS wrapper path resolution Both hooks use main's patterns: extracted processHookInput() for testability, async readHookInput from hook-io.ts, import guard, explicit stateDir params. Policy changes: - Policy #18: "Smoke tests ship WITH the feature — never after" - Priority labels (priority:critical, priority:high) in issue taxonomy - /pick-work and /make-a-dent updated to prefer priority-labeled issues 129 tests across 6 files, all passing. Closes Garsson-io/kaizen#320 Closes Garsson-io/kaizen#332 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wibitai#320, qwibitai#332) (#268) * feat: migrate L3-L4 bash hooks to TypeScript — Phase 3 of #223 (kaizen qwibitai#320) Migrate the three highest-complexity hooks from bash to TypeScript: - pr-review-loop.sh (452 lines) → src/hooks/pr-review-loop.ts - pr-kaizen-clear.sh (290 lines) → src/hooks/pr-kaizen-clear.ts - kaizen-reflect.sh (197 lines) → src/hooks/kaizen-reflect.ts Shared infrastructure: - src/hooks/hook-utils.ts — stdin JSON parsing, git helpers - src/hooks/parse-command.ts — command parsing (port of lib/parse-command.sh) - src/hooks/state-utils.ts — atomic state writes, typed objects, no stat portability - src/hooks/telegram-ipc.ts — Telegram notification via IPC Improvements over bash: - Atomic state writes (temp file + rename) prevent race conditions - Native JSON parsing (no jq pipelines or sed extraction) - No pipe-splitting corruption (IFS='|' read bug) - No stat portability issues (fs.statSync works everywhere) - Proper typed validation with clear error messages - 115 vitest tests covering all state machine paths Also: - Add priority:critical and priority:high labels to issue taxonomy - Update /pick-work and /make-a-dent to prefer high-priority issues - Old bash hooks deactivated with migration comments - Filed qwibitai#331 (worktree-du migration), qwibitai#332 (CI smoke tests), qwibitai#333 (shared lib retirement) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: format hook files with Prettier Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: smoke tests for TS hook wrappers + "tests ship with feature" policy (qwibitai#332) - Add wrapper-smoke.test.ts: 9 tests verifying the full bash→tsx→hook chain for all 3 migrated hooks (pr-review-loop, pr-kaizen-clear, kaizen-reflect) - Fix wrapper path resolution: use `git rev-parse --show-toplevel` instead of `git worktree list` — the old approach resolved to main checkout where the TS hooks don't exist yet in other worktrees - Use randomized PR numbers + isolated STATE_DIR to prevent smoke test state from leaking into production state dir (incident: PR 99999 gate blocked session) - Add policy #18: "Smoke tests ship WITH the feature — never after" to .claude/kaizen/policies.md, CLAUDE.md, and /review-pr skill Closes Garsson-io/kaizen#332 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

aoberoi requested a review from gavrielc as a code owner February 20, 2026 08:29

aoberoi force-pushed the fix-293 branch from a75cc75 to 3619cd2 Compare February 20, 2026 16:45

VintLin mentioned this pull request Feb 24, 2026

fix: preempt active container when scheduled task is enqueued #296

Closed

Andy-NanoClaw-AI added PR: Fix Bug fix Status: Blocked Blocked by merge conflicts or dependencies labels Mar 5, 2026

gavrielc requested a review from gabi-simons as a code owner March 6, 2026 10:17

Andy-NanoClaw-AI added Status: Needs Review Ready for maintainer review Status: Blocked Blocked by merge conflicts or dependencies and removed Status: Blocked Blocked by merge conflicts or dependencies Status: Needs Review Ready for maintainer review labels Mar 9, 2026

github-actions bot mentioned this pull request Mar 10, 2026

🦞 OpenClaw Ecosystem Digest 2026-03-10 duanyytop/agents-radar#120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: signal active containers to close when a scheduled task is enqueued#320

fix: signal active containers to close when a scheduled task is enqueued#320
aoberoi wants to merge 2 commits intoqwibitai:mainfrom
aoberoi:fix-293

aoberoi commented Feb 20, 2026 •

edited

Loading

Uh oh!

longregen commented Feb 20, 2026

Uh oh!

aoberoi commented Feb 20, 2026

Uh oh!

aoberoi commented Feb 20, 2026

Uh oh!

aoberoi commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aoberoi commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of Change

Description

For Skills

Uh oh!

longregen commented Feb 20, 2026

Uh oh!

aoberoi commented Feb 20, 2026

Uh oh!

aoberoi commented Feb 20, 2026

Uh oh!

aoberoi commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aoberoi commented Feb 20, 2026 •

edited

Loading