t1314: Fix batch concurrency bypass in supervisor dispatch pipeline#2236
t1314: Fix batch concurrency bypass in supervisor dispatch pipeline#2236marcusquinn merged 2 commits intomainfrom
Conversation
When cmd_pulse() runs without --batch flag (e.g., from cron), Phase 2 now auto-detects active batches and dispatches through them, respecting each batch's concurrency limit. Previously, all queued tasks were dispatched via global SUPERVISOR_MAX_CONCURRENCY (default: 4), bypassing batch concurrency=1 settings entirely. Unbatched tasks still use global concurrency as a fallback.
…e (t1314) - Fix batch affinity in auto-batch absorption (cron.sh): tasks only absorbed into batches containing tasks from the same repo, preventing cross-repo serialization. Remaining unbatched tasks get a new auto-batch. - Cap cmd_next() sibling limits to batch concurrency (state.sh): when a subtask belongs to a batch with concurrency=1, sibling parallelism is capped to that value instead of using SUPERVISOR_MAX_PARALLEL_SIBLINGS. - Enforce blocked-by at dispatch time (dispatch.sh): re-check blocked-by dependencies in cmd_dispatch() before the concurrency check, not just during auto-pickup. Blocked tasks return to queued state. - Add human review gate for AI lifecycle auto-merge (ai-lifecycle.sh): merge_pr action now requires APPROVED review decision unless SUPERVISOR_AUTO_MERGE_ENABLED=true. Tasks without approval transition to review_waiting state instead of merging. Ref: GH#2163
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the supervisor's batch processing pipeline by addressing several scenarios where batch concurrency limits and intended behaviors were inadvertently bypassed. The changes ensure that tasks are dispatched according to their batch configurations, maintain repository affinity within batches, enforce dependency checks more rigorously, and introduce a human review gate for automated PR merges. These improvements lead to more predictable and controlled task execution, especially in concurrent environments. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 24 20:54:54 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
Code Review
This pull request introduces several crucial fixes to address concurrency bypass issues within the supervisor pipeline, significantly improving its robustness and reliability. The changes to auto-detect and respect batch concurrency, enforce batch affinity for repositories, and cap sibling parallelism are well-implemented and directly tackle the core problems. The addition of a pre-dispatch dependency check and a human review gate for pull request merges are also valuable enhancements for safety and correctness. My review primarily identifies opportunities to improve debuggability by removing error suppression and to enhance maintainability by reducing code duplication. Overall, this is a strong set of improvements.
| local current_review_decision="NONE" | ||
| if [[ -n "$pr_number" && -n "$pr_repo_slug" ]] && command -v gh &>/dev/null; then | ||
| current_review_decision=$(gh pr view "$pr_number" --repo "$pr_repo_slug" \ | ||
| --json reviewDecision --jq '.reviewDecision // "NONE"' 2>/dev/null || echo "NONE") |
There was a problem hiding this comment.
The use of 2>/dev/null here suppresses potentially useful error messages from the gh command. If the command fails due to authentication issues, an invalid repository slug, or other problems, these errors will be hidden, making debugging more difficult. It's better to let stderr be printed so that such issues are visible in the logs. The || echo "NONE" construct already safely handles the command failing without exiting the script.
| --json reviewDecision --jq '.reviewDecision // "NONE"' 2>/dev/null || echo "NONE") | |
| --json reviewDecision --jq '.reviewDecision // "NONE"' || echo "NONE") |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| local still_unbatched_count | ||
| still_unbatched_count=$(echo "$still_unbatched" | wc -l | tr -d ' ') | ||
| local auto_batch_id | ||
| auto_batch_id=$(cmd_batch "$auto_batch_name" --concurrency "$auto_base_concurrency" --tasks "$task_csv" 2>/dev/null) |
There was a problem hiding this comment.
Suppressing stderr with 2>/dev/null can hide important diagnostic information from the cmd_batch command. If there's an issue creating the batch, the error message will be lost, making it harder to debug why auto_batch_id might be empty. It's better to allow errors to be printed to stderr.
| auto_batch_id=$(cmd_batch "$auto_batch_name" --concurrency "$auto_base_concurrency" --tasks "$task_csv" 2>/dev/null) | |
| auto_batch_id=$(cmd_batch "$auto_batch_name" --concurrency "$auto_base_concurrency" --tasks "$task_csv") |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| # is_task_blocked() was only called during auto-pickup (cron.sh), not here. | ||
| # A task could become blocked between pickup and dispatch (e.g., a dependency | ||
| # was re-queued or failed). Re-check now to avoid wasting an Opus session. | ||
| if declare -f is_task_blocked &>/dev/null && [[ -f "$dispatch_todo_file" ]]; then |
There was a problem hiding this comment.
Using &>/dev/null suppresses both stdout and stderr. For declare -f, it's better to only suppress stdout with >/dev/null. This allows potential syntax errors or other issues from the shell to be visible on stderr, which is useful for debugging, while still hiding the function definition output.
| if declare -f is_task_blocked &>/dev/null && [[ -f "$dispatch_todo_file" ]]; then | |
| if declare -f is_task_blocked >/dev/null && [[ -f "$dispatch_todo_file" ]]; then |
References
- In shell scripts, avoid blanket suppression of errors with '2>/dev/null'. For 'declare -f', redirecting only stdout ('>/dev/null') is sufficient as it does not output to stderr, allowing potential syntax errors to remain visible.
| # was re-queued or failed). Re-check now to avoid wasting an Opus session. | ||
| if declare -f is_task_blocked &>/dev/null && [[ -f "$dispatch_todo_file" ]]; then | ||
| local todo_line | ||
| todo_line=$(grep -E "^[[:space:]]*- \[.\] $task_id( |$)" "$dispatch_todo_file" 2>/dev/null | head -1 || true) |
There was a problem hiding this comment.
The 2>/dev/null here is redundant because the existence of $dispatch_todo_file is already checked in the preceding if condition. Suppressing stderr can hide other potential grep errors, such as issues with the regex pattern. It's better to remove the suppression and let || true handle any non-zero exit code from grep if no match is found.
| todo_line=$(grep -E "^[[:space:]]*- \[.\] $task_id( |$)" "$dispatch_todo_file" 2>/dev/null | head -1 || true) | |
| todo_line=$(grep -E "^[[:space:]]*- \[. \] $task_id( |$)" "$dispatch_todo_file" | head -1 || true) |
References
- Avoid using
2>/dev/nullto suppress errors on file operations if the file's existence has already been verified by a preceding check (e.g.,[[ -f "$file" ]]or an early return). This practice is redundant for 'file not found' errors and can mask other important issues like permissions problems.
| local dispatch_blockers | ||
| if dispatch_blockers=$(is_task_blocked "$todo_line" "$dispatch_todo_file"); then | ||
| log_warn "Task $task_id blocked by unresolved dependencies ($dispatch_blockers) at dispatch — returning to queued (t1314)" | ||
| cmd_unclaim "$task_id" "${trepo:-.}" --force 2>/dev/null || true |
There was a problem hiding this comment.
Suppressing stderr with 2>/dev/null for cmd_unclaim can hide important errors related to file permissions, git operations, or other issues within the unclaim logic. Since the command is followed by || true, the script won't exit on failure, so it's safer to let stderr be visible for debugging purposes.
| cmd_unclaim "$task_id" "${trepo:-.}" --force 2>/dev/null || true | |
| cmd_unclaim "$task_id" "${trepo:-.}" --force || true |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| AND t.status = 'queued' | ||
| ) | ||
| ORDER BY b.created_at ASC; | ||
| " 2>/dev/null || echo "") |
There was a problem hiding this comment.
Suppressing stderr with 2>/dev/null can hide important diagnostic information, such as syntax errors in the SQL query or problems with the database file itself. Since the command already uses || echo "" to prevent script termination on failure, it's better to allow errors to be printed to stderr for easier debugging. This pattern appears multiple times in this file.
| " 2>/dev/null || echo "") | |
| " || echo "") |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| next_tasks=$(db -separator $'\t' "$SUPERVISOR_DB" " | ||
| SELECT id, repo, description, model | ||
| FROM tasks | ||
| WHERE status = 'queued' | ||
| AND retries < max_retries | ||
| AND (rate_limit_until IS NULL OR rate_limit_until <= strftime('%Y-%m-%dT%H:%M:%SZ','now')) | ||
| AND NOT EXISTS (SELECT 1 FROM batch_tasks bt WHERE bt.task_id = id) | ||
| ORDER BY retries ASC, created_at ASC | ||
| LIMIT 10; | ||
| " 2>/dev/null || echo "") |
There was a problem hiding this comment.
This database query duplicates the logic already present in the cmd_next function when called with an empty batch ID. To improve maintainability and avoid code duplication, you should call cmd_next "" 10 here instead of re-implementing the query.
| next_tasks=$(db -separator $'\t' "$SUPERVISOR_DB" " | |
| SELECT id, repo, description, model | |
| FROM tasks | |
| WHERE status = 'queued' | |
| AND retries < max_retries | |
| AND (rate_limit_until IS NULL OR rate_limit_until <= strftime('%Y-%m-%dT%H:%M:%SZ','now')) | |
| AND NOT EXISTS (SELECT 1 FROM batch_tasks bt WHERE bt.task_id = id) | |
| ORDER BY retries ASC, created_at ASC | |
| LIMIT 10; | |
| " 2>/dev/null || echo "") | |
| next_tasks=$(cmd_next "" 10) |
| WHERE bt.task_id = '$(sql_escape "$cid")' | ||
| AND b.status IN ('active', 'running') | ||
| LIMIT 1; | ||
| " 2>/dev/null || echo "") |
There was a problem hiding this comment.
Suppressing stderr with 2>/dev/null can hide important diagnostic information, such as syntax errors in the SQL query or problems with the database file itself. Since the command already uses || echo "" to prevent script termination on failure, it's better to allow errors to be printed to stderr for easier debugging.
| " 2>/dev/null || echo "") | |
| " || echo "") |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
and obsoleted by supervisor refactor #2291 The entire bash supervisor (37K lines) was replaced with a 123-line AI pulse system. All 5 contributing causes (cmd_pulse batch detection, auto-batch affinity, cmd_next sibling limits, blocked-by enforcement, auto-merge gate) no longer exist in the codebase. Closes GH#2163.



Summary
Fixes five contributing causes of batch concurrency bypass when
cmd_pulse()runs without--batchflag.cmd_pulse()runs without--batch, Phase 2 now queries active batches from DB and dispatches through each one respecting their concurrency limitsconcurrency=1cmd_next()now capsSUPERVISOR_MAX_PARALLEL_SIBLINGSto the batch's concurrency setting when a subtask belongs to a batchcmd_dispatch()re-checksblocked-by:dependencies before the concurrency check, not just during auto-pickupmerge_praction requiresAPPROVEDreview decision unlessSUPERVISOR_AUTO_MERGE_ENABLED=true(default: false)Files Changed
pulse.shcron.shstate.shdispatch.shai-lifecycle.shTesting
bash -nsyntax checkRef: GH#2163