Skip to content

t3756: address PR #948 review feedback - stderr suppression and race conditions#4533

Merged
alex-solovyev merged 1 commit intomainfrom
bugfix/t3756-pr948-review-feedback
Mar 14, 2026
Merged

t3756: address PR #948 review feedback - stderr suppression and race conditions#4533
alex-solovyev merged 1 commit intomainfrom
bugfix/t3756-pr948-review-feedback

Conversation

@marcusquinn
Copy link
Copy Markdown
Owner

@marcusquinn marcusquinn commented Mar 14, 2026

Summary

Addresses all review feedback from PR #948 (auto-recovery for stuck deploying tasks) flagged by Gemini and CodeRabbit.

Closes #3756

Changes

supervisor-archived/deploy.sh

  • 2>/dev/null2>>"$SUPERVISOR_LOG" on write_proof_log call (Gemini medium: style guide rule docs: update branch creation to recommend worktrees for parallel sessions #50 — stderr suppression only acceptable when redirecting to log files)
  • 2>/dev/null2>>"$SUPERVISOR_LOG" on DB query for deploying_recovery_attempts (Gemini medium: same rule)
  • Race condition fix in else branch (CodeRabbit major): when cmd_transition to deployed fails after all retries, re-check current task state before marking failed. If another process already moved the task out of deploying, skip the failed transition and log the observed state instead of clobbering it.

supervisor-archived/pulse.sh

  • Race condition fix in Phase 4d (CodeRabbit major + Gemini high): the force-deploy fallback unconditionally forced deployed on any stuck task. Now re-checks current state before forcing — if the task was already transitioned to failed/blocked/cancelled by a concurrent process, skip the force and log a warning.

tests/test-supervisor-state-machine.sh

  • Replace false-pass with skip (CodeRabbit nitpick): the test for cmd_pr_lifecycle auto-recovery accepted failed as a passing outcome, which masked regressions. Now uses skip when the task is still in deploying (no real repo context available in test isolation), and fail for any other unexpected state.
  • Fix SUPERVISOR_SCRIPT path: updated to point to supervisor-archived/supervisor-helper.sh (the file was moved during the supervisor refactor in refactor: replace 37K-line bash supervisor with 123-line AI pulse system #2291 but the test path was never updated).

Verification

  • shellcheck .agents/scripts/supervisor-archived/deploy.sh — clean
  • shellcheck tests/test-supervisor-state-machine.sh — clean
  • shellcheck .agents/scripts/supervisor-archived/pulse.sh — hits memory watchdog on large file (pre-existing, not introduced by this PR)

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Enhanced deployment auto-recovery with improved concurrent state transition handling to prevent race conditions.
    • Strengthened supervisor reliability with defensive state checks before forcing task state changes.
    • Increased error visibility in deployment recovery flows.
  • Chores

    • Updated internal script references for consistency.

… and race conditions (t3756)

- Replace 2>/dev/null with 2>>"$SUPERVISOR_LOG" in write_proof_log and DB
  query calls to comply with style guide rule #50 (Gemini review feedback)
- Add state re-check before marking task failed when cmd_transition to deployed
  fails after retries, preventing race condition where concurrent transition
  already moved the task out of deploying (CodeRabbit review feedback)
- Add state re-check in Phase 4d pulse before force-deploying stuck tasks,
  preventing clobber of concurrent transitions to failed/blocked/cancelled
- Fix test: replace false-pass on 'failed' outcome with skip when task is
  still in deploying (CodeRabbit nitpick - lenient assertion undermined test)
- Fix test SUPERVISOR_SCRIPT path to point to supervisor-archived/ location

Closes #3756
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 14, 2026

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

These changes enhance concurrency safety in deployment automation by adding guarded state re-checks before task transitions, improving error logging to supervisor logs, and preventing race conditions where concurrent operations could corrupt task state. Test paths and expectations are updated accordingly.

Changes

Cohort / File(s) Summary
Supervisor Recovery Guards
.agents/scripts/supervisor-archived/deploy.sh, .agents/scripts/supervisor-archived/pulse.sh
Add defensive DB state re-checks before task transitions to avoid clobbering concurrent state changes. Replace immediate force-recovery with guarded fallback logic that verifies the task is still in deploying state before proceeding. Introduce additional error logging and conditional transition guards to improve observability and safety in concurrent environments.
Test Path & Expectations
tests/test-supervisor-state-machine.sh
Update supervisor helper script path and modify auto-recovery test logic (t222) to handle deploying status by skipping test, while maintaining original pass/fail paths and removing acceptance of failed as a recovery outcome.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

code-reviews-actioned

Poem

🛡️ Guards stand watch o'er state machines,
Race conditions caught between,
With DB re-checks held so tight,
Concurrent transitions stay in flight,
Safety logs our cautious sight. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective of the PR: addressing PR #948 review feedback by fixing stderr suppression and race condition issues across multiple deployment scripts.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bugfix/t3756-pr948-review-feedback
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 414 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Mar 14 00:27:33 UTC 2026: Code review monitoring started
Sat Mar 14 00:27:34 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 414

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 414
  • VULNERABILITIES: 0

Generated on: Sat Mar 14 00:27:36 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link
Copy Markdown

@alex-solovyev alex-solovyev merged commit 823bc63 into main Mar 14, 2026
25 of 26 checks passed
@alex-solovyev alex-solovyev deleted the bugfix/t3756-pr948-review-feedback branch March 14, 2026 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

quality-debt: PR #948 review feedback (medium)

2 participants