Skip to content

fix(ci): improve Claude Code review reliability#955

Merged
henrypark133 merged 1 commit intostagingfrom
fix/claude-review-reliability
Mar 11, 2026
Merged

fix(ci): improve Claude Code review reliability#955
henrypark133 merged 1 commit intostagingfrom
fix/claude-review-reliability

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

Summary

  • Add missing tools to --allowedTools: Read, Glob, Grep, Agent — the review prompt requires these but they weren't permitted, causing 8-9 permission denials per run and ~40% of reviews exiting without posting a comment
  • Simplify prompt: merge per-issue scoring agents (old step 4) into the review agents themselves, cutting agent spawns from 6+N to 6 and staying well within the 50-turn budget
  • Add guardrails: only the main process may post PR comments, and it must post exactly one comment before finishing (prevents fragmented output and ensures the staging gate always gets a comment to evaluate)

Evidence of the problem

PR Claude comments Permission denials Outcome
#925 0 9 Gate FAILED
#950 0 8 Gate blocked
#912 0 unknown No comment posted
#830 4 (fragmented) 9 Gate confused

Test plan

  • Merge to staging and wait for next staging-ci run to trigger a claude-review
  • Verify the Claude Code Review job has 0 (or near-0) permission denials in its result JSON
  • Verify the promotion PR receives exactly 1 Claude comment (not 0, not 4)
  • Confirm the staging gate passes without needing re-runs

🤖 Generated with Claude Code

The Claude review step was failing ~40% of the time because:
- --allowedTools didn't include Read, Glob, Grep, Agent, causing 8-9
  permission denials per run and preventing Claude from reading files
  or spawning the subagents the prompt required
- Step 4 spawned N additional scoring agents per issue found, exhausting
  the 50-turn budget before the PR comment could be posted
- Subagents could independently post PR comments, causing fragmented output

Fix: add missing tools to --allowedTools, merge per-issue scoring into
the review agents themselves, and add guardrails ensuring exactly one
consolidated comment is always posted.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 11, 2026 18:47
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

@github-actions github-actions bot added scope: ci CI/CD workflows size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Mar 11, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Claude Code review GitHub Action workflow to reduce tool permission denials, simplify the review prompt/agent flow, and add guardrails to ensure consistent PR comment output for staging promotion gates.

Changes:

  • Expand --allowedTools to include Read, Glob, Grep, and Agent.
  • Simplify the prompt by embedding severity/confidence scoring into the 4 review agents instead of spawning per-issue scoring agents.
  • Add “single commenter / single comment” guardrails to reduce fragmented or missing output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/claude-review.yml
@henrypark133 henrypark133 merged commit d313f44 into staging Mar 11, 2026
12 of 13 checks passed
@henrypark133 henrypark133 deleted the fix/claude-review-reliability branch March 11, 2026 21:05
@github-actions github-actions bot mentioned this pull request Mar 11, 2026
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
The Claude review step was failing ~40% of the time because:
- --allowedTools didn't include Read, Glob, Grep, Agent, causing 8-9
  permission denials per run and preventing Claude from reading files
  or spawning the subagents the prompt required
- Step 4 spawned N additional scoring agents per issue found, exhausting
  the 50-turn budget before the PR comment could be posted
- Subagents could independently post PR comments, causing fragmented output

Fix: add missing tools to --allowedTools, merge per-issue scoring into
the review agents themselves, and add guardrails ensuring exactly one
consolidated comment is always posted.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026
The Claude review step was failing ~40% of the time because:
- --allowedTools didn't include Read, Glob, Grep, Agent, causing 8-9
  permission denials per run and preventing Claude from reading files
  or spawning the subagents the prompt required
- Step 4 spawned N additional scoring agents per issue found, exhausting
  the 50-turn budget before the PR comment could be posted
- Subagents could independently post PR comments, causing fragmented output

Fix: add missing tools to --allowedTools, merge per-issue scoring into
the review agents themselves, and add guardrails ensuring exactly one
consolidated comment is always posted.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: ci CI/CD workflows size: M 50-199 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants