docs(skills): code-review v2 + GitHub endpoint fixes + minor text updates#2528
docs(skills): code-review v2 + GitHub endpoint fixes + minor text updates#2528ilblackdragon merged 6 commits intostagingfrom
Conversation
Rewrite the code-review skill from a 6-bullet checklist into a paranoid-architect workflow that handles both local diffs and GitHub PRs end-to-end: - Two input shapes: local `git diff` or `owner/repo N` / `github.com/.../pull/N` URLs. - Step 1 wraps GitHub fetches in `async def` + `FINAL(await ...)` to avoid the closure-capture quirk that kept tripping LLMs (see the paired codeact preamble update); reads metadata, diff, and files via three sequential awaits instead of `asyncio.gather`. - Step 2 reads each changed file in full (raw media type, no base64 module needed) so reviews account for surrounding context. - Step 3 runs the change through six lenses: correctness, edge cases, security (with a real adversarial checklist), test coverage, docs, architecture. - Step 4 renders findings as a severity table and asks which to post. - Step 5 posts line-level comments via the PR comments endpoint with the captured head SHA, falling back to issue comments for multi-file findings. Bumps `requires.skills` to include `github` so the activation pulls in the GitHub API recipes via the chain-loader. Adds a live e2e test (`e2e_live_code_review.rs`) plus a recorded trace fixture (PR #2483) so the workflow is replayable without hitting GitHub.
LLMs kept inventing a `search_issues` action and looping over
`/repos/{owner}/{repo}/pulls` for "my PRs" queries. Clarify the
GitHub tool surface in three places:
- `tools-src/github/src/lib.rs` and `registry/tools/github.json`:
enumerate the three real search actions and call out that
`search_issues_pull_requests` covers both. Add the canonical
`is:pr author:@me sort:updated-desc` recipe for cross-repo "my PRs".
- `skills/github/SKILL.md`: add an "Authenticated User & Cross-Repo
Queries" section with copy-paste recipes for `@me`, the search
endpoints with proper URL encoding, and the response-envelope
contract (`body` is parsed JSON for application/json, raw `str` for
diff endpoints — never call `json.loads()` on it, never write
`.get("body", body)` as a fallback).
There was a problem hiding this comment.
Code Review
This pull request significantly upgrades the code-review skill to version 2.0.0, introducing a "Paranoid Architect" persona and detailed workflows for performing deep analysis on both local changes and GitHub pull requests. The updates include specific instructions for fetching full file contexts, applying six analytical lenses (e.g., security, correctness, architecture), and posting findings back to GitHub. Corresponding documentation updates were made to the github skill to clarify search endpoints and response handling. Furthermore, a new end-to-end test suite and associated trace fixtures have been added to verify the skill's performance. Review feedback suggests improving the robustness of the GitHub API integration by URL-encoding file paths and demonstrating multi-line comment support in the provided examples.
| ```repl | ||
| r = await http( | ||
| method="GET", | ||
| url=f"https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={head_sha}", |
There was a problem hiding this comment.
The file path should be URL-encoded before being interpolated into the URL. While many code paths are safe, files with spaces or special characters will cause the request to fail. Since the github skill documentation (which this skill requires) already mentions encoding for query parameters, it's best to be consistent here as well. Note that you may need to import urllib.parse if the environment allows it.
| url=f"https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={head_sha}", | |
| url=f"https://api.github.com/repos/{owner}/{repo}/contents/{urllib.parse.quote(path)}?ref={head_sha}", |
| "body": "**High** — `state.store` accessed directly, bypassing dispatch. See `.claude/rules/tools.md`.", | ||
| "commit_id": head_sha, | ||
| "path": "src/channels/web/handlers/foo.rs", | ||
| "line": 142, |
There was a problem hiding this comment.
The GitHub API supports multi-line comments using start_line and start_side. Since the instructions in Step 3 (line 102) mention capturing a "line range", it would be helpful to include these fields in the example to show the agent how to post findings that span multiple lines.
| "line": 142, | |
| "start_line": 140, | |
| "start_side": "RIGHT", | |
| "line": 142, |
henrypark133
left a comment
There was a problem hiding this comment.
No verified blockers in the changed code.
I checked the GitHub tool schema/description updates, the code-review/github skill guidance, and the new live replay coverage. The substantive change here is tightening the model-facing contract around GitHub search actions and adding caller-level regression coverage so the agent has to fetch and name the requested PR instead of silently substituting another target.
Residual risk: most of this PR is prompt/skill behavior, so future regressions can still come from model drift, but the added replay fixture is the right guardrail for that class of bug.
…arness methods - Remove `.into_iter()` on `details` in catalog.rs (clippy::useless_conversion) - Add `with_skills_dir` to `LiveTestHarnessBuilder` for e2e_live_code_review test - Add `active_skill_names` to `TestRig` extracting from SkillActivated status events Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
zmanian
left a comment
There was a problem hiding this comment.
Summary
Upgrades the code-review skill to a paranoid-architect v2 that handles both local diffs and GitHub PRs with line-level comment posting, clarifies the GitHub skill's search endpoints, adds a 313-line live/replay e2e test for the new flow, and ships a 2979-line recorded trace fixture. Scope is broader than "minor text updates" but the pieces are coherent.
Findings
Blocking
None.
Suggestions
- Merge ordering with PR #2530.
tests/support/test_rig.rs:336-348addsactive_skill_names()which destructuresStatusUpdate::SkillActivated { skill_names }. PR #2530 adds afeedbackfield to that variant. Whichever merges second will need a..pattern added to the match. No runtime impact, but will break CI on the second PR unless the author rebases. Worth a heads-up comment in both PRs. - Ambiguous mode detection (
skills/code-review/SKILL.md:79): "If the message contains anything shaped likeowner/repofollowed by a number, treat it as a PR request and use the GitHub path, not git." A user saying "review nearai/ironclaw 2483 locally" would be ambiguous. Consider explicitly calling out that a trailinglocally/localkeyword should override. tools-src/github/src/lib.rs:2151schema description is now 500+ chars; JSON schema descriptions are typically shown truncated in tool registries. Keep the canonical copy here and move extended examples to theskills/github/SKILL.mdfile you already updated.crates/ironclaw_skills/src/catalog.rs:464.zip(details.into_iter())->.zip(details)is a drive-by clippy fix unrelated to docs. Fine, but noting for future splits.tests/e2e_live_code_review.rs:637hard-codesPR_2483_TITLE. A comment notes "update this constant alongside re-recording the trace fixture" — good, but consider fetching the title from the trace fixture at test time so drift is caught by the replay machinery instead of by a stale assert.
Nits
skills/github/SKILL.md:369-371Common Mistakes now has a line about/search/issuescovering both — this is duplicated in the Search section above; one place would suffice.registry/tools/github.jsondescription is now very long (300+ chars). Same caveat as the schema description above.
Verdict
COMMENT — ship it. The skill instructions are thoughtful and the regression test meaningfully guards against the "silent-substitution" failure mode. Coordinate the SkillActivated pattern match with PR #2530's author to avoid a CI break on the second merge.
henrypark133
left a comment
There was a problem hiding this comment.
No verified blockers in the updated docs, tool metadata, and test changes. The GitHub search contract is now explicit about search_issues_pull_requests, the skill guidance fixes the earlier ambiguous query details, and the live/replay coverage still exercises the real code-review owner/repo N flow instead of allowing a generic non-skill response to slip through. Residual risk is mostly prompt behavior rather than Rust logic, but the caller-level trace test is the right backstop for this PR.
…ne comments, description trimming (#2528) - URL-encode file paths in GitHub API content URLs - Add start_line/start_side to multi-line comment example - Add 'locally' keyword override for mode detection - Trim overly long schema descriptions - Remove duplicated /search/issues note from Common Mistakes - Fetch PR title from trace fixture instead of hard-coding Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR 2528 Review — docs(skills): code-review v2 + GitHub endpoint fixesOverviewUpgrades the Strengths
Issues1. Impact: in replay mode the Fix: propagate the stored path into 2. 3. Fixture size. Minor / Nits
Risk SummaryThe SKILL.md / description churn is text-only and low risk. The live/replay test as written likely won't catch skill-activation regressions in replay mode due to issue #1 above — worth fixing before relying on it as a guardrail. |
Review feedback addressedAll comments from @gemini-code-assist and @zmanian have been addressed:
Item 8 (merge ordering with PR #2530 re: Quality gate clean: |
LiveTestHarnessBuilder::with_skills_dir() stored a PathBuf but only
used it as an is_some() flag — the actual SkillRegistry always pointed
at an empty temp directory. Now the stored path flows through
TestRigBuilder::with_skills_dir() into config.skills.local_dir and
the SkillRegistry constructor.
Also generalizes the hardcoded nearai/ironclaw repo name in the
github skill's response-handling example to {owner}/{repo}.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addressed @ilblackdragon review feedback
Commit: e72b3fc Quality gate clean: |
Summary
Cherry-picked from #2504.
Test plan
🤖 Generated with Claude Code