docs(skills): code-review v2 + GitHub endpoint fixes + minor text updates by serrrfirat · Pull Request #2528 · nearai/ironclaw

serrrfirat · 2026-04-16T13:33:29Z

Summary

Upgrade code-review skill to paranoid-architect v2: supports local diffs + GitHub PRs with line-level comments
Fix GitHub skill: clarify search endpoints, document response envelope structure, fix search_issues hallucination
Minor text updates across 8 other skills (commitment-digest, commitment-triage, decision-capture, delegation-tracker, idea-parking, product-prioritization, security-review, tech-debt-tracker)

Cherry-picked from #2504.

Test plan

Verify SKILL.md files parse correctly (valid frontmatter)
Spot-check code-review skill v2 instructions against actual GitHub API behavior

🤖 Generated with Claude Code

Rewrite the code-review skill from a 6-bullet checklist into a paranoid-architect workflow that handles both local diffs and GitHub PRs end-to-end: - Two input shapes: local `git diff` or `owner/repo N` / `github.com/.../pull/N` URLs. - Step 1 wraps GitHub fetches in `async def` + `FINAL(await ...)` to avoid the closure-capture quirk that kept tripping LLMs (see the paired codeact preamble update); reads metadata, diff, and files via three sequential awaits instead of `asyncio.gather`. - Step 2 reads each changed file in full (raw media type, no base64 module needed) so reviews account for surrounding context. - Step 3 runs the change through six lenses: correctness, edge cases, security (with a real adversarial checklist), test coverage, docs, architecture. - Step 4 renders findings as a severity table and asks which to post. - Step 5 posts line-level comments via the PR comments endpoint with the captured head SHA, falling back to issue comments for multi-file findings. Bumps `requires.skills` to include `github` so the activation pulls in the GitHub API recipes via the chain-loader. Adds a live e2e test (`e2e_live_code_review.rs`) plus a recorded trace fixture (PR #2483) so the workflow is replayable without hitting GitHub.

@me

LLMs kept inventing a `search_issues` action and looping over `/repos/{owner}/{repo}/pulls` for "my PRs" queries. Clarify the GitHub tool surface in three places: - `tools-src/github/src/lib.rs` and `registry/tools/github.json`: enumerate the three real search actions and call out that `search_issues_pull_requests` covers both. Add the canonical `is:pr author:@me sort:updated-desc` recipe for cross-repo "my PRs". - `skills/github/SKILL.md`: add an "Authenticated User & Cross-Repo Queries" section with copy-paste recipes for `@me`, the search endpoints with proper URL encoding, and the response-envelope contract (`body` is parsed JSON for application/json, raw `str` for diff endpoints — never call `json.loads()` on it, never write `.get("body", body)` as a fallback).

gemini-code-assist

Code Review

This pull request significantly upgrades the code-review skill to version 2.0.0, introducing a "Paranoid Architect" persona and detailed workflows for performing deep analysis on both local changes and GitHub pull requests. The updates include specific instructions for fetching full file contexts, applying six analytical lenses (e.g., security, correctness, architecture), and posting findings back to GitHub. Corresponding documentation updates were made to the github skill to clarify search endpoints and response handling. Furthermore, a new end-to-end test suite and associated trace fixtures have been added to verify the skill's performance. Review feedback suggests improving the robustness of the GitHub API integration by URL-encoding file paths and demonstrating multi-line comment support in the provided examples.

gemini-code-assist · 2026-04-16T13:37:15Z

+```repl
+r = await http(
+    method="GET",
+    url=f"https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={head_sha}",


The file path should be URL-encoded before being interpolated into the URL. While many code paths are safe, files with spaces or special characters will cause the request to fail. Since the github skill documentation (which this skill requires) already mentions encoding for query parameters, it's best to be consistent here as well. Note that you may need to import urllib.parse if the environment allows it.

Suggested change

url=f"https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={head_sha}",

url=f"https://api.github.com/repos/{owner}/{repo}/contents/{urllib.parse.quote(path)}?ref={head_sha}",

gemini-code-assist · 2026-04-16T13:37:15Z

+            "body": "**High** — `state.store` accessed directly, bypassing dispatch. See `.claude/rules/tools.md`.",
+            "commit_id": head_sha,
+            "path": "src/channels/web/handlers/foo.rs",
+            "line": 142,


The GitHub API supports multi-line comments using start_line and start_side. Since the instructions in Step 3 (line 102) mention capturing a "line range", it would be helpful to include these fields in the example to show the agent how to post findings that span multiple lines.

Suggested change

"line": 142,

"start_line": 140,

"start_side": "RIGHT",

"line": 142,

henrypark133

No verified blockers in the changed code.

I checked the GitHub tool schema/description updates, the code-review/github skill guidance, and the new live replay coverage. The substantive change here is tightening the model-facing contract around GitHub search actions and adding caller-level regression coverage so the agent has to fetch and name the requested PR instead of silently substituting another target.

Residual risk: most of this PR is prompt/skill behavior, so future regressions can still come from model drift, but the added replay fixture is the right guardrail for that class of bug.

…arness methods - Remove `.into_iter()` on `details` in catalog.rs (clippy::useless_conversion) - Add `with_skills_dir` to `LiveTestHarnessBuilder` for e2e_live_code_review test - Add `active_skill_names` to `TestRig` extracting from SkillActivated status events Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

zmanian

Summary

Upgrades the code-review skill to a paranoid-architect v2 that handles both local diffs and GitHub PRs with line-level comment posting, clarifies the GitHub skill's search endpoints, adds a 313-line live/replay e2e test for the new flow, and ships a 2979-line recorded trace fixture. Scope is broader than "minor text updates" but the pieces are coherent.

Findings

Blocking

None.

Suggestions

Merge ordering with PR #2530. tests/support/test_rig.rs:336-348 adds active_skill_names() which destructures StatusUpdate::SkillActivated { skill_names }. PR #2530 adds a feedback field to that variant. Whichever merges second will need a .. pattern added to the match. No runtime impact, but will break CI on the second PR unless the author rebases. Worth a heads-up comment in both PRs.
Ambiguous mode detection (skills/code-review/SKILL.md:79): "If the message contains anything shaped like owner/repo followed by a number, treat it as a PR request and use the GitHub path, not git." A user saying "review nearai/ironclaw 2483 locally" would be ambiguous. Consider explicitly calling out that a trailing locally / local keyword should override.
tools-src/github/src/lib.rs:2151 schema description is now 500+ chars; JSON schema descriptions are typically shown truncated in tool registries. Keep the canonical copy here and move extended examples to the skills/github/SKILL.md file you already updated.
crates/ironclaw_skills/src/catalog.rs:464 .zip(details.into_iter()) -> .zip(details) is a drive-by clippy fix unrelated to docs. Fine, but noting for future splits.
tests/e2e_live_code_review.rs:637 hard-codes PR_2483_TITLE. A comment notes "update this constant alongside re-recording the trace fixture" — good, but consider fetching the title from the trace fixture at test time so drift is caught by the replay machinery instead of by a stale assert.

Nits

skills/github/SKILL.md:369-371 Common Mistakes now has a line about /search/issues covering both — this is duplicated in the Search section above; one place would suffice.
registry/tools/github.json description is now very long (300+ chars). Same caveat as the schema description above.

Verdict

COMMENT — ship it. The skill instructions are thoughtful and the regression test meaningfully guards against the "silent-substitution" failure mode. Coordinate the SkillActivated pattern match with PR #2530's author to avoid a CI break on the second merge.

henrypark133

No verified blockers in the updated docs, tool metadata, and test changes. The GitHub search contract is now explicit about search_issues_pull_requests, the skill guidance fixes the earlier ambiguous query details, and the live/replay coverage still exercises the real code-review owner/repo N flow instead of allowing a generic non-skill response to slip through. Residual risk is mostly prompt behavior rather than Rust logic, but the caller-level trace test is the right backstop for this PR.

…ne comments, description trimming (#2528) - URL-encode file paths in GitHub API content URLs - Add start_line/start_side to multi-line comment example - Add 'locally' keyword override for mode detection - Trim overly long schema descriptions - Remove duplicated /search/issues note from Common Mistakes - Fetch PR title from trace fixture instead of hard-coding Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ilblackdragon · 2026-04-17T06:50:35Z

PR 2528 Review — docs(skills): code-review v2 + GitHub endpoint fixes

Overview

Upgrades the code-review skill to a paranoid-architect v2 that handles both local diffs and GitHub PRs with line-level comment posting, clarifies the github skill's search/user endpoints, tightens the search_issues_pull_requests tool description, lands a new live/replay e2e test for /code-review owner/repo N, and fixes one clippy nit.

Strengths

The code-review v2 SKILL.md is well-structured: two clear input shapes, six-lens review rubric, explicit severity table, and concrete repl blocks. The warnings about FINAL() semantics, asyncio closure-capture pitfalls, and not normalizing body with .get(\"body\", body) are specific and rooted in prior failures — exactly what a skill should teach.
github/SKILL.md now disambiguates search endpoints and documents the http-tool envelope shape ({status, headers, body}), which is the single biggest source of hallucinated GitHub calls.
tools-src/github/src/lib.rs description clarifies that there is no search_issues action; this matches the schema at line 2153.
The e2e test is thoughtful: it validates skill activation, target URL, PR title presence, a concrete file citation, and explicitly guards against the "empty shell / unknown fields" regression.

Issues

1. LiveTestHarnessBuilder::with_skills_dir() silently drops its PathBuf argument (likely bug).
tests/support/live_harness.rs:352,549,598 store skills_dir: Option<PathBuf> but then use it only as an is_some() flag to call rig_builder.with_skills(), which just flips enable_skills = true (test_rig.rs:740). The actual skills directory consumed by AppBuilder comes from config.skills.local_dir, which test_rig.rs:849,857,1049 hardcodes to temp_dir.path().join(\"skills\") — an empty dir. So .with_skills_dir(repo_skills_dir()) is equivalent to .with_skills_dir(PathBuf::from(\"/anywhere\")).

Impact: in replay mode the SkillRegistry has no skills loaded, no SkillActivated event fires for code-review, and the test's first assertion (active.iter().any(|s| s == \"code-review\")) should fail. If the fixture was recorded with SKILLS_DIR env var or some ad-hoc override, live mode is hiding this. This is the exact class of silent-argument-loss bug .claude/rules/testing.md calls out ("Test Through the Caller, Not Just the Helper").

Fix: propagate the stored path into TestRigBuilder so it writes config.skills.local_dir (and update the enable_skills block at test_rig.rs:1047 to use that path, not another temp subdir).

2. max_context_tokens doubled (1200 → 2500).
v2 is substantially longer, but 2500 is sizable against SKILLS_MAX_TOKENS. Worth confirming it still fits alongside github + coding in the default budget without starving other skills.

3. Fixture size.
code_review_pr_2483.json is ~100 KB / 2979 lines committed. Acceptable for replay determinism, but subsequent re-records will churn the repo. Consider whether this fixture needs pruning of x-github-* ratelimit/date headers before recording (they don't affect replay).

Minor / Nits

skills/code-review/SKILL.md:95 comment says "three serial GETs" — the example actually issues two (meta+diff on the same URL) plus files, which is three, but the rationale sentence about asyncio import ordering is a lot of caveat for what could also be solved by a single helper. Fine as-is.
skills/github/SKILL.md:174 example hardcodes nearai/ironclaw — generalize to {owner}/{repo} for consistency with the rest of the doc.
catalog.rs:464 clippy cleanup is correct (zip accepts IntoIterator).

Risk Summary

The SKILL.md / description churn is text-only and low risk. The live/replay test as written likely won't catch skill-activation regressions in replay mode due to issue #1 above — worth fixing before relying on it as a guardrail.

serrrfirat · 2026-04-17T07:05:46Z

Review feedback addressed

All comments from @gemini-code-assist and @zmanian have been addressed:

#	Reviewer	Fix
1	gemini-code-assist	URL-encode file paths in GitHub API content URLs (`urllib.parse.quote(path, safe='')`)
2	gemini-code-assist	Added `start_line`/`start_side` to multi-line comment example
3	zmanian	Added `locally`/`local` keyword override for ambiguous mode detection
4	zmanian	Trimmed `search_issues_pull_requests` schema description (350→110 chars)
5	zmanian	Removed duplicate `/search/issues` note from Common Mistakes
6	zmanian	Trimmed `registry/tools/github.json` description (334→151 chars)
7	zmanian	PR title now extracted from trace fixture at test time instead of hard-coded

Item 8 (merge ordering with PR #2530 re: SkillActivated pattern) — will handle on rebase when the second PR merges.

Quality gate clean: cargo fmt ✓ · cargo clippy ✓ · cargo test --lib ✓

LiveTestHarnessBuilder::with_skills_dir() stored a PathBuf but only used it as an is_some() flag — the actual SkillRegistry always pointed at an empty temp directory. Now the stored path flows through TestRigBuilder::with_skills_dir() into config.skills.local_dir and the SkillRegistry constructor. Also generalizes the hardcoded nearai/ironclaw repo name in the github skill's response-handling example to {owner}/{repo}. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

serrrfirat · 2026-04-17T07:31:31Z

Addressed @ilblackdragon review feedback

#	Issue	Fix
1	`with_skills_dir()` silently drops `PathBuf` — skills dir hardcoded to empty temp subdir	Propagated `skills_dir` through `TestRigBuilder` into `config.skills.local_dir`; `live_harness.rs` now passes the actual path instead of just flipping a boolean flag
2	`skills/github/SKILL.md:174` hardcodes `nearai/ironclaw`	Generalized to `{owner}/{repo}/pulls/123`
3	`max_context_tokens` (1200→2500) / fixture size	Noted — non-blocking, no code change

Commit: e72b3fc

Quality gate clean: cargo fmt ✓ · cargo clippy ✓ · cargo test --lib ✓

ilblackdragon added 2 commits April 16, 2026 16:33

github-actions bot added scope: docs Documentation size: S 10-49 changed lines risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels Apr 16, 2026

gemini-code-assist bot reviewed Apr 16, 2026

View reviewed changes

henrypark133 previously approved these changes Apr 16, 2026

View reviewed changes

serrrfirat dismissed henrypark133’s stale review via 0ad1931 April 16, 2026 20:50

zmanian reviewed Apr 17, 2026

View reviewed changes

henrypark133 previously approved these changes Apr 17, 2026

View reviewed changes

serrrfirat dismissed henrypark133’s stale review via 1e58c1f April 17, 2026 05:50

serrrfirat requested a review from zmanian April 17, 2026 07:05

serrrfirat requested a review from ilblackdragon April 17, 2026 07:31

ilblackdragon approved these changes Apr 17, 2026

View reviewed changes

Merge branch 'staging' into docs/skill-text-updates

2ba1a63

ilblackdragon merged commit 27d53f5 into staging Apr 17, 2026
15 checks passed

ilblackdragon deleted the docs/skill-text-updates branch April 17, 2026 14:12

github-actions bot mentioned this pull request Apr 17, 2026

chore: promote staging to staging-promote/2874d2e9-24567658122 (2026-04-17 14:25 UTC) #2578

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(skills): code-review v2 + GitHub endpoint fixes + minor text updates#2528

docs(skills): code-review v2 + GitHub endpoint fixes + minor text updates#2528
ilblackdragon merged 6 commits intostagingfrom
docs/skill-text-updates

serrrfirat commented Apr 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 16, 2026

Uh oh!

gemini-code-assist bot Apr 16, 2026

Uh oh!

henrypark133 left a comment

Uh oh!

zmanian left a comment

Uh oh!

henrypark133 left a comment

Uh oh!

ilblackdragon commented Apr 17, 2026

Uh oh!

serrrfirat commented Apr 17, 2026

Uh oh!

serrrfirat commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	url=f"https://api.github.com/repos/{owner}/{repo}/contents/{path}?ref={head_sha}",
	url=f"https://api.github.com/repos/{owner}/{repo}/contents/{urllib.parse.quote(path)}?ref={head_sha}",

Conversation

serrrfirat commented Apr 16, 2026

Summary

Test plan

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

henrypark133 left a comment

Choose a reason for hiding this comment

Uh oh!

zmanian left a comment

Choose a reason for hiding this comment

Summary

Findings

Blocking

Suggestions

Nits

Verdict

Uh oh!

henrypark133 left a comment

Choose a reason for hiding this comment

Uh oh!

ilblackdragon commented Apr 17, 2026

PR 2528 Review — docs(skills): code-review v2 + GitHub endpoint fixes

Overview

Strengths

Issues

Minor / Nits

Risk Summary

Uh oh!

serrrfirat commented Apr 17, 2026

Review feedback addressed

Uh oh!

serrrfirat commented Apr 17, 2026

Addressed @ilblackdragon review feedback

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants