Replace PR agent with pr-review skill and decouple comment posting#34460
Replace PR agent with pr-review skill and decouple comment posting#34460
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34460Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34460" |
There was a problem hiding this comment.
Pull request overview
This PR migrates the Copilot-driven PR review automation from the legacy “pr agent” docs/scripts to a skill-based workflow centered on a new pr-review orchestrator (plus phase skills), and updates CI + commenting/labeling plumbing accordingly.
Changes:
- Introduces new PR review skills (
pr-review,pr-preflight,pr-gate,pr-report) and removes legacy PR agent instruction files. - Refactors
Review-PR.ps1to run a 4-step skill sequence (review → finalize → post summary → apply labels) and updates CI invocation. - Simplifies AI summary comment tooling by removing try-fix / write-tests / verify-tests comment scripts, and updates related docs.
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| eng/pipelines/ci-copilot.yml | Updates CI to invoke Review-PR.ps1 without legacy flags; publishes artifacts. |
| .github/skills/verify-tests-fail-without-fix/SKILL.md | Documentation wording adjustment for where verification output is consumed. |
| .github/skills/pr-review/SKILL.md | New orchestrator skill describing the 4-phase PR review workflow. |
| .github/skills/pr-report/SKILL.md | New “report/final recommendation” phase skill documentation. |
| .github/skills/pr-preflight/SKILL.md | New “context gathering” phase skill documentation. |
| .github/skills/pr-gate/SKILL.md | New “test verification” phase skill documentation and output format. |
| .github/skills/pr-finalize/SKILL.md | Updates wording to reference pr-review skill rather than the old agent. |
| .github/skills/ai-summary-comment/SKILL.md | Updates AI summary comment skill docs to match the new workflow structure. |
| .github/skills/ai-summary-comment/scripts/post-write-tests-comment.ps1 | Deletes legacy unified-comment section updater for write-tests attempts. |
| .github/skills/ai-summary-comment/scripts/post-verify-tests-comment.ps1 | Deletes legacy unified-comment section updater for verification results. |
| .github/skills/ai-summary-comment/scripts/post-try-fix-comment.ps1 | Deletes legacy unified-comment section updater for try-fix attempts. |
| .github/skills/ai-summary-comment/scripts/post-pr-finalize-comment.ps1 | Changes default behavior to inject into unified AI Summary comment; renames flag to -Standalone. |
| .github/skills/ai-summary-comment/scripts/post-ai-summary-comment.ps1 | Adjusts how review sessions are wrapped and displayed (commit info moved to top-level summary). |
| .github/skills/ai-summary-comment/NO-EXTERNAL-REFERENCES-RULE.md | Terminology updates (“PR agent” → “agent”). |
| .github/scripts/shared/Update-AgentLabels.ps1 | Terminology update in header; label parsing still reads gate/content.md. |
| .github/scripts/shared/Start-Emulator.ps1 | Adds -Headless flag and uses headless mode by default in CI contexts. |
| .github/scripts/Review-PR.ps1 | Major refactor to a 4-step skill-based orchestration flow. |
| .github/README-AI.md | Updates repository AI docs to point to the new skill-based workflow. |
| .github/instructions/sandbox.instructions.md | Updates guidance to reference pr-review skill instead of the old agent. |
| .github/docs/agent-labels.md | Updates documentation terminology and architecture description for the new workflow. |
| .github/copilot-instructions.md | Updates “available agents/skills” documentation to include pr-review and phase skills. |
| .github/agents/pr/SHARED-RULES.md | Removes legacy agent shared rules doc. |
| .github/agents/pr/post-gate.md | Removes legacy agent Phase 3/4 doc. |
| .github/agents/pr/PLAN-TEMPLATE.md | Removes legacy plan template for the old agent. |
| .github/agents/pr.md | Removes legacy PR agent definition. |
Comments suppressed due to low confidence (1)
.github/skills/ai-summary-comment/SKILL.md:282
- The expected directory structure uses PRAgent/validate/, but the actual posting/labeling scripts auto-load from PRAgent/gate/ (e.g., post-ai-summary-comment.ps1 and Update-AgentLabels.ps1). Please align this doc with the implemented folder names, otherwise agents will write files to the wrong location and the summary/labels won’t pick them up.
The agent writes phase output files that comment scripts auto-load:
CustomAgentLogsTmp/PRState/{PRNumber}/
├── PRAgent/
│ ├── pre-flight/
│ │ └── content.md # Phase 1 output (auto-loaded by post-ai-summary-comment.ps1)
│ ├── validate/
│ │ ├── content.md # Phase 2 output (auto-loaded by post-ai-summary-comment.ps1)
│ │ └── verify-tests-fail/ # Script output from verify-tests-fail.ps1
│ │ ├── verification-report.md
│ │ ├── verification-log.txt
│ │ ├── test-without-fix.log
│ │ └── test-with-fix.log
│ ├── try-fix/
</details>
Code Review: Skills Architecture RefactoringOverall: Good refactoring. The shift from agent-based to skill-based orchestration cuts ~3,800 lines, cleanly separates "what to do" (skill/phase files) from "how to post it" (scripts), and moves deterministic work out of LLM control. The 4-phase workflow is preserved intact. 🔴 Should fix before merge1. �� Nice to fix2. Fork fetch git errors swallowed (Review-PR.ps1 line 205) 3. 4. Orphaned legacy parameters (post-pr-finalize-comment.ps1 lines 49-59) 📋 Separate PR5. ✅ Already fixed
✅ Things done well
|
|
Thanks for the review @PureWeen ! All actionable items addressed in 7755046: 1. ✅ 2. ✅ Fork fetch errors swallowed — 3. ✅ 4. ✅ Orphaned legacy parameters — Removed all 11 unused legacy parameters from |
Refactor the monolithic PR agent into a pr-review orchestrator and distinct phase skills. Removed legacy agent files (agents/pr.md, PLAN-TEMPLATE.md, SHARED-RULES.md, post-gate.md) and added/registered modular skills (pr-review, pr-preflight, pr-gate, pr-report) with updated SKILL.md files. Updated docs (README-AI.md, copilot-instructions.md) and scripts to reference the new pr-review workflow, adjusted related PowerShell scripts and AI-summary tooling, and updated CI pipeline entries. This consolidates the workflow into an orchestrator that invokes separate preflight/gate/try-fix/report skills for clearer responsibilities and reuse. Replace pr agent with pr-review skill (split phases) Refactor the monolithic PR agent into a pr-review orchestrator and distinct phase skills. Removed legacy agent files (agents/pr.md, PLAN-TEMPLATE.md, SHARED-RULES.md, post-gate.md) and added/registered modular skills (pr-review, pr-preflight, pr-gate, pr-report) with updated SKILL.md files. Updated docs (README-AI.md, copilot-instructions.md) and scripts to reference the new pr-review workflow, adjusted related PowerShell scripts and AI-summary tooling, and updated CI pipeline entries. This consolidates the workflow into an orchestrator that invokes separate preflight/gate/try-fix/report skills for clearer responsibilities and reuse. Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce a new Step 0 (branch setup) into .github/scripts/Review-PR.ps1 and update the PR review skill doc. The script now documents a 5-step workflow, adds a -UseCurrentBranch switch and DryRun handling, and by default checks out main to create a pr-review-{PRNumber} branch. It fetches the PR (origin or fork), cherry-picks PR commits squashed into a single review commit, handles conflicts/cleanup, and prints helpful status messages. Also removed the inline Phase 0 instructions from .github/skills/pr-review/SKILL.md and clarified the rule to never switch branches (stay on the review branch set up by the caller). Examples and description text were updated accordingly.
Reorganize PR review documentation by moving pr-preflight, pr-gate, and pr-report from .github/skills/ to .github/pr-review/ and converting them into phase instruction files. Update .github/copilot-instructions.md and .github/skills/pr-review/SKILL.md to reference the new .github/pr-review/* paths and adjust wording to treat those documents as phase instructions (while leaving try-fix as the phase skill). Remove redundant frontmatter from the moved files and make small clarifications (branch setup note, no-comments reminder, and quick-reference table updates). This clarifies phase responsibilities and centralizes PR review guidance.
If the script is called while a previous baseline is still active (state file exists), auto-restore the previous baseline first before establishing the new one. This prevents the dirty-tree error that caused agents to loop endlessly trying to re-establish baseline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Refactor review/commenting flow: move/rename the ai-summary comment script into .github/scripts, add a new post-pr-finalize-comment.ps1 script to inject pr-finalize summaries into the unified AI Summary comment (with auto-discovery, dry-run preview and update/create behavior), and update Review-PR.ps1 to invoke these scripts directly for Step 3. Enhance agent runtime logging in Review-PR.ps1: richer JSON-stream parsing with model display, turn/tool counting, tool icons, intent reporting, content previews, failure tracking, and a final stats summary. Remove legacy ai-summary-comment skill docs and duplicate script artifacts under .github/skills/ai-summary-comment (IMPROVEMENTS.md, NO-EXTERNAL-REFERENCES-RULE.md, SKILL.md and an old script were deleted).
Update PR review skill to mandate using mode: "sync" for all try-fix task invocations and forbid mode: "background" (background mode causes the orchestrator to proceed before attempts finish and try-fix results are lost). Require updating try-fix/content.md after every attempt (immediately after completion) and add corresponding checklist entries and common-mistakes notes to ensure try-fix outputs are persisted and visible in PR comments. (Changes to .github/skills/pr-review/SKILL.md)
Replace the previous cherry-pick flow with a git merge --squash workflow when preparing a review branch. Adds conflict handling: aborts the merge on conflicts, resets state, posts a comment on the PR notifying about merge conflicts, and exits the script. Also updates dry-run messaging, handles detached HEAD display for current branch, and cleans up temporary branches after the squash-merge. Improves robustness and provides clearer notification when a PR cannot be squashed for review. Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
Phase 3 try-fix was updated to use four AI models (claude-opus-4.6-1m, claude-sonnet-4.6, gpt-5.3-codex, gemini-3-pro-preview) instead of two. Adjusted multi-model configuration, model order, and all checklist items to reflect 4 sequential attempts and per-attempt content updates. Clarified cross-pollination wording (now requires querying all models), preserved the mandatory mode: "sync" requirement, and updated exhaustion guidance to reference 4 models. Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
The body consistently references 4 models but the Quick Reference table said ×2. Updated to ×4 to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Capture and restore the original branch instead of relying on `git checkout -`, and surface git pull/fetch outputs in warnings to aid debugging. Review-PR.ps1: record the original branch (or HEAD), use it on error and cleanup, include pull/fetch stdout in warnings, and print the review base SHA. post-pr-finalize-comment.ps1: simplify the parameter list by keeping only PreviewFile and removing many legacy/ignored parameters. These changes make branch restoration more reliable and improve diagnostic information while cleaning up the post-PR script API.
Re-Review (post-rebase)All previously flagged issues have been fixed ✅ — New changes since last review
🟡 One remaining issueStale references to deleted
These still list Otherwise looks good to merge. |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Stale |
Multi-Model Code Review (5 LLMs × consensus filter)Consensus Findings (2+ models agree)🟡 MODERATE -- When
The conflict path (lines 237-243) handles all three correctly. Suggested fix -- mirror it: if ($LASTEXITCODE -ne 0) {
git branch -D $tempBranch 2>$null
git reset --hard HEAD 2>$null
git checkout $originalBranch 2>$null
git branch -D $reviewBranch 2>$null
Write-Error "Failed to create squashed commit"; exit 1
}🟡 MODERATE -- Old behavior: This affects all pipelines consuming Review generated by 5-model consensus (claude-opus-4.6 ×2, claude-sonnet-4.6, gemini-3-pro-preview, gpt-5.3-codex). Only issues flagged by 2+ models are included. |
Multi-Model Re-Review (Gemini 3 Pro, GPT 5.1, Sonnet 4.5)Previous fixes all confirmed intact ✅ (originalBranch, fork errors, legacy params, model count) Stale New findings from multi-model review🟡 🟡 Process cleanup too aggressive (Review-PR.ps1 lines 539-543) 🟡 Try-Fix mandatory vs skip contradiction (SKILL.md) 🟢 Minor: README-AI.md skill listing OverallAll three models agreed this is a good refactoring with clean architecture. The items above are the only net-new findings beyond what was already addressed. |
Re: Multi-Model Re-Review FindingsThanks for running the multi-model review — appreciate the thoroughness. However, I disagree with most of the findings here. Let me explain: 🟡
|
… condition - README-AI.md: Clarify that pr-preflight, pr-gate, pr-report are phase docs in .github/pr-review/, not standalone skills in skills/ - README-AI.md: Fix skill count to distinguish skills from phase docs - SKILL.md: Clarify Try-Fix mandatory language to explicitly state the exception (broken fix) alongside the rule (never skip when Gate passes) Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
🏗️ Overview
This PR replaces the monolithic PR agent with a modular, skill-based PR review orchestrator. The legacy 2-file agent is decomposed into 4 dedicated phase skills, each with clear boundaries, inputs, and outputs.
🔀 Before → After
✨ Key Changes
1️⃣ Agent → Skill Architecture
pr-reviewpr-preflightpr-gatetry-fix(existing)pr-report2️⃣ Decoupled Comment Posting
pr-reviewproduces output files only (CustomAgentLogsTmp/PRState/). Comments are posted separately viaai-summary-comment— no tight coupling.3️⃣ Unified Comment Model
Consolidated from 5 separate comment scripts → 2 sections in a single unified comment:
post-try-fix-comment.ps1post-verify-tests-comment.ps1post-write-tests-comment.ps1post-pr-finalize-comment.ps1now injects into the unified comment by default (use-Standalonefor legacy behavior).4️⃣ Simplified CI & Orchestration
Review-PR.ps1went from complex 3-phase script → clean 4-step orchestrator:ci-copilot.yml— removed explicit-RunFinalize -PostSummaryCommentflags (all phases run automatically).📁 Files Changed
✅ Added (4 files)
.github/skills/pr-review/SKILL.md.github/skills/pr-preflight/SKILL.md.github/skills/pr-gate/SKILL.md.github/skills/pr-report/SKILL.md❌ Deleted (7 files)
.github/agents/pr.mdpr-reviewskill.github/agents/pr/PLAN-TEMPLATE.md.github/agents/pr/SHARED-RULES.md.github/agents/pr/post-gate.mdpr-gate+pr-reportpost-try-fix-comment.ps1post-verify-tests-comment.ps1post-write-tests-comment.ps1🔧 Modified (14 files)
.github/scripts/Review-PR.ps1.github/skills/ai-summary-comment/SKILL.md.github/skills/ai-summary-comment/.../post-ai-summary-comment.ps1.github/skills/ai-summary-comment/.../post-pr-finalize-comment.ps1.github/copilot-instructions.md.github/README-AI.md.github/docs/agent-labels.md.github/scripts/shared/Update-AgentLabels.ps1.github/scripts/shared/Start-Emulator.ps1.github/skills/pr-finalize/SKILL.md.github/skills/verify-tests-fail-without-fix/SKILL.md.github/skills/ai-summary-comment/NO-EXTERNAL-REFERENCES-RULE.md.github/instructions/sandbox.instructions.mdeng/pipelines/ci-copilot.yml📊 Impact
Net reduction of ~2,400 lines — less code, better structure. 🎉