Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/agents/pr/SHARED-RULES.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,43 @@ EOF

---

## Agent Labels (Automated by Review-PR.ps1)

After all phases complete, `Review-PR.ps1` automatically applies GitHub labels based on phase outcomes. The agent does NOT need to apply labels β€” just write accurate `content.md` files.

### Label Categories

**Outcome labels** (mutually exclusive β€” exactly one per PR):
| Label | When Applied |
|-------|-------------|
| `s/agent-approved` | Report recommends APPROVE |
| `s/agent-changes-requested` | Report recommends REQUEST CHANGES |
| `s/agent-review-incomplete` | Agent didn't complete all phases |

**Signal labels** (additive β€” multiple can coexist):
| Label | When Applied |
|-------|-------------|
| `s/agent-gate-passed` | Gate phase passes |
| `s/agent-gate-failed` | Gate phase fails |
| `s/agent-fix-win` | Agent found a better alternative fix than the PR |
| `s/agent-fix-pr-picked` | PR's fix is the best β€” agent couldn't beat it |

**Tracking label** (always applied):
| Label | When Applied |
|-------|-------------|
| `s/agent-reviewed` | Every completed agent run |

### How Labels Are Determined

Labels are parsed from `content.md` files:
- **Outcome**: from `report/content.md` β€” looks for `Final Recommendation: APPROVE` or `REQUEST CHANGES`
- **Gate**: from `gate/content.md` β€” looks for `PASSED` or `FAILED`
- **Fix**: from `try-fix/content.md` β€” looks for alternative selected (win = agent beat PR) vs `Selected Fix: PR` (lose = PR was best)

**Agent responsibility**: Write clear, parseable `content.md` with standard markers (`βœ… PASSED`, `❌ FAILED`, `Selected Fix: PR`, `Final Recommendation: APPROVE`).

---

## No Direct Git Commands

**Never run git commands that change branch or file state.**
Expand Down
171 changes: 171 additions & 0 deletions .github/docs/agent-labels.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Agent Workflow Labels

GitHub labels for tracking outcomes of the AI agent PR review workflow (`Review-PR.ps1`).

All labels use the **`s/agent-*`** prefix for easy querying on GitHub.

---

## Label Categories

### Outcome Labels

Mutually exclusive β€” exactly **one** is applied per PR review run.

| Label | Color | Description | Applied When |
|-------|-------|-------------|--------------|
| `s/agent-approved` | 🟒 `#2E7D32` | AI agent recommends approval β€” PR fix is correct and optimal | Report phase recommends APPROVE |
| `s/agent-changes-requested` | 🟠 `#E65100` | AI agent recommends changes β€” found a better alternative or issues | Report phase recommends REQUEST CHANGES |
| `s/agent-review-incomplete` | πŸ”΄ `#B71C1C` | AI agent could not complete all phases (blocker, timeout, error) | Agent exits without completing all phases |

When a new outcome label is applied, any previously applied outcome label is automatically removed.

### Signal Labels

Additive β€” **multiple** can coexist on a single PR.

| Label | Color | Description | Applied When |
|-------|-------|-------------|--------------|
| `s/agent-gate-passed` | 🟒 `#4CAF50` | AI verified tests catch the bug (fail without fix, pass with fix) | Gate phase passes |
| `s/agent-gate-failed` | 🟠 `#FF9800` | AI could not verify tests catch the bug | Gate phase fails |
| `s/agent-fix-win` | 🟒 `#66BB6A` | AI found a better alternative fix than the PR | Fix phase: alternative selected over PR's fix |
| `s/agent-fix-pr-picked` | 🟠 `#FF7043` | AI could not beat the PR fix β€” PR is the best among all candidates | Fix phase: PR selected as best after comparison |

Gate labels (`gate-passed`/`gate-failed`) are mutually exclusive with each other. Fix labels (`fix-win`/`fix-lose`) are mutually exclusive with each other.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label name s/agent-fix-pr-picked in the code does not match the documentation which refers to it as s/agent-fix-lose.

In the documentation at line 34, the table mentions "Fix labels (fix-win/fix-lose)" suggesting the label should be called s/agent-fix-lose, but the actual label defined in Update-AgentLabels.ps1 line 35 is s/agent-fix-pr-picked.

Either the code should use s/agent-fix-lose to match the documentation's naming pattern, or the documentation should be updated to consistently use s/agent-fix-pr-picked. The current mismatch could cause confusion when users try to query these labels.

Suggested change
Gate labels (`gate-passed`/`gate-failed`) are mutually exclusive with each other. Fix labels (`fix-win`/`fix-lose`) are mutually exclusive with each other.
Gate labels (`gate-passed`/`gate-failed`) are mutually exclusive with each other. Fix labels (`fix-win`/`fix-pr-picked`) are mutually exclusive with each other.

Copilot uses AI. Check for mistakes.

### Tracking Label

Always applied on every completed agent run.

| Label | Color | Description | Applied When |
|-------|-------|-------------|--------------|
| `s/agent-reviewed` | πŸ”΅ `#1565C0` | PR was reviewed by AI agent workflow (full 4-phase review) | Every completed agent run |

### Manual Label

Applied by MAUI maintainers, not by automation.

| Label | Color | Description | Applied When |
|-------|-------|-------------|--------------|
| `s/agent-fix-implemented` | 🟣 `#7B1FA2` | PR author implemented the agent's suggested fix | Maintainer applies when PR author adopts agent's recommendation |

---

## How It Works

### Architecture

```
Review-PR.ps1
β”œβ”€β”€ Phase 1: PR Agent Review (Copilot CLI)
β”‚ β”œβ”€β”€ Pre-Flight β†’ writes content.md
β”‚ β”œβ”€β”€ Gate β†’ writes content.md
β”‚ β”œβ”€β”€ Fix β†’ writes content.md
β”‚ └── Report β†’ writes content.md
β”œβ”€β”€ Phase 2: PR Finalize (optional)
β”œβ”€β”€ Phase 3: Post Comments (optional)
└── Phase 4: Apply Labels ← labels are applied here
β”œβ”€β”€ Parse content.md files
β”œβ”€β”€ Determine outcome + signal labels
β”œβ”€β”€ Apply via GitHub REST API
└── Non-fatal: errors warn but don't fail the workflow
```

Labels are applied exclusively from `Review-PR.ps1` Phase 4. No other script applies agent labels. This single-source design avoids label conflicts and simplifies debugging.

### How Labels Are Parsed

The `Parse-PhaseOutcomes` function in `Update-AgentLabels.ps1` reads `content.md` files from each phase directory:

| Source File | What's Parsed | Resulting Label |
|-------------|---------------|-----------------|
| `gate/content.md` | `**Result:** βœ… PASSED` | `s/agent-gate-passed` |
| `gate/content.md` | `**Result:** ❌ FAILED` | `s/agent-gate-failed` |
| `try-fix/content.md` | `**Selected Fix:** Candidate ...` | `s/agent-fix-win` |
| `try-fix/content.md` | `**Selected Fix:** PR ...` | `s/agent-fix-pr-picked` |
| `report/content.md` | `Final Recommendation: APPROVE` | `s/agent-approved` |
| `report/content.md` | `Final Recommendation: REQUEST CHANGES` | `s/agent-changes-requested` |
| *(missing report)* | No report file exists | `s/agent-review-incomplete` |

### Self-Bootstrapping

Labels are created automatically on first use via `Ensure-LabelExists`. No manual setup required. If a label already exists but has a stale description or color, it is updated.

---

## Querying Labels

All labels use the `s/agent-*` prefix, making them easy to filter on GitHub.

### Common Queries

```
# PRs the agent approved
is:pr label:s/agent-approved

# PRs where agent found a better fix
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment at line 107 says "PRs where agent found a better fix" but queries for s/agent-fix-pr-picked. This is semantically backwards.

According to the label definitions:

  • s/agent-fix-win = "AI found a better alternative fix than the PR"
  • s/agent-fix-pr-picked = "AI could not beat the PR fix β€” PR is the best"

So the query comment should say "PRs where agent could NOT beat the PR fix" or the query should use label:s/agent-fix-win instead.

Suggested change
# PRs where agent found a better fix
# PRs where agent could NOT beat the PR fix (PR fix was best)

Copilot uses AI. Check for mistakes.
is:pr label:s/agent-fix-pr-picked

# PRs where agent found better fix AND author implemented it
is:pr label:s/agent-changes-requested label:s/agent-fix-implemented

# PRs where tests don't catch the bug
is:pr label:s/agent-gate-failed

# Agent-reviewed PRs that are still open
is:pr is:open label:s/agent-reviewed

# All agent-reviewed PRs (total count)
is:pr label:s/agent-reviewed
```

### Metrics You Can Derive

| Metric | Query |
|--------|-------|
| Total agent reviews | `is:pr label:s/agent-reviewed` |
| Approval rate | Compare `label:s/agent-approved` vs `label:s/agent-changes-requested` counts |
| Gate pass rate | Compare `label:s/agent-gate-passed` vs `label:s/agent-gate-failed` counts |
| Fix win rate | Compare `label:s/agent-fix-win` vs `label:s/agent-fix-pr-picked` counts |
| Agent adoption rate | `label:s/agent-fix-implemented` / `label:s/agent-changes-requested` |
| Incomplete review rate | `label:s/agent-review-incomplete` / `label:s/agent-reviewed` |

---

## Implementation Details

### Files

| File | Purpose |
|------|---------|
| `.github/scripts/shared/Update-AgentLabels.ps1` | Label helper module (all label logic) |
| `.github/scripts/Review-PR.ps1` | Orchestrator that calls `Apply-AgentLabels` in Phase 4 |
| `.github/agents/pr/SHARED-RULES.md` | Documents label system for the PR agent |

### Key Functions

| Function | Description |
|----------|-------------|
| `Apply-AgentLabels` | Main entry point β€” parses phases and applies all labels |
| `Parse-PhaseOutcomes` | Reads `content.md` files, returns outcome/gate/fix results |
| `Update-AgentOutcomeLabel` | Applies one outcome label, removes conflicting ones |
| `Update-AgentSignalLabels` | Adds/removes gate and fix signal labels |
| `Update-AgentReviewedLabel` | Ensures tracking label is present |
| `Ensure-LabelExists` | Creates or updates a label in the repository |

### Design Principles

- **Idempotent**: Safe to re-run β€” checks before add/remove, GitHub ignores duplicate adds
- **Non-fatal**: Label failures emit warnings but never fail the overall workflow
- **Single source**: All labels applied from `Review-PR.ps1` only β€” no other scripts touch labels
- **Self-bootstrapping**: Labels are created on first use via GitHub API
- **Mutual exclusivity enforced**: Outcome labels and same-category signal labels automatically remove their counterpart

---

## Migrated From

The following old infrastructure was removed as part of this implementation:

- **`Update-VerificationLabels`** function in `verify-tests-fail.ps1` β€” removed (labels now come from `Review-PR.ps1` only)
- **`s/ai-reproduction-confirmed`** / **`s/ai-reproduction-failed`** labels β€” superseded by `s/agent-gate-passed` / `s/agent-gate-failed`
24 changes: 24 additions & 0 deletions .github/scripts/Review-PR.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -531,6 +531,30 @@ if ($DryRun) {
}
}

# Phase 4: Apply Labels
Write-Host ""
Write-Host "╔═══════════════════════════════════════════════════════════╗" -ForegroundColor Blue
Write-Host "β•‘ PHASE 4: APPLY LABELS β•‘" -ForegroundColor Blue
Write-Host "β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•" -ForegroundColor Blue
Write-Host ""

$labelHelperPath = Join-Path $RepoRoot ".github/scripts/shared/Update-AgentLabels.ps1"
if (-not (Test-Path $labelHelperPath)) {
Write-Host "⚠️ Label helper missing, attempting targeted recovery..." -ForegroundColor Yellow
git checkout $savedHead -- $labelHelperPath 2>&1 | Out-Null
}

if (Test-Path $labelHelperPath) {
try {
. $labelHelperPath
Apply-AgentLabels -PRNumber $PRNumber -RepoRoot $RepoRoot
}
catch {
Write-Host "⚠️ Label application failed (non-fatal): $_" -ForegroundColor Yellow
}
} else {
Write-Host "⚠️ Label helper not found at: $labelHelperPath β€” skipping labels" -ForegroundColor Yellow
}
}
}

Expand Down
Loading