Add Azure DevOps CI investigation instructions with az CLI preference#34335
Add Azure DevOps CI investigation instructions with az CLI preference#34335
Conversation
- Add .github/instructions/azdo-ci.instructions.md: teaches Copilot to always prefer az CLI for ADO queries, check availability first, and prompt user to install/login if missing. Falls back to anonymous REST for dnceng-public with a noted limitation. - Update .github/copilot-instructions.md: add Azure DevOps CI Access section under Development Environment Setup, marking az as strongly recommended with install/setup steps. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34335Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34335" |
There was a problem hiding this comment.
Pull request overview
Adds repository guidance to steer CI/build investigations toward authenticated Azure DevOps access via the az CLI (and azure-devops extension), reducing reliance on anonymous REST calls that are rate-limited and can’t access internal org resources.
Changes:
- Added new
.github/instructions/azdo-ci.instructions.mdwithaz-first CI investigation workflow, common commands, and REST fallback guidance. - Updated
.github/copilot-instructions.mdto recommendazfor Azure DevOps CI investigation and link to the detailed instructions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
.github/instructions/azdo-ci.instructions.md |
Introduces az-preferred investigation guidance, command examples, and org access notes. |
.github/copilot-instructions.md |
Adds a brief “Azure DevOps CI Access” setup section pointing to the new instructions. |
You can also share your feedback on Copilot code review. Take the survey.
.github/copilot-instructions.md
Outdated
|
|
||
| - **Azure CLI (`az`)** — **strongly recommended** for investigating CI builds, test failures, and pipeline status | ||
| - Install: `brew install azure-cli` (macOS) / `winget install Microsoft.AzureCLI` (Windows) | ||
| - Setup: `az login && az extension add --name azure-devops` |
There was a problem hiding this comment.
The setup instructions say az login && az extension add --name azure-devops, but az login alone doesn’t authenticate Azure DevOps operations for many users; the azure-devops extension commonly needs az devops login (PAT / AZURE_DEVOPS_EXT_PAT). Suggest updating the “Setup” bullet to reflect the actual ADO auth flow so users don’t hit confusing 401/403 errors even after running az login.
| - Setup: `az login && az extension add --name azure-devops` | |
| - Setup: `az login && az extension add --name azure-devops` **then** authenticate the extension with either `az devops login` (using a PAT) **or** by setting `AZURE_DEVOPS_EXT_PAT` to a valid PAT, to avoid 401/403 errors |
| **Before making ANY Azure DevOps API call**, check if `az` is installed and authenticated: | ||
|
|
||
| ```bash | ||
| # Check if az is installed and logged in | ||
| az account show 2>/dev/null && az extension show --name azure-devops 2>/dev/null | ||
| ``` |
There was a problem hiding this comment.
The “check if az is installed and authenticated” step uses az account show (Azure subscription login) and bash-specific 2>/dev/null. Azure DevOps (azure-devops extension) authentication is typically via az devops login (PAT / AZURE_DEVOPS_EXT_PAT) and can work even when az account show fails; conversely, az account show can succeed while ADO auth is missing. Consider switching this check to an ADO call (e.g., az devops project list --org ...) and either make the snippet shell-agnostic or provide both bash + PowerShell variants.
| 2. Login: az login | ||
|
|
||
| 3. Extension: az extension add --name azure-devops | ||
|
|
||
| 4. Defaults: `az devops configure --defaults organization=https://dev.azure.com/dnceng-public project=public` |
There was a problem hiding this comment.
In the “STOP and tell the user” setup snippet, step 2 recommends az login, but the Azure DevOps extension usually requires az devops login (PAT) or AZURE_DEVOPS_EXT_PAT to access organizations like dnceng-public/dnceng. Also, step 4 wraps the az devops configure ... command in backticks inside a fenced block, which makes copy/paste include literal backticks and fail. Update the snippet to use the correct ADO auth command and present commands without nested backticks.
| 2. Login: az login | |
| 3. Extension: az extension add --name azure-devops | |
| 4. Defaults: `az devops configure --defaults organization=https://dev.azure.com/dnceng-public project=public` | |
| 2. Login: az devops login # requires a Personal Access Token (PAT) or AZURE_DEVOPS_EXT_PAT | |
| 3. Extension: az extension add --name azure-devops | |
| 4. Defaults: az devops configure --defaults organization=https://dev.azure.com/dnceng-public project=public |
…tions note The separate instruction file had two problems: - applyTo: "**" loaded 125 lines of CI content into every context (C# code, XAML, etc.) - No applyTo value matches conversational CI questions without an open file Replace with a concise note in copilot-instructions.md (always loaded), pointing to pr-build-status scripts for structured queries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…triggers
- Get-PrBuildIds.ps1: diagnose when CI was never triggered (path filters,
draft PR, not queued) instead of silently returning empty output. Only
returns rows with valid BuildIds to downstream scripts.
- SKILL.md: add binlog analysis workflow (binlogtool) for MSBuild/XamlC/
NuGet failures where text logs say 'Build FAILED' with no detail.
- SKILL.md: expand trigger phrases ('why is CI red', 'build failed', etc.)
so the skill surfaces for more natural CI investigation questions.
- SKILL.md: add 'stop if tool missing' policy and 'focus on first error'
rule. Update description and version to 1.2.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align with dotnet/android's naming convention for CI investigation skills. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Preserve all behavioral rules; cut prose, redundant examples, verbose descriptions, and bash code blocks that don't add instructional value. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore copilot-instructions.md to original content, keeping only the Azure DevOps CI Access section with minimal wording, skill rename, and az login clarification for dnceng-public. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tterns table - Add az CLI as optional prereq for binlog artifact download - Add binlogtool reconstruct (full text log) and doublewrites (double-write detection) - Add Common Build Error Patterns table (CS/NU/XamlC/##[error]/TimeoutException/MT/BL) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove 'When to Use' section (duplicates front matter description) - Add pipeline investigation priority order (maui-pr > devicetests > uitests) - Add concrete binlog decision tree (when to use vs. skip) - Fix bash context: rm -rf instead of Remove-Item - Add --detect false to az pipelines command - Remove duplicate Prerequisites section Addresses feedback from Opus, Sonnet, and Codex reviews on PR #34335. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…fact name maui-pr build artifacts use names like 'Windows_NT_Build Windows (Debug)_Attempt1' and are Container type (not PipelineArtifact). Update section to list artifacts first and use the correct download approach. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Helix work items can exit with code 0 while individual tests fail, reported only via ADO test results API (not in job logs). The script previously reported 0 failures for these builds. Add Section 3 that queries ResultSummaryByBuild (public, no auth needed) to cross-check ADO test results: - Warns with Get-HelixLogs guidance when failures exist but logs show clean - Quietly notes test failure count when build errors already explain them Verified against build 1325582 (8 test failures in passing Helix jobs). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…act binlog download Container-type build artifacts (e.g. Windows_NT_Build*) require auth and the ADO File Container API (5.0-preview) to download. az pipelines artifact download does not support them. - Add Get-BuildBinlogs.ps1 that uses Bearer token to list and download .binlog files from Container artifacts - Update SKILL.md binlog section to use the new script - Verified: 7 binlogs present in Windows device test Container artifact Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…m format) - Fix container ID regex: resource.data format is '#/ID/ArtifactName', not '#/ID/' - Fix download format: use OctetStream (not 'file') for binary artifact download Verified: downloads all 7 binlogs from Windows device test Container artifact Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…xt skill (#34438) <!-- Please let the below note in for people that find this PR --> > [!NOTE] > Are you waiting for the changes in this PR to be merged? > It would be very helpful if you could [test the resulting artifacts](https://github.com/dotnet/maui/wiki/Testing-PR-Builds) from this PR and let us know in a comment if this change resolves your issue. Thank you! ## Summary This PR adopts the [`dotnet/arcade-skills`](https://github.com/dotnet/arcade-skills) plugin system for CI investigation in dotnet/maui, replacing the need for custom in-repo PowerShell scripts. Two files are added: 1. **`.github/copilot/settings.json`** — repo-level plugin declaration that auto-installs the `dotnet-dnceng` plugin for all users 2. **`.github/skills/azdo-build-investigator/SKILL.md`** — thin MAUI-specific context supplement (~60 lines, no scripts) --- ## Background & Motivation ### The Problem When investigating CI failures on dotnet/maui PRs, contributors and AI agents need to: - Query Azure DevOps builds across 3 pipelines (`maui-pr`, `maui-pr-devicetests`, `maui-pr-uitests`) - Dig into Helix test logs for device test failures - Analyze MSBuild binlogs for obscure build failures - Detect hidden test failures caused by XHarness exiting with code 0 even when tests fail PR #34335 (`feature/azdo-ci-instructions`) addressed this with ~700 lines of custom PowerShell scripts. During review of that PR, we discovered [`dotnet/arcade-skills`](https://github.com/dotnet/arcade-skills) — a .NET engineering-maintained plugin that provides native MCP tooling for exactly this problem space, and already lists `dotnet/maui` as a supported repository. ### Why arcade-skills Instead of Custom Scripts The `dotnet-dnceng` plugin in arcade-skills provides: | MCP Server | Tool | Replaces | |------------|------|---------| | `ado-dnceng-public` | Native ADO queries via `@azure-devops/mcp` | `Get-BuildInfo.ps1`, `Get-BuildErrors.ps1`, `Get-PrBuildIds.ps1` | | `hlx` | Helix test infrastructure via `lewing.helix.mcp` | `Get-HelixLogs.ps1` | | `mcp-binlog-tool` | MSBuild binlog analysis via `baronfel.binlog.mcp` | `Get-BuildBinlogs.ps1` + manual `binlogtool` | MCP tools are first-class AI primitives — the AI calls them directly with structured parameters rather than running shell scripts and parsing text output. This is more reliable and maintainable. ### Auto-Loading (No User Action Required) The key mechanism that makes this work seamlessly: **`.github/copilot/settings.json`** supports `enabledPlugins` (introduced in Copilot CLI v0.0.422) — a "declarative plugin auto-install" that runs at session startup when a user opens this repository. Every user who opens dotnet/maui gets the `dotnet-dnceng` plugin with all its MCP servers automatically. No `/plugin install` command needed. ```json { "extraKnownMarketplaces": [ { "url": "https://github.com/dotnet/arcade-skills" } ], "enabledPlugins": ["dotnet-dnceng@dotnet-arcade-skills"] } ``` --- ## What the MAUI Context Skill Adds The arcade-skills `ci-analysis` skill is excellent but contains outdated MAUI-specific information (it lists `maui-public` as the pipeline name, which is wrong). The thin `azdo-build-investigator/SKILL.md` provides corrections and MAUI-specific domain knowledge: ### Correct Pipeline Names/IDs | Pipeline | Definition ID | Purpose | |----------|--------------|---------| | `maui-pr` | **302** | Main build — check first | | `maui-pr-devicetests` | **314** | Helix device tests | | `maui-pr-uitests` | **313** | Appium UI tests | ### XHarness Exit-0 Blind Spot XHarness (used in `maui-pr-devicetests`) **exits with code 0 even when tests fail**. The ADO job shows ✅ "Succeeded" while actual test failures hide inside Helix work items. The SKILL.md documents how to detect this via the Helix `ResultSummaryByBuild` API. This quirk was discovered while investigating PRs with the `s/agent-gate-failed` label where CI appeared green but tests were actually failing. ### Container Artifact Quirk for Binlogs MAUI build artifacts are **Container type** (not `PipelineArtifact`), so standard `az pipelines runs artifact download` does not work for binlogs. The SKILL.md documents the correct download approach using the ADO File Container API with Bearer auth. --- ## Relationship to PR #34335 PR #34335 (`feature/azdo-ci-instructions`) adds the same investigation capability via 5 custom PowerShell scripts. This PR supersedes that approach. The knowledge gained building those scripts (XHarness exit-0 discovery, Container artifact API approach, pipeline IDs) is preserved in the SKILL.md here. We recommend closing #34335 in favor of this approach, which: - Has ~5% of the code to maintain - Uses MCP tooling that will improve over time as arcade-skills evolves - Auto-loads for all contributors without any setup --- ## Files Changed ``` .github/copilot/settings.json (new) — repo-level plugin auto-install .github/skills/azdo-build-investigator/SKILL.md (new) — MAUI-specific CI context ``` ## Testing - Verified `.github/copilot/settings.json` schema matches Copilot CLI v0.0.422+ `enabledPlugins` format - Verified `dotnet-dnceng@dotnet-arcade-skills` resolves against the marketplace at `https://github.com/dotnet/arcade-skills/.github/plugin/marketplace.json` - SKILL.md pipeline IDs verified against live ADO builds: maui-pr=302, maui-pr-devicetests=314, maui-pr-uitests=313 /cc @PureWeen --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
Description
Adds instruction guidance so that GitHub Copilot (and other AI agents) prefer the
azCLI when investigating Azure DevOps CI builds, test failures, and pipeline status.Problem
When asking Copilot about CI failures, it defaults to anonymous
curl/Invoke-RestMethodcalls against ADO APIs. This works fordnceng-publicbut:dnceng) builds at allChanges
New:
.github/instructions/azdo-ci.instructions.md(applyTo: "**")azCLI before making ADO API callsazis missingazcommands for builds, timelines, logs, test results, and artifactsdnceng-public(public) vsdnceng(auth required) distinctionModified:
.github/copilot-instructions.mdazCLI as strongly recommended with install/setup stepsWhat this does NOT change
pr-build-statusPowerShell scripts (they provide valuable log pre-filtering for Helix and MSBuild errors)Design decision
This approach was validated by multi-model review (Opus 4.6, Sonnet 4.6, Gemini 3 Pro, GPT-5.2 Codex). All agreed:
azcannot, and provide anonymous access for contributors withoutazazshould be optional —dnceng-publicis genuinely public, requiring auth would be a regressionSee also: dotnet/android#10885 for a complementary skill-based approach (binlog analysis).