Skip to content

fix: prevent azd+Terraform template variable interpolation failures#1585

Open
tmeschter wants to merge 32 commits intomicrosoft:mainfrom
tmeschter:260330-Issue-1558
Open

fix: prevent azd+Terraform template variable interpolation failures#1585
tmeschter wants to merge 32 commits intomicrosoft:mainfrom
tmeschter:260330-Issue-1558

Conversation

@tmeschter
Copy link
Copy Markdown
Member

@tmeschter tmeschter commented Mar 30, 2026

Summary

Fixes #1558. Addresses the azd+Terraform template variable interpolation gap that caused the terraform-azure-container-apps-deploy integration test to time out at 30 minutes.

The root cause: azure-prepare generated main.tfvars.json with Go-style template variables ({{ .Env.AZURE_ENV_NAME }}), but azd does NOT interpolate these in .tfvars.json files -- only in azure.yaml. Literal template strings were passed to Terraform, causing cascading failures (resource naming errors, state conflicts, retry spirals).

Changes

Fix 1: azure-prepare (v1.1.3 → v1.1.4) -- Prevent the problem at source

  • plugin/skills/azure-prepare/references/recipes/azd/terraform.md
    • Added warning against using Go-style template variables in .tfvars.json files
    • Documented correct variable passing: azd auto-mapping, TF_VAR_* env vars, terraform.tfvars HCL
    • Removed incorrect env("DATABASE_NAME") example that encouraged template-style usage
    • Added troubleshooting entries for template interpolation failures

Fix 2: azure-validate (v1.0.3 → v1.0.4) -- Catch it before deployment

  • plugin/skills/azure-validate/references/recipes/terraform/README.md
    • Added Step 10: Template Variable Resolution Check for azd+Terraform
    • Includes grep commands to detect unresolved {{ .Env.* }} patterns and .tfvars.json files
    • Provides remediation steps so the error surfaces during validation, not at deploy time

Fix 3: azure-deploy (v1.0.11 → v1.0.12) -- Document error and add pre-deploy guard

  • plugin/skills/azure-deploy/references/recipes/azd/errors.md
    • Added "Unresolved Terraform Template Variables" error section with full symptom/cause/solution
  • plugin/skills/azure-deploy/references/pre-deploy-checklist.md
    • Added Step 9: Verify Terraform Variable Resolution (azd+Terraform only)
  • plugin/skills/azure-deploy/references/recipes/terraform/errors.md
    • Added entries for template interpolation failure and state clearing behavior

Testing

These are documentation/skill-instruction changes (no runtime code). The fixes guide the agent to:

  1. Never generate main.tfvars.json with template expressions (prevention)
  2. Detect unresolved template variables during azure-validate (early detection)
  3. Recover correctly if the error is encountered during azure-deploy (error handling)

…icrosoft#1558)

Address azd template variable interpolation gap that caused deployment
timeouts in terraform-azure-container-apps-deploy integration tests.

azure-prepare (v1.0.13):
- Add warning against using Go-style template variables in .tfvars.json
- Document correct variable passing: azd auto-mapping, TF_VAR_* env vars
- Remove incorrect env() function usage in variable example
- Add troubleshooting entries for template interpolation errors

azure-validate (v1.0.3):
- Add Step 10: Template Variable Resolution Check for azd+Terraform
- Detect unresolved {{ .Env.* }} patterns and .tfvars.json files
- Provide remediation steps to fix before deployment

azure-deploy (v1.0.9):
- Add Unresolved Terraform Template Variables error section with solution
- Add pre-deploy Step 9: Verify Terraform Variable Resolution
- Add Terraform state management error entries
- Document azd state clearing behavior and remote backend recommendation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 30, 2026 19:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Azure skill documentation to prevent and detect azd + Terraform variable interpolation failures caused by unresolved Go-template expressions being passed into Terraform via .tfvars.json files (Fixes #1558).

Changes:

  • Bumps skill versions for azure-prepare, azure-validate, and azure-deploy.
  • Adds azd+Terraform guidance to prevent generating main.tfvars.json with {{ .Env.* }} templates and recommends alternative variable passing approaches.
  • Adds pre-validation and pre-deploy checks plus error-recovery documentation to surface unresolved template variables earlier.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
plugin/skills/azure-validate/SKILL.md Version bump to reflect documentation updates.
plugin/skills/azure-validate/references/recipes/terraform/README.md Adds Step 10 to detect unresolved {{ .Env.* }} and .tfvars.json usage for azd+Terraform.
plugin/skills/azure-prepare/SKILL.md Version bump to reflect documentation updates.
plugin/skills/azure-prepare/references/recipes/azd/terraform.md Adds warning and revised guidance for passing variables to Terraform without template interpolation.
plugin/skills/azure-deploy/SKILL.md Version bump to reflect documentation updates.
plugin/skills/azure-deploy/references/recipes/terraform/errors.md Adds new Terraform error entries related to unresolved templates and state persistence behavior.
plugin/skills/azure-deploy/references/recipes/azd/errors.md Adds a dedicated error section for unresolved Terraform template variables with remediation steps.
plugin/skills/azure-deploy/references/pre-deploy-checklist.md Adds a pre-deploy guard step to detect unresolved templates / main.tfvars.json before running azd up.
Comments suppressed due to low confidence (1)

plugin/skills/azure-prepare/references/recipes/azd/terraform.md:132

  • The “azd auto-mapping” description is internally inconsistent: it says azd passes AZURE_ENV_NAME/AZURE_LOCATION as Terraform variables “when they match variable names in variables.tf”, but the common Terraform variable names shown (environment_name, location) do not match those AZD env var names. Please reword this to accurately describe the mapping behavior (e.g., whether azd maps specific AZURE_* env vars to conventional Terraform variable names, or only passes variables whose names exactly match).
> **Do NOT generate `main.tfvars.json`** with template variables. Instead, pass variables to Terraform
> using one of these methods (in order of preference):
>
> 1. **azd auto-mapping** — azd automatically passes `AZURE_ENV_NAME`, `AZURE_LOCATION`, and
>    `AZURE_SUBSCRIPTION_ID` as Terraform variables when they match variable names in `variables.tf`
> 2. **`TF_VAR_*` environment variables** — Set via `azd env set TF_VAR_myvar value`
> 3. **`terraform.tfvars` (HCL format)** — Static defaults only; no template expressions

tmeschter and others added 3 commits March 30, 2026 12:33
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 30, 2026 20:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 30, 2026 21:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

plugin/skills/azure-deploy/references/pre-deploy-checklist.md:145

  • Step 9 fails only if infra/main.tfvars.json exists, but the azure-validate recipe flags any *.tfvars.json files as problematic for azd+Terraform. This mismatch can let other .tfvars.json files slip past the pre-deploy guard (or confuse readers). Consider either checking for all *.tfvars.json here too, or narrowing the validation guidance so both documents align.
# Fail if main.tfvars.json exists (should not be used with azd)
if test -f infra/main.tfvars.json; then
  echo "ERROR: Remove main.tfvars.json — use TF_VAR_* env vars instead"
  exit 1
fi

plugin/skills/azure-prepare/references/recipes/azd/terraform.md:257

  • The example implies that azd env set DATABASE_NAME mydb will be automatically passed to Terraform as var.database_name. Elsewhere in this PR the documented “auto-mapping” is specifically for AZURE_ENV_NAME, AZURE_LOCATION, and AZURE_SUBSCRIPTION_ID; Terraform itself only auto-reads environment variables via the TF_VAR_ prefix. To avoid misleading guidance and broken deployments, clarify what azd actually maps automatically and/or update the example to use TF_VAR_database_name (or the supported azd→Terraform variable mechanism).
azd automatically maps its environment variables to Terraform variables. Define the variable
in `variables.tf` and set it via `azd env set`:

```bash
# Set azd variable
azd env set DATABASE_NAME mydb
# In variables.tf — azd passes this automatically
variable "database_name" {
  type    = string
}

Copy link
Copy Markdown
Collaborator

@wbreza wbreza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Code Review: PR #1585

What Looks Good

  • Defense-in-depth strategy is excellent — the prevent (prepare) → detect (validate) → recover (deploy) layering across three skills is great architecture
  • Root cause correctly identified — Go-style {{ .Env.* }} template syntax truly is the problem in .tfvars.json files
  • Correctly removes the bogus env("DATABASE_NAME") Terraform function that doesn't exist in Terraform >= 0.12
  • Version bumps are correct across all three skills
  • The "State cleared on each azd provision" error entry captures a real gotcha

Key Concern: Remediation Contradicts Standard azd Patterns

The diagnosis is spot-on, but the remediation goes too far — it tells the agent to delete main.tfvars.json and never generate it, when main.tfvars.json is actually azd's standard parameter template file for Terraform projects. The azd source code (terraform_provider.go) reads this file, runs envsubst to substitute `` references, then passes the resolved file to Terraform via -var-file=. Microsoft's own sample templates (e.g., `todo-nodejs-mongo-terraform`) use exactly this pattern.

The correct fix should be: "Use `` syntax in main.tfvars.json, not Go-style `{{ .Env. }}`"* — rather than eliminating the file entirely.

Similarly, the "azd auto-mapping" claims (that azd automatically passes environment variables to Terraform by matching names in variables.tf) don't reflect azd's actual mechanism. Variables flow through main.tfvars.json substitution or explicit TF_VAR_* environment variables.

It looks like there are patterns across the PR that contradict the existing azd patterns for Terraform development. If any of the existing azd patterns are wrong or need updating, it'd be great to figure out what those are and collaborate on getting them fixed — rather than working around them. Let's align the guidance to match how azd actually works.

Findings Summary

Priority Count Description
🔴 Critical 3 Incorrect main.tfvars.json prohibition; wrong auto-mapping claims; misleading variable guidance
🟠 High 2 Wrong remediation approach; grep pattern scope issues
🟡 Medium 2 Pre-deploy check would reject standard templates; repeated content (~500+ tokens)
🟢 Low 2 Prior Copilot bot comments on get-value are incorrect (PR is right); legacy spec note is valid
Total 9

Overall Assessment: Comment

Note: The 3 unresolved Copilot bot comments about azd env get-value vs get-values are incorrect — both are valid azd CLI commands. The PR's singular form is the correct one for single-value retrieval.

fanyang-mono and others added 12 commits April 2, 2026 10:13
…ent confusion (microsoft#1584)

* fix: rename .azure/plan.md to .azure/deployment-plan.md to prevent confusion with session-state plan.md

The agent was confusing the workspace deployment plan (.azure/plan.md) with the
session-state plan.md file, causing the 'creates correct files for AZD with Bicep
recipe' integration test to fail (issue microsoft#1562).

Renaming to deployment-plan.md eliminates the name collision and makes the file's
purpose self-documenting. Updated all references across azure-prepare,
azure-validate, and azure-deploy skills, their reference docs, and all tests.

Closes microsoft#1562

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: update azure-deploy trigger keyword snapshots for deployment-plan rename

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ft#1570)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.34.1 to 4.35.1.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@3869755...c10b806)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#1533)

* feat: remove strong verbiage

* feat: update phrasing

* fix: undo tool name change

* chore: bump version to 1.0.1

---------

Co-authored-by: Michael Ren <mren@microsoft.com>
…rosoft#1586)

* Add Docker build context validation to azure-validate skill

Pre-validate Docker build context during azure-validate by checking
for package-lock.json when npm ci is specified in a Dockerfile. This
prevents Docker build failures during azd package/up that waste time
and can push deployments past test timeouts.

Changes:
- AZD recipe: Add step 9 (Docker Build Context Validation) between
  Build Verification and Package Validation
- AZCLI recipe: Enhance Docker Build step with build context
  pre-validation before attempting docker build
- AZD/AZCLI errors: Add npm ci / package-lock.json missing error entry
- Bump azure-validate version to 1.0.3

Fixes microsoft#1557

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update plugin/skills/azure-validate/references/recipes/azd/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update plugin/skills/azure-validate/references/recipes/azd/README.md

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…upport (microsoft#1605)

Add .claude-plugin/plugin.json to the sync-to-microsoft-azure-skills job
so the Azure plugin is discoverable in the Claude marketplace. Updates
the copy, URL replacement, version restore, and version bump steps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…icrosoft#1606)

* Copy .claude-plugin/plugin.json to repo root for Claude marketplace support

Add .claude-plugin/plugin.json to the sync-to-microsoft-azure-skills job
so the Azure plugin is discoverable in the Claude marketplace. Updates
the copy, URL replacement, version restore, and version bump steps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Copy hooks to top-level folder in azure-skills in addition to skills

Update the sync-to-microsoft-azure-skills job in the publish pipeline
to also copy hooks/ and copilot-hooks.json to the repo root, matching
how skills/ is already copied. Also add the new paths to the URL
replacement step.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ments (microsoft#1220)

* update azure-prepare skill to check subscription policies

* update azure-prepare skill to check functionality before deploying

* Add role assignment verification step to azure-prepare skill

Add new Phase 2 step 5 (Verify Role Assignments) between security
hardening and functional verification. Includes reference doc with
service-to-role mapping table, MCP tool usage, and common RBAC
mistakes (e.g., generic Contributor lacking data-plane access).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update plugin/skills/azure-prepare/references/role-verification.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update plugin/skills/azure-prepare/references/functional-verification.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add live role verification step to azure-validate skill

Add step 4 (Live Role Verification) to query Azure for provisioned
RBAC assignments and cross-check against expected roles. Complements
the static role check in azure-prepare: prepare checks generated
Bicep/Terraform, validate checks live Azure state.

Includes reference doc with MCP tool usage, CLI commands, common
issues table, and decision tree for pass/fail criteria.

Bumps azure-validate version 1.0.0 -> 1.0.1.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Clarify azure-prepare role check as static only

Replace MCP live-query section with static code review guidance.
Live role verification is the responsibility of azure-validate
step 4 (live-role-verification.md). This removes the overlap
between the two skills.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: move role verification across prepare/validate/deploy skills

- Remove static role check (step 5) from azure-prepare — prepare just generates
- Add static role check as step 4 in azure-validate (pre-deployment)
- Move live role check from azure-validate step 4 to azure-deploy step 8 (post-deployment)
- Move role-verification.md from azure-prepare to azure-validate references
- Move live-role-verification.md from azure-validate to azure-deploy references
- Update all step number cross-references in functional-verification.md
- Bump versions: prepare 1.0.6->1.0.7, validate 1.0.1->1.0.2, deploy 1.0.5->1.0.6

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: add azure__role to azure-deploy MCP Tools table

Step 8 (Live Role Verification) references azure__role for RBAC
assignment listing, but the tool was missing from the MCP Tools
table. Agents could incorrectly assume only the three listed tools
are available. Bump version 1.0.6 -> 1.0.7.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* update test snapshots

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* update the versions and added integration/ unit tests

* Update plugin/skills/azure-validate/references/role-verification.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* update skills and the test runs

* update snapshots

* update to correct versions

* update reference name

* update versioning

* update steps

* fix: update azure-deploy trigger test snapshots

Keywords were removed from the SKILL.md description in a previous PR
but the trigger test snapshots were not regenerated, causing 2 snapshot
failures in the pipeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: update azure-deploy trigger test snapshots

Keywords were removed from the SKILL.md description in a previous PR
but the trigger test snapshots were not regenerated, causing 2 snapshot
failures in the pipeline.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* update snapshot

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…) (microsoft#1604)

The workflow routing entries loaded durable.md and the DTS README but not
bicep.md — so the agent had overview docs but not the Bicep patterns
needed to generate .bicep files. Also adds 'order processing' keyword.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ost-forecast and azure cost-optimization skills (microsoft#1221)

* initial implementation

* update the guardrails for query and forecast

* reduce token limit of reference files

* update unit tests

* fix breaking PR checks

* fix pr check errors and code review comments

* refactor to azure-cost (#1)

* update tests and references to combined azure cost skill

* Remove unused test fixture files

Delete cost-query-sample.json and cost-forecast-sample.json from
tests/azure-cost/fixtures/ as they are not referenced by any test
files. No other skills in the repo use fixture files either, so
these add maintenance overhead without value.

Addresses PR review comment microsoft#14 and microsoft#15 (fixtures removed entirely
rather than fixing hard-coded dates, since they were unused).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* updates tests

* Consolidate azure-cost tests to standard 3-file layout, fix CI gates

- Consolidate 12 test files into standard 3-file structure (unit/triggers/integration)
- Rewrite integration tests using canonical withTestResult pattern
- Move all positive trigger prompts into triggers.test.ts
- Move all sub-area unit assertions into unit.test.ts
- Delete 9 redundant sub-area test files
- Regenerate snapshot with Jest 30 header format
- Bump sensei version 1.0.1 -> 1.0.2
- Bump azure-prepare version 1.0.10 -> 1.0.11

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix PR review comments: canonical azqr tool name, table formatting

- Change azure__extension_azqr to mcp_azure_mcp_extension_azqr in SKILL.md
- Fix missing space in 429 table row in cost-forecast/error-handling.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Address PR review comments: MCP tools table, code block languages, remove phantom skill

- Add azure__extension_azqr and azure__aks to MCP Tools table for consistency
- Add yaml language to azqr code block in SKILL.md
- Add text language to portal link code block in report-template.md
- Remove non-existent azure-create-app row from tests/README.md coverage grid

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: extract azure-cost workflows into separate reference files

Move each workflow (query, optimization, forecast) into dedicated
reference files under references/ for progressive disclosure. This
reduces SKILL.md from 575 lines (23KB) to 139 lines (7KB), so the
agent only loads the workflow it needs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor: move workflow files into their respective folders

Move cost-query-workflow.md, cost-optimization-workflow.md, and
cost-forecast-workflow.md from references/ into cost-query/,
cost-optimization/, and cost-forecast/ as workflow.md. Update all
links in SKILL.md and cross-references. Bump version to 1.0.2.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fixes skill quality issues

* fix: update azure-cost tests for refactored skill structure

- Update snapshot to match new description with DO NOT USE FOR clause
- Update unit tests to load workflow files directly (content moved from
  SKILL.md to cost-query/, cost-forecast/, cost-optimization/ folders)
- Fix heading level assertions (## not ### in standalone workflow files)
- Remove 3 shouldNotTrigger prompts that contain cost keywords and
  correctly trigger the keyword-based matcher

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: reset azure-cost version to 1.0.0 and fix YAML comment syntax

- Reset version to 1.0.0 for new skill directory (was incorrectly 1.0.3)
- Change // optional to # optional in YAML code block

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Sai Koumudi Kaluvakolanu <saikoumudi@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…soft#1649)

* Document principal type mismatch error in AZD errors reference

AZD base templates (e.g. functions-quickstart-python-http-azd) create RBAC
role assignments with hardcoded principalType 'User' for the deploying
identity. In CI/CD where a service principal is used, ARM rejects this
with a PrincipalType mismatch error. The agent had no guidance for this
failure and spent multiple retries before finding the fix.

Adding this to the AZD errors reference gives the agent a direct path to
the solution: set allowUserIdentityPrincipal to false in main.bicep. It
also warns against the ineffective workaround of clearing
AZURE_PRINCIPAL_ID.

Fixes microsoft#1624

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Update plugin/skills/azure-deploy/references/recipes/azd/errors.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Harshaa and others added 7 commits April 2, 2026 10:14
Co-authored-by: Harsha Nair <hnair@microsoft.com>
…icrosoft#1498)

* feat: add GEPA integration to sensei skill + quality score workflow

Add GEPA (Genetic-Pareto) evolutionary optimization as an optional
enhancement to sensei's Ralph loop for automated SKILL.md improvement.

Changes:
- .github/skills/sensei/SKILL.md: Added --gepa flag, GEPA mode docs,
  Step 5-GEPA in the Ralph loop
- .github/skills/sensei/scripts/gepa/auto_evaluator.py: Auto-discovers
  test harness at runtime, builds GEPA evaluators, scores/optimizes skills
- pipelines/gepa-quality-score.yml: PR quality gate that scores SKILL.md
  quality and posts results as PR comment

The auto-evaluator requires zero manual configuration. It reads
triggers.test.ts to extract shouldTrigger/shouldNotTrigger arrays
and builds a composite evaluator (content quality + trigger accuracy).

Existing tests are NOT replaced or modified.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address PR review feedback for GEPA integration

- Bump sensei SKILL.md version 1.0.0 → 1.0.2 (fixes Skill Structure CI)
- Remove unused imports: sys, dataclass, field (fixes CodeQL warnings)
- Extract strip_frontmatter() helper to replace fragile content.index()
  parsing that could raise ValueError on malformed frontmatter
- Deduplicate frontmatter stripping logic between score_skill/optimize_skill
- Add explicit permissions block (contents: read, pull-requests: write)
- Use sticky comment pattern (<- Consolidate FileSystemWatcher usage: gepa-quality-score --> marker) to avoid
  PR comment spam on re-runs
- Fix display results to match workflow_dispatch single-skill input
- Rename quality gate step to '(advisory)' to clarify non-blocking behavior

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: skip PR comment step for forked PRs

Forked PRs have reduced GITHUB_TOKEN permissions, which would cause
the comment step to fail. Only post comments when the PR originates
from the same repository.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: strip comments in trigger parsing + clarify GEPA step scope

- Strip single-line (//) and multi-line (/* */) comments from trigger
  test arrays before extracting strings, preventing commented-out
  example prompts from polluting trigger accuracy scoring
- Fix SKILL.md step 5b to clarify GEPA only replaces step 5 (IMPROVE
  FRONTMATTER), not step 6 (IMPROVE TESTS)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: correct docstring and SKILL.md to reflect actual evaluator behavior

The evaluator parses trigger prompt arrays and uses content heuristics
for scoring — it does not execute Jest tests or incorporate test
pass/fail results. Updated docs to accurately describe this.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address round 3 review feedback

- Remove unused params: as_json from score_skill, fast from build_evaluator
- Pin all actions to commit SHAs matching repo convention (checkout v6,
  setup-python v6.2.0, upload-artifact v7.0.0, github-script v8.0.0)
- Pin gepa dependency to v0.7.0 for reproducible CI
- Remove DO NOT USE FOR from scoring criteria (conflicts with repo
  guidance that discourages it due to keyword contamination risk)
- Add quality_score_raw field for full-precision threshold comparisons
- Enhance parse_trigger_arrays to resolve ...varName spread patterns
  by extracting strings from referenced arrays in the same file
- Clarify SKILL.md step 5b: GEPA uses trigger definitions as config,
  does not execute Jest tests
- Add NOTE about future workflow_run commenting pattern migration

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix: address PR review feedback — split workflow, fix regex, update docs

- Split gepa-quality-score.yml into read-only scoring workflow +
  workflow_run-triggered commenter (gepa-quality-score-comment.yml),
  matching the repo's existing pr.yml / pr-comment.yml pattern
- Fix API key regex to also match 'api key:' with whitespace separator
- Update PR description to clarify ASI uses heuristic scoring
  (Jest integration is planned for future iteration)
- Remove pull-requests:write from scoring workflow permissions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bumps the minor-and-patch group in /scripts with 4 updates: [@vitest/coverage-v8](https://github.com/vitest-dev/vitest/tree/HEAD/packages/coverage-v8), [fast-xml-parser](https://github.com/NaturalIntelligence/fast-xml-parser), [typescript-eslint](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/typescript-eslint) and [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest).


Updates `@vitest/coverage-v8` from 4.1.0 to 4.1.2
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.2/packages/coverage-v8)

Updates `fast-xml-parser` from 5.5.8 to 5.5.9
- [Release notes](https://github.com/NaturalIntelligence/fast-xml-parser/releases)
- [Changelog](https://github.com/NaturalIntelligence/fast-xml-parser/blob/master/CHANGELOG.md)
- [Commits](NaturalIntelligence/fast-xml-parser@v5.5.8...v5.5.9)

Updates `typescript-eslint` from 8.57.1 to 8.57.2
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/typescript-eslint/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.57.2/packages/typescript-eslint)

Updates `vitest` from 4.1.0 to 4.1.2
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.2/packages/vitest)

---
updated-dependencies:
- dependency-name: "@vitest/coverage-v8"
  dependency-version: 4.1.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor-and-patch
- dependency-name: fast-xml-parser
  dependency-version: 5.5.9
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor-and-patch
- dependency-name: typescript-eslint
  dependency-version: 8.57.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor-and-patch
- dependency-name: vitest
  dependency-version: 4.1.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor-and-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
)

* Initial plan

* Add kvenkatrajan as codeowner for entra-app-registration

Agent-Logs-Url: https://github.com/microsoft/GitHub-Copilot-for-Azure/sessions/0062a31a-0103-4dbf-a191-8264b9deea81

Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: kvenkatrajan <102772054+kvenkatrajan@users.noreply.github.com>
…ing claims

Key changes based on wbreza's review of PR microsoft#1585:

- Replace 'Do NOT generate main.tfvars.json' with correct guidance:
  use \\\ syntax (azd envsubst), not Go-style {{ .Env.* }}
- Remove incorrect 'azd auto-mapping' claims — variables flow via
  main.tfvars.json substitution or explicit TF_VAR_* env vars
- Fix pre-deploy check: validate syntax in main.tfvars.json instead
  of rejecting the file's existence
- Scope grep patterns with --include='*.tf' --include='*.tfvars.json'
  to avoid false positives from .terraform/ and READMEs
- Align grep patterns consistently across all files
- Update remediation steps to fix syntax rather than delete files
- Add main.tfvars.json example with correct \ syntax

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…-validate 1.0.4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 2, 2026 17:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

plugin/skills/azure-validate/references/recipes/terraform/README.md:133

  • azd env get-value is used here, but the rest of the repo consistently documents azd env get-values (plural) for reading environment variables (e.g., plugin/skills/azure-deploy/references/pre-deploy-checklist.md:113-117). If get-value isn’t supported by the azd version users have, this command will fail; please switch to the documented get-values pattern (and extract the single variable) or use the existing eval $(azd env get-values) approach used elsewhere in the repo.
2. For additional variables, use **`TF_VAR_*` environment variables**:
   ```bash
   azd env set TF_VAR_environment_name "$(azd env get-value AZURE_ENV_NAME)"
  1. Verify that variables.tf declares all required variables
</details>

Copilot AI review requested due to automatic review settings April 2, 2026 17:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

plugin/skills/azure-validate/references/recipes/terraform/README.md:132

  • azd env get-value <NAME> isn’t used elsewhere in this repo’s docs (they consistently use azd env get-values). To avoid relying on a possibly-nonexistent subcommand and to match existing guidance, switch this example to azd env get-values (then grep/parse) or eval $(azd env get-values) before referencing $AZURE_ENV_NAME.
2. For additional variables, use **`TF_VAR_*` environment variables**:
   ```bash
   azd env set TF_VAR_environment_name "$(azd env get-value AZURE_ENV_NAME)"
</details>

@tmeschter
Copy link
Copy Markdown
Member Author

@wbreza This is ready for another look.

Copilot AI review requested due to automatic review settings April 3, 2026 22:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates Azure skill documentation to prevent and detect unresolved Go-template variables being passed into Terraform via main.tfvars.json in azd+Terraform projects, addressing integration test timeouts and cascading deploy failures (Fixes #1558).

Changes:

  • Documented the azd+Terraform interpolation limitation and the correct ${VAR} / TF_VAR_* patterns.
  • Added validation + pre-deploy checks to detect unresolved {{ .Env.* }} in Terraform inputs early.
  • Expanded error troubleshooting guidance for unresolved template variables and state behavior.
Show a summary per file
File Description
plugin/skills/azure-validate/references/recipes/terraform/README.md Adds a validation step to detect unresolved Go-style template variables in Terraform inputs.
plugin/skills/azure-validate/SKILL.md Bumps azure-validate skill version to reflect the documentation update.
plugin/skills/azure-prepare/references/recipes/azd/terraform.md Adds warnings + corrected examples for passing variables from azd to Terraform without Go templates.
plugin/skills/azure-prepare/SKILL.md Bumps azure-prepare skill version to reflect the documentation update.
plugin/skills/azure-deploy/references/recipes/terraform/errors.md Adds troubleshooting entries for unresolved templates and azd state-copy behavior.
plugin/skills/azure-deploy/references/recipes/azd/errors.md Adds a dedicated “Unresolved Terraform Template Variables” error section with remediation steps.
plugin/skills/azure-deploy/references/pre-deploy-checklist.md Adds a mandatory pre-deploy guard step for azd+Terraform to catch unresolved templates.
plugin/skills/azure-deploy/SKILL.md Bumps azure-deploy skill version to reflect the documentation update.

Copilot's findings

  • Files reviewed: 8/8 changed files
  • Comments generated: 2

metadata:
author: Microsoft
version: "1.1.6"
version: "1.1.7"
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says azure-prepare is being bumped from v1.1.3 → v1.1.4, but this change bumps azure-prepare/SKILL.md from 1.1.6 → 1.1.7. Similarly, the description mentions azure-deploy v1.0.11 → v1.0.12 while the diff bumps 1.0.14 → 1.0.15. Please reconcile the PR description with the actual version bumps (either update the description or adjust the versions) so release notes stay accurate.

Copilot uses AI. Check for mistakes.
Comment on lines +136 to +139
if grep -rn '{{ *\.Env\.' infra/ --include='*.tf' --include='*.tfvars.json'; then
echo "ERROR: Unresolved Go-style template variables found"
exit 1
fi
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guard silently passes when infra/ doesn’t exist (grep exits non-zero, so the if body won’t run), which can produce a false sense of safety in docs meant to be 'MANDATORY for azd+Terraform'. Consider adding an explicit test -d infra/ check with a clear error (or an explicit 'skip' message) so the snippet behaves deterministically and avoids false negatives.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@wbreza wbreza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #1585

✅ What Looks Good

  • Defense-in-depth architecture is excellent — prevent (prepare) → detect (validate) → recover (deploy) layering across three skills is well-designed
  • Prior critical feedback addressed — the latest version correctly keeps main.tfvars.json with ${VAR} syntax instead of prohibiting the file entirely; auto-mapping claims replaced with accurate envsubst description
  • Correctly removes bogus env("DATABASE_NAME") — this Terraform function doesn’t exist in Terraform ≥0.12
  • Version bumps correct across all three skills

🟡 Medium (3 findings)

See inline comments for details.

# Finding File
1 JSON code blocks contain // comments — invalid JSON that agent may copy literally terraform.md
2 Content repetition across skills (~500+ tokens) — acceptable trade-off for self-contained skill context Multiple
3 Version conflict with PR #1587 — both bump azure-prepare to 1.1.7; coordinate merge order SKILL.md

🟢 Low (2 findings)

# Finding
4 Pre-deploy grep scope is a subset of validate grep (intentional quick guard vs thorough check)
5 PR description version numbers stale from rebasing

Summary

Priority Count
Critical 0
High 0
Medium 3
Low 2
Total 5

Overall Assessment: Approve — prior critical concerns have been addressed. Remaining findings are non-blocking suggestions.

⚠️ Note: Both this PR and #1587 bump azure-prepare from 1.1.6 → 1.1.7. Whichever merges second will need to bump to 1.1.8.


```json
// infra/main.tfvars.json — azd substitutes ${VAR} references via envsubst
{
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 Medium] JSON code block contains // comments

JSON doesn’t support comments. If the agent copies this block literally into main.tfvars.json, it will produce invalid JSON.

Consider moving the comment outside the code fence, or switching the language tag to jsonc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integration test failure: azure-deploy – containerized web app Terraform Container Apps [Deployment failure]