-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Description
Why
AI instruction files (AGENTS.md, copilot-instructions.md, scoped .instructions.md files) reference specific paths, commands, and directory structures. When the codebase evolves, these references can become stale — leading agents to use wrong commands or look in wrong directories. A lightweight, non-blocking CI check catches this drift early, surfacing warnings in PR reviews without blocking merges.
Scope / Proposed changes
- New file:
.github/workflows/instruction-drift.yml(~60 lines)
Proposed contents
name: Instruction Drift Check
on:
pull_request:
branches: ['main']
paths:
- 'AGENTS.md'
- 'CLAUDE.md'
- '.github/copilot-instructions.md'
- '.github/instructions/**'
- 'lm_eval/**'
- 'tests/**'
- 'docs/**'
- 'pyproject.toml'
- '.pre-commit-config.yaml'
workflow_dispatch:
jobs:
check-drift:
name: Check instruction file references
runs-on: ubuntu-latest
timeout-minutes: 5
continue-on-error: true # Non-blocking: warns but doesn't fail the PR
steps:
- name: Checkout Code
uses: actions/checkout@v6
- name: Validate referenced paths exist
run: |
EXIT_CODE=0
INSTRUCTION_FILES=(
"AGENTS.md"
"CLAUDE.md"
".github/copilot-instructions.md"
)
# Also check any .instructions.md files
while IFS= read -r -d '' file; do
INSTRUCTION_FILES+=("$file")
done < <(find .github/instructions -name '*.instructions.md' -print0 2>/dev/null || true)
for file in "${INSTRUCTION_FILES[@]}"; do
if [ ! -f "$file" ]; then
continue
fi
echo "::group::Checking $file"
# Extract paths that look like references to repo files/dirs
# Matches patterns like: lm_eval/api/, docs/new_task_guide.md, tests/models/
grep -oP '(?:^|\s|`|"|\(|/)(?:lm_eval|tests|docs|scripts|\.github|\.pre-commit)[/\w.-]+' "$file" | \
sort -u | while read -r path; do
# Strip leading/trailing whitespace and backticks
path=$(echo "$path" | sed 's/^[\s`"(\/]*//' | sed 's/[\s`")]*$//')
# Skip if empty or looks like a URL
[ -z "$path" ] && continue
echo "$path" | grep -q 'http' && continue
# Check if path exists (as file or directory)
if [ ! -e "$path" ] && [ ! -e "${path%/}" ]; then
echo "::warning file=$file::Referenced path '$path' does not exist in the repo"
EXIT_CODE=1
fi
done
echo "::endgroup::"
done
if [ $EXIT_CODE -ne 0 ]; then
echo ""
echo "::notice::Some instruction files reference paths that don't exist. Please update the references."
fi
exit $EXIT_CODELabels to apply
- Base:
agent-readiness - Priority:
priority:low - Area:
tooling
Depends on
- Create priority and agent-readiness labels for issue triage #3610 (label creation — for
toolinglabel) - Add .github/copilot-instructions.md for repository-wide Copilot guidance #3611 (.github/copilot-instructions.md)
- Add AGENTS.md for AI agent guardrails and repo context #3612 (AGENTS.md)
- Add path-scoped Copilot instructions for task YAML authoring #3614 (tasks.instructions.md)
- Add path-scoped Copilot instructions for model backend code #3615 (models.instructions.md)
All instruction files should exist before adding a CI check that validates them.
Related existing issues
None — no existing issues cover instruction file validation or drift checking.
Acceptance criteria
-
.github/workflows/instruction-drift.ymlexists - Workflow triggers on PRs to
mainwhen instruction files or key code paths change -
continue-on-error: trueis set (non-blocking) - Workflow correctly identifies paths referenced in instruction files
- Stale references produce GitHub Actions
::warningannotations (visible in PR review) - Workflow does NOT block PR merging
- Workflow passes when all references are valid
Avoid drift/duplication notes
- This workflow is intentionally non-blocking (
continue-on-error: true). It produces warnings, not failures. - The path extraction regex is conservative — it only looks for repo-relative paths starting with known directories (
lm_eval/,tests/,docs/, etc.). - If new instruction files are added in the future, add them to the
INSTRUCTION_FILESarray or let thefindcommand pick them up automatically.
References
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels