t1359: Mission-aware browser QA in milestone validation

**Task ID:** `t1359` | **Status:** open | **Estimate:** `~4h`
**Logged:** 2026-02-27
**Tags:** `feature` `browser` `mission` `auto-dispatch`

## Description

Mission-aware browser QA in milestone validation — extend milestone validation worker (t1357.6) with Playwright-based visual testing. Launch app, navigate flows, screenshot key pages, compare against acceptance criteria. Detect layout bugs, broken links, missing content. Integrates with existing browser automation stack (Playwright > Stagehand > DevTools).

**Blocked by:** `t1357.6`

## Task Brief

# t1359: Mission-Aware Browser QA in Milestone Validation

## Origin

- **Created:** 2026-02-28
- **Session:** Claude Code headless worker
- **Created by:** ai-worker (dispatched from issue #2503)
- **Conversation context:** Part of the mission system (t1357) dependent features. The milestone validation worker (t1357.6) runs tests/build/lint after milestone features complete, but has only a stub for browser testing — it runs the project's existing Playwright test suite if present. t1359 adds mission-aware visual QA: launch the app, navigate flows defined in the mission's acceptance criteria, screenshot key pages, detect layout bugs, broken links, and missing content.

## What

A `browser-qa-worker.sh` script and `workflows/browser-qa.md` documentation that provide Playwright-based visual QA for milestone validation. The worker:

1. Launches the app (or connects to a running instance at a given URL)
2. Navigates key user flows (from mission acceptance criteria or a configurable flow list)
3. Screenshots every page visited (stored in `{mission-dir}/assets/qa/`)
4. Checks for broken links (HTTP status on all `<a>` hrefs)
5. Checks for missing/error content (empty body, error states, 404 pages)
6. Detects console errors (JS exceptions, failed network requests)
7. Captures ARIA snapshots for structural validation
8. Outputs a structured pass/fail report with screenshot paths
9. Integrates with `milestone-validation-worker.sh` via a new `--browser-qa` flag

## Why

The existing `--browser-tests` flag in `milestone-validation-worker.sh` only runs the project's own Playwright test suite (if it has one). Many projects — especially POC-mode missions — won't have a pre-written test suite. The browser QA worker provides zero-config visual validation: give it a URL and acceptance criteria, and it checks that the app renders correctly, links work, and key flows are navigable. This closes the gap between "tests pass" and "the app actually works visually."

## How (Approach)

### New files

1. **`.agents/scripts/browser-qa-worker.sh`** (~400-600 lines)
   - Uses Playwright via `npx playwright` (Node.js) for headless browser automation
   - Generates a temporary `.mjs` script that Playwright executes
   - Checks: page loads, screenshots, broken links, console errors, ARIA snapshots
   - Configurable via CLI flags and mission file criteria
   - Outputs JSON report + human-readable summary

2. **`.agents/workflows/browser-qa.md`** (~100-150 lines)
   - Documents the browser QA system
   - Integration guide with milestone validation
   - Configuration options and examples

### Modified files

3. **`.agents/scripts/milestone-validation-worker.sh`**
   - Add `--browser-qa` flag (distinct from existing `--browser-tests`)
   - Add `--browser-qa-flows` flag for custom flow definitions
   - Add `run_browser_qa()` function that invokes `browser-qa-worker.sh`
   - Keep existing `run_browser_tests()` unchanged (backward compatible)

4. **`.agents/workflows/milestone-validation.md`**
   - Add browser QA to the "Optional Checks" table
   - Document the new flags

5. **`.agents/subagent-index.toon`**
   - Add `browser-qa-worker.sh` to scripts section

### Key design decisions

- **Separate script, not inline**: Browser QA is complex enough to warrant its own script. The milestone validation worker delegates to it, keeping both scripts focused.
- **Playwright via npx**: No new dependencies — Playwright is already in the browser automation stack. Uses `npx playwright` which auto-installs if needed.
- **Temporary .mjs script**: The QA logic runs as a Node.js script generated by the shell wrapper. This gives full Playwright API access while keeping the orchestration in bash.
- **Zero-config default**: With just `--browser-qa --browser-url http://localhost:3000`, the worker crawls the homepage, checks links, screenshots pages, and reports errors. Custom flows are optional.
- **Mission-aware**: When invoked from milestone validation, reads the mission file's acceptance criteria to determine what flows to test.

## Acceptance Criteria

- [ ] `browser-qa-worker.sh --url http://localhost:3000` runs headless Playwright, screenshots the page, checks for console errors and broken links
  ```yaml
  verify:
    method: bash
    run: "test -f ~/.aidevops/agents/scripts/browser-qa-worker.sh && bash ~/.aidevops/agents/scripts/browser-qa-worker.sh --help | grep -q 'browser-qa-worker'"
  ```
- [ ] `milestone-validation-worker.sh` accepts `--browser-qa` flag and delegates to `browser-qa-worker.sh`
  ```yaml
  verify:
    method: bash
    run: "grep -q 'browser-qa' ~/.aidevops/agents/scripts/milestone-validation-worker.sh"
  ```
- [ ] Screenshots are saved to configurable output directory (default: `{mission-dir}/assets/qa/`)
- [ ] Broken link detection reports HTTP status codes for all `<a>` hrefs on visited pages
- [ ] Console error detection captures JS exceptions and failed network requests
- [ ] ShellCheck passes on all new/modified `.sh` files with zero violations

## Context & Decisions

- The existing `--browser-tests` flag runs the project's own Playwright test suite. `--browser-qa` is complementary — it provides generic visual QA that works even without a test suite.
- Playwright is the fastest browser engine in the stack (1.4s navigate+screenshot). Stagehand would add AI overhead (~7s) that isn't needed for deterministic checks.
- ARIA snapshots (~0.01s, 50-200 tokens) are preferred over screenshots for structural validation. Screenshots are captured for human review but not used for automated pass/fail decisions.
- The worker generates a temporary `.mjs` file rather than shipping a permanent JS file because the QA logic needs to be parameterised per-run (URLs, flows, output paths).

## Relevant Files

- `scripts/milestone-validation-worker.sh` — existing validation worker to extend
- `workflows/milestone-validation.md` — existing validation docs to update
- `workflows/mission-orchestrator.md` — orchestrator that invokes validation
- `tools/browser/browser-automation.md` — browser tool selection guide
- `tools/browser/playwright.md` — Playwright reference
- `scripts/accessibility/playwright-contrast.mjs` — existing Playwright script pattern

## Dependencies

- **Blocked by:** t1357.6 (milestone validation worker) — MERGED
- **Blocks:** None
- **External:** Playwright (`npx playwright`) — already in the stack

---
*Synced from TODO.md by issue-sync-helper.sh*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t1359: Mission-aware browser QA in milestone validation #2503

Description

Task Brief

t1359: Mission-Aware Browser QA in Milestone Validation

Origin

What

Why

How (Approach)

New files

Modified files

Key design decisions

Acceptance Criteria

Context & Decisions

Relevant Files

Dependencies

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

t1359: Mission-aware browser QA in milestone validation #2503

Description

Description

Task Brief

t1359: Mission-Aware Browser QA in Milestone Validation

Origin

What

Why

How (Approach)

New files

Modified files

Key design decisions

Acceptance Criteria

Context & Decisions

Relevant Files

Dependencies

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions