You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Conversation context: Part of the mission system (t1357) dependent features. The milestone validation worker (t1357.6) runs tests/build/lint after milestone features complete, but has only a stub for browser testing — it runs the project's existing Playwright test suite if present. t1359 adds mission-aware visual QA: launch the app, navigate flows defined in the mission's acceptance criteria, screenshot key pages, detect layout bugs, broken links, and missing content.
What
A browser-qa-worker.sh script and workflows/browser-qa.md documentation that provide Playwright-based visual QA for milestone validation. The worker:
Launches the app (or connects to a running instance at a given URL)
Navigates key user flows (from mission acceptance criteria or a configurable flow list)
Screenshots every page visited (stored in {mission-dir}/assets/qa/)
Checks for broken links (HTTP status on all <a> hrefs)
Checks for missing/error content (empty body, error states, 404 pages)
Outputs a structured pass/fail report with screenshot paths
Integrates with milestone-validation-worker.sh via a new --browser-qa flag
Why
The existing --browser-tests flag in milestone-validation-worker.sh only runs the project's own Playwright test suite (if it has one). Many projects — especially POC-mode missions — won't have a pre-written test suite. The browser QA worker provides zero-config visual validation: give it a URL and acceptance criteria, and it checks that the app renders correctly, links work, and key flows are navigable. This closes the gap between "tests pass" and "the app actually works visually."
Separate script, not inline: Browser QA is complex enough to warrant its own script. The milestone validation worker delegates to it, keeping both scripts focused.
Playwright via npx: No new dependencies — Playwright is already in the browser automation stack. Uses npx playwright which auto-installs if needed.
Temporary .mjs script: The QA logic runs as a Node.js script generated by the shell wrapper. This gives full Playwright API access while keeping the orchestration in bash.
Zero-config default: With just --browser-qa --browser-url http://localhost:3000, the worker crawls the homepage, checks links, screenshots pages, and reports errors. Custom flows are optional.
Mission-aware: When invoked from milestone validation, reads the mission file's acceptance criteria to determine what flows to test.
Acceptance Criteria
browser-qa-worker.sh --url http://localhost:3000 runs headless Playwright, screenshots the page, checks for console errors and broken links
Screenshots are saved to configurable output directory (default: {mission-dir}/assets/qa/)
Broken link detection reports HTTP status codes for all <a> hrefs on visited pages
Console error detection captures JS exceptions and failed network requests
ShellCheck passes on all new/modified .sh files with zero violations
Context & Decisions
The existing --browser-tests flag runs the project's own Playwright test suite. --browser-qa is complementary — it provides generic visual QA that works even without a test suite.
Playwright is the fastest browser engine in the stack (1.4s navigate+screenshot). Stagehand would add AI overhead (~7s) that isn't needed for deterministic checks.
ARIA snapshots (~0.01s, 50-200 tokens) are preferred over screenshots for structural validation. Screenshots are captured for human review but not used for automated pass/fail decisions.
The worker generates a temporary .mjs file rather than shipping a permanent JS file because the QA logic needs to be parameterised per-run (URLs, flows, output paths).
Relevant Files
scripts/milestone-validation-worker.sh — existing validation worker to extend
workflows/milestone-validation.md — existing validation docs to update
workflows/mission-orchestrator.md — orchestrator that invokes validation
Task ID:
t1359| Status: open | Estimate:~4hLogged: 2026-02-27
Tags:
featurebrowsermissionauto-dispatchDescription
Mission-aware browser QA in milestone validation — extend milestone validation worker (t1357.6) with Playwright-based visual testing. Launch app, navigate flows, screenshot key pages, compare against acceptance criteria. Detect layout bugs, broken links, missing content. Integrates with existing browser automation stack (Playwright > Stagehand > DevTools).
Blocked by:
t1357.6Task Brief
t1359: Mission-Aware Browser QA in Milestone Validation
Origin
What
A
browser-qa-worker.shscript andworkflows/browser-qa.mddocumentation that provide Playwright-based visual QA for milestone validation. The worker:{mission-dir}/assets/qa/)<a>hrefs)milestone-validation-worker.shvia a new--browser-qaflagWhy
The existing
--browser-testsflag inmilestone-validation-worker.shonly runs the project's own Playwright test suite (if it has one). Many projects — especially POC-mode missions — won't have a pre-written test suite. The browser QA worker provides zero-config visual validation: give it a URL and acceptance criteria, and it checks that the app renders correctly, links work, and key flows are navigable. This closes the gap between "tests pass" and "the app actually works visually."How (Approach)
New files
.agents/scripts/browser-qa-worker.sh(~400-600 lines)npx playwright(Node.js) for headless browser automation.mjsscript that Playwright executes.agents/workflows/browser-qa.md(~100-150 lines)Modified files
.agents/scripts/milestone-validation-worker.sh--browser-qaflag (distinct from existing--browser-tests)--browser-qa-flowsflag for custom flow definitionsrun_browser_qa()function that invokesbrowser-qa-worker.shrun_browser_tests()unchanged (backward compatible).agents/workflows/milestone-validation.md.agents/subagent-index.toonbrowser-qa-worker.shto scripts sectionKey design decisions
npx playwrightwhich auto-installs if needed.--browser-qa --browser-url http://localhost:3000, the worker crawls the homepage, checks links, screenshots pages, and reports errors. Custom flows are optional.Acceptance Criteria
browser-qa-worker.sh --url http://localhost:3000runs headless Playwright, screenshots the page, checks for console errors and broken linksmilestone-validation-worker.shaccepts--browser-qaflag and delegates tobrowser-qa-worker.sh{mission-dir}/assets/qa/)<a>hrefs on visited pages.shfiles with zero violationsContext & Decisions
--browser-testsflag runs the project's own Playwright test suite.--browser-qais complementary — it provides generic visual QA that works even without a test suite..mjsfile rather than shipping a permanent JS file because the QA logic needs to be parameterised per-run (URLs, flows, output paths).Relevant Files
scripts/milestone-validation-worker.sh— existing validation worker to extendworkflows/milestone-validation.md— existing validation docs to updateworkflows/mission-orchestrator.md— orchestrator that invokes validationtools/browser/browser-automation.md— browser tool selection guidetools/browser/playwright.md— Playwright referencescripts/accessibility/playwright-contrast.mjs— existing Playwright script patternDependencies
npx playwright) — already in the stackSynced from TODO.md by issue-sync-helper.sh