Skip to content

t1359: Mission-aware browser QA for milestone validation#2543

Merged
marcusquinn merged 3 commits intomainfrom
feature/t1359-browser-qa-milestone-validation
Feb 28, 2026
Merged

t1359: Mission-aware browser QA for milestone validation#2543
marcusquinn merged 3 commits intomainfrom
feature/t1359-browser-qa-milestone-validation

Conversation

@marcusquinn
Copy link
Copy Markdown
Owner

@marcusquinn marcusquinn commented Feb 28, 2026

Summary

  • Add Playwright-based visual testing pipeline for mission milestone validation with browser-qa-helper.sh (smoke tests, multi-viewport screenshots, broken link crawling, WCAG accessibility checks)
  • Create browser-qa.md subagent doc with severity mapping, 6-step pipeline, and visual regression guidance
  • Extend milestone-validation.md with browser QA integration, "How to Think" section, severity classification, and dev server management
  • Wire into mission orchestrator Phase 4 and browser automation decision tree

Changes

File What
scripts/browser-qa-helper.sh New — CLI: run, screenshot, links, a11y, smoke commands (912 lines)
tools/browser/browser-qa.md New — Subagent doc: severity mapping, 6-step pipeline, visual regression
workflows/milestone-validation.md Extended — Browser QA section, severity table, "How to Think", dev server mgmt
workflows/mission-orchestrator.md Updated — Phase 4 references browser QA pipeline
tools/browser/browser-automation.md Updated — Decision tree extended with QA branch
AGENTS.md Updated — Domain index includes browser-qa entry
subagent-index.toon Updated — Index entries for browser-qa-helper and milestone-validation

Verification

  • ShellCheck: 0 violations on browser-qa-helper.sh
  • Markdown lint: clean on all .md files
  • Script is executable (mode 755)
  • All shared-constants.sh dependencies verified (logging, error constants)
  • Rebased onto latest main, merge conflicts with t1357.6 resolved cleanly

Supersedes #2540 (closed, could not reopen after rebase).

Closes #2503

Summary by CodeRabbit

  • New Features

    • Introduced comprehensive browser QA capabilities including smoke tests, multi-viewport screenshot capture, link validation, and accessibility checks for milestone validation.
    • Enhanced milestone validation workflow with browser QA integration and improved testing pipeline orchestration.
  • Documentation

    • Added QA documentation and updated validation workflows to support new browser testing features.

Add Playwright-based visual testing pipeline for mission milestone
validation. Creates browser-qa-helper.sh (smoke tests, screenshots,
broken link detection, accessibility checks), browser-qa.md subagent
doc, and milestone-validation.md workflow. Integrates with existing
browser automation stack and mission orchestrator Phase 4.
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 28, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 2 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 12088b9 and bd279db.

📒 Files selected for processing (5)
  • .agents/scripts/browser-qa-helper.sh
  • .agents/subagent-index.toon
  • .agents/tools/browser/browser-qa.md
  • .agents/workflows/milestone-validation.md
  • .agents/workflows/mission-orchestrator.md

Walkthrough

The PR introduces a mission-aware browser QA system for milestone validation, including a new Playwright-based QA helper script, comprehensive documentation, and integration across existing workflows and tooling references.

Changes

Cohort / File(s) Summary
Index & Catalog Updates
AGENTS.md, subagent-index.toon
Updated domain index and public tool registry to register the new browser-qa tooling, added tags and workflow entry for milestone validation.
Browser QA Implementation
scripts/browser-qa-helper.sh, tools/browser/browser-qa.md
Introduced Playwright-based QA pipeline with screenshot capture, link crawling, accessibility checks, and smoke testing; paired with comprehensive documentation defining subagent metadata, quick reference, and detailed QA methodology.
Workflow Documentation & Integration
tools/browser/browser-automation.md, workflows/milestone-validation.md, workflows/mission-orchestrator.md
Updated workflow guidance and orchestrator references to integrate browser QA into validation pipeline; added browser QA decision tree, dev server management, and failure classification guidance.

Sequence Diagram

sequenceDiagram
    participant MVW as Milestone<br/>Validation Worker
    participant QA as Browser QA<br/>Helper Script
    participant PM as Playwright<br/>Manager
    participant APP as Application<br/>Under Test
    participant RPT as QA Report<br/>Generator

    MVW->>QA: invoke with URL + config
    QA->>QA: verify prerequisites<br/>(Node, npx, playwright)
    QA->>QA: poll for URL availability
    QA->>PM: generate dynamic .mjs<br/>QA script
    PM->>APP: navigate to start URL
    APP-->>PM: page loaded
    PM->>PM: capture screenshot
    PM->>PM: crawl internal links
    PM->>PM: check console errors
    PM->>PM: validate ARIA structure
    APP-->>PM: link responses + errors
    PM->>RPT: aggregate findings
    RPT-->>QA: JSON + findings
    QA->>QA: format markdown report
    QA-->>MVW: return QA results
    MVW->>MVW: consolidate validation<br/>summary
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement, needs-review

Poem

🎬 Screenshots dance, links respond,
Accessibility reigns supreme,
Mission milestones shine beyond
The smoke tests' gleaming dream.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title directly and clearly summarizes the main change: introducing mission-aware browser QA for milestone validation via new scripts and documentation.
Linked Issues check ✅ Passed The pull request fully implements all primary coding objectives from issue #2503: Playwright-based visual QA script with run/screenshot/links/a11y/smoke commands, integration with milestone-validation workflow, zero-config CLI operation, and JSON/markdown reporting.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing mission-aware browser QA for milestone validation. Documentation updates, script additions, and workflow extensions all align with the issue #2503 objectives; no extraneous changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t1359-browser-qa-milestone-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 97 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 28 00:51:21 UTC 2026: Code review monitoring started
Sat Feb 28 00:51:22 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 97

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 97
  • VULNERABILITIES: 0

Generated on: Sat Feb 28 00:51:24 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.agents/subagent-index.toon (1)

102-120: ⚠️ Potential issue | 🟠 Major

Fix workflow table cardinality and duplicate milestone-validation entry.

Line 102 declares workflows[15], but Lines 103-119 currently define 16 rows, and milestone-validation appears twice (Lines 110 and 119). This can break deterministic parsing/indexing.

Proposed fix
-<!--TOON:workflows[15]{name,file,purpose}:
+<!--TOON:workflows[15]{name,file,purpose}:
 ...
-milestone-validation,workflows/milestone-validation.md,Milestone validation for mission system — tests build lint browser QA
+milestone-validation,workflows/milestone-validation.md,Milestone validation worker for mission orchestration with browser QA
 ...
-mission-skill-learning,workflows/mission-skill-learning.md,Auto-capture reusable patterns from missions and suggest promotion
-milestone-validation,workflows/milestone-validation.md,Milestone validation worker for mission orchestration with browser QA
+mission-skill-learning,workflows/mission-skill-learning.md,Auto-capture reusable patterns from missions and suggest promotion
 -->
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/subagent-index.toon around lines 102 - 120, The workflows table
declares workflows[15] but contains 16 entries and duplicates the
milestone-validation row; update the declaration or the entries so counts match
(either change workflows[15] to workflows[16] or remove the duplicated
milestone-validation line) and ensure the remaining entries (e.g., git-workflow,
plans, release, version-bump, pr, preflight, postflight, milestone-validation,
browser-qa, ralph-loop, session-review, review-issue-pr, bug-fixing,
feature-development, conversation-starter, mission-skill-learning) are unique
and in the intended order.
🧹 Nitpick comments (2)
.agents/workflows/milestone-validation.md (1)

233-256: Guard DEV_PID cleanup in the shell snippet.

Line 255 assumes DEV_PID exists, but it’s only conditionally assigned. Guard it to avoid brittle copy/paste behavior in strict shells.

Proposed snippet adjustment
-# Cleanup
-kill "$DEV_PID" 2>/dev/null || true
+# Cleanup
+if [[ -n "${DEV_PID:-}" ]]; then
+  kill "$DEV_PID" 2>/dev/null || true
+fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/workflows/milestone-validation.md around lines 233 - 256, The
cleanup unconditionally calls kill "$DEV_PID" even though DEV_PID is only set
when a dev/start script is launched; update the snippet to guard the cleanup by
checking DEV_PID before killing it (e.g., ensure DEV_PID is non-empty and
optionally verify the PID is alive with a kill -0 check) and only then call
kill, falling back to a no-op on failure; refer to the DEV_PID variable and the
dev/start launch block when making the change.
.agents/workflows/mission-orchestrator.md (1)

673-673: Deduplicate the repeated milestone-validation Related reference.

Line 673 repeats workflows/milestone-validation.md, which is already listed earlier in the same Related section. Keeping one canonical entry reduces doc drift.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/workflows/mission-orchestrator.md at line 673, There is a duplicate
Related entry for `workflows/milestone-validation.md`; remove the repeated line
so only the single canonical `workflows/milestone-validation.md` entry remains
in the Related section (keep the earlier occurrence and delete the later `-
`workflows/milestone-validation.md` — Milestone validation worker (t1357.6)`
line).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/scripts/browser-qa-helper.sh:
- Around line 359-375: The contrast findings collected into the results array
(variables: contrast_script, contrast_result, results) are never merged into the
final a11y report that currently emits only the ARIA/structure JSON; update the
emission logic that prints the generated script's ARIA/structure output to
include the corresponding contrast entry for each page (merge each page's
contrast_result into the same JSON object you output for ARIA/structure), ensure
the output stays valid JSON, and keep existing logging (log_warn/log_info)
behavior when the contrast script is missing or fails.
- Around line 153-161: The generated script injects raw CLI inputs into JS
variables (baseUrl, viewports, pages, outputDir, timeout, fullPage) which allows
quotes/backticks to break or alter execution; fix by serializing and escaping
all CLI-derived values before embedding (e.g., use a robust JSON/string-escaping
routine to produce safe JS literals for strings and arrays) and replace the
direct interpolations for baseUrl, viewports, pages, outputDir, timeout,
fullPage with the escaped/serialized versions; apply the same escaping pattern
to the other template injection sites noted (around the other generation
blocks).
- Around line 202-205: Because set -e can abort the script before rm -f
"$script_file" runs when node "$script_file" fails, wrap the node invocation so
cleanup always happens: either install a trap that unconditionally removes
"$script_file" and exits with the captured status (use trap 'rm -f
"$script_file"; exit $EXIT_CODE' EXIT with a saved EXIT_CODE), or temporarily
disable errexit (set +e) before node "$script_file", capture its exit code into
exit_code, then rm -f "$script_file" and return that exit_code, and (if you
changed shell flags) restore -e; apply the same pattern around the other
occurrences of node "$script_file" / rm -f "$script_file" identified in the
comment.
- Around line 47-55: The check_playwright() routine currently treats
`@playwright/mcp` as an acceptable fallback which is incorrect because generated
runtime scripts (screenshot, links, a11y, smoke) unconditionally import
'playwright'; remove the fallback: stop checking for `@playwright/mcp` and only
verify npx --no-install playwright --version, and fail loudly (log_error and
return non-zero) when playwright is absent; also update any other identical
checks in this file that validate runtime dependencies for the generated scripts
so they only accept the real 'playwright' package (not `@playwright/mcp`),
ensuring clear error messages that instruct installing 'playwright' and running
npx playwright install.

In @.agents/tools/browser/browser-qa.md:
- Around line 136-139: Update the "Common findings" section to accurately
reflect the crawler's scope: revise the "**Common findings**:" bullets (the
lines listing "404s from renamed/deleted pages", "Links to placeholder URLs
(`#`, `javascript:void(0)`)", and "Links to API endpoints that return errors
without auth") to note that the crawler only checks absolute internal http* URLs
and therefore will not reliably detect placeholder links like `#` or
`javascript:void(0)`; either remove those placeholder items or add an explicit
caveat stating they are outside current crawler detection and recommend an
alternate manual or regex-based check for placeholders.

---

Outside diff comments:
In @.agents/subagent-index.toon:
- Around line 102-120: The workflows table declares workflows[15] but contains
16 entries and duplicates the milestone-validation row; update the declaration
or the entries so counts match (either change workflows[15] to workflows[16] or
remove the duplicated milestone-validation line) and ensure the remaining
entries (e.g., git-workflow, plans, release, version-bump, pr, preflight,
postflight, milestone-validation, browser-qa, ralph-loop, session-review,
review-issue-pr, bug-fixing, feature-development, conversation-starter,
mission-skill-learning) are unique and in the intended order.

---

Nitpick comments:
In @.agents/workflows/milestone-validation.md:
- Around line 233-256: The cleanup unconditionally calls kill "$DEV_PID" even
though DEV_PID is only set when a dev/start script is launched; update the
snippet to guard the cleanup by checking DEV_PID before killing it (e.g., ensure
DEV_PID is non-empty and optionally verify the PID is alive with a kill -0
check) and only then call kill, falling back to a no-op on failure; refer to the
DEV_PID variable and the dev/start launch block when making the change.

In @.agents/workflows/mission-orchestrator.md:
- Line 673: There is a duplicate Related entry for
`workflows/milestone-validation.md`; remove the repeated line so only the single
canonical `workflows/milestone-validation.md` entry remains in the Related
section (keep the earlier occurrence and delete the later `-
`workflows/milestone-validation.md` — Milestone validation worker (t1357.6)`
line).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f7c54c and 12088b9.

📒 Files selected for processing (7)
  • .agents/AGENTS.md
  • .agents/scripts/browser-qa-helper.sh
  • .agents/subagent-index.toon
  • .agents/tools/browser/browser-automation.md
  • .agents/tools/browser/browser-qa.md
  • .agents/workflows/milestone-validation.md
  • .agents/workflows/mission-orchestrator.md

Comment thread .agents/scripts/browser-qa-helper.sh
Comment thread .agents/scripts/browser-qa-helper.sh
Comment thread .agents/scripts/browser-qa-helper.sh Outdated
Comment thread .agents/scripts/browser-qa-helper.sh
Comment thread .agents/tools/browser/browser-qa.md
@alex-solovyev
Copy link
Copy Markdown
Collaborator

Worker running 4h+ with no PR activity — killed as stuck. Re-dispatching fresh worker to address CodeRabbit review feedback.

- Fix workflow table cardinality (workflows[15] -> [16]) and remove
  duplicate milestone-validation entry in subagent-index.toon
- Remove duplicate milestone-validation Related entry in mission-orchestrator.md
- Guard DEV_PID cleanup with -n check in milestone-validation.md
- Remove @playwright/mcp fallback in check_playwright() — require real playwright
- Add js_escape_string() and escape all CLI inputs before JS embedding
- Merge contrast findings into a11y report via process.argv JSON passing
- Fix set -e safety: use || exit_code=$? pattern for node invocations
- Update browser-qa.md common findings to note crawler scope limitation

Addresses all actionable comments from CodeRabbit CHANGES_REQUESTED review.
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 98 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 28 01:14:39 UTC 2026: Code review monitoring started
Sat Feb 28 01:14:40 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 98

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 98
  • VULNERABILITIES: 0

Generated on: Sat Feb 28 01:14:43 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Replace fragile manual bash array-to-JSON serialization with jq-based
construction. The prior commit passed contrast data via process.argv[2]
but built the JSON by concatenating raw strings — if contrast_result
contained special characters, the JSON would be malformed. Now uses
jq --argjson for safe JSON composition throughout the contrast
collection loop.
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 98 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Sat Feb 28 01:20:40 UTC 2026: Code review monitoring started
Sat Feb 28 01:20:41 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 98

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 98
  • VULNERABILITIES: 0

Generated on: Sat Feb 28 01:20:43 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link
Copy Markdown

@marcusquinn
Copy link
Copy Markdown
Owner Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 28, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@marcusquinn
Copy link
Copy Markdown
Owner Author

@coderabbitai resolve

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 28, 2026

✅ Actions performed

Comments resolved and changes approved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

t1359: Mission-aware browser QA in milestone validation

2 participants