t1359: Mission-aware browser QA for milestone validation#2543
t1359: Mission-aware browser QA for milestone validation#2543marcusquinn merged 3 commits intomainfrom
Conversation
Add Playwright-based visual testing pipeline for mission milestone validation. Creates browser-qa-helper.sh (smoke tests, screenshots, broken link detection, accessibility checks), browser-qa.md subagent doc, and milestone-validation.md workflow. Integrates with existing browser automation stack and mission orchestrator Phase 4.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (5)
WalkthroughThe PR introduces a mission-aware browser QA system for milestone validation, including a new Playwright-based QA helper script, comprehensive documentation, and integration across existing workflows and tooling references. Changes
Sequence DiagramsequenceDiagram
participant MVW as Milestone<br/>Validation Worker
participant QA as Browser QA<br/>Helper Script
participant PM as Playwright<br/>Manager
participant APP as Application<br/>Under Test
participant RPT as QA Report<br/>Generator
MVW->>QA: invoke with URL + config
QA->>QA: verify prerequisites<br/>(Node, npx, playwright)
QA->>QA: poll for URL availability
QA->>PM: generate dynamic .mjs<br/>QA script
PM->>APP: navigate to start URL
APP-->>PM: page loaded
PM->>PM: capture screenshot
PM->>PM: crawl internal links
PM->>PM: check console errors
PM->>PM: validate ARIA structure
APP-->>PM: link responses + errors
PM->>RPT: aggregate findings
RPT-->>QA: JSON + findings
QA->>QA: format markdown report
QA-->>MVW: return QA results
MVW->>MVW: consolidate validation<br/>summary
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 00:51:24 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.agents/subagent-index.toon (1)
102-120:⚠️ Potential issue | 🟠 MajorFix workflow table cardinality and duplicate
milestone-validationentry.Line 102 declares
workflows[15], but Lines 103-119 currently define 16 rows, andmilestone-validationappears twice (Lines 110 and 119). This can break deterministic parsing/indexing.Proposed fix
-<!--TOON:workflows[15]{name,file,purpose}: +<!--TOON:workflows[15]{name,file,purpose}: ... -milestone-validation,workflows/milestone-validation.md,Milestone validation for mission system — tests build lint browser QA +milestone-validation,workflows/milestone-validation.md,Milestone validation worker for mission orchestration with browser QA ... -mission-skill-learning,workflows/mission-skill-learning.md,Auto-capture reusable patterns from missions and suggest promotion -milestone-validation,workflows/milestone-validation.md,Milestone validation worker for mission orchestration with browser QA +mission-skill-learning,workflows/mission-skill-learning.md,Auto-capture reusable patterns from missions and suggest promotion -->🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/subagent-index.toon around lines 102 - 120, The workflows table declares workflows[15] but contains 16 entries and duplicates the milestone-validation row; update the declaration or the entries so counts match (either change workflows[15] to workflows[16] or remove the duplicated milestone-validation line) and ensure the remaining entries (e.g., git-workflow, plans, release, version-bump, pr, preflight, postflight, milestone-validation, browser-qa, ralph-loop, session-review, review-issue-pr, bug-fixing, feature-development, conversation-starter, mission-skill-learning) are unique and in the intended order.
🧹 Nitpick comments (2)
.agents/workflows/milestone-validation.md (1)
233-256: GuardDEV_PIDcleanup in the shell snippet.Line 255 assumes
DEV_PIDexists, but it’s only conditionally assigned. Guard it to avoid brittle copy/paste behavior in strict shells.Proposed snippet adjustment
-# Cleanup -kill "$DEV_PID" 2>/dev/null || true +# Cleanup +if [[ -n "${DEV_PID:-}" ]]; then + kill "$DEV_PID" 2>/dev/null || true +fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/workflows/milestone-validation.md around lines 233 - 256, The cleanup unconditionally calls kill "$DEV_PID" even though DEV_PID is only set when a dev/start script is launched; update the snippet to guard the cleanup by checking DEV_PID before killing it (e.g., ensure DEV_PID is non-empty and optionally verify the PID is alive with a kill -0 check) and only then call kill, falling back to a no-op on failure; refer to the DEV_PID variable and the dev/start launch block when making the change..agents/workflows/mission-orchestrator.md (1)
673-673: Deduplicate the repeatedmilestone-validationRelated reference.Line 673 repeats
workflows/milestone-validation.md, which is already listed earlier in the same Related section. Keeping one canonical entry reduces doc drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/workflows/mission-orchestrator.md at line 673, There is a duplicate Related entry for `workflows/milestone-validation.md`; remove the repeated line so only the single canonical `workflows/milestone-validation.md` entry remains in the Related section (keep the earlier occurrence and delete the later `- `workflows/milestone-validation.md` — Milestone validation worker (t1357.6)` line).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/browser-qa-helper.sh:
- Around line 359-375: The contrast findings collected into the results array
(variables: contrast_script, contrast_result, results) are never merged into the
final a11y report that currently emits only the ARIA/structure JSON; update the
emission logic that prints the generated script's ARIA/structure output to
include the corresponding contrast entry for each page (merge each page's
contrast_result into the same JSON object you output for ARIA/structure), ensure
the output stays valid JSON, and keep existing logging (log_warn/log_info)
behavior when the contrast script is missing or fails.
- Around line 153-161: The generated script injects raw CLI inputs into JS
variables (baseUrl, viewports, pages, outputDir, timeout, fullPage) which allows
quotes/backticks to break or alter execution; fix by serializing and escaping
all CLI-derived values before embedding (e.g., use a robust JSON/string-escaping
routine to produce safe JS literals for strings and arrays) and replace the
direct interpolations for baseUrl, viewports, pages, outputDir, timeout,
fullPage with the escaped/serialized versions; apply the same escaping pattern
to the other template injection sites noted (around the other generation
blocks).
- Around line 202-205: Because set -e can abort the script before rm -f
"$script_file" runs when node "$script_file" fails, wrap the node invocation so
cleanup always happens: either install a trap that unconditionally removes
"$script_file" and exits with the captured status (use trap 'rm -f
"$script_file"; exit $EXIT_CODE' EXIT with a saved EXIT_CODE), or temporarily
disable errexit (set +e) before node "$script_file", capture its exit code into
exit_code, then rm -f "$script_file" and return that exit_code, and (if you
changed shell flags) restore -e; apply the same pattern around the other
occurrences of node "$script_file" / rm -f "$script_file" identified in the
comment.
- Around line 47-55: The check_playwright() routine currently treats
`@playwright/mcp` as an acceptable fallback which is incorrect because generated
runtime scripts (screenshot, links, a11y, smoke) unconditionally import
'playwright'; remove the fallback: stop checking for `@playwright/mcp` and only
verify npx --no-install playwright --version, and fail loudly (log_error and
return non-zero) when playwright is absent; also update any other identical
checks in this file that validate runtime dependencies for the generated scripts
so they only accept the real 'playwright' package (not `@playwright/mcp`),
ensuring clear error messages that instruct installing 'playwright' and running
npx playwright install.
In @.agents/tools/browser/browser-qa.md:
- Around line 136-139: Update the "Common findings" section to accurately
reflect the crawler's scope: revise the "**Common findings**:" bullets (the
lines listing "404s from renamed/deleted pages", "Links to placeholder URLs
(`#`, `javascript:void(0)`)", and "Links to API endpoints that return errors
without auth") to note that the crawler only checks absolute internal http* URLs
and therefore will not reliably detect placeholder links like `#` or
`javascript:void(0)`; either remove those placeholder items or add an explicit
caveat stating they are outside current crawler detection and recommend an
alternate manual or regex-based check for placeholders.
---
Outside diff comments:
In @.agents/subagent-index.toon:
- Around line 102-120: The workflows table declares workflows[15] but contains
16 entries and duplicates the milestone-validation row; update the declaration
or the entries so counts match (either change workflows[15] to workflows[16] or
remove the duplicated milestone-validation line) and ensure the remaining
entries (e.g., git-workflow, plans, release, version-bump, pr, preflight,
postflight, milestone-validation, browser-qa, ralph-loop, session-review,
review-issue-pr, bug-fixing, feature-development, conversation-starter,
mission-skill-learning) are unique and in the intended order.
---
Nitpick comments:
In @.agents/workflows/milestone-validation.md:
- Around line 233-256: The cleanup unconditionally calls kill "$DEV_PID" even
though DEV_PID is only set when a dev/start script is launched; update the
snippet to guard the cleanup by checking DEV_PID before killing it (e.g., ensure
DEV_PID is non-empty and optionally verify the PID is alive with a kill -0
check) and only then call kill, falling back to a no-op on failure; refer to the
DEV_PID variable and the dev/start launch block when making the change.
In @.agents/workflows/mission-orchestrator.md:
- Line 673: There is a duplicate Related entry for
`workflows/milestone-validation.md`; remove the repeated line so only the single
canonical `workflows/milestone-validation.md` entry remains in the Related
section (keep the earlier occurrence and delete the later `-
`workflows/milestone-validation.md` — Milestone validation worker (t1357.6)`
line).
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
.agents/AGENTS.md.agents/scripts/browser-qa-helper.sh.agents/subagent-index.toon.agents/tools/browser/browser-automation.md.agents/tools/browser/browser-qa.md.agents/workflows/milestone-validation.md.agents/workflows/mission-orchestrator.md
|
Worker running 4h+ with no PR activity — killed as stuck. Re-dispatching fresh worker to address CodeRabbit review feedback. |
- Fix workflow table cardinality (workflows[15] -> [16]) and remove duplicate milestone-validation entry in subagent-index.toon - Remove duplicate milestone-validation Related entry in mission-orchestrator.md - Guard DEV_PID cleanup with -n check in milestone-validation.md - Remove @playwright/mcp fallback in check_playwright() — require real playwright - Add js_escape_string() and escape all CLI inputs before JS embedding - Merge contrast findings into a11y report via process.argv JSON passing - Fix set -e safety: use || exit_code=$? pattern for node invocations - Update browser-qa.md common findings to note crawler scope limitation Addresses all actionable comments from CodeRabbit CHANGES_REQUESTED review.
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 01:14:43 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
Replace fragile manual bash array-to-JSON serialization with jq-based construction. The prior commit passed contrast data via process.argv[2] but built the JSON by concatenating raw strings — if contrast_result contained special characters, the JSON would be malformed. Now uses jq --argjson for safe JSON composition throughout the contrast collection loop.
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 28 01:20:43 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai resolve |
✅ Actions performedComments resolved and changes approved. |



Summary
browser-qa-helper.sh(smoke tests, multi-viewport screenshots, broken link crawling, WCAG accessibility checks)browser-qa.mdsubagent doc with severity mapping, 6-step pipeline, and visual regression guidancemilestone-validation.mdwith browser QA integration, "How to Think" section, severity classification, and dev server managementChanges
scripts/browser-qa-helper.shrun,screenshot,links,a11y,smokecommands (912 lines)tools/browser/browser-qa.mdworkflows/milestone-validation.mdworkflows/mission-orchestrator.mdtools/browser/browser-automation.mdAGENTS.mdsubagent-index.toonVerification
browser-qa-helper.sh.mdfilesSupersedes #2540 (closed, could not reopen after rebase).
Closes #2503
Summary by CodeRabbit
New Features
Documentation