Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 28 additions & 1 deletion .agents/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ mode: subagent

**New to aidevops?** Type `/onboarding` to get started with an interactive setup wizard.

**Supported tools:** [OpenCode](https://opencode.ai/) (TUI, Desktop, and Extension for Zed/VSCode/AntiGravity) is the only tested and supported AI coding tool for aidevops. The claude-code CLI is used as a companion tool called from within OpenCode. aidevops is also available in the Claude marketplace.
**Supported tools:** [OpenCode](https://opencode.ai/) (TUI, Desktop, and Extension for Zed/VSCode/AntiGravity) is the only tested and supported AI coding tool for aidevops. The `opencode` CLI is used for headless worker dispatch, supervisor orchestration, and companion subagent spawning. aidevops is also available in the Claude marketplace.

**Runtime identity**: You are an AI DevOps agent powered by the aidevops framework. When asked about your identity, use the app name from the version check output (e.g., "running in OpenCode") - do not guess or assume based on system prompt content. MCP tools like `claude-code-mcp` are auxiliary integrations, not your identity.

Expand Down Expand Up @@ -201,6 +201,32 @@ worktree-helper.sh add feature/x # Fallback

**Full docs**: `workflows/git-workflow.md`, `tools/git/worktrunk.md`

## Autonomous Orchestration

**CLI**: `opencode` is the ONLY supported CLI for worker dispatch. Never use `claude` CLI.

**Supervisor** (`supervisor-helper.sh`): Manages parallel task execution with SQLite state machine.

```bash
# Add tasks and create batch
supervisor-helper.sh add t001 --repo "$(pwd)" --description "Task description"
supervisor-helper.sh batch "my-batch" --concurrency 3 --tasks "t001,t002,t003"

# Install cron pulse (REQUIRED for autonomous operation)
supervisor-helper.sh cron install

# Manual pulse (cron does this automatically every 2 minutes)
supervisor-helper.sh pulse --batch <batch-id>

# Monitor
supervisor-helper.sh dashboard --batch <batch-id>
supervisor-helper.sh status <batch-id>
```

**Cron pulse is mandatory** for autonomous operation. Without it, the supervisor is passive and requires manual `pulse` calls. The pulse cycle: check workers -> evaluate outcomes -> dispatch next -> cleanup.

**Full docs**: `tools/ai-assistants/headless-dispatch.md`, `supervisor-helper.sh help`

## Session Completion

Run `/session-review` before ending. Suggest new sessions after PR merge, domain switch, or 3+ hours.
Expand Down Expand Up @@ -239,6 +265,7 @@ Orchestration agents can create drafts in `draft/` for reusable parallel process
| Video | `tools/video/video-prompt-design.md`, `tools/video/remotion.md`, `tools/video/higgsfield.md` |
| Voice | `tools/voice/speech-to-speech.md`, `voice-helper.sh talk` (voice bridge) |
| Parallel agents | `tools/ai-assistants/headless-dispatch.md`, `tools/ai-assistants/runners/` |
| Orchestration | `supervisor-helper.sh` (batch dispatch, cron pulse, self-healing) |
| MCP dev | `tools/build-mcp/build-mcp.md` |
| Agent design | `tools/build-agent/build-agent.md` |
| Framework | `aidevops/architecture.md` |
Expand Down
9 changes: 5 additions & 4 deletions .agents/scripts/agent-test-helper.sh
Original file line number Diff line number Diff line change
Expand Up @@ -60,17 +60,18 @@ readonly SUITES_DIR="${TEST_DIR}/suites"
readonly RESULTS_DIR="${TEST_DIR}/results"
readonly BASELINES_DIR="${TEST_DIR}/baselines"

# CLI detection
# CLI detection - opencode is the primary and only supported CLI
detect_cli() {
local cli="${AGENT_TEST_CLI:-}"
if [[ -n "$cli" ]]; then
echo "$cli"
return 0
fi
if command -v claude >/dev/null 2>&1; then
echo "claude"
elif command -v opencode >/dev/null 2>&1; then
if command -v opencode >/dev/null 2>&1; then
echo "opencode"
elif command -v claude >/dev/null 2>&1; then
# DEPRECATED: claude CLI fallback - install opencode instead
echo "claude"
else
echo ""
fi
Expand Down
4 changes: 4 additions & 0 deletions .agents/scripts/pre-edit-check.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,10 @@ is_docs_only() {
return 1
}

RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
BOLD='\033[1m'

# Check if we're in a git repository
Expand Down
30 changes: 23 additions & 7 deletions .agents/scripts/supervisor-helper.sh
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,17 @@
# - Mail: mail-helper.sh for escalation
# - Memory: memory-helper.sh for cross-batch learning
# - Git: wt/worktree-helper.sh for isolation
#
# IMPORTANT - Orchestration Requirements:
# - CLI: opencode is the ONLY supported CLI for worker dispatch.
# claude CLI fallback is DEPRECATED and will be removed.
# Install: npm i -g opencode (https://opencode.ai/)
# - Cron pulse: For autonomous operation, install the cron pulse:
# supervisor-helper.sh cron install
# This runs `pulse` every 2 minutes to check/dispatch/evaluate workers.
# Without cron, the supervisor is passive and requires manual `pulse` calls.
# - Batch lifecycle: add tasks -> create batch -> cron pulse handles the rest
# The pulse cycle: check workers -> evaluate outcomes -> dispatch next -> cleanup

set -euo pipefail

Expand Down Expand Up @@ -236,7 +247,7 @@ check_system_load() {
process_count=$(ps aux 2>/dev/null | wc -l | tr -d ' ')
echo "process_count=$process_count"

# Supervisor worker process count (opencode/claude spawned by supervisor)
# Supervisor worker process count (opencode workers spawned by supervisor)
local supervisor_process_count=0
if [[ -d "$SUPERVISOR_DIR/pids" ]]; then
for pid_file in "$SUPERVISOR_DIR/pids"/*.pid; do
Expand Down Expand Up @@ -1718,25 +1729,30 @@ detect_dispatch_mode() {

#######################################
# Resolve the AI CLI tool to use for dispatch
# Prefers opencode, falls back to claude
# Resolve AI CLI for worker dispatch
# opencode is the ONLY supported CLI for aidevops supervisor workers.
# claude CLI fallback is DEPRECATED and will be removed in a future release.
#######################################
resolve_ai_cli() {
# Prefer opencode (supports Anthropic auth + zen free models as fallback)
# opencode is the primary and only supported CLI
if command -v opencode &>/dev/null; then
echo "opencode"
return 0
fi
# DEPRECATED: claude CLI fallback - will be removed
if command -v claude &>/dev/null; then
log_warning "Using deprecated claude CLI fallback. Install opencode: npm i -g opencode"
echo "claude"
return 0
fi
log_error "Neither opencode nor claude CLI found. Install one to dispatch workers."
log_error "opencode CLI not found. Install it: npm i -g opencode"
log_error "See: https://opencode.ai/docs/installation/"
Comment on lines +1742 to +1749
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix undefined log_warning call (will exit under set -e).

log_warning isn’t defined in this script (only log_warn exists). Under set -euo pipefail, hitting the claude fallback will terminate the supervisor.

🐛 Suggested fix
-        log_warning "Using deprecated claude CLI fallback. Install opencode: npm i -g opencode"
+        log_warn "Using deprecated claude CLI fallback. Install opencode: npm i -g opencode"
🤖 Prompt for AI Agents
In @.agents/scripts/supervisor-helper.sh around lines 1742 - 1749, The issue:
the script calls undefined log_warning in the claude CLI fallback, which will
fail under set -e; update the fallback to call the existing log_warn function
(or define log_warning) so the logging call succeeds; locate the block that
checks "if command -v claude" and replace log_warning with log_warn (or add a
compatible log_warning wrapper that delegates to log_warn) and keep the rest of
the flow (echo "claude" and return 0) unchanged.

return 1
}

#######################################
# Resolve the best available model for a given task tier
# Priority: Anthropic SOTA via opencode > claude CLI > opencode zen free
# Priority: Anthropic SOTA via opencode (only supported CLI)
#
# Tiers:
# coding - Best SOTA model for code tasks (default)
Expand Down Expand Up @@ -2100,7 +2116,7 @@ _list_descendants() {

#######################################
# Kill all orphaned worker processes (emergency cleanup)
# Finds opencode/claude processes with PPID=1 that match supervisor patterns
# Finds opencode worker processes with PPID=1 that match supervisor patterns
#######################################
cmd_kill_workers() {
local dry_run=false
Expand Down Expand Up @@ -2151,7 +2167,7 @@ cmd_kill_workers() {

log_info "Protected PIDs (active workers + self): $(echo "$protected_pattern" | tr '|' ' ' | wc -w | tr -d ' ') processes"

# Find orphaned opencode/claude processes (PPID=1, not in any terminal session)
# Find orphaned opencode worker processes (PPID=1, not in any terminal session)
local orphan_count=0
local killed_count=0

Expand Down
132 changes: 132 additions & 0 deletions .agents/seo/mom-test-ux.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
---
description: "Mom Test UX evaluation - Apple-inspired 'Would this confuse my mom?' usability and CRO analysis. Use when the user wants a UX audit, usability review, conversion friction analysis, or asks why users aren't converting."
mode: subagent
tools:
read: true
write: false
edit: false
bash: true
webfetch: true
---

# Mom Test UX / CRO Agent

<!-- AI-CONTEXT-START -->

## Quick Reference

- **Purpose**: Evaluate any page/screen with "Would this confuse my mom?" heuristic
- **Philosophy**: If a non-technical person can't complete the task in under 10 seconds of thought, the UX has failed
- **Input**: URL, screenshot, or ARIA snapshot
- **Output**: Actionable fix table with severity, effort, and impact ratings

## The 6 UX Principles

Every element is evaluated against these principles:

| # | Principle | Mom Test Question |
|---|-----------|-------------------|
| 1 | **Clarity** | "What is this page asking me to do?" |
| 2 | **Simplicity** | "Why are there so many things on this screen?" |
| 3 | **Consistency** | "This button looked different on the last page?" |
| 4 | **Feedback** | "Did anything happen when I clicked that?" |
| 5 | **Discoverability** | "Where do I go to find X?" |
| 6 | **Forgiveness** | "I clicked the wrong thing - how do I go back?" |

## Severity Ranking

| Level | Label | Definition | Example |
|-------|-------|------------|---------|
| S1 | **Blocker** | User cannot complete the task | CTA invisible, form broken, dead click |
| S2 | **Major** | User completes task but with significant confusion | Ambiguous labels, hidden pricing, unclear next step |
| S3 | **Minor** | User notices friction but works through it | Inconsistent styling, slow feedback, extra clicks |
| S4 | **Polish** | Professional refinement | Spacing, micro-copy tone, animation timing |

<!-- AI-CONTEXT-END -->

## Workflow

### Step 1: Capture the Page

Use browser automation to get the page state:

```bash
# ARIA snapshot (preferred - fast, structured, no vision tokens)
playwright screenshot --aria-snapshot https://example.com/pricing

# Full screenshot (for layout/visual issues)
playwright screenshot https://example.com/pricing --full-page
```
Comment on lines +53 to +59
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Playwright CLI screenshot command --aria-snapshot flag documentation

💡 Result:

There is no --aria-snapshot flag for the Playwright CLI screenshot command in the official docs/CLI reference. The CLI docs explicitly point you to discover the current screenshot options via npx playwright screenshot --help (and overall CLI options via npx playwright --help). [1]

“Aria snapshots” are a separate feature: they produce a YAML representation of the accessibility tree and are generated/used via Playwright tests (e.g., expect(locator).toMatchAriaSnapshot(...), updated with npx playwright test -u/--update-snapshots) and via the Code Generator UI (“Assert snapshot” / “Aria snapshot” tab)—not via the screenshot CLI. [2]

Sources: [1] [2]


🌐 Web query:

Playwright screenshot ARIA snapshot command line

💡 Result:

Screenshot (CLI)

npx playwright screenshot --full-page https://example.com example.png

(See also npx playwright screenshot --help for options like --wait-for-timeout.) [1]

ARIA snapshot (there isn’t a single “aria snapshot” CLI like screenshot)

ARIA snapshots are produced via Playwright Test assertions (or via Codegen), then written/updated using the test runner. [2]

1) Add an ARIA snapshot assertion in a test

import { test, expect } from '@playwright/test';

test('aria snapshot', async ({ page }) => {
  await page.goto('https://example.com');
  await expect(page.getByRole('main')).toMatchAriaSnapshot({ name: 'main.aria.yml' });
});

[2]

2) Generate / update the .aria.yml snapshot from the command line

npx playwright test --update-snapshots
# or: npx playwright test -u

[2][3]

Optional (print YAML to stdout):

const yaml = await page.locator('body').ariaSnapshot();
console.log(yaml);

[2]

Sources: [1] [2] [3]


Remove non-existent --aria-snapshot flag and use correct ARIA snapshot workflow.

The playwright screenshot --aria-snapshot flag does not exist in the official Playwright CLI. ARIA snapshots are generated through Playwright tests, not the screenshot command. Replace this example with the correct approach:

// In test file
import { test, expect } from '@playwright/test';

test('aria snapshot', async ({ page }) => {
  await page.goto('https://example.com/pricing');
  await expect(page.getByRole('main')).toMatchAriaSnapshot({ name: 'pricing.aria.yml' });
});

Then update snapshots via:

npx playwright test --update-snapshots

This example violates the .agents/**/*.md guideline requiring code examples to be authoritative. If ARIA snapshots aren't the focus of this doc, use a file:line reference to canonical instructions instead.

🤖 Prompt for AI Agents
In @.agents/seo/mom-test-ux.md around lines 53 - 59, Replace the non-existent
CLI flag usage "playwright screenshot --aria-snapshot" with the canonical ARIA
snapshot test workflow: remove references to the "--aria-snapshot" flag and the
misuse of "playwright screenshot", and instead show creating an actual
Playwright test that navigates to the page and calls the ARIA snapshot matcher
(e.g., toMatchAriaSnapshot on a locator such as page.getByRole('main')), then
instruct readers to update snapshots with "npx playwright test
--update-snapshots"; if ARIA snapshots aren’t central to this doc, replace the
example with a file:line reference to the official Playwright ARIA snapshot docs
instead.


Or use Stagehand/Playwright programmatically. For manual review, the user provides the URL and you fetch with `webfetch`.

### Step 2: Screen-by-Screen Analysis

For each screen/page, generate the findings table:

| Confusing Element | Mom's Reaction | Principle | Severity | Fix |
|-------------------|----------------|-----------|----------|-----|
| CTA says "Get Started" with no context | "Get started with what?" | Clarity | S2 | Change to "Start Free 14-Day Trial" |
| Three pricing tiers with 20+ feature rows | "I don't know which one I need" | Simplicity | S2 | Highlight recommended plan, collapse features into "Most popular for..." |
| Form shows no error until submit | "Did it work? Nothing happened" | Feedback | S1 | Add inline validation on blur |
| Navigation has "Solutions" dropdown with 12 items | "I just want to see what you do" | Discoverability | S3 | Reduce to 4-5 grouped categories |
| No back button in checkout flow | "I'm stuck, I'll just leave" | Forgiveness | S1 | Add breadcrumb and back navigation |

### Step 3: Quick Wins Matrix

Prioritise fixes by effort vs. impact:

| Fix | Impact | Effort | Priority |
|-----|--------|--------|----------|
| Rewrite CTA copy | High | 10 min | **Do first** |
| Add inline form validation | High | 2-4 hrs | **Do first** |
| Add breadcrumb nav | Medium | 1-2 hrs | **Schedule** |
| Redesign pricing table | High | 1-2 days | **Plan** |
| Adjust spacing/polish | Low | 30 min | **Batch later** |

Priority rules: High impact + Low effort = Do first. High impact + High effort = Plan. Low impact = Batch later.

### Step 4: CRO Recommendations

Apply proven UX patterns that directly improve conversion:

| Pattern | Why It Works | Implementation |
|---------|-------------|----------------|
| Single primary CTA per viewport | Reduces decision paralysis | Remove competing links near the main action |
| Social proof near decision point | Reduces anxiety at commitment | Add testimonial/count badge within 200px of CTA |
| Progress indicator on multi-step flows | Sets expectation, reduces abandonment | "Step 2 of 3" breadcrumb bar |
| Benefit-first headlines | Answers "what's in it for me" instantly | Replace feature-speak with outcome language |
| Friction logging | Quantifies UX debt | Track rage clicks, dead clicks, U-turns in analytics |

## Browser Automation Integration

### Automated UX Scan

Use Playwright to programmatically check for common Mom Test failures:

1. **ARIA snapshot** - Parse `page.accessibility.snapshot()` for structure issues
2. **Hidden CTAs** - Query `button, a[href]` and flag any with `offsetParent === null`
3. **Vague link text** - Flag buttons/links with text like "Click here", "Learn more", "Submit"
4. **Unlabelled inputs** - Find `input` elements missing `<label>` and `aria-label`
5. **Missing feedback** - Check forms for absence of `aria-live` regions or inline validation

Each finding maps to a severity (S1-S4) and principle (Clarity/Simplicity/etc).

## Output Format

The final report follows this structure:

1. **One-line verdict**: Pass / Needs Work / Fail (with overall severity)
2. **Findings table**: Every issue with Severity, Principle, Mom's Reaction, Fix
3. **Quick wins**: Sorted by impact/effort, with time estimates
4. **CRO opportunities**: Patterns applicable to this specific page
5. **Accessibility flags**: Any WCAG violations found during analysis (cross-ref `seo/seo-audit-skill.md`)

Every fix must be **specific and implementable** - not "improve the copy" but "Change H1 from 'Welcome to Our Platform' to 'Ship 2x Faster With Zero DevOps'".

## Related

- `seo/seo-audit-skill.md` - Full SEO audit (references page-cro for conversion)
- `seo/analytics-tracking.md` - Measure UX improvements with event tracking
- `tools/browser/browser-automation.md` - Browser tool selection for page analysis
- `seo/programmatic-seo.md` - Applying UX patterns at scale across generated pages
Loading