diff --git a/.agents/AGENTS.md b/.agents/AGENTS.md index 1c564e0c88..0f9df888fc 100644 --- a/.agents/AGENTS.md +++ b/.agents/AGENTS.md @@ -5,7 +5,7 @@ mode: subagent **New to aidevops?** Type `/onboarding` to get started with an interactive setup wizard. -**Supported tools:** [OpenCode](https://opencode.ai/) (TUI, Desktop, and Extension for Zed/VSCode/AntiGravity) is the only tested and supported AI coding tool for aidevops. The claude-code CLI is used as a companion tool called from within OpenCode. aidevops is also available in the Claude marketplace. +**Supported tools:** [OpenCode](https://opencode.ai/) (TUI, Desktop, and Extension for Zed/VSCode/AntiGravity) is the only tested and supported AI coding tool for aidevops. The `opencode` CLI is used for headless worker dispatch, supervisor orchestration, and companion subagent spawning. aidevops is also available in the Claude marketplace. **Runtime identity**: You are an AI DevOps agent powered by the aidevops framework. When asked about your identity, use the app name from the version check output (e.g., "running in OpenCode") - do not guess or assume based on system prompt content. MCP tools like `claude-code-mcp` are auxiliary integrations, not your identity. @@ -201,6 +201,32 @@ worktree-helper.sh add feature/x # Fallback **Full docs**: `workflows/git-workflow.md`, `tools/git/worktrunk.md` +## Autonomous Orchestration + +**CLI**: `opencode` is the ONLY supported CLI for worker dispatch. Never use `claude` CLI. + +**Supervisor** (`supervisor-helper.sh`): Manages parallel task execution with SQLite state machine. + +```bash +# Add tasks and create batch +supervisor-helper.sh add t001 --repo "$(pwd)" --description "Task description" +supervisor-helper.sh batch "my-batch" --concurrency 3 --tasks "t001,t002,t003" + +# Install cron pulse (REQUIRED for autonomous operation) +supervisor-helper.sh cron install + +# Manual pulse (cron does this automatically every 2 minutes) +supervisor-helper.sh pulse --batch + +# Monitor +supervisor-helper.sh dashboard --batch +supervisor-helper.sh status +``` + +**Cron pulse is mandatory** for autonomous operation. Without it, the supervisor is passive and requires manual `pulse` calls. The pulse cycle: check workers -> evaluate outcomes -> dispatch next -> cleanup. + +**Full docs**: `tools/ai-assistants/headless-dispatch.md`, `supervisor-helper.sh help` + ## Session Completion Run `/session-review` before ending. Suggest new sessions after PR merge, domain switch, or 3+ hours. @@ -239,6 +265,7 @@ Orchestration agents can create drafts in `draft/` for reusable parallel process | Video | `tools/video/video-prompt-design.md`, `tools/video/remotion.md`, `tools/video/higgsfield.md` | | Voice | `tools/voice/speech-to-speech.md`, `voice-helper.sh talk` (voice bridge) | | Parallel agents | `tools/ai-assistants/headless-dispatch.md`, `tools/ai-assistants/runners/` | +| Orchestration | `supervisor-helper.sh` (batch dispatch, cron pulse, self-healing) | | MCP dev | `tools/build-mcp/build-mcp.md` | | Agent design | `tools/build-agent/build-agent.md` | | Framework | `aidevops/architecture.md` | diff --git a/.agents/scripts/agent-test-helper.sh b/.agents/scripts/agent-test-helper.sh index dcc56c26bf..9ba6266ffc 100755 --- a/.agents/scripts/agent-test-helper.sh +++ b/.agents/scripts/agent-test-helper.sh @@ -60,17 +60,18 @@ readonly SUITES_DIR="${TEST_DIR}/suites" readonly RESULTS_DIR="${TEST_DIR}/results" readonly BASELINES_DIR="${TEST_DIR}/baselines" -# CLI detection +# CLI detection - opencode is the primary and only supported CLI detect_cli() { local cli="${AGENT_TEST_CLI:-}" if [[ -n "$cli" ]]; then echo "$cli" return 0 fi - if command -v claude >/dev/null 2>&1; then - echo "claude" - elif command -v opencode >/dev/null 2>&1; then + if command -v opencode >/dev/null 2>&1; then echo "opencode" + elif command -v claude >/dev/null 2>&1; then + # DEPRECATED: claude CLI fallback - install opencode instead + echo "claude" else echo "" fi diff --git a/.agents/scripts/pre-edit-check.sh b/.agents/scripts/pre-edit-check.sh index c9116cd4e0..d3e89c1306 100755 --- a/.agents/scripts/pre-edit-check.sh +++ b/.agents/scripts/pre-edit-check.sh @@ -78,6 +78,10 @@ is_docs_only() { return 1 } +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +NC='\033[0m' BOLD='\033[1m' # Check if we're in a git repository diff --git a/.agents/scripts/supervisor-helper.sh b/.agents/scripts/supervisor-helper.sh index 788f7b0022..9cb622cb6c 100755 --- a/.agents/scripts/supervisor-helper.sh +++ b/.agents/scripts/supervisor-helper.sh @@ -64,6 +64,17 @@ # - Mail: mail-helper.sh for escalation # - Memory: memory-helper.sh for cross-batch learning # - Git: wt/worktree-helper.sh for isolation +# +# IMPORTANT - Orchestration Requirements: +# - CLI: opencode is the ONLY supported CLI for worker dispatch. +# claude CLI fallback is DEPRECATED and will be removed. +# Install: npm i -g opencode (https://opencode.ai/) +# - Cron pulse: For autonomous operation, install the cron pulse: +# supervisor-helper.sh cron install +# This runs `pulse` every 2 minutes to check/dispatch/evaluate workers. +# Without cron, the supervisor is passive and requires manual `pulse` calls. +# - Batch lifecycle: add tasks -> create batch -> cron pulse handles the rest +# The pulse cycle: check workers -> evaluate outcomes -> dispatch next -> cleanup set -euo pipefail @@ -236,7 +247,7 @@ check_system_load() { process_count=$(ps aux 2>/dev/null | wc -l | tr -d ' ') echo "process_count=$process_count" - # Supervisor worker process count (opencode/claude spawned by supervisor) + # Supervisor worker process count (opencode workers spawned by supervisor) local supervisor_process_count=0 if [[ -d "$SUPERVISOR_DIR/pids" ]]; then for pid_file in "$SUPERVISOR_DIR/pids"/*.pid; do @@ -1718,25 +1729,30 @@ detect_dispatch_mode() { ####################################### # Resolve the AI CLI tool to use for dispatch -# Prefers opencode, falls back to claude +# Resolve AI CLI for worker dispatch +# opencode is the ONLY supported CLI for aidevops supervisor workers. +# claude CLI fallback is DEPRECATED and will be removed in a future release. ####################################### resolve_ai_cli() { - # Prefer opencode (supports Anthropic auth + zen free models as fallback) + # opencode is the primary and only supported CLI if command -v opencode &>/dev/null; then echo "opencode" return 0 fi + # DEPRECATED: claude CLI fallback - will be removed if command -v claude &>/dev/null; then + log_warning "Using deprecated claude CLI fallback. Install opencode: npm i -g opencode" echo "claude" return 0 fi - log_error "Neither opencode nor claude CLI found. Install one to dispatch workers." + log_error "opencode CLI not found. Install it: npm i -g opencode" + log_error "See: https://opencode.ai/docs/installation/" return 1 } ####################################### # Resolve the best available model for a given task tier -# Priority: Anthropic SOTA via opencode > claude CLI > opencode zen free +# Priority: Anthropic SOTA via opencode (only supported CLI) # # Tiers: # coding - Best SOTA model for code tasks (default) @@ -2100,7 +2116,7 @@ _list_descendants() { ####################################### # Kill all orphaned worker processes (emergency cleanup) -# Finds opencode/claude processes with PPID=1 that match supervisor patterns +# Finds opencode worker processes with PPID=1 that match supervisor patterns ####################################### cmd_kill_workers() { local dry_run=false @@ -2151,7 +2167,7 @@ cmd_kill_workers() { log_info "Protected PIDs (active workers + self): $(echo "$protected_pattern" | tr '|' ' ' | wc -w | tr -d ' ') processes" - # Find orphaned opencode/claude processes (PPID=1, not in any terminal session) + # Find orphaned opencode worker processes (PPID=1, not in any terminal session) local orphan_count=0 local killed_count=0 diff --git a/.agents/seo/mom-test-ux.md b/.agents/seo/mom-test-ux.md new file mode 100644 index 0000000000..62ed9b51ae --- /dev/null +++ b/.agents/seo/mom-test-ux.md @@ -0,0 +1,132 @@ +--- +description: "Mom Test UX evaluation - Apple-inspired 'Would this confuse my mom?' usability and CRO analysis. Use when the user wants a UX audit, usability review, conversion friction analysis, or asks why users aren't converting." +mode: subagent +tools: + read: true + write: false + edit: false + bash: true + webfetch: true +--- + +# Mom Test UX / CRO Agent + + + +## Quick Reference + +- **Purpose**: Evaluate any page/screen with "Would this confuse my mom?" heuristic +- **Philosophy**: If a non-technical person can't complete the task in under 10 seconds of thought, the UX has failed +- **Input**: URL, screenshot, or ARIA snapshot +- **Output**: Actionable fix table with severity, effort, and impact ratings + +## The 6 UX Principles + +Every element is evaluated against these principles: + +| # | Principle | Mom Test Question | +|---|-----------|-------------------| +| 1 | **Clarity** | "What is this page asking me to do?" | +| 2 | **Simplicity** | "Why are there so many things on this screen?" | +| 3 | **Consistency** | "This button looked different on the last page?" | +| 4 | **Feedback** | "Did anything happen when I clicked that?" | +| 5 | **Discoverability** | "Where do I go to find X?" | +| 6 | **Forgiveness** | "I clicked the wrong thing - how do I go back?" | + +## Severity Ranking + +| Level | Label | Definition | Example | +|-------|-------|------------|---------| +| S1 | **Blocker** | User cannot complete the task | CTA invisible, form broken, dead click | +| S2 | **Major** | User completes task but with significant confusion | Ambiguous labels, hidden pricing, unclear next step | +| S3 | **Minor** | User notices friction but works through it | Inconsistent styling, slow feedback, extra clicks | +| S4 | **Polish** | Professional refinement | Spacing, micro-copy tone, animation timing | + + + +## Workflow + +### Step 1: Capture the Page + +Use browser automation to get the page state: + +```bash +# ARIA snapshot (preferred - fast, structured, no vision tokens) +playwright screenshot --aria-snapshot https://example.com/pricing + +# Full screenshot (for layout/visual issues) +playwright screenshot https://example.com/pricing --full-page +``` + +Or use Stagehand/Playwright programmatically. For manual review, the user provides the URL and you fetch with `webfetch`. + +### Step 2: Screen-by-Screen Analysis + +For each screen/page, generate the findings table: + +| Confusing Element | Mom's Reaction | Principle | Severity | Fix | +|-------------------|----------------|-----------|----------|-----| +| CTA says "Get Started" with no context | "Get started with what?" | Clarity | S2 | Change to "Start Free 14-Day Trial" | +| Three pricing tiers with 20+ feature rows | "I don't know which one I need" | Simplicity | S2 | Highlight recommended plan, collapse features into "Most popular for..." | +| Form shows no error until submit | "Did it work? Nothing happened" | Feedback | S1 | Add inline validation on blur | +| Navigation has "Solutions" dropdown with 12 items | "I just want to see what you do" | Discoverability | S3 | Reduce to 4-5 grouped categories | +| No back button in checkout flow | "I'm stuck, I'll just leave" | Forgiveness | S1 | Add breadcrumb and back navigation | + +### Step 3: Quick Wins Matrix + +Prioritise fixes by effort vs. impact: + +| Fix | Impact | Effort | Priority | +|-----|--------|--------|----------| +| Rewrite CTA copy | High | 10 min | **Do first** | +| Add inline form validation | High | 2-4 hrs | **Do first** | +| Add breadcrumb nav | Medium | 1-2 hrs | **Schedule** | +| Redesign pricing table | High | 1-2 days | **Plan** | +| Adjust spacing/polish | Low | 30 min | **Batch later** | + +Priority rules: High impact + Low effort = Do first. High impact + High effort = Plan. Low impact = Batch later. + +### Step 4: CRO Recommendations + +Apply proven UX patterns that directly improve conversion: + +| Pattern | Why It Works | Implementation | +|---------|-------------|----------------| +| Single primary CTA per viewport | Reduces decision paralysis | Remove competing links near the main action | +| Social proof near decision point | Reduces anxiety at commitment | Add testimonial/count badge within 200px of CTA | +| Progress indicator on multi-step flows | Sets expectation, reduces abandonment | "Step 2 of 3" breadcrumb bar | +| Benefit-first headlines | Answers "what's in it for me" instantly | Replace feature-speak with outcome language | +| Friction logging | Quantifies UX debt | Track rage clicks, dead clicks, U-turns in analytics | + +## Browser Automation Integration + +### Automated UX Scan + +Use Playwright to programmatically check for common Mom Test failures: + +1. **ARIA snapshot** - Parse `page.accessibility.snapshot()` for structure issues +2. **Hidden CTAs** - Query `button, a[href]` and flag any with `offsetParent === null` +3. **Vague link text** - Flag buttons/links with text like "Click here", "Learn more", "Submit" +4. **Unlabelled inputs** - Find `input` elements missing `