t1357: Mission system: autonomous long-running project orchestration

**Task ID:** `t1357` | **Status:** open | **Estimate:** `~28h (ai:20h test:5h read:3h)` | **Plan:** `p034`
**Logged:** 2026-02-27
**Tags:** `plan` `feature` `orchestration` `mission`

## Description

Mission system: autonomous long-running project orchestration — `/mission` command and orchestration agent that takes a high-level goal, decomposes into milestones/features, manages resources (accounts, credentials, infrastructure), and drives autonomous execution over hours/days. Two modes: POC (skip ceremony, commit to main) and Full (standard worktree/PR/review). Budget analysis recommends outcome levels for given constraints. Missions start homeless in `~/.aidevops/missions/` and migrate to `todo/missions/` when a repo exists. Self-organising folder structure with temporary agents/scripts. Inspired by Factory.ai Missions but extended to full project lifecycle (research, procurement, communication, infrastructure).

## Subtasks

- [x] t1357.1 Create mission state file template — `templates/mission-template.md` with YAML frontmatter (id, title, status, mode, budget, model_routing, preferences), milestone/feature tracking, resource requirements, budget tracking table, decision log, mission agents section. #auto-dispatch ~2h model:sonnet ref:GH#2495 pr:#2507 completed:2026-02-27
- [x] t1357.2 Create `/mission` command — `scripts/commands/mission.md` with interactive scoping interview (reuses `/define` probe techniques), mode selection (POC/Full), budget input, constraint gathering, milestone decomposition using opus-tier reasoning, mission file creation, optional repo creation (`aidevops init` + `git init`). Headless mode for supervisor dispatch. #auto-dispatch ~6h model:opus ref:GH#2496 pr:#2508 completed:2026-02-27
- [x] t1357.3 Create mission orchestrator agent — agent doc with self-organisation guidance, file/folder management patterns, temporary agent creation (draft tier), improvement feedback to aidevops, reference patterns for using existing aidevops capabilities, research guidance for unknown domains. #auto-dispatch ~4h model:opus blocked-by:t1357.1 ref:GH#2497 pr:#2510 completed:2026-02-27
- [ ] t1357.4 Add POC mode to `/full-loop` — skip worktrees (commit to main for dedicated repos, single branch for existing repos), skip brief requirement, skip PR review loops, skip postflight, informal commits. Flag: `--poc` or detected from mission mode. #auto-dispatch ~2h model:sonnet blocked-by:t1357.2 ref:GH#2498
- [x] t1357.5 Integrate mission awareness into pulse supervisor — add "check active missions" phase to pulse cycle. For each active mission: check current milestone status, dispatch undispatched features, detect milestone completion and trigger validation, advance milestones, track budget spend. Mission features become regular TODO entries with `mission:mNNN` tag. #auto-dispatch ~4h model:opus blocked-by:t1357.2,t1357.3 ref:GH#2499 pr:#2510 completed:2026-02-28
- [x] t1357.6 Create milestone validation worker — specialised worker dispatched after all features in a milestone complete. Pulls mission branch, runs full test suite + build, optionally runs Playwright browser tests (UI missions), reports pass/fail with specific issues, creates fix tasks on failure linked to milestone. #auto-dispatch ~4h model:sonnet blocked-by:t1357.5 ref:GH#2500 pr:#2519 completed:2026-02-28
- [x] t1357.7 Implement budget analysis and recommendation engine — mission agent analyses likely outcomes for requested budget (time/money/tokens). Recommends budget scales: "For $X/Yh you get basic MVP; for $A/Bh you get production-ready with tests; for $C/Dh you get polished with docs and monitoring." Uses model routing cost data, pattern history for estimation, and task complexity heuristics. Integrates with budget-tracker-helper.sh. #auto-dispatch ~4h model:opus blocked-by:t1357.2 ref:GH#2501 pr:#2516 completed:2026-02-27

## Plan: Purpose

Close the gap between "I have an idea" and "autonomous execution." Current aidevops handles task-level work (`/full-loop`) and supervisor dispatch (`/pulse`), but nothing takes a high-level goal and drives it to completion over days. Missions extend beyond code into research, procurement, infrastructure setup, and 3rd-party communication — making aidevops a true autonomous project agent.
Inspired by Factory.ai Missions (multi-day autonomous coding, Feb 2026) but significantly broader in scope. Factory solves "multi-day coding tasks." This solves "autonomous project lifecycle from idea to delivery."

<details><summary>Plan: Context &amp; Architecture</summary>


**Context**

**Key design decisions:**
- Mission state in git (markdown), not a database — consistent with "GitHub + TODO.md are the database" principle
- Orchestrator as pulse extension, not separate daemon — avoids new process management
- POC mode is a flag, not a separate system — same pipeline, fewer gates
- Milestones sequential, features within milestones parallelisable — Factory found this works better than broad parallelism
- One orchestrator layer, not recursive — Factory notes recursive depth as open question; one layer suffices for our scale
- Budget analysis before execution — mission agent should tell you what you'll get for your budget before starting
- Mission-specific agents are draft-tier — temporary tools, promoted if generally useful
**Factory.ai Missions analysis (Feb 2026):**
- Median mission: ~2 hours. 65% run >1 hour. 37% run >4 hours. 14% run >24 hours.
- Missions use ~2x token weight per message vs normal sessions (19K vs 11K)
- Multi-model: orchestrator (opus), workers (sonnet), validators (varies), research (cheapest)
- Key insight: "serial execution with targeted parallelization has worked better than broad parallelism"
- Open questions they identified: parallelization balance, correctness over long horizons, worker scope, recursive management depth
**What aidevops already has (strong overlap):**
- Worker dispatch, task decomposition, fresh context per worker, multi-model routing, git as source of truth, validation (preflight/postflight), failure recovery, skill/memory, browser QA, task briefs, worker efficiency, autonomous operation
**What's genuinely new:**
- Mission-level orchestration (goal → milestones → features → validation → completion)
- Milestone validation (pause after milestone N, validate integration, then proceed)
- Mission state persistence (durable entity grouping tasks into a coherent goal)
- Automatic re-planning (validation failure → create fix tasks)
- POC/shortcut mode
- Budget feasibility analysis and outcome-level recommendations
- Self-organising mission folders
- Autonomous procurement (payment agent)
- 3rd-party communication (email agent)

**Context from Discussion**

**Key design decisions:**
- Mission state in git (markdown), not a database — consistent with "GitHub + TODO.md are the database" principle
- Orchestrator as pulse extension, not separate daemon — avoids new process management
- POC mode is a flag, not a separate system — same pipeline, fewer gates
- Milestones sequential, features within milestones parallelisable — Factory found this works better than broad parallelism
- One orchestrator layer, not recursive — Factory notes recursive depth as open question; one layer suffices for our scale
- Budget analysis before execution — mission agent should tell you what you'll get for your budget before starting
- Mission-specific agents are draft-tier — temporary tools, promoted if generally useful
**Factory.ai Missions analysis (Feb 2026):**
- Median mission: ~2 hours. 65% run >1 hour. 37% run >4 hours. 14% run >24 hours.
- Missions use ~2x token weight per message vs normal sessions (19K vs 11K)
- Multi-model: orchestrator (opus), workers (sonnet), validators (varies), research (cheapest)
- Key insight: "serial execution with targeted parallelization has worked better than broad parallelism"
- Open questions they identified: parallelization balance, correctness over long horizons, worker scope, recursive management depth
**What aidevops already has (strong overlap):**
- Worker dispatch, task decomposition, fresh context per worker, multi-model routing, git as source of truth, validation (preflight/postflight), failure recovery, skill/memory, browser QA, task briefs, worker efficiency, autonomous operation
**What's genuinely new:**
- Mission-level orchestration (goal → milestones → features → validation → completion)
- Milestone validation (pause after milestone N, validate integration, then proceed)
- Mission state persistence (durable entity grouping tasks into a coherent goal)
- Automatic re-planning (validation failure → create fix tasks)
- POC/shortcut mode
- Budget feasibility analysis and outcome-level recommendations
- Self-organising mission folders
- Autonomous procurement (payment agent)
- 3rd-party communication (email agent)

**Architecture**

```text
/mission "Build a CRM with contacts, deals, and email"
    │
    ▼
Phase 1: SCOPING (interactive interview, opus-tier)
    ├── Goal, mode (POC/Full), budget, constraints, preferences
    ├── Existing repo / new repo / homeless (no repo yet)
    └── Budget analysis: "For $X you get Y; for $A you get B"
    │
    ▼
Phase 2: DECOMPOSITION (opus-tier)
    ├── Research phase (if needed)
    ├── 3-7 milestones (sequential)
    ├── 2-5 features per milestone (parallelisable)
    ├── Resource requirements (accounts, services, credentials)
    └── Creates mission.md + TODO entries + GitHub issues
    │
    ▼
Phase 3: EXECUTION (autonomous, pulse-integrated)
    ├── For each milestone (sequential):
    │   ├── Dispatch features as workers
    │   ├── Self-organise: create agents/scripts as needed
    │   ├── Track budget (time, money, tokens)
    │   └── On complete → milestone validation
    │       ├── Pass → advance
    │       └── Fail → create fix tasks, re-validate
    │
    ▼
Phase 4: COMPLETION
    ├── Final validation, budget reconciliation
    ├── Offer improvements back to aidevops
    └── Summary report
```

</details>

<details><summary>Plan: Progress</summary>

- [ ] (2026-02-27) Phase 1: Foundation — template, command, orchestrator agent ~12h
  - [ ] t1357.1 Mission state file template ~2h
  - [ ] t1357.2 `/mission` command ~6h
  - [ ] t1357.3 Mission orchestrator agent ~4h
- [ ] Phase 2: Execution modes — POC mode, pulse integration ~6h
  - [ ] t1357.4 POC mode in `/full-loop` ~2h
  - [ ] t1357.5 Pulse integration ~4h
- [ ] Phase 3: Validation & budget — milestone validation, budget engine ~8h
  - [ ] t1357.6 Milestone validation worker ~4h
  - [ ] t1357.7 Budget analysis engine ~4h
- [ ] Dependent features ~24h
  - [ ] t1358 Payment agent ~8h
  - [ ] t1359 Browser QA in validation ~4h
  - [ ] t1360 Email agent for missions ~4h
  - [ ] t1361 Skill learning ~4h
  - [ ] t1362 Progress dashboard ~4h

</details>

<details><summary>Plan: Decision Log</summary>

(To be populated during implementation)

</details>

## Task Brief

# t1357: Mission System — Autonomous Long-Running Project Orchestration

## Origin

- **Created:** 2026-02-27
- **Session:** Claude Code interactive session
- **Created by:** marcusquinn (human) + ai-interactive
- **Conversation context:** Analysis of Factory.ai Missions (multi-day autonomous coding) led to a broader vision: an autonomous project agent that can research, procure, communicate, build, and self-organise across days/weeks. Not just "multi-day coding" but a full project lifecycle from idea to delivery.

## What

A `/mission` command and mission orchestration agent that takes a high-level goal ("Build a CRM", "Migrate this codebase to TypeScript", "Research and prototype a recommendation engine"), decomposes it into milestones and features, manages resources (accounts, credentials, payments, infrastructure), and drives autonomous execution over hours to days — with two modes:

1. **POC mode** — fast iteration, skip ceremony (briefs, PRs, reviews), commit to main or a single branch
2. **Full mode** — production-quality with standard worktree/PR/review workflows

The mission agent must:
- Analyse budget feasibility and recommend budget scales for various outcome levels
- Self-organise its files and folders as needs are discovered
- Create temporary agents and scripts for the mission, and offer improvements back to aidevops
- Use browser automation for reviewing its own progress and visual research
- Handle email, secrets, and account management for 3rd-party interactions
- Know and respect budgets (time, money, tokens) with model provider options
- Reference aidevops patterns for how to do things it already has working examples for
- Research its own examples when aidevops doesn't have them
- Know user preferences and constraints

### Mission homes:
- `~/.aidevops/missions/{id}/` — homeless missions (no repo yet, POC drafting)
- `todo/missions/{id}/` — missions attached to a project repo

### Mission folder structure:
```
{mission-id}/
├── mission.md          # State file (source of truth)
├── research/           # Gathered research, comparisons, references
├── agents/             # Mission-specific temporary agents
├── scripts/            # Mission-specific temporary scripts
└── assets/             # Screenshots, PDFs, exports, visual research
```

## Why

Current aidevops handles task-level work (`/full-loop`) and supervisor-level dispatch (`/pulse`), but nothing takes a high-level goal and drives it to completion over days. The gap between "I have an idea" and "tasks are in TODO.md ready for dispatch" requires manual decomposition. Missions close this gap and extend beyond code into research, procurement, and infrastructure setup — making aidevops a true autonomous project agent.

Factory.ai's Missions validates the market need but their scope is narrower (coding only). Our vision includes the full project lifecycle.

## How (Approach)

### Phase 1: Foundation (t1357.1-t1357.3)
- Create mission state file template (`templates/mission-template.md`)
- Create `/mission` command (`scripts/commands/mission.md`) with interactive scoping interview
- Create mission orchestrator agent doc with self-organisation guidance

### Phase 2: Execution Modes (t1357.4-t1357.5)
- Add POC mode to `/full-loop` (skip worktrees, skip review, commit to main/branch)
- Integrate mission awareness into pulse supervisor

### Phase 3: Validation & Budget (t1357.6-t1357.7)
- Create milestone validation worker
- Implement budget analysis and recommendation engine (time/money/tokens)

### Dependent Features (t1358-t1362)
- Payment agent for autonomous procurement
- Mission-aware browser QA in milestone validation
- Email agent for 3rd-party communication during missions
- Mission skill learning (auto-capture reusable patterns)
- Mission progress dashboard (CLI + browser)

### Key patterns to follow:
- `scripts/commands/define.md` — interview technique for scoping
- `scripts/commands/pulse.md` — supervisor dispatch pattern
- `scripts/commands/full-loop.md` — worker execution pattern
- `workflows/plans.md` — planning and task decomposition
- `tools/build-agent/build-agent.md` — agent creation lifecycle (draft tier)
- `reference/orchestration.md` — model routing and dispatch

### Key design decisions:
- Mission state in git (markdown), not a database — consistent with "GitHub + TODO.md are the database"
- Orchestrator as pulse extension, not separate daemon
- POC mode is a flag, not a separate system
- Milestones sequential, features within milestones parallelisable
- One orchestrator layer (no recursive sub-orchestrators)
- Missions start homeless in `~/.aidevops/missions/`, migrate to `todo/missions/` when a repo exists
- Mission agents/scripts are temporary (draft tier), with promotion path to aidevops shared

## Acceptance Criteria

- [ ] `/mission "description"` starts an interactive scoping interview
  ```yaml
  verify:
    method: bash
    run: "test -f ~/.aidevops/agents/scripts/commands/mission.md"
  ```
- [ ] Mission state file created in correct location (homeless or repo-attached)
  ```yaml
  verify:
    method: codebase
    pattern: "status: planning"
    path: "templates/mission-template.md"
  ```
- [ ] POC mode commits directly to main (dedicated repo) or single branch (existing repo)
- [ ] Full mode uses standard worktree + PR workflow
- [ ] Budget analysis recommends outcome levels for given budget
- [ ] Mission self-organises its folder (research/, agents/, scripts/, assets/)
- [ ] Pulse supervisor dispatches mission features as workers
- [ ] Milestone validation runs after all features in a milestone complete
- [ ] Mission agents created in draft tier with promotion path
- [ ] Budget tracking (time, money, tokens) with alerts at thresholds

## Context & Decisions

- Factory.ai Missions validated the concept but their scope is coding-only. Our vision extends to full project lifecycle (research, procurement, communication, infrastructure).
- Milestones are sequential with parallel features within — Factory found "serial execution with targeted parallelization has worked better than broad parallelism."
- One orchestrator layer, not recursive — Factory notes recursive management depth as an open question; for our scale, one layer suffices.
- POC mode exists because most missions start as proof-of-concept. The ceremony of briefs/PRs/reviews is valuable for production work but counterproductive for exploration.
- Budget analysis is critical — the mission agent should tell you "for $200 and 40h, you'll get X; for $500 and 80h, you'll get Y" before starting.
- Mission-specific agents are draft-tier by design. They're temporary tools for the mission. If they prove generally useful, they get promoted to custom/ or shared/.

## Relevant Files

- `scripts/commands/define.md` — interview pattern to reuse for mission scoping
- `scripts/commands/pulse.md` — supervisor dispatch to extend with mission awareness
- `scripts/commands/full-loop.md` — worker execution to add POC mode
- `workflows/plans.md` — planning patterns for milestone/feature decomposition
- `templates/brief-template.md` — brief format (used in full mode, skipped in POC)
- `tools/build-agent/build-agent.md` — agent lifecycle tiers (draft for mission agents)
- `reference/orchestration.md` — model routing for mission orchestrator/workers
- `tools/ai-assistants/headless-dispatch.md` — worker dispatch patterns
- `tools/browser/browser-automation.md` — browser QA for milestone validation
- `services/email/` — email capabilities for 3rd-party communication
- `tools/credentials/` — secret management for mission accounts

## Dependencies

- **Blocked by:** None (greenfield)
- **Blocks:** None directly, but enables a new class of autonomous work
- **External:** None for MVP; payment agent (t1358) needs virtual card provider; email agent (t1360) needs SES or similar configured

## Estimate Breakdown

| Phase | Time | Notes |
|-------|------|-------|
| Research/read | 2h | Existing patterns, Factory analysis |
| t1357.1 Mission template | 2h | State file format |
| t1357.2 /mission command | 6h | Interactive scoping + decomposition |
| t1357.3 Mission orchestrator agent | 4h | Self-organisation, guidance |
| t1357.4 POC mode in /full-loop | 2h | Skip ceremony flags |
| t1357.5 Pulse integration | 4h | Mission-aware dispatch |
| t1357.6 Milestone validation | 4h | Integration testing worker |
| t1357.7 Budget analysis engine | 4h | Feasibility + recommendations |
| **Total** | **~28h** | |

| Dependent features | Time | Notes |
|---|---|---|
| t1358 Payment agent | 8h | Virtual cards, budget enforcement |
| t1359 Mission browser QA | 4h | Visual validation in milestones |
| t1360 Email agent for missions | 4h | 3rd-party communication |
| t1361 Mission skill learning | 4h | Auto-capture reusable patterns |
| t1362 Mission progress dashboard | 4h | CLI + browser progress view |
| **Dependent total** | **~24h** | |

---
*Synced from TODO.md by issue-sync-helper.sh*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

t1357: Mission system: autonomous long-running project orchestration #2494

Description

Subtasks

Plan: Purpose

Task Brief

t1357: Mission System — Autonomous Long-Running Project Orchestration

Origin

What

Mission homes:

Mission folder structure:

Why

How (Approach)

Phase 1: Foundation (t1357.1-t1357.3)

Phase 2: Execution Modes (t1357.4-t1357.5)

Phase 3: Validation & Budget (t1357.6-t1357.7)

Dependent Features (t1358-t1362)

Key patterns to follow:

Key design decisions:

Acceptance Criteria

Context & Decisions

Relevant Files

Dependencies

Estimate Breakdown

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Phase	Time	Notes
Research/read	2h	Existing patterns, Factory analysis
t1357.1 Mission template	2h	State file format
t1357.2 /mission command	6h	Interactive scoping + decomposition
t1357.3 Mission orchestrator agent	4h	Self-organisation, guidance
t1357.4 POC mode in /full-loop	2h	Skip ceremony flags
t1357.5 Pulse integration	4h	Mission-aware dispatch
t1357.6 Milestone validation	4h	Integration testing worker
t1357.7 Budget analysis engine	4h	Feasibility + recommendations
Total	~28h

Dependent features	Time	Notes
t1358 Payment agent	8h	Virtual cards, budget enforcement
t1359 Mission browser QA	4h	Visual validation in milestones
t1360 Email agent for missions	4h	3rd-party communication
t1361 Mission skill learning	4h	Auto-capture reusable patterns
t1362 Mission progress dashboard	4h	CLI + browser progress view
Dependent total	~24h

t1357: Mission system: autonomous long-running project orchestration #2494

Description

Description

Subtasks

Plan: Purpose

Task Brief

t1357: Mission System — Autonomous Long-Running Project Orchestration

Origin

What

Mission homes:

Mission folder structure:

Why

How (Approach)

Phase 1: Foundation (t1357.1-t1357.3)

Phase 2: Execution Modes (t1357.4-t1357.5)

Phase 3: Validation & Budget (t1357.6-t1357.7)

Dependent Features (t1358-t1362)

Key patterns to follow:

Key design decisions:

Acceptance Criteria

Context & Decisions

Relevant Files

Dependencies

Estimate Breakdown

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions