Observability: Status API (health, active containers, tasks, audit)

## Summary

Add a lightweight HTTP server exposing status and health endpoints. This is the foundation for all observability — once we can query the system over HTTP, everything else (dashboards, uptime monitors, alerting) becomes trivial.

## Motivation

Currently there's no way to check system health without shelling in and reading logs or querying SQLite directly. We need a programmatic interface to answer: Is it alive? What's running? What happened recently?

## Endpoints

### `GET /health`
Quick liveness/readiness check. Returns:
```json
{
  "status": "ok",
  "uptime_seconds": 84321,
  "whatsapp_connected": true,
  "channels": ["whatsapp", "discord"],
  "db_ok": true,
  "last_message_at": "2026-03-06T10:23:00Z"
}
```

### `GET /status`
Active system state:
```json
{
  "active_containers": [
    { "name": "nanoclaw-main-1709721600", "group": "main", "duration_s": 45, "type": "message" }
  ],
  "active_count": 1,
  "queue_depths": { "main": 2, "work": 0 },
  "registered_groups": 3,
  "pending_tasks": 1
}
```

### `GET /tasks`
Scheduled tasks with recent run history:
```json
{
  "tasks": [
    {
      "id": 1,
      "group": "main",
      "schedule": "0 9 * * *",
      "status": "active",
      "last_run": "2026-03-06T09:00:00Z",
      "last_result": "success",
      "last_duration_ms": 12340
    }
  ]
}
```

### `GET /audit`
Recent agent activity (last N container runs):
```json
{
  "recent_runs": [
    {
      "group": "main",
      "started_at": "2026-03-06T10:20:00Z",
      "duration_ms": 34000,
      "exit_code": 0,
      "type": "message",
      "trigger": "whatsapp"
    }
  ]
}
```

## Implementation Details

- **New file:** `src/status-server.ts` (~150 lines)
- **No new dependencies** — use `node:http` directly
- **Port:** configurable via `STATUS_PORT` env var, default `9100`
- **Bind:** `127.0.0.1` by default (local only)
- **Data sources:** GroupQueue in-memory state, SQLite DB, channel connection status
- Needs read access to `GroupQueue` state (active containers, queue depths) — may need to expose a `getStatus()` method
- Needs read access to channel connection status from `index.ts`
- Task/audit data queried from SQLite (`scheduled_tasks`, `task_run_logs`)

## Acceptance Criteria

- [ ] HTTP server starts alongside main process
- [ ] `/health` returns 200 when system is operational, includes channel connection status
- [ ] `/status` shows active containers with names, groups, and durations
- [ ] `/tasks` lists scheduled tasks with last run info
- [ ] `/audit` shows recent container runs (last 50)
- [ ] Server binds to localhost only by default
- [ ] Port configurable via env var
- [ ] No new npm dependencies
- [ ] Graceful shutdown (server closes when process exits)

## Labels
`observability`, `phase-1`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observability: Status API (health, active containers, tasks, audit) #773

Summary

Motivation

Endpoints

`GET /health`

`GET /status`

`GET /tasks`

`GET /audit`

Implementation Details

Acceptance Criteria

Labels

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Observability: Status API (health, active containers, tasks, audit) #773

Description

Summary

Motivation

Endpoints

GET /health

GET /status

GET /tasks

GET /audit

Implementation Details

Acceptance Criteria

Labels

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`GET /health`

`GET /status`

`GET /tasks`

`GET /audit`