Skip to content

feat(ownership): centralized ownership model with typed identities, DB-backed pairing, and OwnershipCache#1898

Merged
henrypark133 merged 50 commits intostagingfrom
feature/ownership-model
Apr 4, 2026
Merged

feat(ownership): centralized ownership model with typed identities, DB-backed pairing, and OwnershipCache#1898
henrypark133 merged 50 commits intostagingfrom
feature/ownership-model

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

@henrypark133 henrypark133 commented Apr 2, 2026

Summary

Replaces the ad-hoc user_id: String / "default" ownership model with a formally typed, centrally enforced ownership system that works uniformly for single-tenant and multi-tenant deployments.

This PR now also includes a follow-up hardening pass for web/channel pairing:

  • renumber PostgreSQL migrations after the latest staging versions
  • unify Telegram verification and DM pairing around a shared code-challenge module
  • make pending pairing lists admin-only in the web API
  • switch members to a simple self-claim code-entry flow in the web UI
  • immediately invalidate auth and pairing caches when deleting a user

What Changed

Core Types (src/ownership/)

  • OwnerId(String) newtype replaces raw &str / String user_id parameters throughout
  • Identity { owner_id: OwnerId, role: UserRole } — single struct flowing from channel boundary through all scope constructors and authorization checks
  • can_act_on(actor: &Identity, resource_owner: &OwnerId) -> bool — sole authorization check site; pure ownership equality, no admin bypass
  • OwnershipCache — in-process write-through cache mapping (channel, external_id) → Identity; zero DB reads on warm path

Three-Scope Access Model (src/tenant.rs)

  • TenantScope — now carries full Identity (not raw user_id: String); with_identity() preferred constructor, new() bridge for gradual migration
  • SystemScope (renamed from AdminScope) — internal background processes only (scheduler, worker, self-repair, heartbeat); db() escape hatch replaced with typed workspace_for_user()
  • AdminScope (new, lean) — requires UserRole::Admin; user management only; constructable only via AdminScope::new(identity, db) -> Option<Self>

Database

  • V18scope TEXT NOT NULL DEFAULT 'user' on wasm_tools and dynamic_tools (extension point for future admin-shared tools)
  • V19channel_identities(owner_id, channel, external_id) — canonical mapping from external channel identity to OwnerId
  • V20pairing_requests(channel, external_id, code, owner_id, expires_at) — replaces file-based pairing store
  • ChannelPairingStore sub-trait on Databaseresolve_channel_identity, upsert_pairing_request, approve_pairing (transactional, BEGIN IMMEDIATE in libSQL)
  • Fixed idempotency runner bug in IDEMPOTENT_ADD_COLUMN_MIGRATIONS — was only checking first matching entry per version

Bootstrap & Migration

  • bootstrap_ownership(db, config) — runs every startup after V1–V20, before FK enforcement; creates owner user row, rewrites all user_id = 'default' rows across tables, migrates nearai.session_token from plaintext settings to encrypted secrets
  • All hardcoded "default" user IDs in production code replaced with Config.owner_id or "<unset>" sentinels
  • get_or_create_user is atomic (ON CONFLICT DO NOTHING / INSERT OR IGNORE)

DB-Backed Pairing System (src/pairing/)

  • Complete rewrite from file-based sync storage to a DB-backed async pairing system
  • PairingStore::new(db, cache) wraps ChannelPairingStore + OwnershipCache; cache hit → zero DB reads
  • approve(code, owner_id) — unified flow; no separate admin path
  • Shared OwnershipCache instance flows from AppComponents through WASM channel setup and Signal channel
  • New shared src/code_challenge.rs module centralizes one-time code generation, normalization, and challenge rendering
  • Telegram owner verification and DM pairing now share the same code-challenge primitives while keeping their transport-specific UX

Web Pairing / User Management Follow-up

  • GET /api/pairing/{channel} is now admin-only
  • POST /api/pairing/{channel}/approve is the authenticated self-claim endpoint for pairing codes
  • Blank / whitespace-only pairing codes are rejected before hitting the DB
  • Web UI split by role:
    • admins still see pending pairing requests and approve from the channel card
    • members no longer see pending lists and instead enter a code to claim their own channel identity
  • Deleting a user now invalidates both cached bearer auth and cached pairing identities immediately

IncomingMessage Simplification (src/channels/channel.rs)

  • owner_id field removed — it was always a duplicate of user_id
  • user_id is now the canonical resolved OwnerId from channel_identities lookup
  • sender_id retained for reply routing

Settings → Secrets Migration (src/llm/session.rs)

  • SessionManager holds Arc<dyn SecretsStore>; save_session writes to secrets first, settings only as fallback when no secrets store is configured
  • nearai.session_token migrated from plaintext settings table to AES-256-GCM encrypted secrets on startup; plaintext deleted only after verifying decryption round-trip

Authorization Consistency

  • Web handlers in jobs.rs and routines.rs now use can_act_on() instead of inline user_id string comparisons

Security Properties

  • Users own their resources. Each resource (routine, job, credential, tool, channel config) has exactly one owner. No user — including admin — can read another user's data.
  • can_act_on is the sole authorization check. All ownership checks call through this function; no scattered inline comparisons.
  • Admin is not a bypass. UserRole::Admin gates AdminScope construction only; it does not override can_act_on.
  • Single-tenant = multi-tenant with one user. Identical code paths, no branching on tenant mode.
  • Pending pairing codes are no longer enumerable by non-admin users. Members can only redeem a code for themselves.

Test Coverage

  • Ownership-model unit/integration coverage from the original PR
  • Added handler-level pairing regressions for:
    • admin-only pending pairing list access
    • member self-claim flow for valid codes
    • blank-code rejection
    • immediate auth/pairing cache eviction on delete
  • Existing Telegram verification tests updated to exercise the shared code-challenge flow

Test Plan

  • cargo fmt
  • cargo clippy --all --all-features -- -D warnings
  • cargo test --lib pairing_ --no-default-features --features libsql
  • cargo test --lib telegram_ --no-default-features --features libsql
  • cargo test --lib telegram_message_matches_verification_code_variants --no-default-features --features libsql
  • cargo test --test pairing_integration --no-default-features --features libsql
  • broader full-suite / browser E2E coverage

Spec

docs/superpowers/specs/2026-04-01-ownership-model-design.md

henrypark133 and others added 28 commits April 1, 2026 13:15
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ix doc comment

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…orkspace_for_user(), fix stale variable names

- Add SystemScope::workspace_for_user() that wraps Workspace::new_with_db
- Remove SystemScope::db() which exposed the raw Arc<dyn Database>
- Update 3 callers (routine_engine.rs x2, heartbeat.rs x1) to use the new method
- Fix stale comment: "admin context" -> "system context" in SystemScope
- Rename `admin` bindings to `system` in agent_loop.rs for clarity

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…tity() constructor and bridge new()

- TenantScope: replace `user_id: String` field with `identity: Identity`; add `with_identity()` preferred constructor; keep `new(user_id, db)` as Member-role bridge; add `identity()` accessor; all internal method bodies use `identity.owner_id.as_str()` in place of `&self.user_id`
- TenantCtx: replace `user_id: String` field with `identity: Identity`; update constructor signature; add `identity()` accessor; `user_id()` delegates to `identity.owner_id.as_str()`; cost/rate methods updated accordingly
- agent_loop: split `tenant_ctx(&str)` into bridge + new `tenant_ctx_with_identity(Identity)` which holds the full body; bridge delegates to avoid duplication

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…uests migrations

- PostgreSQL: V16__tool_scope.sql adds scope column to wasm_tools/dynamic_tools
- PostgreSQL: V17__channel_identities.sql creates channel identity resolution table
- PostgreSQL: V18__pairing_requests.sql creates pairing request table replacing file-based store
- libSQL SCHEMA: adds scope column to wasm_tools/dynamic_tools, channel_identities, pairing_requests tables
- libSQL INCREMENTAL_MIGRATIONS: versions 17-19 for existing databases
- IDEMPOTENT_ADD_COLUMN_MIGRATIONS: handles fresh-install/upgrade dual path for scope columns
- Runner updated to check ALL idempotent columns per version before skipping SQL
- Test: test_ownership_model_tables_created verifies all new tables/columns exist after migrations

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…rsion sequence offset

Replace datetime('now') with strftime('%Y-%m-%dT%H:%M:%fZ', 'now') in the
channel_identities and pairing_requests table definitions (both in SCHEMA and
INCREMENTAL_MIGRATIONS) to match the project-standard RFC 3339 timestamp format
with millisecond precision. Also add a comment clarifying that libSQL incremental
migration version numbers are independent from PostgreSQL VN migration numbers.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… migration, replace hardcoded 'default' user IDs

- Add V19__ownership_fk.sql (programmatic-only, not in auto-migration sweep)
- Add `migrate_default_owner` to Database trait + both PgBackend and LibSqlBackend
- Add `get_or_create_user` default method to UserStore trait
- Add `bootstrap_ownership()` to app.rs, called in init_database() after connect_with_handles
- Replace hardcoded "default" owner_id in cli/config.rs, cli/mcp.rs, cli/mod.rs, orchestrator/mod.rs
- Add TODO(ownership) comments in llm/session.rs and tools/mcp/client.rs for deferred constructors

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ult_owner, V19 FK inline constant, fix remaining 'default' user IDs

- Delete migrations/V19__ownership_fk.sql so refinery no longer auto-applies FK constraints before bootstrap_ownership runs; add OWNERSHIP_FK_SQL constant with TODO for future programmatic application
- Remove racy SELECT+INSERT default in UserStore::get_or_create_user; both PostgreSQL (ON CONFLICT DO NOTHING) and libSQL (INSERT OR IGNORE) now use atomic upserts
- Wrap migrate_default_owner in explicit transactions on both backends for atomicity
- Make bootstrap_ownership failure fatal (propagate error instead of warn-and-continue)
- Fix mcp auth/test --user: change from default_value="default" to Option<String> resolved from configured owner_id
- Replace hardcoded "default" user IDs in channels/wasm/setup.rs with config.owner_id
- Replace "default" sentinel in OrchestratorState test helper with "<unset>" to make the test-only nature explicit

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…inel strings to <unset>

- Gate ContextManager::create_job() behind #[cfg(test)]; production code must
  use create_job_for_user() with an explicit user_id to prevent DB rows with
  user_id = 'default' being silently created on the production write path.
- Change the placeholder user_id in McpClient::new(), new_with_name(), and
  new_with_config() from "default" to "<unset>" so accidental secrets/settings
  lookups surface immediately rather than silently touching the wrong DB partition.
- Same sentinel change for SessionManager::new() and new_async() in session.rs;
  these are overwritten by attach_store() at startup with the real owner_id.
- Update tests that asserted the old "default" sentinel to expect "<unset>", and
  switch test_list_jobs_tool / test_job_status_tool to create_job_for_user("default")
  to keep ownership alignment with JobContext::default().

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…tity, upsert/approve pairing, PostgreSQL + libSQL implementations

Adds PairingRequestRecord, ChannelPairingStore trait (5 methods), and
generate_pairing_code() to src/db/mod.rs; implements for PgBackend in
postgres.rs and LibSqlBackend in libsql/pairing.rs; wires ChannelPairingStore
into the Database supertrait bound; all 6 libSQL unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…-insensitive/expired/double-approve tests

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ion on warm path

Converts src/ownership.rs to src/ownership/ module directory and adds
src/ownership/cache.rs with a write-through in-process cache mapping
(channel, external_id) -> Identity. Wired as Arc<OwnershipCache> on
AppComponents for Task 8 pairing integration. All 7 cache unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
… DB-backed store

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…t_pairing_response_structure

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…e construction

Adds 5 focused unit tests verifying TenantScope::with_identity stores the
full Identity (owner_id + role), TenantScope::new creates a Member-role
identity, and AdminScope::new returns Some for Admin and None for Member.
Uses LibSqlBackend::new_memory() as the test DB stub.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ershipCache

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…nd ChannelPairingStore

Adds tests/ownership_integration.rs covering migrate_default_owner idempotency,
TenantScope per-user setting isolation (including Admin role bypass check),
and the full ChannelPairingStore lifecycle (upsert, approve, remove, multi-channel isolation).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…tion from integration suite

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…Cache

Replaces the file-based pairing store (~/.ironclaw/*-pairing.json,
*-allowFrom.json) with a DB-backed async implementation that delegates
to ChannelPairingStore and writes through to OwnershipCache on reads.

- PairingStore::new(db, cache) uses the DB; new_noop() for test/no-DB
- resolve_identity() cache-first lookup via OwnershipCache
- approve(code, owner_id) removes channel arg (DB looks up by code)
- All WASM host functions updated: pairing_upsert_request uses block_in_place,
  pairing-is-allowed renamed to pairing-resolve-identity returning Option<String>,
  pairing-read-allow-from deprecated (returns empty list)
- Signal channel receives PairingStore via new(config, db) constructor
- Web gateway pairing handlers read from state.store (DB) directly
- extensions.rs derive_activation_status drops PairingStore dependency;
  derives status from extension.active and owner_binding flag instead
- All test call sites updated to use new_noop()

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…tializers, fix disk-full post-edit compile

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…canonical resolved OwnerId

`owner_id` on `IncomingMessage` was always a duplicate of `user_id` —
both fields held the same value at every call site. Remove the field and
`with_owner_id()` builder, update the four WASM-wrapper and HTTP test
assertions to use `user_id`, and drop the redundant struct literal field
in the routine_engine test helper.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…reen, chat UI, owner login

Adds five Playwright-based browser tests to the ownership model E2E suite
verifying the web UI experience: authenticated owner sees chat input, unauthenticated
browser sees auth screen, owner can send a message and receive a response, settings
tab renders without errors, and basic page structure is correct after login.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…o encrypted secrets store

Moves nearai.session_token from the plaintext DB settings table to the
AES-256-GCM encrypted secrets store (key: nearai_session_token).

- SessionManager gains an `attach_secrets()` method that wires in the
  secrets store; `save_session` writes to it when available and
  `load_session_from_secrets` is called preferentially over settings
- `migrate_session_credential()` runs idempotently on each startup in
  `init_secrets()`, reading the JSON session from settings, writing it
  to secrets, then deleting the plaintext copy
- Wizard's `persist_session_to_db` now writes to secrets first, falling
  back to plaintext settings only when secrets store is unavailable
- Plaintext settings path is preserved as fallback for installs without
  a secrets store (no master key configured)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ecryption before deleting plaintext

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…rshipCache across channels, add dynamic_tools to migration, fix doc comment

- libSQL migrate_default_owner: wrap UPDATE loop in async closure + match to emit ROLLBACK on any mid-transaction failure (mirroring approve_pairing pattern)
- Both backends: add dynamic_tools to the migrate_default_owner table list so agent-built tools are migrated on first pairing
- setup_wasm_channels: accept Arc<OwnershipCache> parameter instead of allocating a fresh cache, share the AppComponents cache
- SignalChannel::new: accept Arc<OwnershipCache> parameter and pass it to PairingStore instead of allocating a new cache
- PairingStore: fix module-level and struct-level doc comments to accurately describe lazy cache population after approve()

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…stead of raw string comparisons

Replace 12 raw `user_id != user.user_id` / `user_id == user.user_id` string comparisons
in jobs.rs and 4 in routines.rs with calls through the canonical `can_act_on` function
from `crate::ownership`, which is the spec-mandated authorization mechanism.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 2, 2026 03:25
@github-actions github-actions bot added the scope: agent Agent core (agent loop, router, scheduler) label Apr 2, 2026
drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026
…B-backed pairing, and OwnershipCache (nearai#1898)

* feat(ownership): add OwnerId, Identity, UserRole, can_act_on types

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(ownership): private OwnerId field, ResourceScope serde derives, fix doc comment

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* refactor(tenant): replace SystemScope::db() escape hatch with typed workspace_for_user(), fix stale variable names

- Add SystemScope::workspace_for_user() that wraps Workspace::new_with_db
- Remove SystemScope::db() which exposed the raw Arc<dyn Database>
- Update 3 callers (routine_engine.rs x2, heartbeat.rs x1) to use the new method
- Fix stale comment: "admin context" -> "system context" in SystemScope
- Rename `admin` bindings to `system` in agent_loop.rs for clarity

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(tenant): rename stale admin binding to system_store in heartbeat.rs

* refactor(tenant): TenantScope/TenantCtx carry Identity, add with_identity() constructor and bridge new()

- TenantScope: replace `user_id: String` field with `identity: Identity`; add `with_identity()` preferred constructor; keep `new(user_id, db)` as Member-role bridge; add `identity()` accessor; all internal method bodies use `identity.owner_id.as_str()` in place of `&self.user_id`
- TenantCtx: replace `user_id: String` field with `identity: Identity`; update constructor signature; add `identity()` accessor; `user_id()` delegates to `identity.owner_id.as_str()`; cost/rate methods updated accordingly
- agent_loop: split `tenant_ctx(&str)` into bridge + new `tenant_ctx_with_identity(Identity)` which holds the full body; bridge delegates to avoid duplication

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(db): add V16 tool scope, V17 channel_identities, V18 pairing_requests migrations

- PostgreSQL: V16__tool_scope.sql adds scope column to wasm_tools/dynamic_tools
- PostgreSQL: V17__channel_identities.sql creates channel identity resolution table
- PostgreSQL: V18__pairing_requests.sql creates pairing request table replacing file-based store
- libSQL SCHEMA: adds scope column to wasm_tools/dynamic_tools, channel_identities, pairing_requests tables
- libSQL INCREMENTAL_MIGRATIONS: versions 17-19 for existing databases
- IDEMPOTENT_ADD_COLUMN_MIGRATIONS: handles fresh-install/upgrade dual path for scope columns
- Runner updated to check ALL idempotent columns per version before skipping SQL
- Test: test_ownership_model_tables_created verifies all new tables/columns exist after migrations

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(db): use correct RFC3339 timestamp default in libSQL, document version sequence offset

Replace datetime('now') with strftime('%Y-%m-%dT%H:%M:%fZ', 'now') in the
channel_identities and pairing_requests table definitions (both in SCHEMA and
INCREMENTAL_MIGRATIONS) to match the project-standard RFC 3339 timestamp format
with millisecond precision. Also add a comment clarifying that libSQL incremental
migration version numbers are independent from PostgreSQL VN migration numbers.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(ownership): bootstrap_ownership(), migrate_default_owner, V19 FK migration, replace hardcoded 'default' user IDs

- Add V19__ownership_fk.sql (programmatic-only, not in auto-migration sweep)
- Add `migrate_default_owner` to Database trait + both PgBackend and LibSqlBackend
- Add `get_or_create_user` default method to UserStore trait
- Add `bootstrap_ownership()` to app.rs, called in init_database() after connect_with_handles
- Replace hardcoded "default" owner_id in cli/config.rs, cli/mcp.rs, cli/mod.rs, orchestrator/mod.rs
- Add TODO(ownership) comments in llm/session.rs and tools/mcp/client.rs for deferred constructors

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(ownership): atomic get_or_create_user, transactional migrate_default_owner, V19 FK inline constant, fix remaining 'default' user IDs

- Delete migrations/V19__ownership_fk.sql so refinery no longer auto-applies FK constraints before bootstrap_ownership runs; add OWNERSHIP_FK_SQL constant with TODO for future programmatic application
- Remove racy SELECT+INSERT default in UserStore::get_or_create_user; both PostgreSQL (ON CONFLICT DO NOTHING) and libSQL (INSERT OR IGNORE) now use atomic upserts
- Wrap migrate_default_owner in explicit transactions on both backends for atomicity
- Make bootstrap_ownership failure fatal (propagate error instead of warn-and-continue)
- Fix mcp auth/test --user: change from default_value="default" to Option<String> resolved from configured owner_id
- Replace hardcoded "default" user IDs in channels/wasm/setup.rs with config.owner_id
- Replace "default" sentinel in OrchestratorState test helper with "<unset>" to make the test-only nature explicit

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(ownership): remove default user_id from create_job(), change sentinel strings to <unset>

- Gate ContextManager::create_job() behind #[cfg(test)]; production code must
  use create_job_for_user() with an explicit user_id to prevent DB rows with
  user_id = 'default' being silently created on the production write path.
- Change the placeholder user_id in McpClient::new(), new_with_name(), and
  new_with_config() from "default" to "<unset>" so accidental secrets/settings
  lookups surface immediately rather than silently touching the wrong DB partition.
- Same sentinel change for SessionManager::new() and new_async() in session.rs;
  these are overwritten by attach_store() at startup with the real owner_id.
- Update tests that asserted the old "default" sentinel to expect "<unset>", and
  switch test_list_jobs_tool / test_job_status_tool to create_job_for_user("default")
  to keep ownership alignment with JobContext::default().

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(db): add ChannelPairingStore sub-trait with resolve_channel_identity, upsert/approve pairing, PostgreSQL + libSQL implementations

Adds PairingRequestRecord, ChannelPairingStore trait (5 methods), and
generate_pairing_code() to src/db/mod.rs; implements for PgBackend in
postgres.rs and LibSqlBackend in libsql/pairing.rs; wires ChannelPairingStore
into the Database supertrait bound; all 6 libSQL unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(db): atomic libSQL approve_pairing with BEGIN IMMEDIATE, add case-insensitive/expired/double-approve tests

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(ownership): add OwnershipCache for zero-DB-read identity resolution on warm path

Converts src/ownership.rs to src/ownership/ module directory and adds
src/ownership/cache.rs with a write-through in-process cache mapping
(channel, external_id) -> Identity. Wired as Arc<OwnershipCache> on
AppComponents for Task 8 pairing integration. All 7 cache unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* test(e2e): add ownership model E2E tests and extend pairing tests for DB-backed store

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): remove unused asyncio import, add fallback assertion in test_pairing_response_structure

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* test(tenant): unit tests for TenantScope::with_identity and AdminScope construction

Adds 5 focused unit tests verifying TenantScope::with_identity stores the
full Identity (owner_id + role), TenantScope::new creates a Member-role
identity, and AdminScope::new returns Some for Admin and None for Member.
Uses LibSqlBackend::new_memory() as the test DB stub.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(ownership): recover from RwLock poison instead of expect() in OwnershipCache

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* test(ownership): integration tests for bootstrap, tenant isolation, and ChannelPairingStore

Adds tests/ownership_integration.rs covering migrate_default_owner idempotency,
TenantScope per-user setting isolation (including Admin role bypass check),
and the full ChannelPairingStore lifecycle (upsert, approve, remove, multi-channel isolation).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(test): remove duplicate pairing tests and flaky random-code assertion from integration suite

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(pairing): rewrite PairingStore to DB-backed async with OwnershipCache

Replaces the file-based pairing store (~/.ironclaw/*-pairing.json,
*-allowFrom.json) with a DB-backed async implementation that delegates
to ChannelPairingStore and writes through to OwnershipCache on reads.

- PairingStore::new(db, cache) uses the DB; new_noop() for test/no-DB
- resolve_identity() cache-first lookup via OwnershipCache
- approve(code, owner_id) removes channel arg (DB looks up by code)
- All WASM host functions updated: pairing_upsert_request uses block_in_place,
  pairing-is-allowed renamed to pairing-resolve-identity returning Option<String>,
  pairing-read-allow-from deprecated (returns empty list)
- Signal channel receives PairingStore via new(config, db) constructor
- Web gateway pairing handlers read from state.store (DB) directly
- extensions.rs derive_activation_status drops PairingStore dependency;
  derives status from extension.active and owner_binding flag instead
- All test call sites updated to use new_noop()

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(pairing): add missing pairing_store field to all GatewayState initializers, fix disk-full post-edit compile

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(channels): remove owner_id from IncomingMessage, user_id is the canonical resolved OwnerId

`owner_id` on `IncomingMessage` was always a duplicate of `user_id` —
both fields held the same value at every call site. Remove the field and
`with_owner_id()` builder, update the four WASM-wrapper and HTTP test
assertions to use `user_id`, and drop the redundant struct literal field
in the routine_engine test helper.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): remove stale owner_id param from make_message test helper

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* test(e2e): add browser/Playwright tests for ownership model — auth screen, chat UI, owner login

Adds five Playwright-based browser tests to the ownership model E2E suite
verifying the web UI experience: authenticated owner sees chat input, unauthenticated
browser sees auth screen, owner can send a message and receive a response, settings
tab renders without errors, and basic page structure is correct after login.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* feat(settings): migrate channel credentials from plaintext settings to encrypted secrets store

Moves nearai.session_token from the plaintext DB settings table to the
AES-256-GCM encrypted secrets store (key: nearai_session_token).

- SessionManager gains an `attach_secrets()` method that wires in the
  secrets store; `save_session` writes to it when available and
  `load_session_from_secrets` is called preferentially over settings
- `migrate_session_credential()` runs idempotently on each startup in
  `init_secrets()`, reading the JSON session from settings, writing it
  to secrets, then deleting the plaintext copy
- Wizard's `persist_session_to_db` now writes to secrets first, falling
  back to plaintext settings only when secrets store is unavailable
- Plaintext settings path is preserved as fallback for installs without
  a secrets store (no master key configured)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(settings): settings fallback only when no secrets store, verify decryption before deleting plaintext

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(ownership): ROLLBACK in libSQL migrate_default_owner, shared OwnershipCache across channels, add dynamic_tools to migration, fix doc comment

- libSQL migrate_default_owner: wrap UPDATE loop in async closure + match to emit ROLLBACK on any mid-transaction failure (mirroring approve_pairing pattern)
- Both backends: add dynamic_tools to the migrate_default_owner table list so agent-built tools are migrated on first pairing
- setup_wasm_channels: accept Arc<OwnershipCache> parameter instead of allocating a fresh cache, share the AppComponents cache
- SignalChannel::new: accept Arc<OwnershipCache> parameter and pass it to PairingStore instead of allocating a new cache
- PairingStore: fix module-level and struct-level doc comments to accurately describe lazy cache population after approve()

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(web): use can_act_on for authorization in job/routine handlers instead of raw string comparisons

Replace 12 raw `user_id != user.user_id` / `user_id == user.user_id` string comparisons
in jobs.rs and 4 in routines.rs with calls through the canonical `can_act_on` function
from `crate::ownership`, which is the spec-mandated authorization mechanism.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* chore: include remaining modified files in ownership model branch

* fix: add pairing_store field to test GatewayState initializers, update PairingStore API calls in integration tests

Add missing `pairing_store: None` to all GatewayState struct initializers
in test files. Migrate old file-based PairingStore API calls
(PairingStore::new(), PairingStore::with_base_dir()) to the new DB-backed
API (PairingStore::new_noop()). Rewrite pairing_integration.rs to use
LibSqlBackend with the new async DB-backed PairingStore API.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* chore: cargo fmt

* fix(pairing): truly no-op PairingStore noop mode, ensure owner user in CLI, fix signal safety comments

- PairingStore::upsert_request now returns a dummy record in noop mode instead of
  erroring, and approve silently succeeds (matching the doc promise of "writes
  are silently discarded").
- PairingStore::approve now accepts a channel parameter, matching the updated
  DB trait signature and propagated to all call sites (CLI, web server, tests).
- CLI run_pairing_command ensures the owner user row exists before approval to
  satisfy the FK constraint on channel_identities.owner_id.
- Signal channel block_in_place safety comments corrected from "WASM channel
  callbacks" to "Signal channel message processing".

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(pairing): thread channel through approve_pairing, add created flag, retry on code collision, remove redundant indexes

Addresses PR review comments:
- approve_pairing validates code belongs to the given channel
- PairingRequestRecord.created replaces timing heuristic
- upsert retries on UNIQUE violation (up to 3 attempts)
- redundant indexes removed (UNIQUE creates implicit index)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(ownership): migrate api_tokens, serialize PG approvals, propagate resolved owner_id

Addresses PR review P1/P2 regressions:

- api_tokens included in migrate_default_owner (both backends)
- PostgreSQL approve_pairing uses FOR UPDATE to prevent concurrent approvals
- Signal resolve_sender_identity returns owner_id, set as IncomingMessage.user_id
  with raw phone number preserved as sender_id for reply routing
- Feishu uses resolved owner_id from pairing_resolve_identity in emitted message
- PairingStore noop mode logs warning when pairing admission is impossible

[skip-regression-check]

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

* fix(pr-review): sanitize DB errors in pairing handlers, fix doc comments, add TODO for derive_activation_status

- Pairing list/approve handlers no longer leak DB error details to clients
- NotFound errors return user-friendly 'Invalid or expired pairing code' message
- Module doc in pairing/store.rs corrected (remove -> evict, no insert method)
- wit_compat.rs stub comment corrected to match actual Val shape
- TODO added for derive_activation_status has_paired approximation

* fix(pr-review): propagate libSQL query errors in approve_pairing, round-trip validate session credential migration, fix test doc comment

- libSQL approve_pairing: .ok().flatten() replaced with .map_err() to propagate DB errors
- migrate_session_credential: round-trip compares decrypted secret against plaintext before deleting
- ownership_integration.rs: doc comment corrected to match actual test coverage

* fix(pairing): store meta, wrap upserts in transactions, case-insensitive role/channel, log Signal DB errors, use auth role in handlers

- Store meta JSONB/TEXT column in pairing_requests (PG migration V18, libSQL schema + incremental migration 19)
- Wrap upsert_pairing_request in transactions (PG: client.transaction(), libSQL: BEGIN IMMEDIATE/COMMIT/ROLLBACK)
- Case-insensitive role parsing: eq_ignore_ascii_case("admin") in both backends
- Case-insensitive channel matching in approve_pairing: LOWER(channel) = LOWER($2)
- Log DB errors in Signal resolve_sender_identity instead of silently discarding
- Use auth role from UserIdentity in web handlers (jobs.rs, routines.rs) via identity_from_auth helper
- Fix variable shadowing: rename `let channel` to `let req_channel` in libsql approve_pairing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(security): add auth to pairing list, cache eviction on deactivate, runtime assert in Signal, remove default fallback, warn on noop pairing codes

Addresses zmanian's review:
- nearai#1: pairing_list_handler requires AuthenticatedUser
- nearai#2: OwnershipCache.evict_user() evicts all entries for a user on suspension
- nearai#3: debug_assert! for multi-thread runtime in Signal block_in_place
- nearai#9: Noop PairingStore warns when generating unredeemable codes
- nearai#10: cli/mcp.rs default fallback replaced with <unset>

* fix(pairing): consistent LOWER() channel matching in resolve_channel_identity, fix wizard doc comment, fix E2E test assertion for ActionResponse convention

* fix(pairing): apply LOWER() consistently across all ChannelPairingStore queries (upsert, list_pending, remove)

All channel matching now uses LOWER() in both PostgreSQL and libSQL backends:
- upsert_pairing_request: WHERE LOWER(channel) = LOWER($1)
- list_pending_pairings: WHERE LOWER(channel) = LOWER($1)
- remove_channel_identity: WHERE LOWER(channel) = LOWER($1)

Previously only resolve_channel_identity and approve_pairing used LOWER(),
causing inconsistent matching when channel names differed by case.

* fix(pairing): unify code challenge flow and harden web pairing

* test: harden pairing review follow-ups

* fix: guard wasm pairing callbacks by runtime flavor

* fix(pairing): normalize channel keys and serialize pg upserts

* chore(web): clean up ownership review follow-ups

* Preserve WASM pairing allowlist compatibility

---------

Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026
…sts (nearai#2064)

* fix(staging): repair broken test build and macOS-incompatible SSRF tests

Staging tip (06a6759) was failing `cargo test` for several unrelated
reasons. This commit gets the test suite back to green on both Linux CI
and macOS.

1. **wrapper.rs:5595** — `PairingStore::new()` was called with zero args
   in `test_http_request_rejects_private_ip_targets`. The signature
   changed to `(db, cache)` in nearai#1898 and the other test in the same file
   (line 5573) was updated to use `PairingStore::new_noop()`, but this
   one was missed. Switch to `new_noop()` to match.

2. **CLI snapshot** — clap's `render_long_help` for `--auto-approve`
   now emits an indented blank line between the short and long
   description (10 spaces, not empty). Update the snapshot to match the
   new output and refresh `assertion_line` (438 -> 461).

3. **validate_base_url IPv6 bracket bug** — `Url::host_str()` returns
   IPv6 literals WITH the surrounding brackets (e.g. `[::1]`), but
   `IpAddr::parse` does not accept brackets — it wants bare `::1`. As a
   result the IPv6 SSRF defense was effectively dead code: every `[…]`
   host failed to parse and fell through to the DNS-resolution path.
   This passed on Linux CI by accident (because `to_socket_addrs` on
   Linux also fails on bracketed strings), but broke on any host whose
   resolver returns a public IP for unresolvable lookups (ISP captive
   portals, ad-injecting DNS providers). Strip the brackets before
   parsing so the IPv6 detection actually works as intended.

4. **DNS-hijack-tolerant test guards** — two tests
   (`validate_base_url_rejects_dns_failure`,
   `test_validate_public_https_url_fails_closed_on_dns_error`) rely on
   RFC 6761's promise that `.invalid` lookups fail. On networks with
   DNS hijacking that promise doesn't hold and the lookups succeed
   (typically resolving to a public ad-server IP). Probe with
   `ironclaw-dns-hijack-probe.invalid` and skip the test with an
   eprintln on hijacked-DNS networks. Coverage on CI is unchanged.

5. **ExtensionManager test isolation** — the
   `extension_manager_with_process_manager_constructs` integration test
   passes `store: None`, which makes `list()` fall back to file-based
   `load_mcp_servers()` reading `~/.ironclaw/mcp-servers.json`. Any
   locally installed MCP server (e.g. notion) leaked into the test and
   broke the empty assertion. Set `IRONCLAW_BASE_DIR` to a fresh
   tempdir at the top of the test (before the LazyLock is initialized)
   to fully isolate.

After this commit, `./scripts/dev-setup.sh` runs end-to-end and
`cargo test` passes 4709/0 locally on macOS as well as CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(staging): address PR review feedback

- helpers.rs: simplify IPv6 bracket stripping to `host.trim_matches(...)`
  per gemini-code-assist suggestion. Functionally equivalent to the
  chained strip_prefix/strip_suffix/unwrap_or but cleaner; for any host
  string from `Url::host_str()` (which only ever returns matched
  brackets), the result is identical.

- setup/channels.rs: replace blocking `std::net::ToSocketAddrs` probe
  with `tokio::time::timeout(2s, tokio::net::lookup_host(...))` per
  Copilot review. The previous synchronous lookup could block a tokio
  worker thread inside `#[tokio::test]` and stall the suite on slow or
  offline DNS. The async resolver with a hard 2-second cap eliminates
  both risks.

- module_init_integration.rs: drop the brittle `is_empty()` assertion
  and the env-var override entirely, addressing both Copilot's review
  comment about parallel test ordering and the reviewer's deeper
  concern about touching process env from an integration test that
  cannot access the crate-private ENV_MUTEX. The test's actual purpose
  is to verify that ExtensionManager constructs and `list()` returns
  Ok — that's exactly what `is_ok()` checks. The empty assertion was
  always brittle (the test creates empty TOOL/CHANNEL dirs but does not
  isolate ~/.ironclaw, so any locally installed MCP server leaks in)
  and trying to "fix" it by mutating IRONCLAW_BASE_DIR from inside an
  integration test introduces order-dependent behaviour that the user
  flagged: parallel tests in the same binary can race the LazyLock,
  and there is no integration-test-visible mutex to serialise env
  mutations. Removing the assertion is the principled fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ironclaw-ci ironclaw-ci bot mentioned this pull request Apr 10, 2026
@ironclaw-ci ironclaw-ci bot mentioned this pull request Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: high Safety, secrets, auth, or critical infrastructure scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: channel/wasm WASM channel runtime scope: channel/web Web gateway channel scope: channel Channel infrastructure scope: db/libsql libSQL / Turso backend scope: db/postgres PostgreSQL backend scope: db Database trait / abstraction scope: docs Documentation scope: extensions Extension management scope: llm LLM integration scope: orchestrator Container orchestrator scope: pairing Pairing mode scope: setup Onboarding / setup scope: tool/builtin Built-in tools scope: tool/mcp MCP client scope: worker Container worker size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants