Skip to content

feat: Add NEAR key management with transaction signing and policy engine#14

Closed
ilblackdragon wants to merge 1 commit intomainfrom
key-management
Closed

feat: Add NEAR key management with transaction signing and policy engine#14
ilblackdragon wants to merge 1 commit intomainfrom
key-management

Conversation

@ilblackdragon
Copy link
Copy Markdown
Member

Implements hybrid-custody NEAR key management where the agent holds scoped function-call keys for routine operations while high-value operations require explicit user approval through the existing channel approval flow.

Core infrastructure:

  • Ed25519 key generation/import via ed25519-dalek (not near-crypto)
  • AES-256-GCM encrypted storage via existing SecretsStore
  • Hand-rolled borsh-serializable NEAR transaction types
  • NEP-413 intent signing and MPC chain signature support
  • Configurable policy engine with transaction analysis pipeline
  • Daily spend tracking with automatic midnight UTC reset
  • Encrypted backup/restore with Argon2id KDF
  • CLI subcommands: generate, import, list, info, remove, export, policy, backup, restore
  • NEAR ed25519 secret key leak detection (Critical/Block)
  • WASM sign-payload host function (keys never enter WASM memory)
  • KeyManager wired into AgentDeps for agent-wide access

Security invariants: private keys never reach the LLM or WASM boundary, signing happens in host Rust code with Zeroize on drop, every transaction is analyzed before signing, most-restrictive policy rule wins.

Implements hybrid-custody NEAR key management where the agent holds scoped
function-call keys for routine operations while high-value operations require
explicit user approval through the existing channel approval flow.

Core infrastructure:
- Ed25519 key generation/import via ed25519-dalek (not near-crypto)
- AES-256-GCM encrypted storage via existing SecretsStore
- Hand-rolled borsh-serializable NEAR transaction types
- NEP-413 intent signing and MPC chain signature support
- Configurable policy engine with transaction analysis pipeline
- Daily spend tracking with automatic midnight UTC reset
- Encrypted backup/restore with Argon2id KDF
- CLI subcommands: generate, import, list, info, remove, export, policy, backup, restore
- NEAR ed25519 secret key leak detection (Critical/Block)
- WASM sign-payload host function (keys never enter WASM memory)
- KeyManager wired into AgentDeps for agent-wide access

Security invariants: private keys never reach the LLM or WASM boundary,
signing happens in host Rust code with Zeroize on drop, every transaction
is analyzed before signing, most-restrictive policy rule wins.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread src/keys/types.rs
/// Rules: 2-64 chars, lowercase alphanumeric + `.`, `-`, `_`.
/// No leading/trailing separators, no consecutive separators.
#[derive(Clone, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct NearAccountId(String);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of handrolling these can we not use near protocol crates? These types should already exist in near sdk

serrrfirat pushed a commit to serrrfirat/ironclaw that referenced this pull request Feb 11, 2026
Reviews PRs nearai#10, nearai#13, nearai#14, nearai#17, nearai#18, nearai#20, nearai#28 covering:
- Critical: hand-rolled NEAR tx serialization and key mgmt (PR nearai#14)
- High: hooks system can bypass safety layer (PR nearai#18)
- High: DM pairing token security needs verification (PR nearai#17)
- Medium: auth bypass when API key mode set without key (PR nearai#20)
- Medium: safety error retry classification in failover (PR nearai#28)
- Low: Okta WASM tool and benchmarking harness

https://claude.ai/code/session_01B75Rq9u593YG9Kc4FG487Z
@tribendu
Copy link
Copy Markdown

Code Review: IronClaw PR #14 - NEAR Key Management

Summary

This PR introduces comprehensive NEAR blockchain key management with ed25519 signing, a policy engine for transaction approval, backup/restore functionality, and WASM payload signing capabilities. The implementation demonstrates solid security fundamentals: proper use of Argon2id for key derivation, AES-256-GCM for encryption, Zeroize for secure memory handling, and a leak detector for secret key detection. However, there are several concerning areas requiring attention before this security-critical code should be merged.

Pros

  • Strong encryption defaults: Uses Argon2id (memory-hard) for backup key derivation with AES-256-GCM (authenticated encryption), avoiding weak KDFs like PBKDF2
  • Memory safety: Implements Zeroize trait on SigningKey to clear key material from memory immediately after use
  • Defense in depth: Policy engine supports multiple layers (transfer limits, whitelists, daily spend limits, function call rules)
  • Secret leak detection: Added regex pattern for NEAR ed25519 secret keys (80-90 char base58) with Block action
  • WASM boundary security: Private keys never enter WASM memory; signing happens in host code with only signature returned
  • Proper key validation: Account IDs validated on construction, public keys validated for correct length and format
  • Comprehensive test coverage: Tests for policy evaluation, transaction analysis, key serialization, glob matching, and spend tracking

Concerns

  1. Policy bypass in scope check (src/keys/policy.rs:456-472): The function call policy check has an early return for scoped keys that may allow transactions outside the key's actual scope. The logic action.value_yocto == 0 should verify the receiver_id matches the key's scoped receiver, otherwise a function-call key scoped to contract A could potentially be used for zero-deposit calls to contract B.

  2. Incomplete Argon2 parameters (src/keys/mod.rs:853): Uses Argon2::default() which uses Argon2id but doesn't specify memory cost (m), iterations (t), or parallelism (p). The default of 2^16 KB memory may be insufficient for security-critical backup encryption. Should explicitly set: Argon2::new(Algorithm::Argon2id, Version::V0x13, Params::new(65536, 3, 4, None)?).

  3. No key rotation mechanism: The implementation stores keys indefinitely without any rotation policy. Long-lived keys increase attack surface if the secrets store is compromised.

  4. Missing rate limiting on RPC client (src/keys/rpc.rs): The RPC client makes network calls without any timeout or retry logic. Long-running requests could cause the agent to hang indefinitely.

  5. Policy stored unencrypted (src/cli/key.rs:45-48): The policy configuration at ~/.ironclaw/key_policy.json is stored in plaintext. While it doesn't contain secrets, it reveals security limits to attackers who gain filesystem access.

  6. No transaction nonce management: The implementation fetches nonce per transaction with nonce = access_key.nonce + 1 but doesn't handle concurrent transaction ordering. Multiple simultaneous transactions could result in nonce collisions.

  7. Incomplete PayloadSigner implementation: The PayloadSigner trait is defined but never actually implemented with a real key manager integration. The WASM signing capability will fail at runtime.

Suggestions

  1. Harden Argon2 parameters for backup encryption with explicit memory cost (64MB), iterations (3), and parallelism (4):

    use argon2::{Algorithm, Version, Params};
    let params = Params::new(65536, 3, 4, None)?;
    let argon2 = Argon2::new(Algorithm::Argon2id, Version::V0x13, params);
  2. Add transaction queue or nonce management to prevent concurrent transaction nonce collisions.

  3. Implement actual signer injection into WASM capabilities by connecting KeyManager as the PayloadSigner implementation.

  4. Add RPC timeouts and exponential backoff for network resilience:

    let client = reqwest::Client::builder()
        .timeout(Duration::from_secs(30))
        .build()?;
  5. Consider encrypting policy file or at minimum adding file permissions checks (0600).

  6. Add integration tests for end-to-end signing flows with actual NEAR testnet.

  7. Implement key rotation with explicit rotation policy and metadata.

Copy link
Copy Markdown
Member Author

@ilblackdragon ilblackdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: feat: Add NEAR key management with transaction signing and policy engine

This is a well-structured PR that adds critical security infrastructure. The hybrid custody model, the policy engine, and the WASM signing boundary are all architecturally sound. The test coverage is strong. That said, since this is key management -- the kind of code where a bug means lost funds -- I have several concerns that should be addressed before merging.


Security Issues (Must Fix)

1. TOCTOU race in generate_key and import_key (src/keys/mod.rs)

Both methods call load_store() to check for duplicates, then call load_store() again later to save. Between the check and the save, another concurrent call (or even the same KeyManager from another task) could insert a key with the same label. The second load_store() clobbers silently.

// Line ~133: First load to check duplicates
let store = self.load_store().await?;
if store.keys.contains_key(label) { ... }

// ... key generation and secrets store insertion happen here ...

// Line ~179: Second load to save -- races with first!
let mut store = self.load_store().await?;
store.keys.insert(label.to_string(), metadata.clone());
self.save_store(&store).await?;

This is the same TOCTOU pattern flagged in CLAUDE.md under "Fix the pattern, not just the instance." Both generate_key and import_key have this. The secrets store create call could succeed while the metadata save races, leaving an orphaned secret. At minimum, do a single load, mutate, save cycle. Better: use a file lock or an RwLock on the KeyManager.

2. Binary body scanning regression in leak_detector (src/safety/leak_detector.rs)

The PR removes the String::from_utf8_lossy scan for HTTP request bodies and replaces it with a strict from_utf8 check:

// Before (secure):
let body_str = String::from_utf8_lossy(body_bytes);
self.scan_and_clean(&body_str)?;

// After (bypasses scanning):
if let Ok(body_str) = std::str::from_utf8(body_bytes) {
    self.scan_and_clean(body_str)?;
}
// Binary bodies are not scanned

This means an attacker can prepend a single 0xFF byte to an exfiltration payload containing ed25519:... or sk-proj-... and the leak detector will skip the body entirely. The old code handled this correctly with from_utf8_lossy. The deleted test test_scan_http_request_blocks_secret_in_binary_body explicitly tested this exact attack vector. Revert this change.

3. Metadata file (~/.ironclaw/keys.json) stores key labels in plaintext on disk without filesystem permissions hardening

The save_store method writes keys.json with default umask permissions. On multi-user systems, another user could read which keys exist, their account IDs, public keys, networks, etc. While the private keys themselves are in the encrypted secrets store, metadata leakage is still an operational security concern. At minimum, set file mode to 0600 after write (or use std::os::unix::fs::OpenOptionsExt with mode).

4. Policy file (~/.ironclaw/key_policy.json) is unprotected

The policy file controls what the agent can auto-approve. If an attacker can modify this file, they can set transfer_auto_approve_max_yocto to u128::MAX and whitelist their own account. The file should have restrictive permissions and, ideally, a MAC/checksum so modifications are detected.


Correctness Issues

5. wrapper.rs rewrite is a breaking change (net -1180 lines)

The PR replaces the entire WASM tool wrapper with a new implementation that uses the low-level Val API instead of wasmtime::component::bindgen!(). This drops:

  • WasiView implementation (WASI context)
  • All HTTP host functions (http-request with timeout, headers, body)
  • Credential injection (placeholder substitution, host-based injection, redaction)
  • OAuth token refresh (OAuthRefreshConfig)
  • LeakDetector integration in WASM responses
  • tool-invoke and secret-exists host function bindings
  • The bindgen!() macro entirely (replaced with manual Val marshalling)

The WIT file also removes the timeout-ms parameter from http-request. Only log, now-millis, workspace-read, and the new sign-payload are wired up in the new wrapper. This means all existing WASM tools that use HTTP, secrets, tool invocation, or credentials will break. This seems unintentional or at least needs to be called out in the PR description.

6. Spend tracking not recorded on approval-then-sign path (src/keys/mod.rs:314)

Spend is only recorded in the AutoApprove branch of sign_transaction:

PolicyDecision::AutoApprove => {
    // ...signs...
    if analysis.total_value_yocto > 0 {
        let _ = self.spend_tracker.record_spend(...).await;
    }
}

But when approval is granted and the transaction is subsequently signed (the ApprovalRequired path), there is no code path that records the spend. The daily limit can be bypassed by always going through the approval flow for small transactions.

7. _domain parameter unused in build_chain_signature_action (src/keys/chain_signatures.rs:168)

The SignatureDomain parameter is accepted but ignored. The generated sign call doesn't pass the domain to the MPC contract. This could result in signing with the wrong curve if the contract defaults differ from what the caller expects.


Design Feedback

8. Flat file storage for keys metadata and spend tracking

Both keys.json and spend_tracking.json are read-modify-write JSON files. This has no concurrency safety (see #1), no atomicity (a crash mid-write corrupts the file), and doesn't follow the project's Database trait pattern. Per CLAUDE.md: "All new features that touch persistence MUST support both backends." Consider:

  • Atomic writes (write to temp file, then rename)
  • Or migrate to the Database trait with proper implementations

9. Default policy is too permissive for full-access keys

PolicyConfig::default() sets deny_full_access_operations: false, meaning a full-access key can be used for transfers, function calls, etc. The PR description says "high-value operations require explicit user approval" but the default policy auto-approves zero-value transfers from full-access keys (because transfer_auto_approve_max_yocto: 0 means "up to 0 yocto is auto-approved"). While this technically blocks non-zero transfers, function calls with zero deposit through a full-access key on an arbitrary contract are also auto-approved if the key is scoped. Consider defaulting deny_full_access_operations: true to match the "hybrid custody" philosophy.

10. let _ = self.secrets_store.delete(...) silently ignores deletion errors (src/keys/mod.rs:265)

In remove_key, if the secret deletion fails but the metadata is already removed, the key becomes orphaned in the secrets store with no way to clean it up. At least log the error.


Style / Minor

11. let-else refactoring in capabilities.rs and leak_detector.rs

The PR refactors if let ... && ... chains into nested if let / if blocks. This is fine for compatibility, but the commit message doesn't mention it. These should be in a separate commit or at least noted in the PR description since they touch security-critical code paths (endpoint matching, leak detection).

12. format_yocto has precision loss for amounts between 1 milliNEAR and 1 NEAR

let frac = (yocto % ONE_NEAR) / ONE_MILLI_NEAR;

This integer division truncates. E.g., 1.999 NEAR formats as "1.999 NEAR" but 0.001999 NEAR formats as "0.001 NEAR" (the 999 microNEAR is lost). For a financial display, consider at least 6 decimal places.

13. No #[cfg(test)] on InMemorySecretsStore import

The tests in mod.rs use InMemorySecretsStore -- make sure this type is available at test time (it appears to be, but worth a compilation check with --no-default-features).


Summary

The core key management design is solid -- encrypted storage, Zeroize on drop, keys never reaching WASM, policy-before-sign pipeline. But the TOCTOU race, the leak detector regression, and the wrapper.rs rewrite breaking existing WASM tools are blockers. The spend tracking gap is a financial correctness bug that should also be fixed before merge.

Recommended disposition: Request changes on items 1, 2, 5, 6 before merging.

ilblackdragon added a commit that referenced this pull request Apr 14, 2026
…, binary writes

- Add pre-intercept safety param validation so sandbox-dispatched calls
  go through the same checks as host-dispatched calls (#1)
- Set network_mode: "none" on sandbox containers to prevent outbound
  network access (#3)
- Reject binary content in containerized write instead of silently
  corrupting via from_utf8_lossy (#5)
- Cap list_dir depth to 10 to prevent unbounded traversal (#8)
- Change container creation log from info! to debug! to avoid breaking
  REPL/TUI output (#10)
- Make is_truthy case-insensitive so SANDBOX_ENABLED=True works (#11)
- Return error instead of unwrap_or_default for missing container ID (#12)
- Propagate set_permissions errors instead of silently ignoring (#13)
- Return error for missing daemon output key instead of defaulting to
  empty object (#14)
- Add env mutex guard in sandbox_live_e2e test (#15)
- Fix rustfmt formatting for let-chain in canonicalize_under_root

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ilblackdragon added a commit that referenced this pull request Apr 19, 2026
* feat(engine-v2): mount-backend abstraction for per-project sandbox (Phase 1)

Adds the engine-side `MountBackend` trait + minimal `WorkspaceMounts` registry
and a host-side bridge interceptor that routes sandbox-eligible tool calls
(`file_read`, `file_write`, `list_dir`, `apply_patch`, `shell`) through a
backend when their path argument starts with `/project/`. Default behavior is
unchanged: until `EffectBridgeAdapter::set_workspace_mounts(Some(...))` is
called (Phase 6), the interception path is dormant.

This is the first phase of the per-project sandbox plan
(`docs/plans/2026-04-10-engine-v2-sandbox.md`) and a deliberately small subset
of the unified Workspace VFS proposed in #1894 — just enough
abstraction so the sandbox can be a `MountBackend` rather than a special case
in the bridge. When #1894's full mount table lands, the sandbox backend slots
in unchanged.

Engine crate (`crates/ironclaw_engine/src/workspace/`):
- `mount.rs` — `MountBackend` trait, `MountError` (NotFound / InvalidPath /
  PermissionDenied / Io / Tool / Backend / Unsupported), `DirEntry`,
  `EntryKind`, `ShellOutput`
- `filesystem.rs` — `FilesystemBackend`: passthrough host-fs implementation
  with two-layer path validation (lexical reject of absolute / `..`, then
  symlink-escape canonicalization). `read`/`write`/`list` fully implemented;
  `patch`/`shell` return `Unsupported` so the bridge falls through to the
  host tool until Phase 5
- `registry.rs` — `WorkspaceMounts` per-project registry with lazy
  `ProjectMountFactory`, longest-prefix-match resolution, cached and
  invalidatable

Bridge (`src/bridge/sandbox/`):
- `intercept.rs` — `maybe_intercept` and `SANDBOX_TOOL_NAMES`. Returns
  `Handled(json)` on a successful backend dispatch, `FellThrough` for
  non-sandbox tools, host paths, missing path params, or `Unsupported`
  backend ops
- `effect_adapter.rs` — `workspace_mounts` field + `set_workspace_mounts`
  setter; interception block in `execute_action_internal` right before
  `execute_tool_with_safety`, gated on the optional mount table

Tests (31 new):
- 17 engine workspace unit tests covering trait error mapping, path safety
  (lexical + symlink), longest-prefix routing, and lazy factory caching
- 9 bridge sandbox unit tests including `intercept_actually_dispatches_into_backend`
  (counting backend) which proves the interceptor reaches the backend
- 5 integration tests in `tests/engine_v2_sandbox_integration.rs` driving
  `EffectBridgeAdapter::execute_action()` end-to-end per the
  "Test Through the Caller" rule (`.claude/rules/testing.md`), including
  a host-path-falls-through test that asserts the sandbox tempdir was
  not touched, and a `..`-escape test that verifies no `/etc/passwd`
  content leaks even after safety-layer redaction

Drive-by: feature-gate two pre-existing dead-code helpers in
`crates/ironclaw_skills/src/parser.rs` on `#[cfg(feature = "registry")]` to
match their only call site, fixing a pre-existing clippy warning that blocked
the workspace's `-D warnings` policy when `ironclaw_skills` is built with
`default-features = false` (as the engine crate does).

Verification:
- `cargo fmt --check` clean
- `cargo clippy --all --benches --tests --examples --all-features` zero warnings
- 31 / 31 new tests passing; no existing tests broken

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(engine-v2): per-project sandbox — Phases 2–7 + live Docker e2e test

Completes the per-project sandbox plan (docs/plans/2026-04-10-engine-v2-sandbox.md
Phases 2–7), building on Phase 1's mount-backend abstraction (#2211).

Phase 2 — Project workspace folder:
- `Project.workspace_path: Option<PathBuf>` field + `with_workspace_path()`
- Host-side `project_workspace_path()`, `ensure_project_workspace_dir()` (creates
  `~/.ironclaw/projects/<id>/` mode 0700, idempotent)
- `FilesystemMountFactory` taking a `ProjectPathResolver` closure (decoupled from
  `Store`); wired into `EffectBridgeAdapter` via `set_workspace_mounts()`

Phase 3 — Standalone daemon binary:
- `src/bin/sandbox_daemon.rs` — NDJSON over stdin/stdout, health/shutdown/execute_tool
- Constructs ReadFileTool/WriteFileTool/ListDirTool/ApplyPatchTool/ShellTool with
  `base_dir=/project` (override via `IRONCLAW_SANDBOX_BASE_DIR`)

Phase 4 — Dockerfile.sandbox:
- Multi-stage build: rust-slim builder (+ python3 for pyo3) compiles sandbox_daemon;
  debian-slim runtime with tini PID 1, common build tools, `/project` mount target

Phase 5 — ProjectSandboxManager + ContainerizedFilesystemBackend:
- protocol.rs: Request/Response/RpcError matching daemon wire format
- transport.rs: `SandboxTransport` trait (seam for testing without Docker)
- containerized_backend.rs: `ContainerizedFilesystemBackend` impls `MountBackend`,
  translates relative→`/project/<rel>`, maps tool-error→MountError
- docker_transport.rs: real bollard exec session, serialized Mutex, lazy reconnect
- lifecycle.rs: deterministic `ironclaw-sandbox-<pid>` naming, ensure_running/stop/remove
- manager.rs: `ProjectSandboxManager` per-project transport cache

Phase 6 — Router gating on ENGINE_V2_SANDBOX:
- `engine_v2_sandbox_enabled()` helper (truthy: 1/true/yes/on)
- Router selects `ContainerizedMountFactory` when enabled + Docker reachable;
  falls back to `FilesystemMountFactory` with warning otherwise

Live e2e bugs caught and fixed:
- Shell without explicit `workdir` defaulted to host (not sandbox); fixed by
  defaulting to `/project/` in `extract_path_param`
- `ContainerizedFilesystemBackend::shell` parsed `stdout`/`stderr` but host
  ShellTool returns merged `output` field; fixed with fallback key lookup
- SANDBOX_TOOL_NAMES only had v2 names (`file_read`/`file_write`) but host
  registry uses v1 names (`read_file`/`write_file`); added both aliases

Tests (62 sandbox-related, all green):
- 27 bridge sandbox unit tests (intercept, workspace_path, factory, protocol,
  lifecycle, containerized_backend with ScriptedTransport mock)
- 7 containerized-backend tests (including 2 regression tests for the shell bugs)
- 5 engine v2 sandbox integration tests (EffectBridgeAdapter end-to-end)
- 5 daemon binary smoke tests (real subprocess + NDJSON I/O)
- 17 engine workspace unit tests
- 1 live Docker e2e test: agent clones nearai/ironclaw into sandbox, renames
  to megaclaw via sed, verifies with grep — 70s, $0.09, recorded trace committed

Verification:
- `cargo fmt --check` clean
- `cargo clippy --all --benches --tests --examples --all-features` zero warnings
- All 62 sandbox tests passing; no existing tests broken

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace .expect() with Result in DockerTransport::ensure_session

CI's no-panics checker flagged the .expect("just inserted") in production
code. Replace with .ok_or_else() returning MountError::Backend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: multi-tenant project paths + unify sandbox env var with v1

Two issues addressed:

1. Project workspace paths now namespace by user_id:
   `~/.ironclaw/projects/<user_id>/<project_id>/` instead of
   `~/.ironclaw/projects/<project_id>/`. Prevents filesystem collisions
   in multi-tenant deployments where two users could theoretically have
   the same project UUID.

2. Sandbox enablement now reads `SANDBOX_ENABLED` (same env var as v1
   sandbox) in addition to `ENGINE_V2_SANDBOX`. Either being truthy
   enables the per-project sandbox. This means a single flag governs
   sandbox behavior regardless of engine version, while the v2-specific
   override remains available for transitional setups.

Tests: 30 bridge sandbox unit tests passing (added multi-tenant path
tests + env var combination tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review — TOCTOU race, shell env passthrough, canonicalize guard

Three issues flagged by the code review bot on #2211:

1. TOCTOU race in WorkspaceMounts::resolve (HIGH): Added double-checked
   locking — re-check the cache after acquiring the write lock so two
   threads racing on the same project's first access don't both call
   factory.build(). The second thread finds the insert from the first.

2. Shell intercept ignores env parameter (MEDIUM): The shell arm in
   maybe_intercept was passing HashMap::new() instead of forwarding
   the tool call's env map. Fixed to parse parameters["env"] and pass
   it through to backend.shell().

3. Canonicalization fails when root doesn't exist (MEDIUM): When
   self.root hasn't been created yet (first write to a new project),
   canonicalize_under_root would walk up to a real ancestor and the
   starts_with check against the non-existent root would always fail.
   Now skips canonicalization entirely when root doesn't exist — lexical
   safety is already guaranteed by safe_join.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review round 2 — apply_patch schema, content validation, dir perms, docs

- Fix apply_patch schema mismatch: MountBackend::patch now takes
  (old_string, new_string, replace_all) matching ApplyPatchTool's
  actual contract. Previously sent {patch: diff} which would fail
  with invalid_params in the containerized daemon.
- Validate file_write content param: return error instead of silently
  writing empty string when content is missing.
- Log stderr frames from sandbox daemon at debug! instead of silently
  discarding them in docker_transport StreamReader.
- Tighten permissions on intermediate directories created by
  ensure_project_workspace_dir (projects/, <user_id>/) to 0o700,
  not just the leaf.
- Fix stale module doc in sandbox/mod.rs (referenced "Phase 5 will
  add" but all phases shipped).
- Fix doc path mismatch: workspace path is <user_id>/<project_id>/,
  not <project_id>/ (workspace_path.rs, CLAUDE.md, design plan).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review round 3 — symlink safety, visibility, debug logging

- Close TOCTOU window in canonicalize_under_root: re-canonicalize and
  verify containment when the reassembled path exists on disk
- Fix list_dir_recursive: use symlink_metadata (lstat) so symlinks are
  detected instead of followed; validate directories against root before
  recursive traversal
- Tighten is_mountable_path to /project/, /memory/, /home/ prefixes
  instead of any absolute path (defense-in-depth)
- Narrow sandbox module visibility to pub(crate) and remove unused
  pub use re-exports
- Remove concrete types (FilesystemBackend, DirEntry, EntryKind,
  ShellOutput) from engine crate top-level re-exports; access via
  ironclaw_engine::workspace:: module path
- Add debug! tracing to sandbox intercept routing decisions
- Add read_file/write_file v1 aliases to daemon SUPPORTED_TOOLS health
  response
- Remove developer-local path from sandbox mod.rs doc comment
- Merge staging to fix CI (user_timezone field on ThreadExecutionContext)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review round 4 — safety validation, network isolation, binary writes

- Add pre-intercept safety param validation so sandbox-dispatched calls
  go through the same checks as host-dispatched calls (#1)
- Set network_mode: "none" on sandbox containers to prevent outbound
  network access (#3)
- Reject binary content in containerized write instead of silently
  corrupting via from_utf8_lossy (#5)
- Cap list_dir depth to 10 to prevent unbounded traversal (#8)
- Change container creation log from info! to debug! to avoid breaking
  REPL/TUI output (#10)
- Make is_truthy case-insensitive so SANDBOX_ENABLED=True works (#11)
- Return error instead of unwrap_or_default for missing container ID (#12)
- Propagate set_permissions errors instead of silently ignoring (#13)
- Return error for missing daemon output key instead of defaulting to
  empty object (#14)
- Add env mutex guard in sandbox_live_e2e test (#15)
- Fix rustfmt formatting for let-chain in canonicalize_under_root

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review round 5 — path traversal, error types, tests

Security fixes:
- Sanitize user_id in workspace path to prevent directory traversal via
  malicious user IDs containing `..` or `/`
- Add Component::ParentDir check in ContainerizedFilesystemBackend::container_path
  matching the defense-in-depth approach of FilesystemBackend::safe_join

Correctness:
- Use MountError::Tool instead of MountError::InvalidPath for missing
  tool parameters (content, old_string, new_string) — fixes confusing
  LLM-visible error messages
- Fix clippy sort_by_key suggestion in registry.rs

Cleanup:
- Remove spurious Notify import and dead _notify_link function

New tests:
- ContainerizedFilesystemBackend path traversal rejection (read + write)
- container_path unit tests for safe and unsafe paths
- Adversarial user_id test in workspace_path
- Daemon-side path traversal test in sandbox_daemon_smoke

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review round 6 — param normalization, error types, edge cases

- Normalize sandbox params via prepare_tool_params() before validation,
  matching the host execution path (fixes inconsistent validation)
- Return ToolError::InvalidParameters instead of EngineError::Effect for
  sandbox param validation failures (consistent error surface)
- ensure_dir checks path.is_dir() not path.exists() (rejects files)
- Empty user_id returns "_anonymous" sentinel instead of empty hex string
  that would drop the tenant namespace via PathBuf::join("")
- Restore ENGINE_V2_SANDBOX env var after sandbox live E2E test
- Tighten is_mountable_path to /project/ only (no mounts for /memory/
  or /home/ yet)
- Add v1 tool name aliases (read_file, write_file) to SUPPORTED_TOOLS

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: unify sandbox env var — remove ENGINE_V2_SANDBOX, use SANDBOX_ENABLED only

Single env var controls sandboxing for both engine versions. The
transitional ENGINE_V2_SANDBOX override is removed from code, tests,
docs, and Dockerfile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: double-checked locking in transport_for, explicit stdin close in smoke test

- ProjectSandboxManager::transport_for no longer holds the mutex across
  the Docker ensure_running await. Uses double-checked locking so
  concurrent projects initialize in parallel.
- sandbox_daemon_smoke: explicitly take() stdin before wait_with_output
  so EOF is sent even without a shutdown request.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review — network mode, error types, race, protocol dedup

- Change sandbox container network_mode from "none" to default bridge
  so git clone / cargo build / pip install work inside the container
- Fix binary content rejection to use MountError::Tool instead of
  MountError::InvalidPath (semantic mismatch)
- Fix list depth: use actual depth value instead of depth.max(1)
- Fix orphan container race in transport_for by holding lock across
  container creation instead of double-checked locking
- Deduplicate protocol types: daemon now imports from shared
  bridge::sandbox::protocol instead of defining its own copies
- Make bridge::sandbox pub (narrow exposure: only protocol and
  workspace_path sub-modules are pub)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update plan doc — sandbox uses bridge networking, not network_mode=none

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants