chore: promote staging to staging-promote/79ad2e38-24572736232 (2026-04-17 16:19 UTC)#2587
Merged
henrypark133 merged 1 commit intomainfrom Apr 18, 2026
Merged
Conversation
…#1958) * fix(engine): security hardening for v2 orchestrator and Monty sandbox Address deferred security items C1/C2/C4/M2 from PR #1557 review: C1/C2 — Orchestrator self-modification approval gates: - memory_write tool now returns ApprovalRequirement::Always for protected orchestrator paths when ORCHESTRATOR_SELF_MODIFY=true, forcing human approval before any orchestrator or prompt overlay patch is written - Store adapter validates Python syntax via Monty parser before persisting orchestrator patches, preventing broken code from consuming failure budget - Content hash (SHA-256) stamped on all protected docs for audit trail C4 — Sandbox security test coverage (6 new tests): - sandbox_enforces_rlm_query_depth_limit: depth check at max recursion - sandbox_rejects_final_injection: FINAL() captures payload literally - sandbox_rejects_tool_name_injection: dynamic names can't bypass leases - sandbox_context_variable_is_not_mutable: Python mutations don't affect Rust - sandbox_handles_deep_recursion: infinite recursion terminates safely - validate_syntax_rejects_broken_code: syntax validation unit test M2 — Remove ownership-bypassing thread operations: - Delete stop_thread_system() and inject_message_system() — dead code with no callers, was a privilege escalation footgun. Ownership-checking variants (stop_thread/inject_message) are the only API now. Also: document child lease budget snapshot semantics (intentional design), add EngineError::InvalidInput variant, re-export validate_python_syntax. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: cargo fmt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: collapse nested if per clippy suggestion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: cargo fmt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(engine): address PR #1958 review findings Review feedback from Copilot and serrrfirat: 1. Use ORCHESTRATOR_FAILURES_TITLE constant consistently in both branches of the self-modify gate (was string literal in deny branch). 2. Normalize metadata to {} before stamping content_hash, so the audit trail is reliably present even for docs with null/non-object metadata. 3. Add 256KB size cap to validate_python_syntax() to prevent pathological inputs from causing CPU/memory pressure on the store write path. 4. Tighten sandbox_rejects_tool_name_injection test assertion to verify the expected "not_found" outcome, not just absence of "ESCAPED". 5. Change ApprovalRequirement::Always → UnlessAutoApproved for protected orchestrator writes. The v2 effect bridge maps Always to hard denial (LeaseDenied), making the self-modify path unusable. UnlessAutoApproved triggers the gate/pause flow so human approval is possible. 6. Extend is_protected_orchestrator_path() to cover physical workspace paths (engine/orchestrator/*) in addition to logical aliases. Prevents bypassing the approval gate by writing to the persisted file path. 7. Persist project_id and user_id in frontmatter (serialize_knowledge_doc) and restore them on load (deserialize_knowledge_doc). Previously all reloaded docs got project_id=nil and user_id="legacy", making them invisible to project-scoped queries after restart. 8. Synthesize MemoryDocs from raw .py orchestrator files on startup (synthesize_orchestrator_doc_from_py). Orchestrator versions are persisted as engine/orchestrator/v{N}.py but load_knowledge_docs could not parse them — they silently disappeared on restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(engine): PR #1958 round-2 review — close path-bypass, forgeable-metadata, and rehydration holes Address the critical/high findings from @serrrfirat's second review on PR #1958. Every fix is paired with a caller-level regression test per the "Test Through the Caller" rule in .claude/rules/testing.md. ## Critical — path normalization bypass `is_protected_orchestrator_path` and `is_orchestrator_code_path` both did raw `starts_with` on user input. Paths like `engine/./orchestrator/v3.py`, `engine//orchestrator/v3.py`, and `engine/knowledge/../orchestrator/v3.py` resolved to the protected location but sidestepped the check, letting LLM-authored writes skip the approval gate *and* syntax validation. Added `normalize_workspace_path` (in tools/builtin/memory.rs) and `normalize_path` (in bridge/store_adapter.rs): strip `.` and empty segments, reject `..` traversal outright. Both guards now normalize first, then match against both `.system/engine/orchestrator/` and the legacy `engine/orchestrator/` prefix. Traversal attempts are treated as protected so they route through the approval gate — and `execute()` rejects them with `InvalidParameters` before reaching the workspace. ## High — forgeable `is_system_internal` metadata The gate previously trusted `doc.metadata.source == "compiled_in"` to mark a write as system-internal. An LLM tool call could craft a doc with that field and bypass both denial and validation. Replaced with a `tokio::task_local!`-backed trusted-write scope in the new `crates/ironclaw_engine/src/runtime/internal_write.rs` module: runtime::with_trusted_internal_writes(async { store.save_memory_doc(&seed).await }) The orchestrator v0 seeder in `MissionManager::seed_orchestrator_v0` enters the scope; the store gate reads `is_trusted_internal_write_active()`. Task-locals cannot be set from outside a trusted callsite and do not propagate across `tokio::spawn`, so an untrusted caller cannot inherit the flag. A unit test asserts the no-propagation property. ## High — rehydration rendered orchestrator invisible after restart `synthesize_orchestrator_doc_from_py` returned a doc with `ProjectId::nil()`, but `list_memory_docs` filters by exact project, so persisted `.py` orchestrator files disappeared from project-scoped queries after a process restart and the runtime silently reverted to compiled-in defaults. Override `HybridStore::list_shared_memory_docs` to surface docs flagged as "physically global" (by title — orchestrator, failure tracker, prompt overlay) for any project query, regardless of the stored project_id. Physical storage is one file per workspace; the override reflects that. New end-to-end round-trip test writes an orchestrator via the store, rebuilds a fresh HybridStore from the same workspace, and asserts the rehydrated doc is visible to both the original project and an unrelated project. ## High — env var read on every gate check Both `memory_write::requires_approval` and `save_memory_doc` read `ORCHESTRATOR_SELF_MODIFY` from the environment on every call. Env vars are global mutable state; a future sandbox escape could flip a security gate mid-flight. Centralized in `runtime::self_modify_enabled()`: a process-wide `OnceLock<bool>` seeded on first read. Tool, store, engine loop, and self-improvement mission all share the same snapshot. Tests need to flip the flag, so the module also exposes a `SelfModifyTestGuard` that overrides the snapshot and serializes concurrent tests via a process-wide `Mutex`. The override layer is compiled out of release builds (`cfg(debug_assertions)`). ## High — PR description / code mismatch + hard denial regression Original PR said `ApprovalRequirement::Always`, code used `UnlessAutoApproved`. The `Always` path is mapped by the v2 effect bridge to `LeaseDenied` (permanent refusal), not to a resumable approval gate. Fixed the code to `UnlessAutoApproved` (already in the previous round), but added an end-to-end regression test in `effect_adapter.rs` that drives `EffectBridgeAdapter::execute_action` with the real `MemoryWriteTool` and asserts the protected target produces `GatePaused(Approval)` — not `LeaseDenied`. Sibling test asserts that with self-modify disabled, the write surfaces as a non-resumable refusal so the agent doesn't loop on an unreachable approval. ## Medium — audit-hash scope The content hash stamped on protected docs is **write-time audit only**: the workspace file is the trust boundary, and anyone with workspace access can edit the raw file bypassing save_memory_doc. Documented this explicitly in `save_memory_doc` so future readers don't mistake it for a runtime integrity check. Also cleaned up the clippy `map_for_value` nit by switching to `if let Some(map) = ...`. ## Test coverage (all caller-level, new) Unit tests (41 new): - `memory.rs` path normalization & protected-path guard (dot-segment bypass, double-slash bypass, traversal, legacy path, canonical path, logical alias, unrelated path) — 9 cases - `memory.rs` requires_approval branches (enabled + protected, disabled + protected, physical path, dot bypass, traversal bypass, unprotected, missing target) — 7 cases - `store_adapter.rs` normalize_path, is_orchestrator_code_path, synthesize_orchestrator_doc_from_py, validate_orchestrator_content, is_protected_orchestrator_doc, is_globally_shared — 23 cases - `internal_write.rs` trusted-write scope semantics — 3 cases - `list_shared_memory_docs` override surfaces global vs project-scoped docs correctly — 2 cases Integration tests (11 new, libsql feature): - `dispatch.rs::integration_tests` — drives `ToolDispatcher::dispatch()` against the real `MemoryWriteTool` for all bypass paths (protected alias, physical path, dot segment, double slash, traversal, unprotected baseline) — 6 cases - `store_adapter.rs::migration_tests::orchestrator_py_round_trips_through_restart` — full write → restart → load → cross-project query cycle - `store_adapter.rs::migration_tests::knowledge_md_doc_round_trips_project_id_and_user_id` — asserts frontmatter `project_id`/`user_id` survive restart (was previously dropped, making docs invisible to project queries) - `store_adapter.rs::migration_tests::invalid_python_orchestrator_is_rejected_at_write_time` — validator gate fires before persistence - `effect_adapter.rs::tests::memory_write_orchestrator_target_paused_for_approval_when_self_modify_enabled` — the UnlessAutoApproved regression test - `effect_adapter.rs::tests::memory_write_orchestrator_target_refused_when_self_modify_disabled` — asserts no gate pauses when self-modify is off ## Quality gate - `cargo fmt` — clean - `cargo clippy -p ironclaw --lib --tests --features libsql` — 0 warnings - `cargo clippy -p ironclaw_engine --all-targets` — 0 warnings (crate-local) - `cargo test -p ironclaw_engine` — 358/358 pass - `cargo test -p ironclaw --lib --features libsql` — 4735/4735 pass - `cargo test --test engine_v2_gate_integration --test engine_v2_skill_codeact` — 27/27 pass Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(engine): round-3 review — syntax validation in memory_write, Always-approval gate, global doc visibility Critical: MemoryWriteTool::execute() now validates Python syntax for protected .py paths before writing to workspace (was bypassing the Store-level validator entirely). High: ApprovalRequirement::Always now produces GatePaused(Approval) with allow_always=false instead of hard LeaseDenied; memory_write returns Always (not UnlessAutoApproved) for protected paths so session auto-approve cannot silently skip the gate. High: Global docs (orchestrator, failures, prompt overlay) have project_id normalized to nil on save so they surface immediately from any project query, not just after restart. Medium: Traversal paths no longer trigger spurious approval gates — requires_approval returns Never for normalization failures so execute() rejects them immediately as InvalidParameters. Medium: Dead code removed from is_orchestrator_code_path (equality checks unreachable after .py suffix requirement). Medium: Document prompt overlay validation skip and syntax validation threat model; expanded validate_python_syntax test coverage (size cap, empty input, unicode, error format). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): update deny.toml — remove stale wasmtime advisories, add rand 0.8.5 Wasmtime was upgraded and no longer triggers RUSTSEC-2025-0046, RUSTSEC-2025-0118, RUSTSEC-2026-0020, RUSTSEC-2026-0021. The stale ignores caused cargo-deny to fail with "advisory was not encountered". Added RUSTSEC-2026-0097 (rand 0.8.5 unsound aliased mutable ref in ThreadRng during reseed from custom logger) — transitive dep via monty/wasmtime; upgrade to rand 0.9+ tracked separately. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(engine): path-boundary check in requires_approval, fix stale docstring Address two Copilot review comments: - requires_approval used raw starts_with on normalized path, matching unrelated paths like orchestrator_backup/. Added path-boundary checks. - Updated stale docstring on the Always-approval gate test to reflect current behavior (Always→GatePaused, not UnlessAutoApproved). [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(engine): deduplicate requires_approval gate, move syntax validation after param checks - requires_approval now delegates to is_protected_orchestrator_path instead of duplicating the matching logic (reviewer concern about drift between the two checks) - Syntax validation for protected .py paths moved after patch-mode parameter validation (empty old_string, missing new_string) so an empty-string replace can't create a huge intermediate string before being rejected [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(engine): address PR #1958 round-4 review findings 5 issues from serrrfirat review (2026-04-13): 1. High — clamp `always` in `resolve_gate` against `pending.resume_kind` so a caller-supplied `always: true` can no longer install a session- wide auto-approval on an `Approval { allow_always: false }` gate (orchestrator self-modify writes). Extracted to `clamp_always_to_ resume_kind` with unit tests covering all three ResumeKind variants. 2. Medium — expand `validate_python_syntax` rustdoc to document that `MontyRun::new()` parses and prepares only (no heap/namespaces, no module-level execution) and explain the 256 KB size cap as a bound on parser allocation, not an execution-time safeguard. 3. Medium — remove the `doc.title == ORCHESTRATOR_FAILURES_TITLE` shortcut from the store-adapter self-modify gate. The two legitimate callers (`record_orchestrator_failure`, `reset_orchestrator_failures`) now wrap their `save_memory_doc` in `with_trusted_internal_writes`. Additionally reject untrusted writes to the failures title regardless of self-modify state, since no LLM-reachable code path should ever persist the system-internal tracker. 4. Medium — add a parity test asserting `normalize_path` (store adapter) and `normalize_workspace_path` (memory tool) agree on a canonical input set. Shared extraction isn't clean across the bridge/tool boundary; the test is the lighter guard against drift on this security boundary. 5. Medium — tighten `MemoryWriteTool::requires_approval` commentary with an explicit cross-reference to the `execute()` rejection site (~line 444) and a load-bearing invariant warning so a future refactor that weakens `execute()` will also be forced to flip this branch to `Always`. Also collapse a clippy::collapsible_if that appeared in the traversal gate check. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Auto-promotion from staging CI
Batch range:
a53eac5c2dec6b6cd5c08189086093fde64aa9cb..e65ba2e4d9f46149207ec8ceb09f25b31d07f8bdPromotion branch:
staging-promote/e65ba2e4-24575255629Base:
staging-promote/79ad2e38-24572736232Triggered by: Staging CI batch at 2026-04-17 16:19 UTC
Commits in this batch (72):
ironclaw profile listsubcommand (feat(cli): addironclaw profile listsubcommand #2288)reasoning_contentfields in chat completions response (fix: duplicatereasoning_contentfields in chat completions response #2493)Current commits in this promotion (1)
Current base:
staging-promote/79ad2e38-24572736232Current head:
staging-promote/e65ba2e4-24575255629Current range:
origin/staging-promote/79ad2e38-24572736232..origin/staging-promote/e65ba2e4-24575255629Auto-updated by staging promotion metadata workflow
Waiting for gates:
Auto-created by staging-ci workflow