chore: promote staging to staging-promote/e2eb340c-22999151534 (2026-03-13 03:36 UTC) by ironclaw-ci[bot] · Pull Request #1096 · nearai/ironclaw

ironclaw-ci · 2026-03-13T03:36:52Z

Auto-promotion from staging CI

Batch range: e2eb340c049c02e860c53e94e9631c5cf3f397ed..3c619b627297d042d52fd87c915d31284e7df907
Promotion branch: staging-promote/3c619b62-23035039465
Base: staging-promote/e2eb340c-22999151534
Triggered by: Staging CI batch at 2026-03-13 03:36 UTC

Commits in this batch (35):

5a62cea refactor: extract safety module into ironclaw_safety crate (refactor: extract safety module into ironclaw_safety crate #1024)
c937dfa fix(registry): use versioned artifact URLs and checksums for all WASM manifests (fix(registry): use versioned artifact URLs and checksums for all WASM manifests #1007)
ef34943 fix: release lock guards before awaiting channel send ([CRITICAL] Lock held across async I/O boundary blocks webhook processing #869) (fix: release lock guards before awaiting channel send (#869) #1003)
c26f116 fix(deploy): harden production container and bootstrap security (fix(deploy): harden production container and bootstrap security #1014)
0b81342 Fix UTF-8 unsafe truncation in WASM emit_message (fix: UTF-8-safe truncation for WASM emitted messages #1015)
8a26cfa fix(mcp): open MCP OAuth in same browser as gateway (fix(mcp): open MCP OAuth in same browser as gateway #951)
4faf81a fix(mcp): include OAuth state parameter in authorization URLs (fix(mcp): include OAuth state parameter in authorization URLs #1049)
f776d96 fix: remove all inline event handlers for CSP script-src compliance (fix: remove all inline event handlers for CSP script-src compliance #1063)
863702a feat: add MiniMax as a built-in LLM provider (feat: add MiniMax as a built-in LLM provider #940)
d420abf fix(memory): reject absolute filesystem paths with corrective routing (fix(memory): reject absolute filesystem paths with corrective routing #934)
006c15e style(agent): remove unnecessary Worker re-export (style(agent): remove unnecessary Worker re-export (fix #891) #923)
6bbf87b feat(routines): enable tool access in lightweight routine execution (feat: enable tool access in lightweight routine execution #257) (feat(routines): enable tool access in lightweight routine execution (#257) #730)
8df51c0 feat: enhance HTTP tool parameter parsing (feat: enhance HTTP tool parameter parsing #911)
c94ecf1 feat: add Slack approval buttons for tool execution in DMs (feat: add Slack approval buttons for tool execution in DMs #796)
0b122cb feat(web-chat): add hover copy button for user/assistant messages (feat(web-chat): add hover copy button for user/assistant messages #948)
c592c50 discord: mentions + signature verification in WASM channel (discord: mentions + signature verification in WASM channel #335)
fd574b2 Expose the shared agent session manager via AppComponents (Expose the shared agent session manager via AppComponents #532)
5dfa666 feat: adds context-llm tool support (feat: adds context-llm tool support #616)
bcda73c feat(cli): add cron subcommand for managing scheduled routines (feat(cli): add cron subcommand for managing scheduled routines #1017)
8ac24e7 style: fix formatting in cli/mod.rs and mcp/auth.rs (style: fix formatting in cli/mod.rs and mcp/auth.rs #1071)
e1691a8 feat: configurable hybrid search fusion strategy (feat: configurable hybrid search fusion strategy #234)
d5828b2 feat(tools): add reusable sensitive JSON redaction helper (feat(tools): add reusable sensitive JSON redaction helper #457)
442a42d fix(web): recompute cron next_fire_at when re-enabling routines (fix(web): recompute cron next_fire_at when re-enabling routines #1080)
7a9cbb3 fix(routines): run cron checks immediately on ticker startup (fix(routines): run cron checks immediately on ticker startup #1066)
e522d33 fix(web): make approval requests appear without page reload (Tool approval modal requires page reload to appear #996) (fix(web): make approval requests appear without page reload (#996) #1073)
6f00490 fix: relax approval requirements for low-risk tools (fix: relax approval requirements for low-risk tools #922)
d8bcfe1 fix(service): set CLI_ENABLED=false in macOS launchd plist (fix(service): set CLI_ENABLED=false in macOS launchd plist #1079)
1ba6a83 fix(http): fail closed when webhook secret is missing at runtime (fix(http): fail closed when webhook secret is missing at runtime #1075)
c54f739 fix: resolve bug_bash UX/logging issues (Approval message shows raw tool name 'toolinstall' — missing space/formatting #1054 Naive timestamp WARN logs flood output on every query #1055 No notification in chat when Telegram channel connects successfully #1058) (fix: resolve bug_bash UX/logging issues (#1054 #1055 #1058) #1072)
c7dec64 feat(ci): include commit history in staging promotion PRs (feat(ci): include commit history in staging promotion PRs #952)
8a60fa2 fix: add tool_info schema discovery for WASM tools (fix: add tool_info schema discovery for WASM tools #1086)
9fbdd42 fix(extensions): fix lifecycle bugs + comprehensive E2E tests (fix(extensions): fix lifecycle bugs + comprehensive E2E tests #1070)
cd1245a fix(ci): repair staging-ci workflow parsing (fix(ci): repair staging-ci workflow parsing #1090)
15c5d3e fix(wasm): address fix: add tool_info schema discovery for WASM tools #1086 review followups -- description hint and coercion safety (fix(wasm): address #1086 review followups -- description hint and coercion safety #1092)
3c619b6 fix(ci): repair staging promotion workflow behavior (fix(ci): repair staging promotion workflow behavior #1091)

Current commits in this promotion (31)

Current base: staging-promote/e2eb340c-22999151534
Current head: staging-promote/3c619b62-23035039465
Current range: origin/staging-promote/e2eb340c-22999151534..origin/staging-promote/3c619b62-23035039465

863702a feat: add MiniMax as a built-in LLM provider (feat: add MiniMax as a built-in LLM provider #940)
d420abf fix(memory): reject absolute filesystem paths with corrective routing (fix(memory): reject absolute filesystem paths with corrective routing #934)
006c15e style(agent): remove unnecessary Worker re-export (style(agent): remove unnecessary Worker re-export (fix #891) #923)
6bbf87b feat(routines): enable tool access in lightweight routine execution (feat: enable tool access in lightweight routine execution #257) (feat(routines): enable tool access in lightweight routine execution (#257) #730)
8df51c0 feat: enhance HTTP tool parameter parsing (feat: enhance HTTP tool parameter parsing #911)
c94ecf1 feat: add Slack approval buttons for tool execution in DMs (feat: add Slack approval buttons for tool execution in DMs #796)
0b122cb feat(web-chat): add hover copy button for user/assistant messages (feat(web-chat): add hover copy button for user/assistant messages #948)
c592c50 discord: mentions + signature verification in WASM channel (discord: mentions + signature verification in WASM channel #335)
fd574b2 Expose the shared agent session manager via AppComponents (Expose the shared agent session manager via AppComponents #532)
5dfa666 feat: adds context-llm tool support (feat: adds context-llm tool support #616)
bcda73c feat(cli): add cron subcommand for managing scheduled routines (feat(cli): add cron subcommand for managing scheduled routines #1017)
8ac24e7 style: fix formatting in cli/mod.rs and mcp/auth.rs (style: fix formatting in cli/mod.rs and mcp/auth.rs #1071)
e1691a8 feat: configurable hybrid search fusion strategy (feat: configurable hybrid search fusion strategy #234)
d5828b2 feat(tools): add reusable sensitive JSON redaction helper (feat(tools): add reusable sensitive JSON redaction helper #457)
442a42d fix(web): recompute cron next_fire_at when re-enabling routines (fix(web): recompute cron next_fire_at when re-enabling routines #1080)
7a9cbb3 fix(routines): run cron checks immediately on ticker startup (fix(routines): run cron checks immediately on ticker startup #1066)
e522d33 fix(web): make approval requests appear without page reload (Tool approval modal requires page reload to appear #996) (fix(web): make approval requests appear without page reload (#996) #1073)
6f00490 fix: relax approval requirements for low-risk tools (fix: relax approval requirements for low-risk tools #922)
d8bcfe1 fix(service): set CLI_ENABLED=false in macOS launchd plist (fix(service): set CLI_ENABLED=false in macOS launchd plist #1079)
1ba6a83 fix(http): fail closed when webhook secret is missing at runtime (fix(http): fail closed when webhook secret is missing at runtime #1075)
c54f739 fix: resolve bug_bash UX/logging issues (Approval message shows raw tool name 'toolinstall' — missing space/formatting #1054 Naive timestamp WARN logs flood output on every query #1055 No notification in chat when Telegram channel connects successfully #1058) (fix: resolve bug_bash UX/logging issues (#1054 #1055 #1058) #1072)
c7dec64 feat(ci): include commit history in staging promotion PRs (feat(ci): include commit history in staging promotion PRs #952)
8a60fa2 fix: add tool_info schema discovery for WASM tools (fix: add tool_info schema discovery for WASM tools #1086)
9fbdd42 fix(extensions): fix lifecycle bugs + comprehensive E2E tests (fix(extensions): fix lifecycle bugs + comprehensive E2E tests #1070)
cd1245a fix(ci): repair staging-ci workflow parsing (fix(ci): repair staging-ci workflow parsing #1090)
15c5d3e fix(wasm): address fix: add tool_info schema discovery for WASM tools #1086 review followups -- description hint and coercion safety (fix(wasm): address #1086 review followups -- description hint and coercion safety #1092)
3c619b6 fix(ci): repair staging promotion workflow behavior (fix(ci): repair staging promotion workflow behavior #1091)
a89cf37 fix(registry): bump telegram channel version for capabilities change (fix(registry): bump telegram channel version for capabilities change #1064)
c47237b fix(ci): add missing attachments field and crates/ dir to Dockerfiles (fix(ci): discord missing attachments field + Dockerfile crates/ copy #1100)
5e77585 chore: periodic sync main into staging (resolved conflicts) (chore: periodic sync main into staging (resolved conflicts) #1098)
1e00b1f fix(ci): checkout promotion PR head for metadata refresh (fix(ci): checkout promotion PR head for metadata refresh #1097)

Auto-updated by staging promotion metadata workflow

Waiting for gates:

Tests: pending
E2E: pending
Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

Add MiniMax to the provider registry with OpenAI-compatible protocol. Available models: - MiniMax-M2.5 (default) - 204,800 token context window - MiniMax-M2.5-highspeed - same performance, faster inference Configuration: LLM_BACKEND=minimax MINIMAX_API_KEY=<your-key> Supports both global (api.minimax.io) and China mainland (api.minimaxi.com) endpoints via MINIMAX_BASE_URL env var. Co-authored-by: PR Bot <pr-bot@minimaxi.com>

…#934) * ci(staging): use default branch instead of hardcoded main * fix(memory): route absolute paths to filesystem tools

…257) (#730) * Rebase onto staging * fix(routines): prevent autonomy-escalation in lightweight routines - Add ROUTINE_TOOL_DENYLIST to block routine_create/update/delete/fire and restart from being callable by lightweight routines - Deduplicate sentinel logic by reusing handle_text_response() in the no-tools path - Filter tool definitions sent to LLM to only include callable tools, avoiding wasted tokens on tools that would be rejected

* feat: enhance HTTP tool parameter parsing - Add support for stringified JSON arrays in headers parameter. - Introduce timeout_secs parameter parsing to accept both numbers and string representations. - Implement save_to parameter parsing to handle empty strings as None. - Update HTTP request handling to incorporate timeout and save_to parameters. - Add unit tests for new parsing functions to ensure correct behavior. * feat(http): enhance HTTP tool with timeout and header parsing improvements - Introduced default and maximum request timeout constants to manage resource usage. - Refactored header parsing logic to separate functions for better readability and maintainability. - Updated timeout handling to ensure it respects the maximum allowed value. - Added unit tests to validate new header parsing functionality. * refactor(http): replace hardcoded timeout with effective_timeout variable in HTTP tool error handling

* feat: add channel-relay integration for Slack via external relay service - Add RelayChannel and RelayClient for connecting to channel-relay SSE streams - Add RelayConfig with env-based configuration (CHANNEL_RELAY_URL, CHANNEL_RELAY_API_KEY) - Add channel-relay extension lifecycle: install, OAuth auth, activate with hot-add - Add proxy message sending through channel-relay for Slack chat.postMessage - Add extension registry entry for Slack relay with OAuth auth hint - Add relay integration test with mock SSE server - Wire relay channel into app startup with reconnect on stored credentials - Add AuthRequired extension error variant for cleaner auth flow detection [skip-regression-check] * chore: apply cargo fmt * fix: remove remaining Telegram test references in relay channel * fix: address PR #790 review feedback — parser handle leak, CSRF, circuit breaker - Fix parser handle leak on reconnect by sharing Arc<RwLock> instead of creating a local copy in start() (shutdown now aborts the correct task) - Add CSRF state nonce to OAuth flow: generate in auth_channel_relay, validate in slack_relay_oauth_callback_handler, one-time use - Remove dead proxy_slack method, update integration test to use proxy_provider - Add reconnect circuit breaker (max_consecutive_failures, default 50) - Fix stale docs (Telegram refs), extract event_types constants * fix: double backoff in reconnect loop and UTF-8 chunk-boundary corruption - Remove second sleep+backoff in list_connections error branch to prevent O(4^n) backoff growth (was sleeping and doubling twice per iteration) - Buffer raw bytes in SSE parser instead of per-chunk String::from_utf8_lossy to prevent U+FFFD corruption when multi-byte chars span chunk boundaries * feat: add Slack approval buttons for tool execution in DMs Send Block Kit Approve/Deny buttons via relay when a tool requires approval in a DM context. Auto-deny approval-requiring tools in shared channels to prevent prompt injection and stuck threads. * fix: address PR #796 review — use PreflightOutcome::Rejected, add tests - Auto-deny in non-DM relay channels now uses PreflightOutcome::Rejected instead of manually pushing to reason_ctx.messages, so the post-flight handler properly records the error in the turn - Add regression tests for relay auto-deny decision logic - Remove test_clean.db artifact * feat: restore Block Kit approval buttons in send_status The send_status implementation was accidentally dropped during the staging merge. Restores Approve/Deny Block Kit buttons for DM tool approval, with required sender_id validation, payload size docs, and 4 regression tests. Also removes test_clean.db. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply rustfmt formatting to dispatcher test code Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* ci(staging): use default branch instead of hardcoded main * feat(web-chat): add hover copy button for message bubbles * fix(web-chat): address Gemini review for copy state and streaming safety * chore(pr): drop unrelated staging workflow change from #948

* discord: address PR feedback on polling, auth, and tests * discord: add signature verification dependencies on latest main * test(discord): expand coverage for helper and signature edge cases --------- Co-authored-by: firat.sertgoz <f@nuff.tech>

* Expose agent session manager via AppComponents * Polish AppComponents session manager naming

* feat: adds context-llm tool support Introduces a new tool for the LLM Context endpoint of the Brave Search API: https://api-dashboard.search.brave.com/documentation/services/llm-context. * minor refactoring * Update registry/tools/llm-context.json Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update tools-src/llm-context/llm-context-tool.capabilities.json Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update tools-src/llm-context/src/lib.rs Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * chore: address feedback from review * address feedback * address feedback * fix: remove snippet-counting fn --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* feat(cli): add cron subcommand for managing scheduled routines Rebase onto staging branch and address collaborator review: - Fix .unwrap_or(None) → proper error propagation in set_enabled() - Add --yes/-y flag for non-interactive deletion with confirmation prompt - Add --json flag for machine-readable output in list and history - Preserve error context chain with {e:#} in run_cron_cli() Note: GATEWAY_USER_ID is trusted from the environment; future work may add authentication for multi-tenant deployments. * fix(cli): reject invalid cron timezones * refactor(cli): rename cron subcommand to routines The system manages all routine types (cron, webhook, event, manual), not just cron schedules. Rename the CLI subcommand to reflect this: - `ironclaw cron` -> `ironclaw routines` (with `cron` as hidden alias) - List shows all routines by default, add --trigger filter - Remove cron-trigger-only validation - Simplify require_routine helper (no trigger type check) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: reidliu41 <reid201711@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix formatting in cli/mod.rs and mcp/auth.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(cli): add missing use_tools and max_tool_rounds fields to routines create The routines CLI create command was missing the new Lightweight fields added after the cron->routines rename merged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(clippy): move path_routing_tests after production code in memory.rs Fixes items_after_test_module lint by moving the test module to the end of the file, after all production structs and impls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: configurable hybrid search fusion strategy (#169) Add WeightedScore fusion as an alternative to the default RRF algorithm. Users can now tune search behavior via env vars (SEARCH_FUSION_STRATEGY, SEARCH_FTS_WEIGHT, SEARCH_VECTOR_WEIGHT, SEARCH_RRF_K) or by passing SearchConfig with the new fields. Default behavior (RRF, k=60) is unchanged. - Add FusionStrategy enum (Rrf/WeightedScore) to workspace::search - Add weighted_score_fusion() and fuse_results() dispatcher - Add config/search.rs with WorkspaceSearchConfig from env vars - Wire search defaults through Workspace struct - Update both postgres and libsql backends to use fuse_results() - Add 7 new tests (4 fusion + 3 config) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: swap default search weights to match issue #169 spec (0.7 vector / 0.3 FTS) The issue spec says "0.7/0.3 (vector/keyword) for weighted mode" but our defaults had fts_weight=0.7, vector_weight=0.3 (inverted). Also fixes the misleading docstring on weighted_score_fusion that claimed 1/rank normalizes to [0,1]. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: validate weight inputs and update stale doc comments - Reject NaN, infinite, and negative values for SEARCH_FTS_WEIGHT and SEARCH_VECTOR_WEIGHT with a clear ConfigError - Fix module-level docs that incorrectly claimed WeightedScore "normalizes per-method scores to [0,1]" - Update SearchResult.score doc from "Combined RRF score" to strategy-agnostic "Combined fusion score" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: validate weight setters against NaN/inf/negative values with_fts_weight() and with_vector_weight() now silently ignore non-finite (NaN, ±inf) and negative values, matching the env var validation already in place for SEARCH_FTS_WEIGHT / SEARCH_VECTOR_WEIGHT. Values > 1.0 remain valid since weights are normalized internally. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use crate-wide ENV_MUTEX in search config tests Replace the module-local `ENV_MUTEX` in `search.rs` with a shared `crate::config::helpers::ENV_MUTEX` to prevent cross-module env races when `cargo test` runs tests in parallel. Addresses copilot review comment. Tracked in #245. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: per-strategy weight defaults to match issue #169 spec RRF mode now defaults to 0.5/0.5 (fts/vector) and WeightedScore defaults to 0.3/0.7, matching the acceptance criteria in #169. Previously both modes used 0.3/0.7 uniformly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: reject both weights=0 in weighted fusion mode When both SEARCH_FTS_WEIGHT and SEARCH_VECTOR_WEIGHT are 0.0 under WeightedScore strategy, all scores would be 0.0, producing arbitrary ordering. RRF mode is unaffected since it ignores weights entirely. Addresses Copilot review comment. The other comment (rrf_k=0 division by zero) is a false positive — ranks are 1-based, so k=0 just gives inverse-rank scoring with no infinity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: clarify weight doc comments and error key - SearchConfig field docs: clarify that Default always uses 0.5, per-strategy defaults only apply via WorkspaceSearchConfig::resolve() - WorkspaceSearchConfig field docs: same clarification - Error key for both-weights-zero now references both env vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove broken intra-doc links to pub(crate) resolve() WorkspaceSearchConfig::resolve is pub(crate), so linking to it from public field docs triggers rustdoc private_intra_doc_links warnings. Switch to plain-text references. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add document_path to weighted_score_fusion results The weighted_score_fusion function was missing the document_path field added in a recent main branch commit, causing a compile error after rebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: trigger CI re-check after rebase * fix: resolve pre-existing staging fmt and clippy issues - Fix import ordering in cli/mod.rs (cargo fmt) - Fix line wrapping in tools/mcp/auth.rs (cargo fmt) - Move path_routing_tests before MemoryTreeTool to fix clippy::items_after_test_module [skip-regression-check] * fix: remove duplicate path_routing_tests module after rebase [skip-regression-check] --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tools): add reusable sensitive JSON redaction helper * fix(tools): harden sensitive-key tokenization and context matching

* fix(routines): run cron check immediately at ticker startup * test/ci: add routine_engine test and fix style lint drift

…1073) * fix(web): show approval requests in realtime without reload * Update src/channels/web/static/app.js Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix: relax approval requirements for low-risk tools Remove unnecessary UnlessAutoApproved friction from list_dir, image_gen, image_analyze, image_edit, tool_install, tool_auth, tool_upgrade, and build_tool — these operate on trusted inputs or are low-risk operations so they now use the trait default (Never). For the http tool, GET requests without credentials now return Never instead of UnlessAutoApproved, while credential-bearing requests and non-GET methods retain their existing approval levels. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: apply cargo fmt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback on approval changes Rename test_requires_approval_returns_unless_auto_approved to test_requires_approval_returns_never to match the asserted behavior. In http requires_approval(), treat missing method as unknown (falls through to UnlessAutoApproved) instead of defaulting to GET, since the schema requires method. Updated comment to reflect this. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make http method optional, default to GET Make method optional in schema (only url is required) and default to GET in both execute() and requires_approval(). This aligns approval logic with execution and reduces friction for simple GET requests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: restore UnlessAutoApproved for build_tool, tool_install, tool_upgrade Address review feedback: these tools modify the system's trust boundary (shell execution, WASM installation, version mutation) and should retain approval gating. tool_auth kept as Never per owner decision. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(service): set CLI_ENABLED=false in macOS launchd plist * Update src/service.rs Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix(web,db): improve auth UX + reduce naive timestamp log noise * fix(clippy): keep memory test modules at end of file

* feat(ci): include commit history in staging promotion PRs and merge commits Promotion PRs from staging->main previously had opaque bodies showing only the batch SHA range. Now they enumerate all non-merge commits in each batch as a flat markdown list, visible both in the PR body and embedded in the merge commit message via --subject/--body flags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): use unique delimiter for commit_summary output Replace hardcoded COMMIT_SUMMARY_DELIM with a uuidgen-based delimiter to prevent theoretical collisions with commit message content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): use heredoc for PR body to avoid GFM code-block rendering The inline --body string had 10 leading spaces per line (from YAML indentation), which GitHub-flavored Markdown renders as a code block. Move the body into a heredoc variable so content starts at column 0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): truncate commit list at 50 and include PR number in merge subject - Cap commit enumeration at 50 entries with a truncation note to avoid blowing past GitHub PR body/merge message limits on large batches. - Prefix merge commit subject with #PR_NUMBER for traceability in git log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): address review — shell expansion, body-file, uuidgen 1. Replace heredoc with string concatenation to prevent shell expansion of commit messages containing $, backticks, or backslashes 2. Use --body-file for merge commit body for robustness 3. Replace uuidgen with date +%s for portability Addresses: #952 (review) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add tool_info schema discovery for WASM tools * refactor: simplify WASM schema and hint state * refactor: store tool_info registry reference as Weak

* feat(extensions): unify auth and configure into single entrypoint Refactors the extension lifecycle to eliminate the divergence between chat and gateway paths that caused Telegram setup via chat to fail (missing webhook secret auto-generation, no token validation). Key changes: - Rename save_setup_secrets() → configure(): single entrypoint for providing secrets to any extension (WasmChannel, WasmTool, MCP). Validates, stores, auto-generates, and activates. - Add configure_token(): convenience wrapper for single-token callers (chat auth card, WebSocket, agent auth mode). - Refactor auth() to pure status check: remove token parameter, delete token-storing branches from auth_mcp/auth_wasm_tool, rename auth_wasm_channel → auth_wasm_channel_status. - Add ConfigureResult/MissingSecret types for structured responses. - Replace hardcoded Telegram token validation with generic validation_endpoint from capabilities.json. - Update all callers (9 files) to use the new interface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use ValidationFailed error variant instead of string matching Replace brittle msg.contains("Invalid token") checks with a proper ExtensionError::ValidationFailed variant. configure() now returns this variant for token validation failures, and callers match on it directly instead of parsing error message strings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review — SSRF protection, error typing, missing-secret selection, WS auth 1. SSRF: call validate_fetch_url() before validation_endpoint HTTP request 2. Transport errors map to ExtensionError::Other (not ValidationFailed) 3. configure_token() picks first *missing* secret, not first non-optional 4. WebSocket error path re-emits AuthRequired on ValidationFailed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add regression tests for extension lifecycle refactoring - test_configure_token_picks_first_missing_secret: verifies multi-secret channels can be configured one secret at a time (commit ce106f4) - test_auth_is_read_only_for_wasm_channel: verifies auth() has no side effects and doesn't store secrets (commit 47f8eb6) - test_validation_failed_is_distinct_error_variant: verifies the typed error variant can be pattern-matched (commit a318161) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments — activation dispatch, dead code, caps consolidation - Fix configure() fallthrough bug: dispatch activation by ExtensionKind instead of unconditionally calling activate_wasm_channel() for all non-WasmTool types (MCP servers and channel relays now use their correct activation methods) - Remove dead MissingSecret struct and missing_secrets field (never populated, flagged by reviewer) - Consolidate capabilities file parsing in configure(): parse once and reuse for allowed names, validation_endpoint, and auto-generation - Fix auth() doc comment: note MCP OAuth side effects - Fix stale save_setup_secrets reference in server.rs comment - Add regression test for activation dispatch bug Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(extensions): fix 5 extension lifecycle bugs found during E2E testing Bug fixes in src/extensions/manager.rs: - Add auth guard to activate_wasm_tool() blocking activation when secrets are missing (NeedsSetup), matching activate_wasm_channel() behavior - Evict WasmToolRuntime module cache on remove() so reinstall uses fresh binary - Clear activation_errors on remove() for both WasmTool and WasmChannel - Clean up in-progress OAuth flows on remove() (abort TCP listener, purge pending flow entries) Bug fix in src/channels/web/server.rs: - Broadcast AuthCompleted SSE event on expired OAuth callback so web UI doesn't stay stuck showing "auth required" E2E test coverage: - test_wasm_lifecycle.py: 35 tests covering install/configure/activate/ remove/reinstall lifecycle with regression tests for bugs 1 and 3 - test_extension_oauth.py: 9 tests covering OAuth round-trip flow - test_tool_execution.py: 5 tests for tool invocation via chat - test_pairing.py: 4 tests for pairing request lifecycle - Enhanced conftest.py, helpers.py, mock_llm.py for OAuth mock support [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(web): unify extension auth UX and add lifecycle regressions * test: fix pending oauth flow fixtures after rebase * test(e2e): fix playwright route ordering for extensions reloads * test: address e2e review follow-ups * test: address remaining PR review comments --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rcion safety (#1092) Two fixes from the review of #1086 (tool_info schema discovery): 1. Replace fragile description string mutation (append_schema_hint_if_permissive / strip_schema_hint) with composition at display time. The raw description stays clean; the tool_info hint is composed in the Tool::schema() override only when the advertised schema is permissive. This also includes the tool name and `include_schema: true` in the hint for better LLM guidance. 2. Make effective_for_coercion use the load-time extracted schema from PreparedModule instead of re-calling the WASM schema() export on the already-running instance mid-execution. This avoids potential state contamination from calling schema() after linear memory is initialized for execution. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(ci): repair staging-ci workflow parsing * fix(ci): chain staging promotion to latest open branch * feat(ci): carry staging batch summaries into release PRs * test(ci): add dry-run dispatch for promotion metadata workflows * fix(ci): fetch only release tags for batch summaries * fix(ci): address review feedback on batch summaries * fix(ci): harden metadata workflows and dedupe body helpers * fix(ci): pass repo explicitly to gh pr list

…1064) The validation_endpoint addition to telegram.capabilities.json requires a version bump to pass the CI version-check gate on staging promotion. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…#1100) The discord channel's poll_channel_mentions emit_message call was missing the required `attachments: vec![]` field, causing WASM compilation failure. Both Dockerfiles were also missing `COPY crates/ crates/` needed for the extracted ironclaw_safety crate. [skip-regression-check] Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* chore: promote staging to main (2026-03-10 15:19 UTC) (#865) * fix: Channel HTTP: server doesn't start after config change (no hot-r… (#779) * fix: Channel HTTP: server doesn't start after config change (no hot-reload) * review fixes * review fixes * fix linter * fix code style * fix: prevent session lock contention blocking message processing (#783) * fix: prevent session lock contention blocking message processing ## Problem After container restart, POST /api/chat/send returns 202 ACCEPTED but messages don't appear in conversation_messages and agent never responds. Messages get stuck in "stale state" after restart. Root cause: Session lock was held for entire duration of chat_threads_handler and chat_history_handler, including during slow database queries. This blocked the agent loop from acquiring the session lock to process incoming messages, causing them to hang indefinitely. ## Solution 1. **Release session lock early in chat_threads_handler**: Only acquire lock when reading active_thread at response time, not during DB queries for thread list. DB operations no longer block message processing. 2. **Release session lock early in chat_history_handler**: Only acquire lock when accessing in-memory thread state, not during paginated DB queries or thread ownership checks. DB operations no longer block message processing. 3. **Add comprehensive logging**: Track message flow from receipt through session resolution, thread hydration, and state transitions. Helps diagnose future issues: - Message queued to agent loop (chat_send_handler) - Processing message from channel (handle_message) - Hydrating thread from DB (maybe_hydrate_thread) - Resolving session and thread (resolve_thread) - Checking thread state (process_user_input) - Persisting user message (persist_user_message) ## Impact - Message processing no longer blocks on session lock contention - API response times for thread list/history queries unaffected (DB queries still happen, but lock is not held) - Better diagnostics for future debugging ## Testing - All 2756 tests pass - Code compiles with zero clippy warnings - No changes to user-facing API or behavior, only lock timing Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * security: redact PII from info-level logs Downgrade user_id and channel logging to debug level to prevent exposing Personally Identifiable Information (PII) in production logs. The user_id field can contain sensitive information such as phone numbers (e.g., for Signal messages). Logging PII in cleartext at the info level creates a security and privacy risk, as these logs may be stored in persistent storage, indexed by log management systems, or accessible to unauthorized personnel. Changes: - Info level: logs only message_id (UUID) for tracking - Debug level: logs user_id, channel, thread_id for troubleshooting This maintains debugging capability for developers while protecting user privacy in production logs. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com> * chore: sync main into staging (#855) * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): run fmt + clippy on staging PRs, skip Windows clippy [skip-regression-check] (#802) - Remove branches:[main] filter from code_style.yml so it runs on all PRs - Gate clippy-windows with `if: github.base_ref == 'main'` (skip on staging PRs) - Update rollup job to allow skipped clippy-windows - Simplify claude-review.yml to only trigger on labeled event (avoids duplicate runs) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: persist user_id in save_job and expose job_id on routine runs (#709) * feat: persist worker events to DB and fix activity tab rendering In-process Worker (used by Scheduler::dispatch_job) now persists events via save_job_event at key execution points: plan creation, LLM responses, tool_use, tool_result, and job completion/failure/stuck. Event data shapes match the container worker format so the gateway activity tab renders them correctly. Frontend: tool_result errors now show a red X icon with danger styling instead of a silent empty output. The result event falls back to the error field when message is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire RoutineEngine into gateway for direct manual trigger firing Replace the message-channel hack in routines_trigger_handler with a direct call to RoutineEngine::fire_manual(), ensuring FullJob routines dispatch correctly when triggered from the web UI. Inject the engine into GatewayState from Agent::run after construction. Also persists user_id in save_job for both PG and libSQL backends, removes the source='sandbox' filter so all jobs are visible, and exposes job_id on RoutineRunInfo for the frontend job link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove stale gateway_state argument from Agent::new test call sites The gateway_state parameter was removed from Agent::new during rebase (replaced by post-construction set_routine_engine_slot), but three test call sites still passed the extra None argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — restore sandbox source filter, remove blank lines - Revert removal of `source = 'sandbox'` filter in all SandboxStore queries (8 sites across PG and libSQL). Sandbox-specific APIs should stay scoped to sandbox jobs; unified job listing for the Jobs tab should use a separate query path. - Remove extra blank lines in agent_loop.rs and worker.rs that caused formatting CI failure. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review — regenerate Cargo.lock, add user_id regression test - Regenerate Cargo.lock from main's lockfile to eliminate dependency version downgrades (anyhow, syn, etc.) that were churn from rebase. - Add regression test verifying user_id round-trips through save_job and get_job in the libSQL backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: remove trailing blank line in libsql jobs.rs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Postgres-side regression test for user_id persistence in save_job Mirrors the existing libSQL test (test_save_job_persists_user_id) for the Postgres backend. Gated behind #[cfg(feature = "postgres")] + #[ignore] since it requires a running PostgreSQL instance (integration tier). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): per-provider unsupported parameter filtering (#749, #728) (#809) Add declarative `unsupported_params` field to provider definitions in providers.json. Parameters listed are stripped from requests before sending, preventing 400 errors from providers that reject them (e.g. gpt-5 family and kimi-k2.5 rejecting custom temperature values). - Add `unsupported_params` to ProviderDefinition and RegistryProviderConfig - Propagate from registry through config resolution - Generic strip helpers handle temperature, max_tokens, stop_sequences - Apply filtering in RigAdapter and AnthropicOAuthProvider - Mark openai and tinfoil providers as unsupporting temperature - Update openai default model to gpt-5-mini Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com> * fix: Chat input is hidden in mobile browser mode (#877) * fix: stop XML-escaping tool output content (#598) (#874) * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): run fmt + clippy on staging PRs, skip Windows clippy [skip-regression-check] (#802) - Remove branches:[main] filter from code_style.yml so it runs on all PRs - Gate clippy-windows with `if: github.base_ref == 'main'` (skip on staging PRs) - Update rollup job to allow skipped clippy-windows - Simplify claude-review.yml to only trigger on labeled event (avoids duplicate runs) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: persist user_id in save_job and expose job_id on routine runs (#709) * feat: persist worker events to DB and fix activity tab rendering In-process Worker (used by Scheduler::dispatch_job) now persists events via save_job_event at key execution points: plan creation, LLM responses, tool_use, tool_result, and job completion/failure/stuck. Event data shapes match the container worker format so the gateway activity tab renders them correctly. Frontend: tool_result errors now show a red X icon with danger styling instead of a silent empty output. The result event falls back to the error field when message is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire RoutineEngine into gateway for direct manual trigger firing Replace the message-channel hack in routines_trigger_handler with a direct call to RoutineEngine::fire_manual(), ensuring FullJob routines dispatch correctly when triggered from the web UI. Inject the engine into GatewayState from Agent::run after construction. Also persists user_id in save_job for both PG and libSQL backends, removes the source='sandbox' filter so all jobs are visible, and exposes job_id on RoutineRunInfo for the frontend job link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove stale gateway_state argument from Agent::new test call sites The gateway_state parameter was removed from Agent::new during rebase (replaced by post-construction set_routine_engine_slot), but three test call sites still passed the extra None argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — restore sandbox source filter, remove blank lines - Revert removal of `source = 'sandbox'` filter in all SandboxStore queries (8 sites across PG and libSQL). Sandbox-specific APIs should stay scoped to sandbox jobs; unified job listing for the Jobs tab should use a separate query path. - Remove extra blank lines in agent_loop.rs and worker.rs that caused formatting CI failure. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review — regenerate Cargo.lock, add user_id regression test - Regenerate Cargo.lock from main's lockfile to eliminate dependency version downgrades (anyhow, syn, etc.) that were churn from rebase. - Add regression test verifying user_id round-trips through save_job and get_job in the libSQL backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: remove trailing blank line in libsql jobs.rs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Postgres-side regression test for user_id persistence in save_job Mirrors the existing libSQL test (test_save_job_persists_user_id) for the Postgres backend. Gated behind #[cfg(feature = "postgres")] + #[ignore] since it requires a running PostgreSQL instance (integration tier). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): per-provider unsupported parameter filtering (#749, #728) (#809) Add declarative `unsupported_params` field to provider definitions in providers.json. Parameters listed are stripped from requests before sending, preventing 400 errors from providers that reject them (e.g. gpt-5 family and kimi-k2.5 rejecting custom temperature values). - Add `unsupported_params` to ProviderDefinition and RegistryProviderConfig - Propagate from registry through config resolution - Generic strip helpers handle temperature, max_tokens, stop_sequences - Apply filtering in RigAdapter and AnthropicOAuthProvider - Mark openai and tinfoil providers as unsupporting temperature - Update openai default model to gpt-5-mini Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: stop XML-escaping tool output content in wrap_for_llm (#598) Remove content escaping that corrupted JSON in tool output. The <tool_output> structural boundary is preserved but content now passes through raw, fixing downstream parse failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Henry Park <henrypark133@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(safety): allow empty string tool params (#848) * fix(safety): allow empty string tool params * fix(safety): preserve heuristic checks and add path context to tool validation This follow-up refactor addresses PR review feedback by restoring heuristic checks (whitespace ratio, character repetition) for tool parameter validation and improving error reporting. Changes: - Restored heuristic warnings in validate_non_empty_input so they apply to both user input and tool parameters (when non-empty). - Refactored check_strings to recursively build and pass JSON paths (e.g., "metadata.tags[1]"). - Updated validation errors to use the specific JSON path as the field name instead of the generic "input". - Added regression tests for whitespace/repetition warnings and JSON path reporting in tool parameters. This ensures the safety layer remains semantically neutral about empty strings (fixing the memory_tree path: "" issue) while maintaining rigorous protection and providing better developer ergonomics. * style: run cargo fmt * perf: optimize release and dist build profiles (#843) * perf: optimize release and dist build profiles Add [profile.release] with strip=true and panic="abort" for smaller, faster release binaries. Upgrade [profile.dist] from lto="thin" to lto="fat" with codegen-units=1 for maximum optimization in CI releases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove panic=abort from release profile Reviewers (zmanian, Copilot, Gemini) correctly flagged that panic=abort in the release profile would kill the entire process on any tokio task panic, breaking fault isolation for the long-running server. Removed from release profile entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add PR template with risk assessment (#837) * feat: add PR template with risk assessment and review tracks Add a pull request template that includes summary, change type, validation checklist, security/database impact sections, blast radius, and rollback plan. Update CONTRIBUTING.md with review track definitions (A/B/C) based on change risk level. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: expand CONTRIBUTING.md with setup, workflow, and guidelines Add getting started, development workflow, code style summary, database change guidance, and dependency management sections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: add fuzzing targets for untrusted input parsers (#835) * feat: add fuzzing targets for untrusted input parsers Add cargo-fuzz infrastructure with 5 fuzz targets exercising security-critical code paths: - fuzz_safety_sanitizer: Aho-Corasick + regex injection detection - fuzz_safety_validator: Input validation (length, encoding, patterns) - fuzz_leak_detector: Secret leak scanning (API keys, tokens) - fuzz_tool_params: Tool parameter JSON validation - fuzz_config_env: TOML/JSON config parsing Each target exercises real IronClaw business logic with invariant assertions. Includes corpus directories and setup documentation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: improve fuzz targets to exercise real IronClaw code paths - fuzz_config_env: exercise SafetyLayer end-to-end (sanitize, validate, policy check) instead of generic TOML/JSON parsing - fuzz_tool_params: add validate_tool_schema coverage alongside validate_tool_params - Add "fuzz" to workspace exclude in root Cargo.toml - Update README descriptions to match actual target behavior [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace redundant detect() call with meaningful invariant assertion Replace the double sanitize()+detect() call with an assertion that critical severity warnings always trigger content modification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rewrite fuzz_config_env to exercise IronClaw safety code directly Replace SafetyLayer wrapper usage with direct Sanitizer, Validator, and LeakDetector instantiation and invocation. Adds meaningful consistency assertions (non-empty output, valid-means-no-errors, scan/clean agreement). Removes the config construction that was only exercising struct instantiation. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(wasm): run leak scan before credential injection in tools wrapper (#791) * fix(wasm): run leak scan before credential injection in tools wrapper The tools WASM wrapper runs the LeakDetector on HTTP request headers AFTER inject_host_credentials() has already substituted real secrets (e.g., xoxb- Slack bot tokens). This causes the leak detector to flag the tool's own legitimate outbound API calls as secret exfiltration. Move the scan to run on raw_headers before any credential injection, matching the fix already applied to the channels wrapper in #421. Fixes the same class of bug as #421 (which only fixed channels/wasm/wrapper.rs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * perf: inline leak scan to avoid Vec allocation on every HTTP request Address review feedback: instead of cloning all header keys/values into a Vec to pass to scan_http_request(), iterate over raw_headers directly using scan_and_clean(). This also provides more specific error messages (URL vs header vs body). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix cargo fmt formatting in leak scan loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(setup): drain residual terminal events before secret input (#747) (#849) * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): run fmt + clippy on staging PRs, skip Windows clippy [skip-regression-check] (#802) - Remove branches:[main] filter from code_style.yml so it runs on all PRs - Gate clippy-windows with `if: github.base_ref == 'main'` (skip on staging PRs) - Update rollup job to allow skipped clippy-windows - Simplify claude-review.yml to only trigger on labeled event (avoids duplicate runs) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: persist user_id in save_job and expose job_id on routine runs (#709) * feat: persist worker events to DB and fix activity tab rendering In-process Worker (used by Scheduler::dispatch_job) now persists events via save_job_event at key execution points: plan creation, LLM responses, tool_use, tool_result, and job completion/failure/stuck. Event data shapes match the container worker format so the gateway activity tab renders them correctly. Frontend: tool_result errors now show a red X icon with danger styling instead of a silent empty output. The result event falls back to the error field when message is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire RoutineEngine into gateway for direct manual trigger firing Replace the message-channel hack in routines_trigger_handler with a direct call to RoutineEngine::fire_manual(), ensuring FullJob routines dispatch correctly when triggered from the web UI. Inject the engine into GatewayState from Agent::run after construction. Also persists user_id in save_job for both PG and libSQL backends, removes the source='sandbox' filter so all jobs are visible, and exposes job_id on RoutineRunInfo for the frontend job link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove stale gateway_state argument from Agent::new test call sites The gateway_state parameter was removed from Agent::new during rebase (replaced by post-construction set_routine_engine_slot), but three test call sites still passed the extra None argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — restore sandbox source filter, remove blank lines - Revert removal of `source = 'sandbox'` filter in all SandboxStore queries (8 sites across PG and libSQL). Sandbox-specific APIs should stay scoped to sandbox jobs; unified job listing for the Jobs tab should use a separate query path. - Remove extra blank lines in agent_loop.rs and worker.rs that caused formatting CI failure. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review — regenerate Cargo.lock, add user_id regression test - Regenerate Cargo.lock from main's lockfile to eliminate dependency version downgrades (anyhow, syn, etc.) that were churn from rebase. - Add regression test verifying user_id round-trips through save_job and get_job in the libSQL backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: remove trailing blank line in libsql jobs.rs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Postgres-side regression test for user_id persistence in save_job Mirrors the existing libSQL test (test_save_job_persists_user_id) for the Postgres backend. Gated behind #[cfg(feature = "postgres")] + #[ignore] since it requires a running PostgreSQL instance (integration tier). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): per-provider unsupported parameter filtering (#749, #728) (#809) Add declarative `unsupported_params` field to provider definitions in providers.json. Parameters listed are stripped from requests before sending, preventing 400 errors from providers that reject them (e.g. gpt-5 family and kimi-k2.5 rejecting custom temperature values). - Add `unsupported_params` to ProviderDefinition and RegistryProviderConfig - Propagate from registry through config resolution - Generic strip helpers handle temperature, max_tokens, stop_sequences - Apply filtering in RigAdapter and AnthropicOAuthProvider - Mark openai and tinfoil providers as unsupporting temperature - Update openai default model to gpt-5-mini Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: skip the regression check [skip-regression-check] --------- Co-authored-by: Henry Park <henrypark133@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com> * feat(agent): add context size logging before LLM prompt (#810) * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): run fmt + clippy on staging PRs, skip Windows clippy [skip-regression-check] (#802) - Remove branches:[main] filter from code_style.yml so it runs on all PRs - Gate clippy-windows with `if: github.base_ref == 'main'` (skip on staging PRs) - Update rollup job to allow skipped clippy-windows - Simplify claude-review.yml to only trigger on labeled event (avoids duplicate runs) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: persist user_id in save_job and expose job_id on routine runs (#709) * feat: persist worker events to DB and fix activity tab rendering In-process Worker (used by Scheduler::dispatch_job) now persists events via save_job_event at key execution points: plan creation, LLM responses, tool_use, tool_result, and job completion/failure/stuck. Event data shapes match the container worker format so the gateway activity tab renders them correctly. Frontend: tool_result errors now show a red X icon with danger styling instead of a silent empty output. The result event falls back to the error field when message is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire RoutineEngine into gateway for direct manual trigger firing Replace the message-channel hack in routines_trigger_handler with a direct call to RoutineEngine::fire_manual(), ensuring FullJob routines dispatch correctly when triggered from the web UI. Inject the engine into GatewayState from Agent::run after construction. Also persists user_id in save_job for both PG and libSQL backends, removes the source='sandbox' filter so all jobs are visible, and exposes job_id on RoutineRunInfo for the frontend job link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove stale gateway_state argument from Agent::new test call sites The gateway_state parameter was removed from Agent::new during rebase (replaced by post-construction set_routine_engine_slot), but three test call sites still passed the extra None argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — restore sandbox source filter, remove blank lines - Revert removal of `source = 'sandbox'` filter in all SandboxStore queries (8 sites across PG and libSQL). Sandbox-specific APIs should stay scoped to sandbox jobs; unified job listing for the Jobs tab should use a separate query path. - Remove extra blank lines in agent_loop.rs and worker.rs that caused formatting CI failure. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review — regenerate Cargo.lock, add user_id regression test - Regenerate Cargo.lock from main's lockfile to eliminate dependency version downgrades (anyhow, syn, etc.) that were churn from rebase. - Add regression test verifying user_id round-trips through save_job and get_job in the libSQL backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: remove trailing blank line in libsql jobs.rs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Postgres-side regression test for user_id persistence in save_job Mirrors the existing libSQL test (test_save_job_persists_user_id) for the Postgres backend. Gated behind #[cfg(feature = "postgres")] + #[ignore] since it requires a running PostgreSQL instance (integration tier). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat(agent): add context size logging before LLM prompt --------- Co-authored-by: Henry Park <henrypark133@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com> * fix: preserve text before tool-call XML in forced-text responses (#852) * fix: preserve text before tool-call XML in forced-text responses (#789) Local models (Qwen3, DeepSeek, GLM) emit <tool_call> XML even when no tools are available (force_text mode). The existing strip_xml_tag() discards everything from an unclosed opening tag onward, producing an empty string that triggers the "I'm not sure how to respond" fallback. Add truncate_at_tool_tags() — a code-region-aware pre-processing step that truncates at the first tool-call XML tag BEFORE clean_response() runs, preserving all useful text before the tag. Protect all 7 clean_response() call sites. Case-insensitive matching handles models that emit <TOOL_CALL> or <Tool_Call> variants. Secondary fix: add has_native_thinking() model detection to skip <think>/<final> system prompt injection for models with built-in reasoning (Qwen3, QwQ, DeepSeek-R1, GLM-Z1, etc.), preventing thinking-only responses that clean to empty. Wire with_model_name(active_model_name()) at all 9 production sites that construct Reasoning, so the runtime model name (not static config) drives system prompt generation. 126 new/updated tests covering truncation edge cases, code-block awareness, Unicode, case-insensitivity, StubLlm integration for complete/plan/evaluate_success/respond_with_tools paths, model detection, and conditional system prompt generation. Closes #789 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Copilot review — unclosed-only truncation, ASCII case folding - truncate_at_tool_tags() now only truncates at UNCLOSED tool tags; properly closed tags (e.g. <tool_call>...</tool_call>) are left intact for clean_response() to strip normally, preserving any text after them - Switch from to_lowercase() to to_ascii_lowercase() to prevent byte offset misalignment with non-ASCII characters whose lowercase form has different byte length (e.g. Kelvin sign U+212A) - Add closing_tag_for() helper to derive closing tags from open patterns - Fix doc comment: "fenced markdown code blocks or inline code spans" (not "indented", which find_code_regions() doesn't detect) - Add regression tests: closed vs unclosed for each tag variant, Unicode + case-insensitive offset safety, and mixed closed/unclosed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: minor review items — consistent ascii_lowercase, closing_tag_for tests - Switch has_native_thinking() from to_lowercase() to to_ascii_lowercase() for consistency with truncate_at_tool_tags() approach - Add unit tests for closing_tag_for(): standard tags, space-suffixed patterns, pipe-delimited tags, and exhaustive coverage of all TOOL_TAG_PATTERNS entries - Add test for mixed closed+unclosed tags of different types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Feat/docker shell edition (#804) * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Henry Park <henrypark133@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(mcp): strip top-level null params before forwarding to MCP servers (#795) * feat(llm): per-provider unsupported parameter filtering (#749, #728) (#809) Add declarative `unsupported_params` field to provider definitions in providers.json. Parameters listed are stripped from requests before sending, preventing 400 errors from providers that reject them (e.g. gpt-5 family and kimi-k2.5 rejecting custom temperature values). - Add `unsupported_params` to ProviderDefinition and RegistryProviderConfig - Propagate from registry through config resolution - Generic strip helpers handle temperature, max_tokens, stop_sequences - Apply filtering in RigAdapter and AnthropicOAuthProvider - Mark openai and tinfoil providers as unsupporting temperature - Update openai default model to gpt-5-mini Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(mcp): strip top-level null params before forwarding to MCP servers LLMs frequently emit `"field": null` for optional parameters in tool calls. Many MCP servers reject explicit nulls for fields that should simply be absent — e.g. Notion returns 400 for `"sort": null` in a search call, expecting the field to be omitted entirely. Strip top-level null keys from the params object before calling `call_tool()`. Only top-level keys are stripped; nested nulls are preserved since they may be semantically meaningful. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * Add event-triggered routines and workflow skill templates (#756) * Add event-triggered routines and workflow skill templates * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review feedback for event_emit security and quality Security fixes: - Require approval (UnlessAutoApproved) for event_emit, matching routine_fire - Enable sanitization on event_emit payload (external JSON reaches LLM) - Remove user_id parameter from event_emit to prevent IDOR — always use ctx.user_id Correctness fixes: - Rename source → event_source in event_emit for consistency with routine_create - Use json_value_as_filter_string for filter parsing (handles numbers/booleans) - Case-insensitive matching for event source and event_type - Add debug logging for missing filter keys in payload - Fix skill_install_routine_webhook_sim test missing .with_skills() - Fix schema_validator test for event_emit payload properties Code quality: - Move EventEmitTool struct/impl after RoutineHistoryTool (fix split layout) - Deduplicate routine_to_info into RoutineInfo::from_routine in types.rs - Add test section headers in e2e_routine_heartbeat.rs - Clarify event_emit description to specify system_event routines only Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): run fmt + clippy on staging PRs, skip Windows clippy [skip-regression-check] (#802) - Remove branches:[main] filter from code_style.yml so it runs on all PRs - Gate clippy-windows with `if: github.base_ref == 'main'` (skip on staging PRs) - Update rollup job to allow skipped clippy-windows - Simplify claude-review.yml to only trigger on labeled event (avoids duplicate runs) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: persist user_id in save_job and expose job_id on routine runs (#709) * feat: persist worker events to DB and fix activity tab rendering In-process Worker (used by Scheduler::dispatch_job) now persists events via save_job_event at key execution points: plan creation, LLM responses, tool_use, tool_result, and job completion/failure/stuck. Event data shapes match the container worker format so the gateway activity tab renders them correctly. Frontend: tool_result errors now show a red X icon with danger styling instead of a silent empty output. The result event falls back to the error field when message is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire RoutineEngine into gateway for direct manual trigger firing Replace the message-channel hack in routines_trigger_handler with a direct call to RoutineEngine::fire_manual(), ensuring FullJob routines dispatch correctly when triggered from the web UI. Inject the engine into GatewayState from Agent::run after construction. Also persists user_id in save_job for both PG and libSQL backends, removes the source='sandbox' filter so all jobs are visible, and exposes job_id on RoutineRunInfo for the frontend job link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove stale gateway_state argument from Agent::new test call sites The gateway_state parameter was removed from Agent::new during rebase (replaced by post-construction set_routine_engine_slot), but three test call sites still passed the extra None argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — restore sandbox source filter, remove blank lines - Revert removal of `source = 'sandbox'` filter in all SandboxStore queries (8 sites across PG and libSQL). Sandbox-specific APIs should stay scoped to sandbox jobs; unified job listing for the Jobs tab should use a separate query path. - Remove extra blank lines in agent_loop.rs and worker.rs that caused formatting CI failure. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review — regenerate Cargo.lock, add user_id regression test - Regenerate Cargo.lock from main's lockfile to eliminate dependency version downgrades (anyhow, syn, etc.) that were churn from rebase. - Add regression test verifying user_id round-trips through save_job and get_job in the libSQL backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: remove trailing blank line in libsql jobs.rs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Postgres-side regression test for user_id persistence in save_job Mirrors the existing libSQL test (test_save_job_persists_user_id) for the Postgres backend. Gated behind #[cfg(feature = "postgres")] + #[ignore] since it requires a running PostgreSQL instance (integration tier). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix: make routine_system_event_emit test create routine before emitting - Add routine_create step to trace fixture so event_emit has a matching routine to fire - Assert fired_routines > 0, not just key presence (Copilot review) - Add .with_auto_approve_tools(true) since event_emit now requires approval Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: renumber test headers after system_event test insertion Test 4 was duplicated (routine_cooldown and heartbeat_findings). Renumber heartbeat_findings to Test 5 and heartbeat_empty_skip to Test 6. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: merge staging and add missing RoutineEngine args in test RoutineEngine::new on staging requires `tools` and `safety` params. Update system_event_trigger_matches_and_filters test to pass them. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address new Copilot review comments - Add .with_auto_approve_tools(true) to skill_install_routine_webhook_sim test so event_emit doesn't block on approval - Fix module-level doc comment for event_emit to specify system_event trigger [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: deduplicate json_value_as_string helper Remove private `json_value_as_string` from routine_engine.rs and use the identical public `json_value_as_filter_string` from routine.rs, eliminating divergence risk. (Copilot review) [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Henry Park <henrypark133@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: enable WASM credential injection in No-DB environments (#845) * fix(wasm): enable credential injection in no-DB environments via env var fallback When a secrets store is unavailable (e.g. no-DB mode), WASM channel credentials were silently not injected, causing channels to start without credentials. Fix by: - Changing `inject_channel_credentials_from_secrets` to accept `Option<&dyn SecretsStore>` — secrets store is tried first when present - Adding env var fallback (`inject_env_credentials`) for credentials not covered by the secrets store - Enforcing a channel-name prefix security check on env var names to prevent WASM channels from reading unrelated host credentials (e.g. `AWS_SECRET_ACCESS_KEY`) - Extracting pure `resolve_env_credentials` helper for testability - Adding case-insensitive prefix matching for secrets store lookup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(wasm): inject credentials at startup when no secrets store (setup.rs path) The startup path (setup_wasm_channels -> register_channel) was guarded by `if let Some(secrets) = secrets_store`, so in No-DB mode credentials were never injected and the channel started without them. Fix by: - Changing inject_channel_credentials to accept Option<&dyn SecretsStore> - Always calling it (removing the if-let guard) — env var fallback runs even when secrets_store is None - Adding channel-name prefix security check to the env var fallback path (e.g. TELEGRAM_ for channel "telegram"), consistent with manager.rs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): correct misleading comment on ICTEST1_UNRELATED_OTHER placeholder * fix(wasm): guard against empty channel name in credential injection An empty channel_name would produce prefix "_", allowing any env var starting with "_" to pass the security check and be injected. Add an early-return guard in resolve_env_credentials, inject_env_credentials, and inject_channel_credentials. Add a test to cover this path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: lizican123 <lizican123@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: promote to main (#878) * fix: replace unsafe env::set_var with thread-safe inject_single_var in SIGHUP handler Fixes race condition where SIGHUP handler modifies global environment variables while other threads may be reading them via Config::from_env(). Changes: - Replace unsafe { std::env::set_var() } with ironclaw::config::inject_single_var() - Uses INJECTED_VARS mutex instead of unsafe global state modification - All reads via optional_env() check the thread-safe overlay first - Prevents data races between SIGHUP reload and concurrent config reads Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * fix: spawn webhook restart as background task to avoid blocking I/O across lock Prevents holding Mutex lock during async I/O operations (TcpListener::bind, task shutdown). The SIGHUP handler no longer blocks webhook processing during listener restart. Changes: - Read old_addr and drop lock immediately - Spawn restart_with_addr() as background task via tokio::spawn - Lock is only held during the actual restart operation, not the signal handler Benefits: - SIGHUP handler returns immediately without blocking - Webhook requests not delayed by listener restart I/O - Lock contention significantly reduced Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * fix: add graceful shutdown mechanism for SIGHUP handler background task Prevents unbounded loop without cancellation token. The SIGHUP handler now listens for a shutdown signal and exits cleanly during graceful termination. Changes: - Create broadcast channel for shutdown signaling - SIGHUP handler uses tokio::select! to wait for shutdown or SIGHUP - Send shutdown signal to all background tasks after agent.run() completes - Ensures clean task lifecycle and no orphaned background tasks Benefits: - Proper task cancellation during graceful shutdown - Follows Tokio best practices for background task management - No background tasks orphaned when runtime shuts down Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * refactor: replace stringly-typed parameter filtering with typed enum and single helper Fixes DRY violation where unsupported parameter filtering was duplicated across rig_adapter.rs and anthropic_oauth.rs using string contains checks. Changes: - Add UnsupportedParam typed enum in provider.rs (Temperature, MaxTokens, StopSequences) - Create strip_unsupported_completion_params() helper function - Create strip_unsupported_tool_params() helper function - Update rig_adapter.rs to use shared helpers - Update anthropic_oauth.rs to use shared helpers - Replace 60+ lines of duplicate stringly-typed logic Benefits: - Type safety: parameter names checked at compile time - Single source of truth: adding a new param updates one place - Reduced maintenance burden: no duplicate logic to keep in sync - Better code clarity: named enum variant is self-documenting Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * docs: clarify intentional parameter asymmetry between completion and tool requests Add documentation explaining why strip_unsupported_tool_params does not handle StopSequences: the field doesn't exist in ToolCompletionRequest. Changes: - Add clarifying comments to strip_unsupported_tool_params() - Explain why StopSequences is only in CompletionRequest - Note that ToolCompletionRequest only supports Temperature and MaxTokens - Inline comment confirms no action needed for StopSequences This addresses the appearance of incomplete implementation without changing logic, as the asymmetry is intentional and correct (ToolCompletionRequest lacks the field). Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * perf: isolate webhook_secret to reduce lock contention on hot path Move webhook_secret from shared HttpChannelState RwLock into its own Arc<RwLock<>>. This eliminates contention between secret validation and other state operations. Changes: - Change webhook_secret field type from RwLock<Option<SecretString>> to Arc<RwLock<Option<SecretString>>> - Update initialization in HttpChannel::new() - Update comments to explain isolation rationale Benefits: - Reduce lock contention on webhook request hot path (secret validation) - Rarely-changing field (SIGHUP only) isolated from frequent state accesses - Other state operations (tx, pending_responses) no longer wait behind secret reads - Minimal code change: only field declaration and initialization The Arc wrapper allows cloning the RwLock handle to separate concerns. With this change, every webhook request acquires its own isolated lock for secret validation, not the shared HttpChannelState lock. This scales better under high request volume. Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * fix: prevent partial state corruption on SIGHUP restart failure Ensure atomicity of configuration reload: if webhook listener restart fails, secret update is skipped to prevent inconsistent state. Changes: - Wait for restart_with_addr() to complete (don't spawn background task) - Track restart result with restart_failed flag - Only update secret if restart succeeded or wasn't needed - Ensure listener and secret stay synchronized Problem addressed: - Before: restart spawned as background task, secret updated immediately - If restart failed, secret was changed but listener still on old address - This left system in inconsistent state (partial corruption) Solution: - Make restart blocking (SIGHUP handler can wait, it's not on request hot path) - Atomically update secret only after successful restart - Flag prevents race between restart and secret update Benefits: - Configuration changes are atomic (both succeed or both fail together) - No partial state corruption on restart failure - Failed restarts don't silently leave inconsistent state - Secret and listener address stay in sync Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * refactor: generalize hot-secret-swapping with ChannelSecretUpdater trait Decouple SIGHUP handler from HTTP channel internals by introducing a trait for channels that support zero-downtime secret updates. Changes: - Add ChannelSecretUpdater trait in channels/channel.rs - Implement ChannelSecretUpdater for HttpChannelState - Export trait from channels module - Update SIGHUP handler to use trait-based secret updater collection - Replace explicit HTTP channel knowledge with generic updater loop Benefits: - SIGHUP handler no longer depends on HttpChannelState details - Tight coupling removed: main.rs doesn't need HTTP channel imports - Extensible: new channels can opt-in by implementing the trait - Scalable: multiple channels supported without main.rs changes - Maintainable: adding channels requires only trait implementation, not SIGHUP handler edits Pattern: - ChannelSecretUpdater trait defines the interface for all updaters - Channels that support hot-secret-swapping implement the trait - SIGHUP handler loops through all registered updaters generically Verification: - All 2,787 tests pass - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> * feat: validate parameter names at deserialization time, not just tests Add custom serde deserializer for unsupported_params that validates parameter names at runtime when loading providers.json (or user overrides). Changes: - Add unsupported_params_de module with custom deserializer - Only allows: "temperature", "max_tokens", "stop_sequences" - Invalid parameter names cause immediate deserialization error - Update ProviderDefinition to use custom deserializer - Enhanced test with explicit parameter name validation - Add new test that verifies invalid parameters are rejected Problem solved: - Before: Invalid param names (e.g., "temperrature") silently ignored - Now: Rejected at deserialization time with clear error message - Prevents runtime failures caused by typos in configuration Example error: unsupported parameter name 'temperrature': must be one of: temperature, max_tokens, stop_sequences Benefits: - Fail-fast: errors caught when loading config, not at runtime - Clear feedback: error message lists valid parameter names - Type safety: validators run during deserialization - Configuration errors detected immediately, not silently ignored Verification: - All 2,788 tests pass (including new validation test) - Zero clippy warnings - Code compiles successfully Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com> * merge: resolve conflicts for PR #800 and #822 into staging (#881) * fix(ci): secrets can't be used in step if conditions [skip-regression-check] (#787) GitHub Actions step-level `if:` doesn't have access to `secrets` context. Replace `if: secrets.X != ''` with `continue-on-error: true` and let the Set token step handle the fallback. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(ci): clean up staging pipeline — remove hacks, skip redundant checks [skip-regression-check] (#794) - Remove continue-on-error from staging-ci.yml app token steps (secrets are configured) - Skip test.yml and code_style.yml on PRs targeting staging (staging-ci.yml already runs tests before promoting, promotion PR gets full CI on main) - Allow ironclaw-ci[bot] in Claude Code review for bot-created promotion PRs Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * fix(ci): run fmt + clippy on staging PRs, skip Windows clippy [skip-regression-check] (#802) - Remove branches:[main] filter from code_style.yml so it runs on all PRs - Gate clippy-windows with `if: github.base_ref == 'main'` (skip on staging PRs) - Update rollup job to allow skipped clippy-windows - Simplify claude-review.yml to only trigger on labeled event (avoids duplicate runs) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * feat: persist user_id in save_job and expose job_id on routine runs (#709) * feat: persist worker events to DB and fix activity tab rendering In-process Worker (used by Scheduler::dispatch_job) now persists events via save_job_event at key execution points: plan creation, LLM responses, tool_use, tool_result, and job completion/failure/stuck. Event data shapes match the container worker format so the gateway activity tab renders them correctly. Frontend: tool_result errors now show a red X icon with danger styling instead of a silent empty output. The result event falls back to the error field when message is absent. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire RoutineEngine into gateway for direct manual trigger firing Replace the message-channel hack in routines_trigger_handler with a direct call to RoutineEngine::fire_manual(), ensuring FullJob routines dispatch correctly when triggered from the web UI. Inject the engine into GatewayState from Agent::run after construction. Also persists user_id in save_job for both PG and libSQL backends, removes the source='sandbox' filter so all jobs are visible, and exposes job_id on RoutineRunInfo for the frontend job link. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove stale gateway_state argument from Agent::new test call sites The gateway_state parameter was removed from Agent::new during rebase (replaced by post-construction set_routine_engine_slot), but three test call sites still passed the extra None argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review — restore sandbox source filter, remove blank lines - Revert removal of `source = 'sandbox'` filter in all SandboxStore queries (8 sites across PG and libSQL). Sandbox-specific APIs should stay scoped to sandbox jobs; unified job listing for the Jobs tab should use a separate query path. - Remove extra blank lines in agent_loop.rs and worker.rs that caused formatting CI failure. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review — regenerate Cargo.lock, add user_id regression test - Regenerate Cargo.lock from main's lockfile to eliminate dependency version downgrades (anyhow, syn, etc.) that were churn from rebase. - Add regression test verifying user_id round-trips through save_job and get_job in the libSQL backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: remove trailing blank line in libsql jobs.rs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Postgres-side regression test for user_id persistence in save_job Mirrors the existing libSQL test (test_save_job_persists_user_id) for the Postgres backend. Gated behind #[cfg(feature = "postgres")] + #[ignore] since it requires a running PostgreSQL instance (integration tier). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * refactor: unify three agentic loops into single AgenticLoop engine (#654) Replace three independent copy-pasted agentic loops (dispatcher, worker, container runtime) with a single shared engine in `agentic_loop.rs` that all consumers customize via the `LoopDelegate` trait. Phase 1 — Shared engine (`src/agent/agentic_loop.rs`, 205 lines): - `run_agentic_loop()` owns the core LLM → tool exec → repeat cycle - `LoopDelegate` trait (Send + Sync, &dyn dispatch) with 6 hook points - Tool intent nudge logic consolidated (was duplicated in 3 files) - Iteration limit + force-text behavior preserved Phase 2 — Three delegate implementations: - `ChatDelegate` (dispatcher.rs): 3-phase approval flow, hooks, cost guard, context compaction, skill attenuation, interruption - `JobDelegate` (worker/job.rs): planning pre-loop phase, parallel JoinSet exec, mark_completed/stuck/failed, SSE streaming, self-repair - `ContainerDelegate` (worker/container.rs): sequential tool exec, HTTP-proxied LLM, container-safe tools, credential injection Phase 3 — File moves and cleanup: - Delete `src/agent/worker.rs` — job logic moved to `src/worker/job.rs` - Rename `src/worker/runtime.rs` → `src/worker/container.rs` - Re-export `Worker`/`WorkerDeps` from `crate::worker` in `agent/mod.rs` - Update `scheduler.rs` imports to new worker location Shared helpers (`src/tools/execute.rs`): - `execute_tool_with_safety()` replaces 4 copies of validate → timeout → execute → serialize - `process_tool_result()` replaces 3 copies of sanitize → wrap → ChatMessage (also used by thread_ops.rs approval resume paths) Net result: -2,408 lines, zero duplicated loop logic, single code path for tool intent nudge and completion detection. Closes #654 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback from Copilot 1. scheduler.rs: Replace `unwrap_or` fallback with proper error propagation when parsing tool output JSON — surfaces bugs instead of silently changing the output type. 2. worker/job.rs: Drop MutexGuard before the cancellation `.await` in `check_signals()` to avoid holding a lock across an async I/O call (prevents `await_holding_lock` lint). 3. worker/job.rs: Restore consecutive rate-limit counter (MAX_CONSECUTIVE_RATE_LIMITS = 10) so sustained rate limiting marks the job stuck with "Persistent rate limiting" instead of silently burning through max_iterations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: incorporate staging changes — token budget tracking + mark_failed Merge staging's changes into the refactored JobDelegate: - Add token budget tracking in call_llm (update_context/add_tokens) - mark_stuck → mark_failed for iteration cap and rate-limit exhaustion (aligns with staging's #788 fix) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address zmanian's PR review — eliminate type erasure, clean up Address all 6 review points from zmanian on PR #800: 1. Replace LoopOutcome::Custom(Box<dyn Any>) with typed LoopOutcome::NeedApproval(Box<PendingApproval>) — eliminates type erasure and downcast, resolves clippy large_enum_variant. 2. Remove dead max_tool_iterations field from ChatDelegate struct. 3. Add on_tool_intent_nudge() hook to LoopDelegate trait with implementations in Job and Container delegates for observability. 4. Fix SSE events in job worker to emit raw sanitized content instead of XML-wrapped <tool_output> tags. 5. Remove 4 duplicate completion tests from job.rs that were already covered by the shared util module. 6. Avoid logging full tool results — use result_size_bytes in debug logs (execute.rs, job.rs). Also updates path references in CLAUDE.md, COVERAGE_PLAN.md, and add-sse-event.md command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(doctor): expand diagnostics from 7 to 16 health checks * test: add unit tests for agentic_loop and execute shared modules Add 16 tests covering the two new critical shared modules: agentic_loop.rs (10 tests): - Text response exits loop immediately - Tool call → text response continuation - LoopSignal::Stop exits before LLM call - LoopSignal::InjectMessage adds user message to context - Max iterations terminates with LoopOutcome::MaxIterations - Tool intent nudge fires twice then caps - before_llm_call early exit bypasses LLM - truncate_for_preview: short string, long string, multibyte safety execute.rs (6 tests): - execute_tool_with_safety success path - Missing tool returns ToolError::NotFound - Tool execution failure propagates - Per-tool timeout enforcement (50ms) - process_tool_result XML wrapping on success - process_tool_result error formatting All 2,777 unit tests pass, 0 clippy warnings. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: cargo fmt Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address code review — 9 issues across agentic loop, job worker, container CRITICAL fixes: - Rate-limit exhaustion now returns Err(LlmError::RateLimited) instead of Ok(Text("")), stopping the loop immediately with no ghost iteration. Below-threshold retries still use Text("") with an explicit empty-string guard in handle_text_response to skip injection. - check_signals drains the entire message channel before returning, prioritizing Stop over UserMessage. Previously returned early on first UserMessage, silently dropping any queued Stop or additional messages. - check_signals now detects all non-progressing job states (Cancelled, Failed, Stuck, Completed, Submitted, Accepted) instead of only Cancelled and Failed. HIGH fixes: - Error path in process_tool_result_job applies truncate_for_preview to bound error strings in SSE/DB events (was unbounded). - Document Send+Sync lifetime constraint on LoopDelegate trait. - Test mock before_llm_call refactored from double-lock to single lock acquisition, eliminating deadlock risk on refactor. MEDIUM fixes: - CompletionReport includes actual iteration count via shared Arc<Mutex<u32>> tracker (was hardcoded 0). - process_tool_result_job return type changed from Result<bool> to Result<()> — the bool was always false (dead API). - Deduplicate truncate in container.rs; now uses truncate_for_preview from agentic_loop. Verified: 0 clippy warnings, 2781 tests pass, cargo fmt clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Henry Park <henrypark133@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com> Co-authored-by: Umesh Kumar Singh <brijbiharisingh1971@outlook.com> Co-authored-by: reidliu41 <reid201711@gmail.com> * Revert "Feat/docker shell edition" + fix fmt/clippy (#886) * Revert "Feat/docker shell edition (#804)" This reverts commit c566faf28fb77c2fa4df92c2947fb48f1a25df9b. * style: fix formatting issues from revert Run cargo fmt to fix formatting across 7 files after the revert of the docker shell edition feature. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> * refactor: centralize test cre…

…3919 chore: promote staging to staging-promote/3c619b62-23035039465 (2026-03-13 04:35 UTC)

…3035039465 chore: promote staging to staging-promote/e2eb340c-22999151534 (2026-03-13 03:36 UTC)

…3035039465 chore: promote staging to staging-promote/775bb0cd-22999151534 (2026-03-13 03:36 UTC)

octo-patch and others added 27 commits March 12, 2026 11:17

fix(memory): reject absolute filesystem paths with corrective routing (…

d420abf

…#934) * ci(staging): use default branch instead of hardcoded main * fix(memory): route absolute paths to filesystem tools

style(agent): remove unnecessary Worker re-export (#923)

006c15e

Expose the shared agent session manager via AppComponents (#532)

fd574b2

* Expose agent session manager via AppComponents * Polish AppComponents session manager naming

feat(tools): add reusable sensitive JSON redaction helper (#457)

d5828b2

* feat(tools): add reusable sensitive JSON redaction helper * fix(tools): harden sensitive-key tokenization and context matching

fix(web): recompute cron next_fire_at when re-enabling routines (#1080)

442a42d

fix(routines): run cron checks immediately on ticker startup (#1066)

7a9cbb3

* fix(routines): run cron check immediately at ticker startup * test/ci: add routine_engine test and fix style lint drift

fix(http): fail closed when webhook secret is missing at runtime (#1075)

1ba6a83

fix: resolve bug_bash UX/logging issues (#1054 #1055 #1058) (#1072)

c54f739

* fix(web,db): improve auth UX + reduce naive timestamp log noise * fix(clippy): keep memory test modules at end of file

fix: add tool_info schema discovery for WASM tools (#1086)

8a60fa2

* fix: add tool_info schema discovery for WASM tools * refactor: simplify WASM schema and hint state * refactor: store tool_info registry reference as Weak

fix(ci): repair staging-ci workflow parsing (#1090)

cd1245a

ironclaw-ci bot added the staging-promotion label Mar 13, 2026

github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel labels Mar 13, 2026

zmanian and others added 5 commits March 12, 2026 21:02

fix(registry): bump telegram channel version for capabilities change (#…

a89cf37

…1064) The validation_endpoint addition to telegram.capabilities.json requires a version bump to pass the CI version-check gate on staging promotion. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

fix(ci): checkout promotion PR head for metadata refresh (#1097)

1e00b1f

Merge pull request #1102 from nearai/staging-promote/1e00b1fe-2303636…

3149c91

…3919 chore: promote staging to staging-promote/3c619b62-23035039465 (2026-03-13 04:35 UTC)

github-actions bot added the scope: sandbox Docker sandbox label Mar 13, 2026

henrypark133 merged commit 2b8063a into staging-promote/e2eb340c-22999151534 Mar 13, 2026
18 checks passed

henrypark133 deleted the staging-promote/3c619b62-23035039465 branch March 13, 2026 05:56

bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026

Merge pull request nearai#1096 from nearai/staging-promote/3c619b62-2…

03d6430

…3035039465 chore: promote staging to staging-promote/e2eb340c-22999151534 (2026-03-13 03:36 UTC)

drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026

Merge pull request nearai#1096 from nearai/staging-promote/8d3f19cc-2…

45099d8

…3035039465 chore: promote staging to staging-promote/775bb0cd-22999151534 (2026-03-13 03:36 UTC)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: promote staging to staging-promote/e2eb340c-22999151534 (2026-03-13 03:36 UTC)#1096

chore: promote staging to staging-promote/e2eb340c-22999151534 (2026-03-13 03:36 UTC)#1096
henrypark133 merged 32 commits intostaging-promote/e2eb340c-22999151534from
staging-promote/3c619b62-23035039465

ironclaw-ci bot commented Mar 13, 2026 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

ironclaw-ci bot commented Mar 13, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Auto-promotion from staging CI

Commits in this batch (35):

Current commits in this promotion (31)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

ironclaw-ci bot commented Mar 13, 2026 •

edited by github-actions bot

Loading