chore: promote staging to staging-promote/59014516-23505370929 (2026-03-24 23:13 UTC)#1627
Merged
henrypark133 merged 16 commits intostaging-promote/59014516-23505370929from Mar 25, 2026
Conversation
* Fix hosted OAuth refresh via proxy * Address OAuth refresh review feedback * Address new OAuth refresh review comments * Address additional OAuth refresh review feedback * Harden proxy exchange redirects
* fix: restore owner-scoped gateway startup * fix: split gateway owner and sender scope * fix: keep multi-user gateway sender identity * test: cover gateway sender scope regression * test: harden e2e startup teardown race * fix: align gateway owner scope across auth modes
* feat(cli): show credential auth status in `tool info`
`ironclaw tool info` now checks the secrets store and shows whether
each required credential is configured or missing, consolidated into
a single Auth section that deduplicates across http.credentials,
auth, and setup.required_secrets. Secrets already shown in Auth are
filtered from the Secrets section to avoid redundancy.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): address review feedback on tool info auth status
- Fix clippy collapsible-if by using `if let` + `&&`
- Use HashMap<String, usize> for O(1) dedup instead of HashSet + linear scan
- Add --user flag to `tool info` for checking non-default user credentials
- Show "? unknown" on secrets store errors instead of silently reporting missing
- Surface secrets store init failure via eprintln instead of silent .ok()
- Sort auth entries by secret name for deterministic output
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): only filter secrets when auth section renders, add regression test
When the secrets store fails to initialize, the Auth section is not
rendered. Previously, secret names were still filtered from the Secrets
section, causing credential names to disappear entirely. Now secrets
are only filtered when the Auth section will actually be displayed.
Adds test verifying auth secret deduplication across auth, setup, and
http.credentials sections, plus secrets store existence checks.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor(cli): extract collect_auth_secrets helper, always render Auth section
Address review feedback:
- Extract dedup logic into `collect_auth_secrets()` so the test exercises
the same code path as production (not a re-implementation)
- Always render the Auth section when auth secrets exist, showing
"? unknown" status when the secrets store is unavailable instead of
hiding credential names entirely
- Lazily init secrets store only when capabilities contain auth secrets,
avoiding spurious warnings for tools with no auth
- Add test for empty capabilities edge case
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style(cli): move HashMap/HashSet imports to top of file
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(cli): use correct tagged JSON format for credential location in test
The CredentialLocationSchema uses serde tagged enum format
({"type": "bearer"}), not a bare string ("AuthorizationBearer").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: extract AppEvent to crates/ironclaw_common SseEvent was defined in src/channels/web/types.rs but imported by 12+ modules across agent, orchestrator, worker, tools, and extensions — it had become the application-wide event protocol, not a web transport concern. Create crates/ironclaw_common as a shared workspace crate and move the enum there as AppEvent. Also move the truncate_preview utility which was similarly leaked from the web gateway into agent modules. - New crate: crates/ironclaw_common (AppEvent, truncate_preview) - Rename SseEvent → AppEvent, from_sse_event → from_app_event - web/types.rs re-exports AppEvent for internal gateway use - web/util.rs re-exports truncate_preview - Wire format unchanged (serde renames are on variants, not the enum) Aligned with the event bus direction on refactor/architectural-hardening where DomainEvent (≡ AppEvent) is wrapped in a SystemEvent envelope. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: add AppEvent::event_type() helper, deduplicate match blocks Address Gemini review: extract the variant→string match into a single method on AppEvent, replacing the duplicated 22-arm matches in sse.rs and types.rs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: rename leftover sse vars/tests to match AppEvent rename Address Copilot review: rename sse_event vars to app_event in orchestrator/api.rs and ws.rs, rename test functions from test_ws_server_from_sse_* to test_ws_server_from_app_event_*, and update stale SSE comments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: add Deserialize to AppEvent, round-trip test, fix stale comments Address zmanian review: - Add Deserialize derive to AppEvent so downstream consumers can deserialize incoming events - Add event_type_matches_serde_type_field test that round-trips every variant through serde and asserts event_type() matches the serialized "type" field — catches drift between serde renames and the manual match - Add round_trip_deserialize test for basic Serialize/Deserialize parity - Update remaining "SSE" references in comments across server.rs, manager.rs, ws_gateway_integration.rs, and worker/job.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: ensure LLM calls always end with user message (closes #763) Claude 4.6 models (claude-sonnet-4-6, claude-opus-4-6) no longer support assistant message prefill — any LLM call where the conversation ends on an assistant message is rejected with HTTP 400 "This model does not support assistant message prefill". The same root cause also triggers NEAR AI's "No user query found in messages" 400 error for the routine engine path. Two fixes: 1. src/worker/container.rs — before_llm_call() After poll_and_inject_prompt(), if no user follow-up arrived and handle_text_response() left an assistant message at the end of the conversation, inject a sentinel "Continue." user message before the next LLM call. 2. src/agent/routine_engine.rs — execute_lightweight_with_tools() Before the force_text final completion call, ensure messages end with a user-role message. Tool result messages (Role::Tool) satisfy Anthropic but not NEAR AI; assistant messages satisfy neither. Also updates the worker system prompt to instruct the agent to include the phrase "The job is complete" in its final message, so the agentic loop can detect termination reliably. Tested with claude-sonnet-4-6 and claude-opus-4-6. Workaround: ANTHROPIC_MODEL=claude-sonnet-4-20250514 (still supports prefill). * fix: broaden sentinel guard to any non-user message (per review) Gemini suggested the Role::Assistant check in before_llm_call() is too specific. Changed to !Role::User to match the routine_engine.rs fix and cover tool results too. * fix: address zmanian review — JobDelegate sentinel, shared helper, NearAI complete() flattening - Extract ensure_ends_with_user_message() to src/util.rs with 4 unit tests (empty list, after assistant, after tool result, no-op when already user) - Add sentinel guard to JobDelegate::before_llm_call() in src/worker/job.rs so scheduler jobs (CreateJob / /job path) no longer hit Claude 4.6 / NEAR AI 400s - Replace inline guards in ContainerDelegate and routine_engine.rs with the shared helper — all 3 call sites now use one implementation - Fix complete() in nearai_chat.rs to apply flatten_tool_messages when flatten_tool_messages=true — previously only complete_with_tools() flattened, so force_text paths could still send role:"tool" messages to NEAR AI - Update stale comment in container.rs: "assistant message" → "non-user message" - Add flatten tests in nearai_chat.rs covering the complete() path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: fix fmt and tar advisory --------- Co-authored-by: Jacob Lasky <jacob.lasky@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com> Co-authored-by: firat.sertgoz <f@nuff.tech>
… all surfaces (#1513) * feat(agent): thread per-tool reasoning from LLM through to REPL, HTTP, SSE, and DB Add end-to-end agent reasoning summaries so users can see *why* the agent chose specific tools, not just what it did. - Add `reasoning: Option<String>` to `ToolCall` (all providers) - Populate from LLM response content in `Reasoning::respond_with_tools` and `select_tools`, with per-tool override when providers supply it - Extend `Turn` with `narrative` and `TurnToolCall` with `rationale` + `tool_call_id` for identity-based result matching - Persist reasoning in DB via existing tool_calls JSON (no migration) - Add `StatusUpdate::ReasoningUpdate` and `SseEvent::ReasoningUpdate` + `SseEvent::JobReasoning` for real-time streaming - Emit reasoning events in both chat dispatcher and worker job path - Add `/reasoning [N|all]` command for inspecting turn reasoning - Surface `narrative` and `rationale` in HTTP `/api/chat/history` Based on the design from #361 and #456, reconstructed cleanly with Option<String> to minimize blast radius (vs mandatory String that broke compilation in #456). Closes #456 Co-Authored-By: panosAthDBX <47406510+panosAthDBX@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review feedback from Gemini and Copilot - Fix `_ => Ok(None)` in agent_loop.rs to avoid accidental shutdown - Fix fallback in record_tool_result_for/record_tool_error_for to use first pending call instead of last_mut (parallel execution safety) - Include per-tool decisions in WASM channel reasoning messages - Apply truncate_at_tool_tags + clean_response to shared_reasoning in select_tools (parity with respond_with_tools) - Persist turn-level narrative to DB in tool_calls JSON wrapper - Parse both old (array) and new (object) tool_calls formats in build_turns_from_db_messages for backward compatibility - Populate reasoning from action.reasoning in execute_plan ToolCalls [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address second round of review comments + merge fixes - Add reasoning: None to new github_copilot.rs ToolCall sites (from staging merge) - Run cargo fmt on 4 files with formatting diffs - Truncate narrative to 1000 chars before DB persistence - Clone turn data and drop session lock in /reasoning command - Extract ToolDecisionDto::from_json_array shared helper (deduplicate worker/job.rs and orchestrator/api.rs) - Add unit tests for wrapped tool_calls JSON format with narrative [skip-regression-check] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address third round of review comments (Copilot + serrrfirat) - Reword ToolCall.reasoning docstring to reflect provider-supplied or fallback contract - Sanitize narrative through SafetyLayer before storage/emission - Clean per-tool reasoning via truncate_at_tool_tags + clean_response in select_tools (parity with shared reasoning) - Convert 4 approval-path recording sites in thread_ops.rs to identity-based record_tool_result_for/record_tool_error_for - Preserve tool_call_id and reasoning through restore_from_messages - Fix has_result/has_error to reject JSON null values - Truncate tool_call_id to 128 chars before DB persistence - Add 4 unit tests for record_tool_result_for/error_for edge cases Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address zmanian review — sanitize JobDelegate reasoning + warn on dropped results - Sanitize narrative and per-tool rationale through SafetyLayer in JobDelegate reasoning events (parity with ChatDelegate) - Add tracing::warn when record_tool_result_for/error_for drops a result because no matching or pending tool call exists - Add 3 unit tests for reasoning normalization (thinking tags, tool tags, empty-after-cleaning) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 4 remaining unreplied review comments - Clean per-tool reasoning in respond_with_tools via truncate_at_tool_tags + clean_response (parity with select_tools) - Handle wrapped JSON format in rebuild_chat_messages_from_db so cold hydration works after persist_tool_calls format change - Update persist_tool_calls doc comment to describe new JSON shape - Sanitize per-tool rationale through SafetyLayer in ChatDelegate before emission and storage (parity with JobDelegate) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address zmanian review round 2 - Add tracing::debug on fallback-to-pending path in record_tool_result_for and record_tool_error_for (item 1) - Add comment explaining why /reasoning is special-cased in agent_loop.rs (item 4) - Items 2 (narrative persistence), 3 (rationale sanitization), and 5 (catch-all fix) were already addressed in prior commits Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: panosAthDBX <47406510+panosAthDBX@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix REPL single-message hang and cap CI test duration * Fix Clippy nested-if lint in REPL startup * Fix single-message approval flow * Handle empty single-message REPL exits * Wait for one-shot event routines before exit
* Fix REPL single-message hang and cap CI test duration * Fix Clippy nested-if lint in REPL startup * Fix single-message approval flow * Handle empty single-message REPL exits * Wait for one-shot event routines before exit * Fix MCP lifecycle trace user scope
* Fix REPL single-message hang and cap CI test duration * Fix Clippy nested-if lint in REPL startup * Fix single-message approval flow * Handle empty single-message REPL exits * Wait for one-shot event routines before exit * Fix MCP lifecycle trace user scope * Normalize cron schedules on routine create
…3131 chore: promote staging to staging-promote/ab0ad948-23563320113 (2026-03-25 21:37 UTC)
…0113 chore: promote staging to staging-promote/c949521d-23562109203 (2026-03-25 20:47 UTC)
…9203 chore: promote staging to staging-promote/0341fcc9-23558273569 (2026-03-25 20:19 UTC)
…3569 chore: promote staging to staging-promote/6daa2f15-23538193544 (2026-03-25 18:47 UTC)
…3544 chore: promote staging to staging-promote/82822d7b-23516534944 (2026-03-25 12:01 UTC)
492d9d2
into
staging-promote/59014516-23505370929
12 of 13 checks passed
bkutasi
pushed a commit
to bkutasi/ironclaw
that referenced
this pull request
Mar 28, 2026
…3516534944 chore: promote staging to staging-promote/59014516-23505370929 (2026-03-24 23:13 UTC)
drchirag1991
pushed a commit
to drchirag1991/ironclaw
that referenced
this pull request
Apr 8, 2026
…3516534944 chore: promote staging to staging-promote/cf800da6-23505370929 (2026-03-24 23:13 UTC)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Auto-promotion from staging CI
Batch range:
0d1a5c210b877f89bcb87e6f1d8584396d12f208..82822d7b2556a1cf29c6525d211cadd9b0a5917fPromotion branch:
staging-promote/82822d7b-23516534944Base:
staging-promote/59014516-23505370929Triggered by: Staging CI batch at 2026-03-24 23:13 UTC
Commits in this batch (39):
ironclaw hooks listsubcommand ( feat(cli): addironclaw hooks listsubcommand #1023)Current commits in this promotion (3)
Current base:
staging-promote/59014516-23505370929Current head:
staging-promote/82822d7b-23516534944Current range:
origin/staging-promote/59014516-23505370929..origin/staging-promote/82822d7b-23516534944Auto-updated by staging promotion metadata workflow
Waiting for gates:
Auto-created by staging-ci workflow