Skip to content

chore: promote staging to staging-promote/59014516-23505370929 (2026-03-24 23:13 UTC)#1627

Merged
henrypark133 merged 16 commits intostaging-promote/59014516-23505370929from
staging-promote/82822d7b-23516534944
Mar 25, 2026
Merged

chore: promote staging to staging-promote/59014516-23505370929 (2026-03-24 23:13 UTC)#1627
henrypark133 merged 16 commits intostaging-promote/59014516-23505370929from
staging-promote/82822d7b-23516534944

Conversation

@ironclaw-ci
Copy link
Copy Markdown
Contributor

@ironclaw-ci ironclaw-ci bot commented Mar 24, 2026

Auto-promotion from staging CI

Batch range: 0d1a5c210b877f89bcb87e6f1d8584396d12f208..82822d7b2556a1cf29c6525d211cadd9b0a5917f
Promotion branch: staging-promote/82822d7b-23516534944
Base: staging-promote/59014516-23505370929
Triggered by: Staging CI batch at 2026-03-24 23:13 UTC

Commits in this batch (39):

Current commits in this promotion (3)

Current base: staging-promote/59014516-23505370929
Current head: staging-promote/82822d7b-23516534944
Current range: origin/staging-promote/59014516-23505370929..origin/staging-promote/82822d7b-23516534944

Auto-updated by staging promotion metadata workflow

Waiting for gates:

  • Tests: pending
  • E2E: pending
  • Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

zmanian and others added 3 commits March 24, 2026 11:48
* Fix hosted OAuth refresh via proxy

* Address OAuth refresh review feedback

* Address new OAuth refresh review comments

* Address additional OAuth refresh review feedback

* Harden proxy exchange redirects
* fix: restore owner-scoped gateway startup

* fix: split gateway owner and sender scope

* fix: keep multi-user gateway sender identity

* test: cover gateway sender scope regression

* test: harden e2e startup teardown race

* fix: align gateway owner scope across auth modes
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: channel/web Web gateway channel scope: tool/wasm WASM tool sandbox scope: extensions Extension management scope: docs Documentation size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Mar 24, 2026
ilblackdragon and others added 13 commits March 24, 2026 23:01
* feat(cli): show credential auth status in `tool info`

`ironclaw tool info` now checks the secrets store and shows whether
each required credential is configured or missing, consolidated into
a single Auth section that deduplicates across http.credentials,
auth, and setup.required_secrets. Secrets already shown in Auth are
filtered from the Secrets section to avoid redundancy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(cli): address review feedback on tool info auth status

- Fix clippy collapsible-if by using `if let` + `&&`
- Use HashMap<String, usize> for O(1) dedup instead of HashSet + linear scan
- Add --user flag to `tool info` for checking non-default user credentials
- Show "? unknown" on secrets store errors instead of silently reporting missing
- Surface secrets store init failure via eprintln instead of silent .ok()
- Sort auth entries by secret name for deterministic output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(cli): only filter secrets when auth section renders, add regression test

When the secrets store fails to initialize, the Auth section is not
rendered. Previously, secret names were still filtered from the Secrets
section, causing credential names to disappear entirely. Now secrets
are only filtered when the Auth section will actually be displayed.

Adds test verifying auth secret deduplication across auth, setup, and
http.credentials sections, plus secrets store existence checks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(cli): extract collect_auth_secrets helper, always render Auth section

Address review feedback:
- Extract dedup logic into `collect_auth_secrets()` so the test exercises
  the same code path as production (not a re-implementation)
- Always render the Auth section when auth secrets exist, showing
  "? unknown" status when the secrets store is unavailable instead of
  hiding credential names entirely
- Lazily init secrets store only when capabilities contain auth secrets,
  avoiding spurious warnings for tools with no auth
- Add test for empty capabilities edge case

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(cli): move HashMap/HashSet imports to top of file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(cli): use correct tagged JSON format for credential location in test

The CredentialLocationSchema uses serde tagged enum format
({"type": "bearer"}), not a bare string ("AuthorizationBearer").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: extract AppEvent to crates/ironclaw_common

SseEvent was defined in src/channels/web/types.rs but imported by 12+
modules across agent, orchestrator, worker, tools, and extensions — it
had become the application-wide event protocol, not a web transport
concern.

Create crates/ironclaw_common as a shared workspace crate and move the
enum there as AppEvent.  Also move the truncate_preview utility which
was similarly leaked from the web gateway into agent modules.

- New crate: crates/ironclaw_common (AppEvent, truncate_preview)
- Rename SseEvent → AppEvent, from_sse_event → from_app_event
- web/types.rs re-exports AppEvent for internal gateway use
- web/util.rs re-exports truncate_preview
- Wire format unchanged (serde renames are on variants, not the enum)

Aligned with the event bus direction on refactor/architectural-hardening
where DomainEvent (≡ AppEvent) is wrapped in a SystemEvent envelope.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: add AppEvent::event_type() helper, deduplicate match blocks

Address Gemini review: extract the variant→string match into a single
method on AppEvent, replacing the duplicated 22-arm matches in sse.rs
and types.rs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: rename leftover sse vars/tests to match AppEvent rename

Address Copilot review: rename sse_event vars to app_event in
orchestrator/api.rs and ws.rs, rename test functions from
test_ws_server_from_sse_* to test_ws_server_from_app_event_*, and
update stale SSE comments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: add Deserialize to AppEvent, round-trip test, fix stale comments

Address zmanian review:
- Add Deserialize derive to AppEvent so downstream consumers can
  deserialize incoming events
- Add event_type_matches_serde_type_field test that round-trips every
  variant through serde and asserts event_type() matches the serialized
  "type" field — catches drift between serde renames and the manual match
- Add round_trip_deserialize test for basic Serialize/Deserialize parity
- Update remaining "SSE" references in comments across server.rs,
  manager.rs, ws_gateway_integration.rs, and worker/job.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: ensure LLM calls always end with user message (closes #763)

Claude 4.6 models (claude-sonnet-4-6, claude-opus-4-6) no longer support
assistant message prefill — any LLM call where the conversation ends on an
assistant message is rejected with HTTP 400 "This model does not support
assistant message prefill".

The same root cause also triggers NEAR AI's "No user query found in messages"
400 error for the routine engine path.

Two fixes:

1. src/worker/container.rs — before_llm_call()
   After poll_and_inject_prompt(), if no user follow-up arrived and
   handle_text_response() left an assistant message at the end of the
   conversation, inject a sentinel "Continue." user message before
   the next LLM call.

2. src/agent/routine_engine.rs — execute_lightweight_with_tools()
   Before the force_text final completion call, ensure messages end
   with a user-role message. Tool result messages (Role::Tool) satisfy
   Anthropic but not NEAR AI; assistant messages satisfy neither.

Also updates the worker system prompt to instruct the agent to include
the phrase "The job is complete" in its final message, so the agentic
loop can detect termination reliably.

Tested with claude-sonnet-4-6 and claude-opus-4-6.
Workaround: ANTHROPIC_MODEL=claude-sonnet-4-20250514 (still supports prefill).

* fix: broaden sentinel guard to any non-user message (per review)

Gemini suggested the Role::Assistant check in before_llm_call() is too
specific. Changed to !Role::User to match the routine_engine.rs fix and
cover tool results too.

* fix: address zmanian review — JobDelegate sentinel, shared helper, NearAI complete() flattening

- Extract ensure_ends_with_user_message() to src/util.rs with 4 unit tests
  (empty list, after assistant, after tool result, no-op when already user)
- Add sentinel guard to JobDelegate::before_llm_call() in src/worker/job.rs
  so scheduler jobs (CreateJob / /job path) no longer hit Claude 4.6 / NEAR AI 400s
- Replace inline guards in ContainerDelegate and routine_engine.rs with the
  shared helper — all 3 call sites now use one implementation
- Fix complete() in nearai_chat.rs to apply flatten_tool_messages when
  flatten_tool_messages=true — previously only complete_with_tools() flattened,
  so force_text paths could still send role:"tool" messages to NEAR AI
- Update stale comment in container.rs: "assistant message" → "non-user message"
- Add flatten tests in nearai_chat.rs covering the complete() path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* ci: fix fmt and tar advisory

---------

Co-authored-by: Jacob Lasky <jacob.lasky@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com>
Co-authored-by: firat.sertgoz <f@nuff.tech>
… all surfaces (#1513)

* feat(agent): thread per-tool reasoning from LLM through to REPL, HTTP, SSE, and DB

Add end-to-end agent reasoning summaries so users can see *why* the
agent chose specific tools, not just what it did.

- Add `reasoning: Option<String>` to `ToolCall` (all providers)
- Populate from LLM response content in `Reasoning::respond_with_tools`
  and `select_tools`, with per-tool override when providers supply it
- Extend `Turn` with `narrative` and `TurnToolCall` with `rationale` +
  `tool_call_id` for identity-based result matching
- Persist reasoning in DB via existing tool_calls JSON (no migration)
- Add `StatusUpdate::ReasoningUpdate` and `SseEvent::ReasoningUpdate` +
  `SseEvent::JobReasoning` for real-time streaming
- Emit reasoning events in both chat dispatcher and worker job path
- Add `/reasoning [N|all]` command for inspecting turn reasoning
- Surface `narrative` and `rationale` in HTTP `/api/chat/history`

Based on the design from #361 and #456, reconstructed cleanly with
Option<String> to minimize blast radius (vs mandatory String that broke
compilation in #456).

Closes #456

Co-Authored-By: panosAthDBX <47406510+panosAthDBX@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review feedback from Gemini and Copilot

- Fix `_ => Ok(None)` in agent_loop.rs to avoid accidental shutdown
- Fix fallback in record_tool_result_for/record_tool_error_for to use
  first pending call instead of last_mut (parallel execution safety)
- Include per-tool decisions in WASM channel reasoning messages
- Apply truncate_at_tool_tags + clean_response to shared_reasoning in
  select_tools (parity with respond_with_tools)
- Persist turn-level narrative to DB in tool_calls JSON wrapper
- Parse both old (array) and new (object) tool_calls formats in
  build_turns_from_db_messages for backward compatibility
- Populate reasoning from action.reasoning in execute_plan ToolCalls

[skip-regression-check]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address second round of review comments + merge fixes

- Add reasoning: None to new github_copilot.rs ToolCall sites (from staging merge)
- Run cargo fmt on 4 files with formatting diffs
- Truncate narrative to 1000 chars before DB persistence
- Clone turn data and drop session lock in /reasoning command
- Extract ToolDecisionDto::from_json_array shared helper (deduplicate
  worker/job.rs and orchestrator/api.rs)
- Add unit tests for wrapped tool_calls JSON format with narrative

[skip-regression-check]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address third round of review comments (Copilot + serrrfirat)

- Reword ToolCall.reasoning docstring to reflect provider-supplied or
  fallback contract
- Sanitize narrative through SafetyLayer before storage/emission
- Clean per-tool reasoning via truncate_at_tool_tags + clean_response
  in select_tools (parity with shared reasoning)
- Convert 4 approval-path recording sites in thread_ops.rs to
  identity-based record_tool_result_for/record_tool_error_for
- Preserve tool_call_id and reasoning through restore_from_messages
- Fix has_result/has_error to reject JSON null values
- Truncate tool_call_id to 128 chars before DB persistence
- Add 4 unit tests for record_tool_result_for/error_for edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address zmanian review — sanitize JobDelegate reasoning + warn on dropped results

- Sanitize narrative and per-tool rationale through SafetyLayer in
  JobDelegate reasoning events (parity with ChatDelegate)
- Add tracing::warn when record_tool_result_for/error_for drops a
  result because no matching or pending tool call exists
- Add 3 unit tests for reasoning normalization (thinking tags,
  tool tags, empty-after-cleaning)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address 4 remaining unreplied review comments

- Clean per-tool reasoning in respond_with_tools via truncate_at_tool_tags
  + clean_response (parity with select_tools)
- Handle wrapped JSON format in rebuild_chat_messages_from_db so cold
  hydration works after persist_tool_calls format change
- Update persist_tool_calls doc comment to describe new JSON shape
- Sanitize per-tool rationale through SafetyLayer in ChatDelegate before
  emission and storage (parity with JobDelegate)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address zmanian review round 2

- Add tracing::debug on fallback-to-pending path in record_tool_result_for
  and record_tool_error_for (item 1)
- Add comment explaining why /reasoning is special-cased in agent_loop.rs
  (item 4)
- Items 2 (narrative persistence), 3 (rationale sanitization), and 5
  (catch-all fix) were already addressed in prior commits

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: panosAthDBX <47406510+panosAthDBX@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix REPL single-message hang and cap CI test duration

* Fix Clippy nested-if lint in REPL startup

* Fix single-message approval flow

* Handle empty single-message REPL exits

* Wait for one-shot event routines before exit
* Fix REPL single-message hang and cap CI test duration

* Fix Clippy nested-if lint in REPL startup

* Fix single-message approval flow

* Handle empty single-message REPL exits

* Wait for one-shot event routines before exit

* Fix MCP lifecycle trace user scope
* Fix REPL single-message hang and cap CI test duration

* Fix Clippy nested-if lint in REPL startup

* Fix single-message approval flow

* Handle empty single-message REPL exits

* Wait for one-shot event routines before exit

* Fix MCP lifecycle trace user scope

* Normalize cron schedules on routine create
…3131

chore: promote staging to staging-promote/ab0ad948-23563320113 (2026-03-25 21:37 UTC)
…0113

chore: promote staging to staging-promote/c949521d-23562109203 (2026-03-25 20:47 UTC)
…9203

chore: promote staging to staging-promote/0341fcc9-23558273569 (2026-03-25 20:19 UTC)
…3569

chore: promote staging to staging-promote/6daa2f15-23538193544 (2026-03-25 18:47 UTC)
…3544

chore: promote staging to staging-promote/82822d7b-23516534944 (2026-03-25 12:01 UTC)
@henrypark133 henrypark133 merged commit 492d9d2 into staging-promote/59014516-23505370929 Mar 25, 2026
12 of 13 checks passed
@henrypark133 henrypark133 deleted the staging-promote/82822d7b-23516534944 branch March 25, 2026 22:13
@github-actions github-actions bot added the scope: channel Channel infrastructure label Mar 25, 2026
@github-actions github-actions bot added scope: channel/wasm WASM channel runtime scope: tool Tool infrastructure scope: tool/builtin Built-in tools scope: llm LLM integration scope: workspace Persistent memory / workspace scope: orchestrator Container orchestrator scope: worker Container worker scope: ci CI/CD workflows scope: dependencies Dependency updates labels Mar 25, 2026
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
…3516534944

chore: promote staging to staging-promote/59014516-23505370929 (2026-03-24 23:13 UTC)
drchirag1991 pushed a commit to drchirag1991/ironclaw that referenced this pull request Apr 8, 2026
…3516534944

chore: promote staging to staging-promote/cf800da6-23505370929 (2026-03-24 23:13 UTC)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: channel/wasm WASM channel runtime scope: channel/web Web gateway channel scope: channel Channel infrastructure scope: ci CI/CD workflows scope: dependencies Dependency updates scope: docs Documentation scope: extensions Extension management scope: llm LLM integration scope: orchestrator Container orchestrator scope: tool/builtin Built-in tools scope: tool/wasm WASM tool sandbox scope: tool Tool infrastructure scope: worker Container worker scope: workspace Persistent memory / workspace size: XL 500+ changed lines staging-promotion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants