Skip to content

fix: restore issue-2402 v2 gate resume and action alias consistency#2458

Merged
serrrfirat merged 7 commits intostagingfrom
sisyphus/issue-2402-staging
Apr 16, 2026
Merged

fix: restore issue-2402 v2 gate resume and action alias consistency#2458
serrrfirat merged 7 commits intostagingfrom
sisyphus/issue-2402-staging

Conversation

@serrrfirat
Copy link
Copy Markdown
Collaborator

Summary

  • add a staging regression for auth/external gate resumes that proved v2 could lose ActionResult call pairing when pending.call_id was empty or stale, then fix the resume branches to resolve or synthesize the correct correlator before resuming the thread
  • add a staging regression for hyphen/underscore action aliases in the real structured executor path, then make granted-action matching consistent so lease preflight, policy, and consumption agree on aliased action names
  • validate the fixes with targeted staging regressions, adjacent router/capability tests, and cargo check -p ironclaw_engine -p ironclaw

Validation

  • cargo test -p ironclaw_engine alias_normalization_stays_consistent_between_preflight_and_consume
  • cargo test -p ironclaw covers_action
  • cargo test -p ironclaw resolve_gate_repairs_call_id_for_resume_output_auth_resume
  • cargo test -p ironclaw resolved_call_id_
  • cargo test -p ironclaw handle_with_engine_reemits_approval_status_for_pending_gate
  • cargo check -p ironclaw_engine -p ironclaw

serrrfirat and others added 2 commits April 14, 2026 15:20
Keep lease preflight, policy, and consumption consistent for hyphen/underscore action aliases so installed tools do not fail mid-turn after being allowed.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Resolve or synthesize the original action call id when resuming auth or external callback gates so resumed ActionResult messages remain correctly paired with the waiting assistant call.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@github-actions github-actions Bot added size: L 200-499 changed lines risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels Apr 14, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces normalization for action name aliases, allowing both hyphens and underscores to be used interchangeably when checking granted actions. It also refactors the resolution of call IDs for pending gates into a shared helper function, ensuring that historical call IDs are correctly recovered or synthesized when resuming threads. This fix prevents correlation issues for the LLM when processing action results after a gate resolution. New tests were added to verify alias normalization and call ID repair logic. I have no feedback to provide.

Add a higher-fidelity v2 gate integration regression that proves an install auth resume can flow directly into an aliased follow-up tool call and still complete the thread instead of stalling.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
@github-actions github-actions Bot added size: XL 500+ changed lines and removed size: L 200-499 changed lines labels Apr 14, 2026
@serrrfirat serrrfirat linked an issue Apr 14, 2026 that may be closed by this pull request
3 tasks
Resolve structured preflight action definitions with the same hyphen/underscore alias semantics as lease matching so aliased calls cannot bypass approval or deny policies.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
henrypark133
henrypark133 previously approved these changes Apr 14, 2026
Copy link
Copy Markdown
Collaborator

@henrypark133 henrypark133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Gate resume + action alias consistency (Risk: Medium)

Clean fix that correctly addresses the gate resume call_id corruption and action name alias inconsistency.

Positives:

  • action_name_matches() provides clean bidirectional hyphen/underscore normalization
  • GrantedActions::covers() updated with same normalization — transitively fixes find_and_consume
  • resolved_or_synthetic_call_id_for_pending_action fixes both resume_output paths in resolve_gate
  • Comprehensive test suite: 4 new tests including full install→auth→aliased-follow-up integration test
  • aliased_action_name_still_triggers_policy_approval verifies policy enforcement isn't bypassed

Convention notes:

  • The new helper loads the thread from store for call_id resolution — minor extra DB read but necessary since the resume_output paths don't have the thread pre-loaded
  • Test structure follows the "test through the caller" pattern well

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added scope: agent Agent core (agent loop, router, scheduler) risk: medium Business logic, config, or moderate-risk modules and removed risk: low Changes to docs, tests, or low-risk modules labels Apr 14, 2026
Copy link
Copy Markdown
Contributor

@standardtoaster standardtoaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question on resolved_call_id_for_pending_action (router.rs:189-197): the resolved_ids set is built from thread.messages, but in production the orchestrator writes ActionResult messages to thread.internal_messages via sync_runtime_state. The test uses thread.add_message() which goes to messages, so it passes, but the production resolved_ids set may always be empty.

This probably doesn't matter in practice since new gates always have call_id set (line 185 short-circuits), but if the legacy fallback path is meant to work, it should scan internal_messages instead of messages. Is this a known limitation?

The `resolved_call_id_for_pending_action` legacy fallback scanned only
`thread.messages`, but in production the orchestrator writes ActionResult
and assistant messages to `thread.internal_messages` via
`sync_runtime_state`. This meant the `resolved_ids` set was always empty
and the fallback never found a match, silently falling through to a
synthetic id.

Scan both `messages` and `internal_messages` so the legacy path works
correctly for orchestrator-driven threads.

Addresses review feedback from @standardtoaster on #2458.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator Author

@serrrfirat serrrfirat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @standardtoaster — you're right. resolved_call_id_for_pending_action was only scanning thread.messages, but in production the orchestrator writes ActionResult (and assistant-with-actions) messages to thread.internal_messages via sync_runtime_state. The resolved_ids set was always empty and the fallback never matched, silently falling through to a synthetic id.

Fixed in 893e319: the function now chains both messages and internal_messages iterators. Added a regression test (resolved_call_id_legacy_fallback_scans_internal_messages) that puts messages in internal_messages only — this test would have failed before the fix.

@github-actions github-actions Bot added size: L 200-499 changed lines and removed size: XL 500+ changed lines labels Apr 15, 2026
Copy link
Copy Markdown
Collaborator

@henrypark133 henrypark133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Gate resume + action alias consistency

No verified findings in the current diff. I rechecked the prior internal_messages concern and the fix now scans both visible and internal transcripts before synthesizing a fallback call id. The added engine-v2 coverage also exercises the auth-resume plus aliased follow-up path that previously risked hanging.

Copy link
Copy Markdown
Contributor

@standardtoaster standardtoaster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll move persist_v2_tool_calls into the Completed match arm on #2452 so it doesn't fire during GatePaused.

@serrrfirat serrrfirat merged commit 853b6a5 into staging Apr 16, 2026
15 checks passed
@serrrfirat serrrfirat deleted the sisyphus/issue-2402-staging branch April 16, 2026 12:11
@henrypark133 henrypark133 mentioned this pull request Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) size: L 200-499 changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[QA] Bot enters infinite "Calling LLM" loop after tool operations

4 participants