fix: restore issue-2402 v2 gate resume and action alias consistency#2458
fix: restore issue-2402 v2 gate resume and action alias consistency#2458serrrfirat merged 7 commits intostagingfrom
Conversation
Keep lease preflight, policy, and consumption consistent for hyphen/underscore action aliases so installed tools do not fail mid-turn after being allowed. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Resolve or synthesize the original action call id when resuming auth or external callback gates so resumed ActionResult messages remain correctly paired with the waiting assistant call. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
There was a problem hiding this comment.
Code Review
This pull request introduces normalization for action name aliases, allowing both hyphens and underscores to be used interchangeably when checking granted actions. It also refactors the resolution of call IDs for pending gates into a shared helper function, ensuring that historical call IDs are correctly recovered or synthesized when resuming threads. This fix prevents correlation issues for the LLM when processing action results after a gate resolution. New tests were added to verify alias normalization and call ID repair logic. I have no feedback to provide.
Add a higher-fidelity v2 gate integration regression that proves an install auth resume can flow directly into an aliased follow-up tool call and still complete the thread instead of stalling. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Resolve structured preflight action definitions with the same hyphen/underscore alias semantics as lease matching so aliased calls cannot bypass approval or deny policies. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
henrypark133
left a comment
There was a problem hiding this comment.
Review: Gate resume + action alias consistency (Risk: Medium)
Clean fix that correctly addresses the gate resume call_id corruption and action name alias inconsistency.
Positives:
action_name_matches()provides clean bidirectional hyphen/underscore normalizationGrantedActions::covers()updated with same normalization — transitively fixesfind_and_consumeresolved_or_synthetic_call_id_for_pending_actionfixes bothresume_outputpaths inresolve_gate- Comprehensive test suite: 4 new tests including full install→auth→aliased-follow-up integration test
aliased_action_name_still_triggers_policy_approvalverifies policy enforcement isn't bypassed
Convention notes:
- The new helper loads the thread from store for call_id resolution — minor extra DB read but necessary since the resume_output paths don't have the thread pre-loaded
- Test structure follows the "test through the caller" pattern well
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
standardtoaster
left a comment
There was a problem hiding this comment.
One question on resolved_call_id_for_pending_action (router.rs:189-197): the resolved_ids set is built from thread.messages, but in production the orchestrator writes ActionResult messages to thread.internal_messages via sync_runtime_state. The test uses thread.add_message() which goes to messages, so it passes, but the production resolved_ids set may always be empty.
This probably doesn't matter in practice since new gates always have call_id set (line 185 short-circuits), but if the legacy fallback path is meant to work, it should scan internal_messages instead of messages. Is this a known limitation?
The `resolved_call_id_for_pending_action` legacy fallback scanned only `thread.messages`, but in production the orchestrator writes ActionResult and assistant messages to `thread.internal_messages` via `sync_runtime_state`. This meant the `resolved_ids` set was always empty and the fallback never found a match, silently falling through to a synthetic id. Scan both `messages` and `internal_messages` so the legacy path works correctly for orchestrator-driven threads. Addresses review feedback from @standardtoaster on #2458. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
serrrfirat
left a comment
There was a problem hiding this comment.
Good catch @standardtoaster — you're right. resolved_call_id_for_pending_action was only scanning thread.messages, but in production the orchestrator writes ActionResult (and assistant-with-actions) messages to thread.internal_messages via sync_runtime_state. The resolved_ids set was always empty and the fallback never matched, silently falling through to a synthetic id.
Fixed in 893e319: the function now chains both messages and internal_messages iterators. Added a regression test (resolved_call_id_legacy_fallback_scans_internal_messages) that puts messages in internal_messages only — this test would have failed before the fix.
henrypark133
left a comment
There was a problem hiding this comment.
Review: Gate resume + action alias consistency
No verified findings in the current diff. I rechecked the prior internal_messages concern and the fix now scans both visible and internal transcripts before synthesizing a fallback call id. The added engine-v2 coverage also exercises the auth-resume plus aliased follow-up path that previously risked hanging.
standardtoaster
left a comment
There was a problem hiding this comment.
LGTM. I'll move persist_v2_tool_calls into the Completed match arm on #2452 so it doesn't fire during GatePaused.
Summary
ActionResultcall pairing whenpending.call_idwas empty or stale, then fix the resume branches to resolve or synthesize the correct correlator before resuming the threadcargo check -p ironclaw_engine -p ironclawValidation