fix(engine): surface action errors to LLM with [ACTION FAILED] prefix#2326
Merged
serrrfirat merged 2 commits intostagingfrom Apr 15, 2026
Merged
fix(engine): surface action errors to LLM with [ACTION FAILED] prefix#2326serrrfirat merged 2 commits intostagingfrom
serrrfirat merged 2 commits intostagingfrom
Conversation
Contributor
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
256d28f to
c37eda5
Compare
When tools fail (e.g. "No lease for action"), the orchestrator appended the raw error JSON as an ActionResult message. The LLM frequently ignored these errors and claimed success — a trust/hallucination issue. Prefix failed action results with "[ACTION FAILED] <tool>:" so the LLM receives an unmissable signal that the tool call did not succeed. The Rust executor already sets `is_error: true` on lease and policy failures. Closes #2279 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c37eda5 to
e0c163a
Compare
henrypark133
previously approved these changes
Apr 14, 2026
Collaborator
henrypark133
left a comment
There was a problem hiding this comment.
Review: Surface action errors to LLM with [ACTION FAILED] prefix (Risk: Low)
Clean fix. When is_error is true on an ActionResult, the output is now prefixed with [ACTION FAILED] action_name: so the LLM can clearly distinguish failures from successful outputs in its next turn.
Positives:
- Python orchestrator change is minimal (2 lines) and targeted
- Rust loop_engine test
failed_action_result_is_explicit_in_next_llm_messageverifies the prefix appears in the message content - TODO comment for consecutive error tracking is well-scoped with issue reference (#2325)
LGTM.
Resolve conflict in orchestrator/default.py: combine PR's action_name in error prefix with staging's batch error/success counting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
henrypark133
approved these changes
Apr 15, 2026
Collaborator
henrypark133
left a comment
There was a problem hiding this comment.
Review: Surface action errors to LLM with [ACTION FAILED] prefix
Clean fix. The Python-side prefixing change is minimal, and the new loop_engine test verifies the next LLM turn now sees an explicit failure marker including the action name.
Residual risk is low: I only spot-checked the engine path and ran cargo test -p ironclaw_engine failed_action_result_is_explicit_in_next_llm_message locally.
This was referenced Apr 15, 2026
This was referenced Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
[ACTION FAILED] <tool>:in the Python orchestrator so the LLM can't ignore tool failures and falsely claim successTODO(#2325)for consecutive action error counting, deferred pending unified progress-tracking designContext
When tools fail (e.g., "No lease for action"), the orchestrator appended the raw error JSON as a generic
ActionResultmessage — indistinguishable from success. The LLM frequently ignored these errors and told the user "Successfully sent!" despite every tool call failing.The Rust executor already sets
is_error: trueon lease and policy failures. This change reads that flag and prefixes the message content so the error is unmissable.Error counting (failing the thread after repeated action errors) is tracked in #2325 and depends on the progress-tracking design discussed in #2240.
Closes #2279
Test plan
cargo test -p ironclaw_engine— 281 tests pass🤖 Generated with Claude Code