feat(advisor): advisor tool orchestration loop for non-Anthropic providers#25579
feat(advisor): advisor tool orchestration loop for non-Anthropic providers#25579ishaan-berri merged 15 commits intolitellm_ishaan_april11from
Conversation
…R_TOOL_DESCRIPTION constants
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
|
|
6b397f8 to
742e2fe
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…g exception When the advisor loop hits max_uses, inject a tool_result error so the executor sees the cap and continues without further advice — matches Anthropic server-side behaviour (error_code: max_uses_exceeded).
Greptile SummaryThis PR adds an orchestration loop for the Confidence Score: 5/5Safe to merge; only one minor dead-code cleanup remains. All prior P0/P1 concerns are resolved. The one remaining finding is a dead helper function that has no effect on correctness. All 11 unit tests and integration tests are mocked and pass; the implementation logic is sound. advisor.py — the unused _inject_max_uses_error function should be removed.
|
| Filename | Overview |
|---|---|
| litellm/llms/anthropic/experimental_pass_through/messages/interceptors/advisor.py | Core orchestration loop — AdvisorMaxIterationsError is defined and tests pass, but _inject_max_uses_error is dead code never called anywhere in the loop. |
| litellm/llms/anthropic/experimental_pass_through/messages/handler.py | Interceptor dispatch injected cleanly before the normal backend path; api_key/api_base forwarded explicitly as required. |
| litellm/llms/anthropic/common_utils.py | strip_advisor_blocks_from_messages gains replace_with_text mode; backward-compatible default, shallow copy in handle() ensures original messages are not mutated. |
| litellm/constants.py | Adds ADVISOR_NATIVE_PROVIDERS, ADVISOR_MAX_USES, ADVISOR_TOOL_DESCRIPTION constants; clean additions with no side-effects. |
| tests/test_litellm/llms/anthropic/messages/test_advisor_orchestration.py | 11 unit tests; all paths mocked — no real network calls. AdvisorMaxIterationsError import resolved. Covers can_handle, loop, max_uses, streaming, history stripping, tool translation. |
| tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_advisor_integration.py | Integration tests exercise the full anthropic_messages() dispatch path with mocked LLM sub-calls; covers native bypass and max_uses propagation. |
Sequence Diagram
sequenceDiagram
participant Caller
participant anthropic_messages
participant AdvisorOrchestrationHandler
participant Executor as Executor (e.g. openai/gpt-4.1-mini)
participant Advisor as Advisor (e.g. claude-opus-4-6)
Caller->>anthropic_messages: request(tools=[advisor_20260301], model=openai/...)
anthropic_messages->>AdvisorOrchestrationHandler: can_handle? → True
anthropic_messages->>AdvisorOrchestrationHandler: handle(...)
loop Until end_turn or max_uses exceeded
AdvisorOrchestrationHandler->>Executor: call(synthetic_advisor_tool)
alt stop_reason = tool_use (advisor called)
Executor-->>AdvisorOrchestrationHandler: tool_use(name=advisor, question=...)
AdvisorOrchestrationHandler->>Advisor: sub-call(no tools, question)
Advisor-->>AdvisorOrchestrationHandler: advice text
AdvisorOrchestrationHandler->>AdvisorOrchestrationHandler: inject tool_result into messages
else stop_reason = end_turn
Executor-->>AdvisorOrchestrationHandler: final text response
AdvisorOrchestrationHandler-->>Caller: response (or FakeStream if stream=True)
end
end
Note over AdvisorOrchestrationHandler: If iteration > max_uses → raise AdvisorMaxIterationsError
Reviews (4): Last reviewed commit: "docs(advisor): move supported providers ..." | Re-trigger Greptile
| iteration += 1 | ||
| if iteration > max_uses: | ||
| # Per Anthropic spec: inject max_uses_exceeded error result so the | ||
| # executor sees the cap and continues without further advice. | ||
| current_messages = _inject_max_uses_error( | ||
| current_messages, executor_response, advisor_use_block | ||
| ) | ||
| continue |
There was a problem hiding this comment.
Infinite loop + missing
AdvisorMaxIterationsError class
After iteration > max_uses, the code injects the error message and continues — but there is no exit condition. If the executor keeps calling the advisor tool on the next iteration, iteration is incremented again, iteration > max_uses is still true, another error is injected, and the loop runs forever. Additionally, the test test_loop_max_uses_raises imports AdvisorMaxIterationsError from this module, but the class is never defined here (or anywhere in the codebase), so that test will fail with ImportError at runtime.
The fix is to define the exception class and raise it instead of continue-ing after the cap is exceeded:
class AdvisorMaxIterationsError(Exception):
"""Raised when the advisor orchestration loop exceeds max_uses."""
pass iteration += 1
if iteration > max_uses:
raise AdvisorMaxIterationsError(
f"Advisor tool called more than max_uses={max_uses} times. "
"Increase max_uses in the advisor tool definition to allow more iterations."
)| raise ValueError( | ||
| "advisor tool definition must include a 'model' field specifying the advisor model" | ||
| ) | ||
| max_uses: int = advisor_tool.get("max_uses") or ADVISOR_MAX_USES |
There was a problem hiding this comment.
advisor_tool.get("max_uses") or ADVISOR_MAX_USES evaluates 0 as falsy, so a caller setting "max_uses": 0 to disable advisor calls gets the default limit (5) instead of an immediate cap. Use an explicit None check to preserve intent:
| max_uses: int = advisor_tool.get("max_uses") or ADVISOR_MAX_USES | |
| max_uses: int = advisor_tool.get("max_uses") if advisor_tool.get("max_uses") is not None else ADVISOR_MAX_USES |
| from litellm.llms.anthropic.experimental_pass_through.messages.interceptors.advisor import ( | ||
| AdvisorMaxIterationsError, | ||
| AdvisorOrchestrationHandler, | ||
| ) |
There was a problem hiding this comment.
Import of non-existent
AdvisorMaxIterationsError will fail
AdvisorMaxIterationsError is not defined anywhere in advisor.py or any other module in this PR. This import will raise ImportError at test runtime, making test_loop_max_uses_raises always fail. Once the class is defined in advisor.py (see the companion comment on the loop), this import will work correctly.
|
@greptile review |
…tive provider table
… litellm native loop
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Relevant issues
Fixes #25516
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirementmake test-unitCI (LiteLLM team)
Link:
Link:
Links:
Type
Changes
Implements the advisor tool loop for non-Anthropic providers. The
advisor_20260301tool lets a fast executor model (Sonnet/Haiku) consult a high-intelligence advisor (Opus) mid-generation. Anthropic handles this server-side natively; for all other providers LiteLLM now runs the orchestration loop itself.How it works
Advisor tool definition (caller sets this):
{ "type": "advisor_20260301", "name": "advisor", "model": "claude-opus-4-6", "max_uses": 5, "api_base": "optional-proxy-url", "api_key": "optional-key" }Live E2E results (gpt-4.1-mini executor + claude-opus-4-6 advisor)
Scenario 1 — Complex coding task (LRU cache)
Scenario 2 — Tricky concurrency (async bounded semaphore)
Scenario 3 — max_uses=1 cap enforced
Scenario 4 — Trivial question (advisor not called)
Tests (11 unit tests, all passing)
can_handlerouting (anthropic skips, non-anthropic fires)max_usescap — injectsmax_uses_exceedederror result, executor continuesFakeAnthropicMessagesStreamIteratorwraps final responseadvisor_tool_resultblocks cleaned before re-sendadvisor_20260301→ regular function tool