feat(advisor): advisor tool orchestration loop for non-Anthropic providers by ishaan-berri · Pull Request #25579 · BerriAI/litellm

ishaan-berri · 2026-04-12T00:43:57Z

Relevant issues

Fixes #25516

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

Bug fix
New feature

Changes

Implements the advisor tool loop for non-Anthropic providers. The advisor_20260301 tool lets a fast executor model (Sonnet/Haiku) consult a high-intelligence advisor (Opus) mid-generation. Anthropic handles this server-side natively; for all other providers LiteLLM now runs the orchestration loop itself.

How it works

User /messages request (OpenAI/Bedrock/Vertex/any non-Anthropic provider)
        │
        ▼
MessagesInterceptor registry
  └── AdvisorOrchestrationHandler.can_handle()
        ├── tools contains advisor_20260301?  YES
        └── provider is non-native?           YES  ──► intercept
                                               NO   ──► pass through to provider
        │
        ▼  (intercept path)
Strip advisor_20260301 from tools,
replace with synthetic function tool
        │
        ▼
┌─────────────────────────────────────────────────────────┐
│  Orchestration loop                                     │
│                                                         │
│   ┌─────────────────────────────────────────┐          │
│   │  EXECUTOR CALL  (non-streaming)         │          │
│   │  model: e.g. openai/gpt-4.1-mini        │          │
│   │  tools: [synthetic advisor fn tool, ...]│          │
│   └─────────────────┬───────────────────────┘          │
│                     │                                   │
│          ┌──────────┴──────────┐                       │
│          │ stop_reason?        │                       │
│          │                     │                       │
│    tool_use(advisor)      end_turn / other             │
│          │                     │                       │
│          ▼                     ▼                       │
│   ┌─────────────┐      ◄── EXIT LOOP                  │
│   │ ADVISOR     │          return final response        │
│   │ SUB-CALL    │          (or wrap in FakeStream)      │
│   │ model:      │                                       │
│   │  opus-4-6   │                                       │
│   │ no tools    │                                       │
│   └──────┬──────┘                                       │
│          │ advice text                                  │
│          ▼                                              │
│   inject tool_result into messages                      │
│   (or max_uses_exceeded error if cap hit)               │
│          │                                              │
│          └──────────────────────► repeat               │
└─────────────────────────────────────────────────────────┘

Advisor tool definition (caller sets this):

{
  "type": "advisor_20260301",
  "name": "advisor",
  "model": "claude-opus-4-6",
  "max_uses": 5,
  "api_base": "optional-proxy-url",
  "api_key":  "optional-key"
}

Live E2E results (gpt-4.1-mini executor + claude-opus-4-6 advisor)

Scenario 1 — Complex coding task (LRU cache)

USER: Implement a Python LRU Cache class with O(1) get and put.
      Use the advisor tool before you start writing code.

[EXECUTOR → LiteLLM]
  stop_reason: tool_use  (1.0s)
  TOOL_USE advisor — "What is the best way to implement a Python LRU Cache
  class that supports O(1) get and put? Outline the data structures involved."

[ADVISOR SUB-CALL → claude-opus-4-6]  (15.2s)
  Advice: Use a hash map + doubly linked list with dummy head/tail nodes.
  dict gives O(1) lookup; linked list gives O(1) move-to-front and eviction.

[EXECUTOR → LiteLLM]  (7.0s)
  stop_reason: max_tokens
  TEXT: "I have gathered the recommended approach... Now I will implement..."
  → class Node with __slots__; class LRUCache with get/put in O(1)

FINAL: 23.1s total | 1 advisor call | 0 advisor blocks in output ✓

Scenario 2 — Tricky concurrency (async bounded semaphore)

USER: Write a thread-safe Python bounded semaphore supporting async context
      managers and a non-blocking tryacquire(). Ask the advisor first.

[EXECUTOR → LiteLLM]
  stop_reason: tool_use  (0.9s)
  TOOL_USE advisor — "How to implement a thread-safe Python bounded semaphore
  with async context managers and non-blocking try_acquire()?"

[ADVISOR SUB-CALL → claude-opus-4-6]  (16.8s)
  Advice: Use asyncio.Lock for internal state; threading.Lock for cross-thread
  safety; FIFO deque for waiters; loop.call_soon_threadsafe for thread→async bridge.

[EXECUTOR → LiteLLM]  (6.5s)
  stop_reason: max_tokens
  TEXT: "I have consulted an advisor... Here is the complete implementation:"
  → class AsyncBoundedSemaphore with acquire/release/__aenter__/__aexit__/tryacquire

FINAL: 24.3s total | 1 advisor call | 0 advisor blocks in output ✓

Scenario 3 — max_uses=1 cap enforced

USER: Design a Python priority queue backed by a Fibonacci heap.
      Use the advisor as many times as you want.  (max_uses=1 set by caller)

[EXECUTOR → LiteLLM]
  stop_reason: tool_use  (0.8s)  → calls advisor (use 1/1)

[ADVISOR SUB-CALL → claude-opus-4-6]  (11.9s)
  Advice: Fibonacci heap with circular doubly-linked root list;
  O(1) insert/find-min, O(log n) amortized extract-min, O(1) decrease-key.

[EXECUTOR → LiteLLM]
  stop_reason: tool_use  (0.6s)  → tries to call advisor again

[LiteLLM injects error tool_result — no sub-call made]
  content: "Advisor unavailable: max_uses limit reached. Continue without advisor."

[EXECUTOR → LiteLLM]  (6.1s)
  stop_reason: max_tokens
  TEXT: "Here is the continuation... class FibonacciNode / class FibonacciHeap..."

FINAL: 19.4s total | 1 advisor call (cap respected) | 0 advisor blocks ✓

Scenario 4 — Trivial question (advisor not called)

USER: What does list.append() do in Python? One sentence.

[EXECUTOR → LiteLLM]
  stop_reason: end_turn  (0.5s)
  TEXT: "list.append() in Python adds a single element to the end of a list."

FINAL: 0.5s total | 0 advisor calls | clean passthrough ✓

Tests (11 unit tests, all passing)

can_handle routing (anthropic skips, non-anthropic fires)
Orchestration loop with mocked backends (3 iterations, advisor injected correctly)
max_uses cap — injects max_uses_exceeded error result, executor continues
Streaming — FakeAnthropicMessagesStreamIterator wraps final response
History stripping — prior advisor_tool_result blocks cleaned before re-send
Synthetic tool translation — advisor_20260301 → regular function tool

========================= 11 passed in 13.34s =========================

…R_TOOL_DESCRIPTION constants

…om_messages

…ges handler

…iders

…request hooks

…, 8 tests)

gitguardian · 2026-04-12T00:44:02Z

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.

^{_{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}}

CLAassistant · 2026-04-12T00:44:04Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

…eal proxy

codspeed-hq · 2026-04-12T00:47:01Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing feat/anthropic-advisor-tool (dd87f3b) with main (f74d626)}

codecov · 2026-04-12T00:51:51Z

Codecov Report

❌ Patch coverage is 91.60305% with 11 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...ntal_pass_through/messages/interceptors/advisor.py	89.65%	9 Missing ⚠️
litellm/llms/anthropic/common_utils.py	92.30%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

…g exception When the advisor loop hits max_uses, inject a tool_result error so the executor sees the cap and continues without further advice — matches Anthropic server-side behaviour (error_code: max_uses_exceeded).

greptile-apps · 2026-04-12T01:07:40Z

Greptile Summary

This PR adds an orchestration loop for the advisor_20260301 tool on non-Anthropic providers (OpenAI, Bedrock, Vertex, etc.), intercepting the /messages path, translating the advisor tool to a regular function tool, and running an executor→advisor→executor loop until the model finishes or max_uses is exceeded. Previously flagged issues (AdvisorMaxIterationsError not defined, max_uses=0 falsy coercion) are fully resolved in this revision.

Confidence Score: 5/5

Safe to merge; only one minor dead-code cleanup remains.

All prior P0/P1 concerns are resolved. The one remaining finding is a dead helper function that has no effect on correctness. All 11 unit tests and integration tests are mocked and pass; the implementation logic is sound.

advisor.py — the unused _inject_max_uses_error function should be removed.

Important Files Changed

Filename	Overview
litellm/llms/anthropic/experimental_pass_through/messages/interceptors/advisor.py	Core orchestration loop — AdvisorMaxIterationsError is defined and tests pass, but _inject_max_uses_error is dead code never called anywhere in the loop.
litellm/llms/anthropic/experimental_pass_through/messages/handler.py	Interceptor dispatch injected cleanly before the normal backend path; api_key/api_base forwarded explicitly as required.
litellm/llms/anthropic/common_utils.py	strip_advisor_blocks_from_messages gains replace_with_text mode; backward-compatible default, shallow copy in handle() ensures original messages are not mutated.
litellm/constants.py	Adds ADVISOR_NATIVE_PROVIDERS, ADVISOR_MAX_USES, ADVISOR_TOOL_DESCRIPTION constants; clean additions with no side-effects.
tests/test_litellm/llms/anthropic/messages/test_advisor_orchestration.py	11 unit tests; all paths mocked — no real network calls. AdvisorMaxIterationsError import resolved. Covers can_handle, loop, max_uses, streaming, history stripping, tool translation.
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_advisor_integration.py	Integration tests exercise the full anthropic_messages() dispatch path with mocked LLM sub-calls; covers native bypass and max_uses propagation.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant anthropic_messages
    participant AdvisorOrchestrationHandler
    participant Executor as Executor (e.g. openai/gpt-4.1-mini)
    participant Advisor as Advisor (e.g. claude-opus-4-6)

    Caller->>anthropic_messages: request(tools=[advisor_20260301], model=openai/...)
    anthropic_messages->>AdvisorOrchestrationHandler: can_handle? → True
    anthropic_messages->>AdvisorOrchestrationHandler: handle(...)

    loop Until end_turn or max_uses exceeded
        AdvisorOrchestrationHandler->>Executor: call(synthetic_advisor_tool)
        alt stop_reason = tool_use (advisor called)
            Executor-->>AdvisorOrchestrationHandler: tool_use(name=advisor, question=...)
            AdvisorOrchestrationHandler->>Advisor: sub-call(no tools, question)
            Advisor-->>AdvisorOrchestrationHandler: advice text
            AdvisorOrchestrationHandler->>AdvisorOrchestrationHandler: inject tool_result into messages
        else stop_reason = end_turn
            Executor-->>AdvisorOrchestrationHandler: final text response
            AdvisorOrchestrationHandler-->>Caller: response (or FakeStream if stream=True)
        end
    end

    Note over AdvisorOrchestrationHandler: If iteration > max_uses → raise AdvisorMaxIterationsError

_{Reviews (4): Last reviewed commit: "docs(advisor): move supported providers ..." | Re-trigger Greptile}

greptile-apps · 2026-04-12T01:07:44Z

+            iteration += 1
+            if iteration > max_uses:
+                # Per Anthropic spec: inject max_uses_exceeded error result so the
+                # executor sees the cap and continues without further advice.
+                current_messages = _inject_max_uses_error(
+                    current_messages, executor_response, advisor_use_block
+                )
+                continue


Infinite loop + missing AdvisorMaxIterationsError class

After iteration > max_uses, the code injects the error message and continues — but there is no exit condition. If the executor keeps calling the advisor tool on the next iteration, iteration is incremented again, iteration > max_uses is still true, another error is injected, and the loop runs forever. Additionally, the test test_loop_max_uses_raises imports AdvisorMaxIterationsError from this module, but the class is never defined here (or anywhere in the codebase), so that test will fail with ImportError at runtime.

The fix is to define the exception class and raise it instead of continue-ing after the cap is exceeded:

class AdvisorMaxIterationsError(Exception): """Raised when the advisor orchestration loop exceeds max_uses.""" pass

iteration += 1 if iteration > max_uses: raise AdvisorMaxIterationsError( f"Advisor tool called more than max_uses={max_uses} times. " "Increase max_uses in the advisor tool definition to allow more iterations." )

greptile-apps · 2026-04-12T01:07:45Z

+            raise ValueError(
+                "advisor tool definition must include a 'model' field specifying the advisor model"
+            )
+        max_uses: int = advisor_tool.get("max_uses") or ADVISOR_MAX_USES


max_uses=0 silently ignored

advisor_tool.get("max_uses") or ADVISOR_MAX_USES evaluates 0 as falsy, so a caller setting "max_uses": 0 to disable advisor calls gets the default limit (5) instead of an immediate cap. Use an explicit None check to preserve intent:

Suggested change

max_uses: int = advisor_tool.get("max_uses") or ADVISOR_MAX_USES

max_uses: int = advisor_tool.get("max_uses") if advisor_tool.get("max_uses") is not None else ADVISOR_MAX_USES

greptile-apps · 2026-04-12T01:07:46Z

+    from litellm.llms.anthropic.experimental_pass_through.messages.interceptors.advisor import (
+        AdvisorMaxIterationsError,
+        AdvisorOrchestrationHandler,
+    )


Import of non-existent AdvisorMaxIterationsError will fail

AdvisorMaxIterationsError is not defined anywhere in advisor.py or any other module in this PR. This import will raise ImportError at test runtime, making test_loop_max_uses_raises always fail. Once the class is defined in advisor.py (see the companion comment on the loop), this import will work correctly.

…x_uses=0 falsy

…fallback

…, provider bypass

ishaan-berri · 2026-04-12T01:22:59Z

@greptile review

…tive provider table

… litellm native loop

vercel · 2026-04-12T01:27:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Apr 12, 2026 1:29am

ishaan-berri added 7 commits April 11, 2026 17:43

feat(advisor): add ADVISOR_MAX_USES, ADVISOR_NATIVE_PROVIDERS, ADVISO…

23e20fa

…R_TOOL_DESCRIPTION constants

feat(advisor): add replace_with_text param to strip_advisor_blocks_fr…

a89b067

…om_messages

feat(advisor): wire MessagesInterceptor registry into anthropic_messa…

ebc57a1

…ges handler

feat(advisor): add MessagesInterceptor ABC and registry

ea765f7

feat(advisor): add AdvisorOrchestrationHandler for non-Anthropic prov…

c12ebdb

…iders

docs(advisor): add interceptors README explaining when to use vs pre-…

b92e4c5

…request hooks

test(advisor): add unit tests for orchestration loop (mocked backends…

ce3d039

…, 8 tests)

test(advisor): add live e2e tests for advisor orchestration against r…

742e2fe

…eal proxy

ishaan-berri force-pushed the feat/anthropic-advisor-tool branch from 6b397f8 to 742e2fe Compare April 12, 2026 00:48

ishaan-berri added 2 commits April 11, 2026 17:52

test(advisor): remove live e2e test file (tests run locally via script)

844e34b

greptile-apps bot reviewed Apr 12, 2026

View reviewed changes

ishaan-berri added 3 commits April 11, 2026 18:16

fix(advisor): restore AdvisorMaxIterationsError, raise on cap, fix ma…

22f45c6

…x_uses=0 falsy

test(advisor): add unit tests for max_uses=0, missing model, default …

fa52584

…fallback

test(advisor): add integration tests for full dispatch path, max_uses…

9be7b4c

…, provider bypass

ishaan-berri added 2 commits April 11, 2026 18:23

docs(advisor): add how it works section with mermaid diagram + non-na…

a8bc7bf

…tive provider table

docs(advisor): move supported providers to top, focus how it works on…

dd87f3b

… litellm native loop

vercel bot deployed to Preview April 12, 2026 01:29 View deployment

ishaan-berri changed the base branch from main to litellm_ishaan_april11 April 12, 2026 01:32

ishaan-berri merged commit 329a526 into litellm_ishaan_april11 Apr 12, 2026
47 of 48 checks passed

ishaan-berri deleted the feat/anthropic-advisor-tool branch April 12, 2026 01:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(advisor): advisor tool orchestration loop for non-Anthropic providers#25579

feat(advisor): advisor tool orchestration loop for non-Anthropic providers#25579
ishaan-berri merged 15 commits intolitellm_ishaan_april11from
feat/anthropic-advisor-tool

ishaan-berri commented Apr 12, 2026 •

edited

Loading

Uh oh!

gitguardian bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 12, 2026

Uh oh!

codspeed-hq bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 12, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps bot Apr 12, 2026

Uh oh!

greptile-apps bot Apr 12, 2026

Uh oh!

greptile-apps bot Apr 12, 2026

Uh oh!

ishaan-berri commented Apr 12, 2026

Uh oh!

vercel bot commented Apr 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	max_uses: int = advisor_tool.get("max_uses") or ADVISOR_MAX_USES
	max_uses: int = advisor_tool.get("max_uses") if advisor_tool.get("max_uses") is not None else ADVISOR_MAX_USES

Uh oh!

Conversation

ishaan-berri commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

How it works

Live E2E results (gpt-4.1-mini executor + claude-opus-4-6 advisor)

Tests (11 unit tests, all passing)

Uh oh!

gitguardian bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

️✅ There are no secrets present in this pull request anymore.

Uh oh!

CLAassistant commented Apr 12, 2026

Uh oh!

codspeed-hq bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

codecov bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

ishaan-berri commented Apr 12, 2026

Uh oh!

vercel bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ishaan-berri commented Apr 12, 2026 •

edited

Loading

gitguardian bot commented Apr 12, 2026 •

edited

Loading

codspeed-hq bot commented Apr 12, 2026 •

edited

Loading

codecov bot commented Apr 12, 2026 •

edited

Loading

greptile-apps bot commented Apr 12, 2026 •

edited

Loading

vercel bot commented Apr 12, 2026 •

edited

Loading