Skip to content

fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms#24071

Merged
krrish-berri-2 merged 1 commit intoBerriAI:litellm_oss_staging_04_02_2026_p1from
milan-berri:fix/anthropic-litellm-overhead-time-ms
Apr 3, 2026
Merged

fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms#24071
krrish-berri-2 merged 1 commit intoBerriAI:litellm_oss_staging_04_02_2026_p1from
milan-berri:fix/anthropic-litellm-overhead-time-ms

Conversation

@milan-berri
Copy link
Copy Markdown
Collaborator

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit (anthropic chat handler tests: poetry run pytest tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_handler.py -v)
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🐛 Bug Fix

Changes

  • Fix: Pass logging_obj from the Anthropic chat handler into every client.post() / async_handler.post() call so the @track_llm_api_timing() decorator on the HTTP handler can set model_call_details["llm_api_duration_ms"]. Without it, litellm_overhead_time_ms stayed null when using Anthropic with LITELLM_DETAILED_TIMING=true.
  • Files: litellm/llms/anthropic/chat/handler.py (4 call sites: async/sync streaming in make_call/make_sync_call, async/sync non-streaming in acompletion/completion).
  • Test: test_make_call_passes_logging_obj_to_client_post in tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_handler.py ensures make_call passes logging_obj to client.post.

…time_ms

When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for
Anthropic because the handler did not pass logging_obj to client.post(),
so track_llm_api_timing could not set llm_api_duration_ms. Pass
logging_obj=logging_obj at all four post() call sites (make_call,
make_sync_call, acompletion, completion). Add test to ensure make_call
passes logging_obj to client.post.

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 19, 2026 1:26am

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq Bot commented Mar 19, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing milan-berri:fix/anthropic-litellm-overhead-time-ms (9e2de44) with main (488b93c)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 19, 2026

Greptile Summary

This PR fixes a bug where litellm_overhead_time_ms remained null when using the Anthropic provider with LITELLM_DETAILED_TIMING=true. The root cause was that logging_obj was not being forwarded to client.post() / async_handler.post() calls in litellm/llms/anthropic/chat/handler.py, so the @track_llm_api_timing() decorator on AsyncHTTPHandler.post / HTTPHandler.post couldn't retrieve it from kwargs to record llm_api_duration_ms on the logging object.

Changes:

  • logging_obj is now passed as a keyword argument at all four call sites: async streaming (make_call), sync streaming (make_sync_call), async non-streaming (acompletion_function), and sync non-streaming (completion).
  • A new pytest.mark.asyncio unit test verifies that make_call correctly forwards logging_obj to client.post, using mocks with no real network calls.

Observations:

  • The fix itself is minimal, correct, and consistent across all four paths.
  • The test only covers the async streaming path; the sync streaming and non-streaming paths have no corresponding tests, leaving three of the four fixes without test coverage.
  • The "Relevant issues" section in the PR description is empty — no issue number is linked despite the checklist item for it.

Confidence Score: 4/5

  • Safe to merge; the change is minimal and correct, with the only gap being partial test coverage of the four fixed call sites.
  • The core fix — passing logging_obj to client.post() at all four call sites — is straightforward, well-scoped, and aligns exactly with how the @track_llm_api_timing() decorator extracts the object from kwargs. The HTTP handler already accepted logging_obj as an optional parameter, so no interface changes are needed. The score is 4 rather than 5 because only one of the four fixed paths has a unit test; the sync streaming and non-streaming paths lack test coverage, creating a small risk of silent regression.
  • No files require special attention; both changed files are correct. The test file could benefit from broader coverage of the sync and non-streaming call sites.

Important Files Changed

Filename Overview
litellm/llms/anthropic/chat/handler.py Passes logging_obj to all four client.post() / async_handler.post() call sites so the @track_llm_api_timing() decorator can extract it from kwargs and set llm_api_duration_ms on the logging object; change is minimal and correct.
tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_handler.py Adds a mocked async test verifying make_call forwards logging_obj to client.post; test is correct and uses no real network calls, but only covers one of the four fixed call sites (async streaming); sync streaming and non-streaming paths remain untested.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant AnthropicChatCompletion as handler.py
    participant HTTPHandler as AsyncHTTPHandler/HTTPHandler
    participant Decorator as track_llm_api_timing()

    Caller->>AnthropicChatCompletion: completion() / acompletion()
    AnthropicChatCompletion->>HTTPHandler: client.post(..., logging_obj=logging_obj)
    Note over HTTPHandler,Decorator: Decorator intercepts call
    Decorator->>Decorator: start_time = now()<br/>logging_obj = kwargs.get("logging_obj")
    Decorator->>HTTPHandler: execute post()
    HTTPHandler-->>Decorator: response
    Decorator->>Decorator: end_time = now()<br/>logging_obj.model_call_details["llm_api_duration_ms"] = duration_ms
    Decorator-->>AnthropicChatCompletion: response
    AnthropicChatCompletion-->>Caller: ModelResponse / StreamWrapper
Loading

Last reviewed commit: "fix(anthropic): pass..."

Comment on lines +35 to +37
mock_client.post.assert_called_once()
call_kwargs = mock_client.post.call_args[1]
assert call_kwargs.get("logging_obj") is logging_obj
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Use .kwargs instead of index-based call_args[1]

call_args[1] is the legacy tuple-indexing API for accessing keyword arguments on a call object. The modern, more readable approach is .kwargs. Both work, but the newer form is clearer and less likely to break if positional args are added.

Suggested change
mock_client.post.assert_called_once()
call_kwargs = mock_client.post.call_args[1]
assert call_kwargs.get("logging_obj") is logging_obj
call_kwargs = mock_client.post.call_args.kwargs
assert call_kwargs.get("logging_obj") is logging_obj

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +13 to +37
@pytest.mark.asyncio
async def test_make_call_passes_logging_obj_to_client_post():
"""make_call must pass logging_obj to client.post so track_llm_api_timing can set llm_api_duration_ms for litellm_overhead_time_ms."""
mock_client = AsyncMock()
mock_response = MagicMock()
mock_response.aiter_lines = MagicMock(return_value=iter([b'data: {"type":"message_start"}\n', b'data: {"type":"message_delta"}\n']))
mock_client.post.return_value = mock_response

logging_obj = MagicMock()

await make_call(
client=mock_client,
api_base="https://api.anthropic.com/v1/messages",
headers={},
data="{}",
model="claude-3-5-haiku",
messages=[{"role": "user", "content": "Hi"}],
logging_obj=logging_obj,
timeout=60.0,
json_mode=False,
)

mock_client.post.assert_called_once()
call_kwargs = mock_client.post.call_args[1]
assert call_kwargs.get("logging_obj") is logging_obj
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Test only covers one of four fixed call sites

The PR fixes logging_obj being passed at four separate call sites in handler.py:

  1. make_call (async streaming) — ✅ covered by this test
  2. make_sync_call (sync streaming) — ❌ not tested
  3. acompletion_function async non-streaming — ❌ not tested
  4. completion sync non-streaming — ❌ not tested

Per the project's testing guidelines (at least 1 test is a hard requirement), this technically satisfies the bar, but adding tests for the sync path and non-streaming paths would prevent regressions where a future refactor accidentally drops logging_obj from one of those call sites.

Consider adding parallel tests for make_sync_call and the non-streaming paths.

@milan-berri
Copy link
Copy Markdown
Collaborator Author

before and after:
image
image

@ishaan-jaff ishaan-jaff changed the base branch from main to litellm_ishaan_march_18 March 19, 2026 01:43
@ishaan-jaff ishaan-jaff changed the base branch from litellm_ishaan_march_18 to main March 19, 2026 01:44
@krrish-berri-2 krrish-berri-2 changed the base branch from main to litellm_oss_staging_04_02_2026_p1 April 3, 2026 18:58
@krrish-berri-2 krrish-berri-2 merged commit a18215c into BerriAI:litellm_oss_staging_04_02_2026_p1 Apr 3, 2026
18 of 40 checks passed
Sameerlite pushed a commit that referenced this pull request Apr 8, 2026
…time_ms (#24071)

When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for
Anthropic because the handler did not pass logging_obj to client.post(),
so track_llm_api_timing could not set llm_api_duration_ms. Pass
logging_obj=logging_obj at all four post() call sites (make_call,
make_sync_call, acompletion, completion). Add test to ensure make_call
passes logging_obj to client.post.

Made-with: Cursor
krrish-berri-2 added a commit that referenced this pull request Apr 9, 2026
* fix(vertex_ai): support pluggable (executable) credential_source for WIF auth (#24700)

The WIF credential dispatch in load_auth() only handled identity_pool and
aws credential types. When credential_source.executable was present (used
for Azure Managed Identity via Workload Identity Federation), it fell
through to identity_pool.Credentials which rejected it with MalformedError.

Add dispatch to google.auth.pluggable.Credentials for executable-type
credential sources, following the same pattern as the existing identity_pool
and aws helpers.

Fixes authentication for Azure Container Apps → GCP Vertex AI via WIF
with executable credential sources.

* feat(logging): add component and logger fields to JSON logs for 3rd p… (#24447)

* feat(logging): add component and logger fields to JSON logs for 3rd party filtering

* Let user-supplied extra fields win over auto-generated component/logger, tighten test assertions

* Feat - Add organization into the metrics metadata for org_id & org_alias (#24440)

* Add org_id and org_alias label names to Prometheus metric definitions

* Add user_api_key_org_alias to StandardLoggingUserAPIKeyMetadata

* Populate user_api_key_org_alias in pre-call metadata

* Pass org_id and org_alias into per-request Prometheus metric labels

* Add test for org labels on per-request Prometheus metrics

* chore: resolve test mockdata

* Address review: populate org_alias from DB view, add feature flag, use .get() for org metadata

* Add org labels to failure path and verify flag behavior in test

* Fix test: build flag-off enum_values without org fields

* Gate org labels behind feature flag in get_labels() instead of static metric lists

* Scope org label injection to metrics that carry team context, remove orphaned budget label defs, add test teardown

* Use explicit metric allowlist for org label injection instead of team heuristic

* Fix duplicate org label guard, move _org_label_metrics to class constant

* Reset custom_prometheus_metadata_labels after duplicate label assertion

* fix: emit org labels by default, remove flag, fix missing org_alias in all metadata paths

* fix: emit org labels by default, no opt-in flag required

* fix: write org_alias to metadata unconditionally in proxy_server.py

* fix: 429s from batch creation being converted to 500 (#24703)

* add us gov models (#24660)

* add us gov models

* added max tokens

* Litellm dev 04 02 2026 p1 (#25052)

* fix: replace hardcoded url

* fix: Anthropic web search cost not tracked for Chat Completions

The ModelResponse branch in response_object_includes_web_search_call()
only checked url_citation annotations and prompt_tokens_details, missing
Anthropic's server_tool_use.web_search_requests field. This caused
_handle_web_search_cost() to never fire for Anthropic Claude models.

Also routes vertex_ai/claude-* models to the Anthropic cost calculator
instead of the Gemini one, since Claude on Vertex uses the same
server_tool_use billing structure as the direct Anthropic API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix(anthropic): pass logging_obj to client.post for litellm_overhead_time_ms (#24071)

When LITELLM_DETAILED_TIMING=true, litellm_overhead_time_ms was null for
Anthropic because the handler did not pass logging_obj to client.post(),
so track_llm_api_timing could not set llm_api_duration_ms. Pass
logging_obj=logging_obj at all four post() call sites (make_call,
make_sync_call, acompletion, completion). Add test to ensure make_call
passes logging_obj to client.post.

Made-with: Cursor

* sap - add additional parameters for grounding

- additional parameter for grounding added for the sap provider

* sap - fix models

* (sap) add filtering, masking, translation SAP GEN AI Hub modules

* (sap) add tests and docs for new SAP modules

* (sap) add support of multiple modules config

* (sap) code refactoring

* (sap) rename file

* test(): add safeguard tests

* (sap) update tests

* (sap) update docs, solve merge conflict in transformation.py

* (sap) linter fix

* (sap) Align embedding request transformation with current API

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) mock commit

* (sap) run black formater

* (sap) add literals to models, add negative tests, fix test for tool transformation

* (sap) fix formating

* (sap) fix models

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) commit for rerun bot review

* (sap) minor improve

* (sap) fix after bot review

* (sap) lint fix

* docs(sap): update documentation

* fix(sap): change creds priority

* fix(sap): change creds priority

* fix(sap): fix sap creds unit test

* fix(sap): linter fix

* fix(sap): linter fix

* linter fix

* (sap) update logic of fetching creds, add additional tests

* (sap) clean up code

* (sap) fix after review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) add a possibility to put the service key by both variants

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) update test

* (sap) update service key resolve function

* (sap) run black formater

* (sap) fix validate credentials, add negative tests for credential fetching

* (sap) fix validate credentials, add negative tests for credential fetching

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) fix after bot review

* (sap) lint fix

* (sap) lint fix

* feat: support service_tier in gemini

* chore: add a service_tier field mapping from openai to gemini

* fix: use x-gemini-service-tier header in response

* docs: add service_tier to gemini docs

* chore: add defaut/standard mapping, and some tests

* chore: tidying up some case insensitivity

* chore: remove unnecessary guard

* fix: remove redundant test file

* fix: handle 'auto' case-insensitively

* fix: return service_tier on final steamed chunk

* chore: black

* feat: enable supports_service_tier to gemini models

* Fix get_standard_logging_metadata tests

* Fix test_get_model_info_bedrock_models

* Fix test_get_model_info_bedrock_models

* Fix remaining tests

* Fix mypy issues

* Fix tests

* Fix merge conflicts

* Fix code qa

* Fix code qa

* Fix code qa

* Fix greptile review

---------

Co-authored-by: michelligabriele <gabriele.michelli@icloud.com>
Co-authored-by: Josh <36064836+J-Byron@users.noreply.github.com>
Co-authored-by: mubashir1osmani <mubashir.osmani777@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: milan-berri <milan@berri.ai>
Co-authored-by: Alperen Kömürcü <alperen.koemuercue@sap.com>
Co-authored-by: Vasilisa Parshikova <vasilisa.parshikova@sap.com>
Co-authored-by: Lin Xu <lin.xu03@sap.com>
Co-authored-by: Mark McDonald <macd@google.com>
Co-authored-by: Sameer Kankute <sameer@berri.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants