Skip to content

fix(anthropic): raise default httpx read timeout for streaming; add configurable timeout param#5529

Open
SuperMarioYL wants to merge 1 commit intolivekit:mainfrom
SuperMarioYL:fix/anthropic-httpx-read-timeout
Open

fix(anthropic): raise default httpx read timeout for streaming; add configurable timeout param#5529
SuperMarioYL wants to merge 1 commit intolivekit:mainfrom
SuperMarioYL:fix/anthropic-httpx-read-timeout

Conversation

@SuperMarioYL
Copy link
Copy Markdown

Summary

Fixes #5508.

The Anthropic LLM plugin used httpx.AsyncClient(timeout=5.0), which sets all httpx sub-timeouts — including the per-chunk SSE read timeout — to 5 seconds. Claude's adaptive-thinking phases routinely pause for 10–30 s before emitting the first content chunk, so the 5 s read timeout fires during normal usage and raises APIConnectionError, killing voice sessions mid-turn.

Changes

  • Default split timeout: httpx.Timeout(5.0, read=30.0) — connect stays at 5 s (genuine TCP failures surface fast) while the per-chunk read window is 30 s (covers standard thinking budgets with headroom).
  • New timeout constructor parameter: timeout: httpx.Timeout | None = None lets callers supply a custom timeout without constructing a whole anthropic.AsyncClient — e.g. httpx.Timeout(5.0, read=60.0) for extended thinking or very large contexts. Aligns with the pattern already used by the OpenAI plugin.
  • Six unit tests in tests/test_plugin_anthropic.py cover the default split, tight connect value, custom timeout pass-through, and client= precedence over timeout=.

Before / after

# Before — all sub-timeouts at 5 s, kills thinking phases
httpx.AsyncClient(timeout=5.0, ...)

# After — tight connect, generous read
httpx.AsyncClient(timeout=timeout or httpx.Timeout(5.0, read=30.0), ...)

Usage (new parameter)

# Default (30 s read) — sufficient for most models
llm = anthropic.LLM(model="claude-sonnet-4-6")

# Extended thinking or very large contexts
llm = anthropic.LLM(
    model="claude-opus-4-6",
    timeout=httpx.Timeout(5.0, read=60.0),
)

Test plan

  • uv run pytest tests/test_plugin_anthropic.py -v — 6/6 pass
  • uv run ruff check — no errors
  • uv run ruff format --check — no changes needed

…d timeout param

The Anthropic plugin was using a flat `httpx.AsyncClient(timeout=5.0)` which
sets all sub-timeouts — including the per-chunk read timeout for streaming SSE
responses — to 5 seconds.  Claude's adaptive-thinking phases and large-context
requests can produce 10-30 s silences between streamed chunks; the 5 s read
timeout fires during those pauses, raising `APIConnectionError` and killing
voice sessions mid-turn.

Changes:
- Default httpx timeout is now `httpx.Timeout(5.0, read=30.0)`: connect stays
  tight at 5 s (genuine TCP failures surface quickly) while the per-chunk read
  window is 30 s (covers standard thinking budgets with headroom).
- New `timeout: httpx.Timeout | None = None` constructor parameter lets callers
  supply a fully-customised timeout without having to build a whole
  `anthropic.AsyncClient` (e.g. `httpx.Timeout(5.0, read=60.0)` for extended
  thinking or very large contexts).  Matches the pattern already used by the
  OpenAI plugin.
- Six unit tests in `tests/test_plugin_anthropic.py` verify the default split,
  the tight connect value, custom timeout pass-through, and that a caller-
  supplied `client=` wins over `timeout=`.

Fixes livekit#5508.
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@longcw longcw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! something nit:

parallel_tool_calls: NotGivenOr[bool] = NOT_GIVEN,
tool_choice: NotGivenOr[ToolChoice] = NOT_GIVEN,
caching: NotGivenOr[Literal["ephemeral"]] = NOT_GIVEN,
timeout: httpx.Timeout | None = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use NOT_GIVEN

Suggested change
timeout: httpx.Timeout | None = None,
timeout: NotGivenOr[httpx.Timeout ] = NOT_GIVEN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Anthropic plugin: default 5s httpx timeout too aggressive for adaptive-thinking / large-context models

3 participants