Skip to content

feat: optimize auto routing#3

Merged
eigmax merged 7 commits intomainfrom
feat/fallback
Feb 8, 2026
Merged

feat: optimize auto routing#3
eigmax merged 7 commits intomainfrom
feat/fallback

Conversation

@eigmax
Copy link
Copy Markdown
Collaborator

@eigmax eigmax commented Feb 8, 2026

feat: intelligent model fallback with cross-provider routing

When a model fails (context window exceeded, billing errors, API failures),
the router now automatically retries with alternative models instead of
showing raw API errors to users.

Key behaviors:

  • Context window exceeded → escalate to higher-tier model with larger context
  • Insufficient credits (402) → try same-tier cheaper alternatives first,
    then escalate to next tier
  • Cross-provider fallback: if OpenRouter credits are exhausted, automatically
    route to direct Anthropic/Groq/OpenAI APIs using separately configured keys
  • Pre-check: estimate token count before API call and skip models whose
    context window is too small
  • User-friendly error messages when all fallbacks are exhausted

Changes:

  • catalog.rs: parse context_length from OpenRouter API, single Catalog struct
  • config.rs: add tier escalation chain (next_tier/prev_tier), tier_alternatives
    with models from multiple providers (Groq, DeepSeek, Anthropic, OpenAI)
  • router.rs: expose get_context_length, get_fallback_model, get_tier_alternatives
  • metrics.rs: track escalation_count
  • litellm_provider.py: classify billing vs context errors, set all provider
    API keys, skip openrouter/ prefix for direct-provider calls
  • loop.py: pre-check + escalation loop with billing-aware routing
  • schema.py: add get_all_api_keys() for multi-provider support

@eigmax eigmax merged commit 576a017 into main Feb 8, 2026
3 of 5 checks passed
@eigmax eigmax deleted the feat/fallback branch February 8, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant