Skip to content

Revert "[Feat] Lazy-load optional feature routers on first request"#26727

Merged
krrish-berri-2 merged 1 commit intolitellm_internal_stagingfrom
revert-26534-litellm_importMemReduction2
Apr 29, 2026
Merged

Revert "[Feat] Lazy-load optional feature routers on first request"#26727
krrish-berri-2 merged 1 commit intolitellm_internal_stagingfrom
revert-26534-litellm_importMemReduction2

Conversation

@krrish-berri-2
Copy link
Copy Markdown
Contributor

Reverts #26534

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@krrish-berri-2 krrish-berri-2 enabled auto-merge (squash) April 29, 2026 00:16
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR fully reverts #26534, replacing the LazyFeatureMiddleware approach with eager imports and app.include_router() calls for all optional feature routers at startup. The deleted file (_lazy_features.py) and associated tests are removed cleanly. The trade-off is the loss of the ~700 MB idle memory saving and deferred startup cost that #26534 introduced.

  • google_router is now registered earlier in the router list (before pass_through_router) rather than at the very end, which was its deliberate original position. The original code included a comment warning about a /models/{name}:method vs. /models overlap that has been silently removed — the new placement should be verified against the OpenAI /models route order.

Confidence Score: 4/5

Revert is structurally clean; a P1 router-ordering concern for google_router should be verified before merge.

The revert is complete and consistent — deleted code, removed tests, and restored eager includes all match. One real concern: google_router's registration position moved earlier relative to its intentional prior placement (which had a comment about /models route overlap). If the Vertex AI /models/{name}:method route now shadows the core OpenAI models endpoint, that would be a live routing regression. No P0 issues; capped at 4 due to the P1.

litellm/proxy/proxy_server.py — verify google_router position relative to the OpenAI /models route registration order.

Important Files Changed

Filename Overview
litellm/proxy/_lazy_features.py File deleted as part of revert — the entire lazy-loading middleware and feature registry are removed cleanly.
litellm/proxy/proxy_server.py Restores eager imports and app.include_router() calls for all previously-lazy modules; google_router now registered earlier in the list (before pass_through_router) vs. its former last position — the deleted comment noted a /models overlap that may still apply.
tests/proxy_unit_tests/test_proxy_routes.py Correctly removes force-loading boilerplate for lazy features — all routes are now present at import time, so the simplification is valid.
tests/test_litellm/proxy/test_proxy_server.py Drops all lazy-feature-specific test classes (registry shape, startup absence, middleware behavior) — appropriate since the implementation they tested no longer exists.
tests/test_litellm/proxy/vector_store_endpoints/test_vector_store_endpoints.py Removes manual lazy-route force-registration workaround — now redundant since vector store routes are eagerly registered at startup.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Proxy Server Import] --> B[Eager import of ALL routers at module load time]
    B --> C[app.include_router calls for all routers at startup]
    C --> D[app.mount BASE_MCP_ROUTE]
    D --> E[Server Ready — all routes visible in /openapi.json immediately]

    F[Old Lazy Approach] --> G[Eager import of core routers only]
    G --> H[attach_lazy_features middleware]
    H --> I{Incoming Request}
    I -->|Path matches feature prefix| J[Import module on first hit ~1-3s off-thread]
    J --> K[Register router, invalidate openapi_schema]
    K --> L[Serve request]
    I -->|Already loaded / no match| L
Loading

Comments Outside Diff (1)

  1. litellm/proxy/proxy_server.py, line 547-548 (link)

    P1 google_router registration position changed

    In the previous (pre-lazy-loading) eager code, google_router was included last — right before attach_lazy_features — with the explicit comment "Eager: /models/{name}:method overlaps with the OpenAI /models endpoint." That comment and the deliberate placement are both gone in this revert. FastAPI matches routes in registration order, so placing google_router earlier (before pass_through_router and many other routers) could cause its /models/{name}:method pattern to shadow the OpenAI /models route if the OpenAI route is registered later. Worth confirming the ordering is intentional and that the /models endpoint still returns the expected response.

Reviews (1): Last reviewed commit: "Revert "lazy-load optional feature route..." | Re-trigger Greptile

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@krrish-berri-2 krrish-berri-2 merged commit fd32f29 into litellm_internal_staging Apr 29, 2026
43 checks passed
@krrish-berri-2 krrish-berri-2 deleted the revert-26534-litellm_importMemReduction2 branch April 29, 2026 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants