[model-gateway] bugfix: backward compatibility for GET endpoints by alphabetc1 · Pull Request #15413 · sgl-project/sglang

alphabetc1 · 2025-12-18T16:30:12Z

Summary

This commit adds backward-compatible metadata discovery: when /server_info or /model_info is unavailable, the router falls back to legacy /get_server_info and /get_model_info to keep older SGLang workers compatible with the newer SGLang router.

Root Cause

Newer router versions use /server_info and /model_info, but older SGLang workers don’t implement these endpoints (only the legacy /get_* ones), This causes server_info and model_info to be lost:

When dp-aware is disabled on the router: the worker registers to the router normally, but the model information is missing.
When dp-aware is enabled on the router: the worker fails to register to the router and reports error "Step discover_dp_info failed" (this step has a hard dependency on the dp_size information)..

How to reproduce:
1. Launch a sglang `python -m sglang.launch_server --model-path shakechen/Llama-2-7b-chat-hf --port 8000` 
2. Launch a sglang router `python -m sglang_router.launch_router --worker-urls http://127.0.0.1:8000 --port 30000`
3. Run: curl /v1/models `curl http://127.0.0.1:30000/v1/models`

# For sglang 0.5.5 + router 0.2.2, it works as expected:
# curl http://127.0.0.1:30000/v1/models
# {"models":["shakechen/Llama-2-7b-chat-hf"]}
INFO:     127.0.0.1:38382 - "GET /health HTTP/1.1" 200 OK
INFO:     127.0.0.1:38400 - "GET /get_server_info HTTP/1.1" 200 OK

# For sglang 0.5.5 + router 0.2.4, the returned model id is "unknown":
# curl http://127.0.0.1:30000/v1/models
# {"object":"list","data":[{"id":"unknown","object":"model","owned_by":"local"}]}
INFO:     127.0.0.1:45944 - "GET /server_info HTTP/1.1" 404 Not Found
INFO:     127.0.0.1:45944 - "GET /model_info HTTP/1.1" 404 Not Found

Solution

On 404 from /server_info or /model_info, automatically retry via the corresponding legacy /get_* endpoint, and emit a deprecation warning. A FIXME notes this fallback will be removed together with the worker’s legacy /get_server_info and /get_model_info endpoints in the future.

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

…turns 404

gemini-code-assist · 2025-12-18T16:30:28Z

Summary of Changes

Hello @alphabetc1, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a compatibility issue where the router failed to retrieve metadata from older SGLang workers due to a mismatch in API endpoint versions. It implements a robust fallback mechanism that allows the router to gracefully handle 404 responses from modern endpoints by automatically retrying with their deprecated legacy counterparts. This ensures continued functionality for older worker versions while providing clear deprecation warnings, facilitating a smoother transition to the updated API.

Highlights

Backward Compatibility for Metadata Discovery: The router now includes a fallback mechanism to discover metadata from older SGLang workers that do not support the newer /server_info and /model_info endpoints.
Legacy Endpoint Fallback: If a request to /server_info or /model_info results in a 404 'Not Found' error, the system will automatically retry the request using the corresponding deprecated legacy endpoints, /get_server_info and /get_model_info.
Deprecation Warning: A warning message is now logged when the fallback to legacy endpoints occurs, explicitly stating that these /get_* endpoints are deprecated and will be removed in a future version, encouraging migration to the newer endpoints.
New get_json_fallback Function: A new asynchronous utility function, get_json_fallback, has been introduced to encapsulate the logic for retrying with legacy endpoints, handling HTTP requests, and parsing JSON responses, along with emitting deprecation warnings.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds backward compatibility for metadata discovery by falling back to legacy endpoints when the new ones are not available. The implementation is correct, but it introduces some code duplication. My review includes suggestions to refactor the new code to improve maintainability by reducing this duplication.

gemini-code-assist · 2025-12-18T16:33:19Z

sgl-model-gateway/src/core/steps/worker/local/discover_metadata.rs

+async fn get_json_fallback(
+    base_url: &str,
+    endpoint: &str,
+    api_key: Option<&str>,
+) -> Result<Value, String> {
+    // FIXME: This fallback logic should be removed together with /get_server_info
+    // and /get_model_info endpoints in http_server.py
+    warn!(
+        concat!(
+            "Endpoint '/{}' returned 404, falling back to '/get_{}' for backward compatibility. ",
+            "The '/get_{}' endpoint is deprecated and will be removed in a future version. ",
+            "Please use '/{}' instead."
+        ),
+        endpoint, endpoint, endpoint, endpoint
+    );
+
+    let old_url = format!("{}/get_{}", base_url, endpoint);
+    let mut req = HTTP_CLIENT.get(&old_url);
+    if let Some(key) = api_key {
+        req = req.bearer_auth(key);
+    }
+
+    let response = req
+        .send()
+        .await
+        .map_err(|e| format!("Failed to connect to {}: {}", old_url, e))?;
+
+    if !response.status().is_success() {
+        return Err(format!(
+            "Server returned status {} from {}",
+            response.status(),
+            old_url
+        ));
+    }
+
+    response
+        .json::<Value>()
+        .await
+        .map_err(|e| format!("Failed to parse response from {}: {}", old_url, e))
+}


To reduce code duplication in get_server_info and get_model_info, you can make this function generic to handle deserialization. This moves the serde_json::from_value logic inside get_json_fallback and simplifies the call sites. This change will allow you to simplify the fallback logic in both get_server_info and get_model_info as suggested in the other comments. Using serde::de::DeserializeOwned is a bit more idiomatic here.

async fn get_json_fallback<T: serde::de::DeserializeOwned>( base_url: &str, endpoint: &str, api_key: Option<&str>, ) -> Result<T, String> { // FIXME: This fallback logic should be removed together with /get_server_info // and /get_model_info endpoints in http_server.py warn!( concat!( "Endpoint '/{}' returned 404, falling back to '/get_{}' for backward compatibility. ", "The '/get_{}' endpoint is deprecated and will be removed in a future version. ", "Please use '/{}' instead." ), endpoint, endpoint, endpoint, endpoint ); let old_url = format!("{}/get_{}", base_url, endpoint); let mut req = HTTP_CLIENT.get(&old_url); if let Some(key) = api_key { req = req.bearer_auth(key); } let response = req .send() .await .map_err(|e| format!("Failed to connect to {}: {}", old_url, e))?; if !response.status().is_success() { return Err(format!( "Server returned status {} from {}", response.status(), old_url )); } let value: Value = response .json() .await .map_err(|e| format!("Failed to parse response from {}: {}", old_url, e))?; serde_json::from_value(value) .map_err(|e| format!("Failed to parse {} from fallback response: {}", endpoint, e)) }

gemini-code-assist · 2025-12-18T16:33:19Z

sgl-model-gateway/src/core/steps/worker/local/discover_metadata.rs

+    // If /server_info returns 404, fallback to /get_server_info for backward compatibility
+    if response.status() == reqwest::StatusCode::NOT_FOUND {
+        let json = get_json_fallback(base_url, "server_info", api_key).await?;
+        return serde_json::from_value(json)
+            .map_err(|e| format!("Failed to parse server info: {}", e));
+    }


With the suggested generic get_json_fallback function, this block can be simplified to a single call, removing the duplicated deserialization logic.

// If /server_info returns 404, fallback to /get_server_info for backward compatibility if response.status() == reqwest::StatusCode::NOT_FOUND { return get_json_fallback::<ServerInfo>(base_url, "server_info", api_key).await; }

gemini-code-assist · 2025-12-18T16:33:19Z

sgl-model-gateway/src/core/steps/worker/local/discover_metadata.rs

+    // If /model_info returns 404, fallback to /get_model_info for backward compatibility
+    if response.status() == reqwest::StatusCode::NOT_FOUND {
+        let json = get_json_fallback(base_url, "model_info", api_key).await?;
+        return serde_json::from_value(json)
+            .map_err(|e| format!("Failed to parse model info: {}", e));
+    }


With the suggested generic get_json_fallback function, this block can be simplified to a single call, removing the duplicated deserialization logic.

// If /model_info returns 404, fallback to /get_model_info for backward compatibility if response.status() == reqwest::StatusCode::NOT_FOUND { return get_json_fallback::<ModelInfo>(base_url, "model_info", api_key).await; }

…-project#15413)

fix: fallback to /get_* endpoints when /server_info or /model_info re…

54a1aab

…turns 404

alphabetc1 requested a review from slin1237 as a code owner December 18, 2025 16:30

github-actions bot added the model-gateway label Dec 18, 2025

gemini-code-assist bot reviewed Dec 18, 2025

View reviewed changes

alphabetc1 changed the title ~~fix: GET endpoint compatibility with the old version~~ fix(gateway): backward compatibility for GET endpoints Dec 18, 2025

alphabetc1 changed the title ~~fix(gateway): backward compatibility for GET endpoints~~ [model-gateway] Backward compatibility for GET endpoints Dec 20, 2025

alphabetc1 changed the title ~~[model-gateway] Backward compatibility for GET endpoints~~ [model-gateway] bugfix: backward compatibility for GET endpoints Dec 20, 2025

slin1237 added the run-ci label Dec 20, 2025

slin1237 approved these changes Dec 20, 2025

View reviewed changes

slin1237 merged commit 1d90b19 into sgl-project:main Dec 20, 2025
65 of 68 checks passed

alphabetc1 deleted the bugfix/get_endpoint_compatibility branch December 21, 2025 03:32

jiaming1130 pushed a commit to zhuyijie88/sglang that referenced this pull request Dec 25, 2025

[model-gateway] bugfix: backward compatibility for GET endpoints (sgl…

6a55050

…-project#15413)

alphabetc1 mentioned this pull request Jan 5, 2026

[Feature] request for "classify" method for sglang router #15240

Open

2 tasks

GuoYechang pushed a commit to GuoYechang/sglang that referenced this pull request Jan 13, 2026

[model-gateway] bugfix: backward compatibility for GET endpoints (sgl…

3d4f945

…-project#15413)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[model-gateway] bugfix: backward compatibility for GET endpoints#15413

[model-gateway] bugfix: backward compatibility for GET endpoints#15413
slin1237 merged 1 commit intosgl-project:mainfrom
alphabetc1:bugfix/get_endpoint_compatibility

alphabetc1 commented Dec 18, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 18, 2025

Uh oh!

gemini-code-assist bot Dec 18, 2025

Uh oh!

gemini-code-assist bot Dec 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

alphabetc1 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Solution

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

alphabetc1 commented Dec 18, 2025 •

edited

Loading