[bugfix] add httpexception handle in stream case by Bruce-x-1997 · Pull Request #10305 · sgl-project/sglang

Bruce-x-1997 · 2025-09-11T06:12:18Z

Motivation

We find some http-exception online, and there is an unexpected response when streaming request meets 503(when request queue is full)
http server does not catch httpexception, so it will raise a total 503 unstream exception even this is a streaming request
and as a user, we expect a streaming error response

Modifications

add httpexception handle in generate_request and v1/chat/completions

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist

Summary of Changes

Hello @Bruce-x-1997, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where HTTP exceptions, such as a 503 "request queue full" error, were not properly handled during streaming requests, leading to unstreamed or unexpected responses. The changes ensure that such exceptions are caught and formatted into a proper streaming error response, improving the robustness and user experience of the streaming API.

Highlights

Improved HTTP Exception Handling: Implemented HTTPException handling for streaming requests in http_server.py to ensure graceful error responses instead of unexpected behavior when exceptions like 503 (request queue full) occur.
Consistent Streaming Error Responses: Extended HTTPException handling to the v1/chat/completions endpoint in serving_chat.py to provide a standardized streaming error format for clients.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request aims to fix a bug where HTTPExceptions were not being handled correctly in streaming responses, causing an incorrect error format. The changes introduce except HTTPException blocks in http_server.py and serving_chat.py to catch these exceptions and yield a properly formatted streaming error response.

The overall approach is correct and addresses the issue described. I've found one critical issue in serving_chat.py where an incorrect argument is passed, which would cause a TypeError. I've also left a comment in http_server.py regarding some code duplication that could be refactored for better maintainability.

Once the critical issue is addressed, this PR should be good to go.

python/sglang/srt/entrypoints/openai/serving_chat.py

python/sglang/srt/entrypoints/http_server.py

Bruce-x-1997 · 2025-09-11T06:16:24Z

@slin1237 please help review it & trigger ci

Bruce-x-1997 · 2025-09-15T06:07:10Z

@slin1237 could you help trigger again, I don't see error about my pr

Bruce-x-1997 · 2025-09-23T02:24:57Z

@slin1237 hello, could you trigger again?
the fail ci seems be no relationship with my changes

hnyls2002 · 2025-10-12T12:40:25Z

@Bruce-x-1997 Please resolve the conflicts. Also, @harrisonlimh can check on this change

sglang/python/sglang/srt/managers/scheduler.py

Lines 1597 to 1606 in 20a6c0a

    
           self.send_to_tokenizer.send_pyobj( 
        
               AbortReq( 
        
                   finished_reason={ 
        
                       "type": "abort", 
        
                       "status_code": HTTPStatus.SERVICE_UNAVAILABLE, 
        
                       "message": message, 
        
                   }, 
        
                   rid=req_to_abort.rid, 
        
               )

hnyls2002 · 2025-10-21T11:04:10Z

@Jimmy-L99 Please look at this, if it is similar to your fix #11904

Bruce-x-1997 requested review from CatherineSue, ispobock, merrymercy and slin1237 as code owners September 11, 2025 06:12

gemini-code-assist bot reviewed Sep 11, 2025

View reviewed changes

python/sglang/srt/entrypoints/openai/serving_chat.py Outdated Show resolved Hide resolved

python/sglang/srt/entrypoints/http_server.py Outdated Show resolved Hide resolved

[bugfix] add httpexception handle in stream case

af5058b

Bruce-x-1997 force-pushed the bruce-online-fix-13 branch from 7bd4a09 to af5058b Compare September 11, 2025 06:15

adarshxs added the express-lane A PR may be merged without a full CI check label Oct 8, 2025

JustinTong0323 added the run-ci label Oct 12, 2025

merrymercy requested a review from JustinTong0323 as a code owner November 29, 2025 07:06

Bruce-x-1997 closed this Dec 22, 2025

Bruce-x-1997 reopened this Dec 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[bugfix] add httpexception handle in stream case#10305

[bugfix] add httpexception handle in stream case#10305
Bruce-x-1997 wants to merge 1 commit intosgl-project:mainfrom
GMISWE:bruce-online-fix-13

Bruce-x-1997 commented Sep 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Bruce-x-1997 commented Sep 11, 2025

Uh oh!

Bruce-x-1997 commented Sep 15, 2025

Uh oh!

Bruce-x-1997 commented Sep 23, 2025

Uh oh!

hnyls2002 commented Oct 12, 2025 •

edited

Loading

Uh oh!

hnyls2002 commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

Bruce-x-1997 commented Sep 11, 2025

Motivation

Modifications

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Bruce-x-1997 commented Sep 11, 2025

Uh oh!

Bruce-x-1997 commented Sep 15, 2025

Uh oh!

Bruce-x-1997 commented Sep 23, 2025

Uh oh!

hnyls2002 commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hnyls2002 commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hnyls2002 commented Oct 12, 2025 •

edited

Loading