Skip to content

Comments

[bugfix] add httpexception handle in stream case#10305

Open
Bruce-x-1997 wants to merge 1 commit intosgl-project:mainfrom
GMISWE:bruce-online-fix-13
Open

[bugfix] add httpexception handle in stream case#10305
Bruce-x-1997 wants to merge 1 commit intosgl-project:mainfrom
GMISWE:bruce-online-fix-13

Conversation

@Bruce-x-1997
Copy link
Contributor

Motivation

We find some http-exception online, and there is an unexpected response when streaming request meets 503(when request queue is full)
http server does not catch httpexception, so it will raise a total 503 unstream exception even this is a streaming request
and as a user, we expect a streaming error response

Modifications

add httpexception handle in generate_request and v1/chat/completions

Checklist

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @Bruce-x-1997, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where HTTP exceptions, such as a 503 "request queue full" error, were not properly handled during streaming requests, leading to unstreamed or unexpected responses. The changes ensure that such exceptions are caught and formatted into a proper streaming error response, improving the robustness and user experience of the streaming API.

Highlights

  • Improved HTTP Exception Handling: Implemented HTTPException handling for streaming requests in http_server.py to ensure graceful error responses instead of unexpected behavior when exceptions like 503 (request queue full) occur.
  • Consistent Streaming Error Responses: Extended HTTPException handling to the v1/chat/completions endpoint in serving_chat.py to provide a standardized streaming error format for clients.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix a bug where HTTPExceptions were not being handled correctly in streaming responses, causing an incorrect error format. The changes introduce except HTTPException blocks in http_server.py and serving_chat.py to catch these exceptions and yield a properly formatted streaming error response.

The overall approach is correct and addresses the issue described. I've found one critical issue in serving_chat.py where an incorrect argument is passed, which would cause a TypeError. I've also left a comment in http_server.py regarding some code duplication that could be refactored for better maintainability.

Once the critical issue is addressed, this PR should be good to go.

@Bruce-x-1997
Copy link
Contributor Author

@slin1237 please help review it & trigger ci

@Bruce-x-1997
Copy link
Contributor Author

@slin1237 could you help trigger again, I don't see error about my pr

@Bruce-x-1997
Copy link
Contributor Author

@slin1237 hello, could you trigger again?
the fail ci seems be no relationship with my changes

@adarshxs adarshxs added the express-lane A PR may be merged without a full CI check label Oct 8, 2025
@hnyls2002
Copy link
Collaborator

hnyls2002 commented Oct 12, 2025

@Bruce-x-1997 Please resolve the conflicts. Also, @harrisonlimh can check on this change

self.send_to_tokenizer.send_pyobj(
AbortReq(
finished_reason={
"type": "abort",
"status_code": HTTPStatus.SERVICE_UNAVAILABLE,
"message": message,
},
rid=req_to_abort.rid,
)

@hnyls2002
Copy link
Collaborator

@Jimmy-L99 Please look at this, if it is similar to your fix #11904

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

express-lane A PR may be merged without a full CI check run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants