[Fix]HTTP Stream raise exception#11904
Conversation
Summary of ChangesHello @jinmingyi1998, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical issue where HTTP exceptions were prematurely raised during streaming operations, leading to service interruptions. It introduces a more robust exception handling mechanism for streaming requests, ensuring a graceful termination. Additionally, the PR significantly enhances the system's observability by adding detailed performance metrics across the request lifecycle and integrating a structured Logstash-compatible logging formatter. These changes aim to improve system stability, provide deeper insights into performance bottlenecks, and streamline log analysis. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
16e62e4 to
3845619
Compare
There was a problem hiding this comment.
Code Review
This pull request primarily addresses a crash that occurs when an exception is raised during an HTTP stream. The fix correctly handles this by breaking the loop and sending an abort message to the client for streaming requests, which is a solid improvement. A large portion of the changes also introduces extensive performance metric collection and structured logging capabilities, likely for an internal tool called "PPinfer". While these additions are valuable for observability, I've identified a few minor issues: a duplicated metric logging call, a repeated attribute in a newly added utility file, and an opportunity to improve exception re-raising for better debuggability. Overall, the pull request is in good shape, and the core fix is sound.
|
/gemini review |
|
/gemini summary |
There was a problem hiding this comment.
Code Review
This pull request correctly addresses an issue where an HTTPException would crash the tokenizer_manager during a streaming request. The change introduces a check for obj.stream and avoids raising an exception for streaming requests, instead breaking the loop to allow a graceful finish. The logic is sound. I've added one suggestion to refactor the code for better maintainability by reducing duplication.
Summary of ChangesThis pull request addresses a critical issue where HTTP exceptions were prematurely raised during streaming operations within the Highlights
Changelog
Activity
|
|
@Simon-Li @CatherineSue could you please take a look? |
Motivation
stream is started, cannot raise HTTP Exception
Modifications
if stream, just break loop
when tokenizer_manager get a exception from a stream request, just break the loop, client will receive
before:
after:
How I found and reproduce
I run with p-d disag, set
SGLANG_DISAGGREGATION_WAITING_TIMEOUT=1, decode will raise KV Transfer error.tokenizer_managerraise a HTTPException in stream and crash then.Checklist