Conversation
Summary of ChangesHello @slin1237, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This PR simplifies the gRPC health check on the server side by removing the generation-based validation, making it a simple 'server is alive' check. Concurrently, it dramatically increases the gRPC client's channel timeout from 30 seconds to 3600 seconds, likely to accommodate longer-running operations without premature disconnections. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request simplifies the gRPC health check to a liveness probe and increases the client-side timeout. The changes are logical, but there are a few areas for improvement regarding robustness and configurability. I've left some suggestions to address these points.
| return sglang_scheduler_pb2.HealthCheckResponse( | ||
| healthy=True, message="Health check passed" | ||
| ) |
There was a problem hiding this comment.
The simplified health check always returns healthy=True. This can be problematic during a graceful shutdown, as a load balancer or service orchestrator might continue to send traffic to a server that is about to terminate. The previous implementation correctly handled this by checking self.request_manager.gracefully_exit. This check should be restored to ensure the server reports as unhealthy when it's in the process of shutting down.
if self.request_manager.gracefully_exit:
return sglang_scheduler_pb2.HealthCheckResponse(
healthy=False, message="Server is shutting down"
)
return sglang_scheduler_pb2.HealthCheckResponse(
healthy=True, message="Health check passed"
)| request: sglang_scheduler_pb2.HealthCheckRequest, | ||
| context: grpc.aio.ServicerContext, |
There was a problem hiding this comment.
The request and context arguments are no longer used in the HealthCheck method. According to PEP 8, it's good practice to prefix unused arguments with an underscore to make this explicit.
| request: sglang_scheduler_pb2.HealthCheckRequest, | |
| context: grpc.aio.ServicerContext, | |
| _request: sglang_scheduler_pb2.HealthCheckRequest, | |
| _context: grpc.aio.ServicerContext, |
|
|
||
| let channel = Channel::from_shared(http_endpoint)? | ||
| .timeout(Duration::from_secs(30)) | ||
| .timeout(Duration::from_secs(3600)) |
There was a problem hiding this comment.
Hardcoding the timeout to 3600 seconds (1 hour) makes the client inflexible. While a long timeout might be necessary for some use cases, it can also mask problems with an unresponsive server.
It would be better to make this value configurable, for example, by reading it from an environment variable. This would allow operators to tune the timeout to their specific needs.
Example:
use std::env;
use std::time::Duration;
// In connect()
const DEFAULT_GRPC_TIMEOUT_SECS: u64 = 3600;
let timeout_secs = env::var("SGLANG_GRPC_CLIENT_TIMEOUT")
.ok()
.and_then(|s| s.parse().ok())
.unwrap_or(DEFAULT_GRPC_TIMEOUT_SECS);
let channel = Channel::from_shared(http_endpoint)?
.timeout(Duration::from_secs(timeout_secs))
// ...
Checklist