[model-gateway] Use UUIDs for router-managed worker resources#15540
[model-gateway] Use UUIDs for router-managed worker resources#15540slin1237 merged 1 commit intosgl-project:mainfrom
Conversation
Summary of ChangesHello @alphabetc1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the usability of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds a more convenient endpoint DELETE /workers for deleting workers, accepting the URL via query parameter or JSON body. This is a nice usability improvement and reuses existing logic well. However, I've found a critical borrow-checker issue in the new handler that will prevent compilation, for which I've provided a fix. I've also included a suggestion to improve code readability.
sgl-model-gateway/src/server.rs
Outdated
| let url_from_query = req.uri().query().and_then(|qs| { | ||
| url::form_urlencoded::parse(qs.as_bytes()) | ||
| .find(|(k, _)| k == "url") | ||
| .map(|(_, v)| v.into_owned()) | ||
| }); | ||
|
|
||
| let url = if let Some(url) = url_from_query { | ||
| Some(url) | ||
| } else { | ||
| // Fallback: JSON body: {"url":"http://..."} | ||
| let (_parts, body) = req.into_parts(); |
There was a problem hiding this comment.
The current implementation has a borrow checker error. req.uri().query() borrows req, and this borrow is still active when req.into_parts() is called in the else block. This attempts to move req while it is borrowed, which will cause a compilation failure.
To fix this, you should first extract the query string into an owned String. This releases the borrow on req, allowing it to be safely consumed later.
| let url_from_query = req.uri().query().and_then(|qs| { | |
| url::form_urlencoded::parse(qs.as_bytes()) | |
| .find(|(k, _)| k == "url") | |
| .map(|(_, v)| v.into_owned()) | |
| }); | |
| let url = if let Some(url) = url_from_query { | |
| Some(url) | |
| } else { | |
| // Fallback: JSON body: {"url":"http://..."} | |
| let (_parts, body) = req.into_parts(); | |
| let query_str = req.uri().query().map(str::to_string); | |
| let url_from_query = query_str.as_deref().and_then(|qs| { | |
| url::form_urlencoded::parse(qs.as_bytes()) | |
| .find(|(k, _)| k == "url") | |
| .map(|(_, v)| v.into_owned()) | |
| }); | |
| let url = if let Some(url) = url_from_query { | |
| Some(url) | |
| } else { | |
| // Fallback: JSON body: {"url":"http://..."} | |
| let (_parts, body) = req.into_parts(); |
sgl-model-gateway/src/server.rs
Outdated
| .and_then(|bytes| { | ||
| (!bytes.is_empty()) | ||
| .then(|| serde_json::from_slice::<DeleteWorkerQuery>(&bytes).ok()) | ||
| .flatten() | ||
| }) |
There was a problem hiding this comment.
This chain of then and flatten to handle an empty request body is a bit dense and can be hard to follow. Using a more explicit if check for an empty byte slice would make the logic clearer and improve maintainability.
.and_then(|bytes| {
if bytes.is_empty() {
None
} else {
serde_json::from_slice::<DeleteWorkerQuery>(&bytes).ok()
}
})|
the reason why we used endpoint instead of UUID of each resource is to make it compatible with old delete worker endpoint |
Agree, this is more RESTful and also more convenient to use.
Meanwhile, GET/PUT |
|
feel free to make that change
|
3d9cf19 to
7d8e236
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a significant and beneficial breaking change by switching from URL-based worker identification to UUIDs. This improves the robustness and usability of the worker management API. The changes are well-implemented across the documentation, tests, and core server logic. My review focuses on a few areas for improvement: addressing a race condition in UUID reservation, optimizing a lookup function, and reducing code duplication in the API handlers for better maintainability.
| if let Some(existing_id) = self.url_to_id.get(url) { | ||
| return existing_id.clone(); | ||
| } | ||
| let worker_id = WorkerId::new(); | ||
| self.url_to_id.insert(url.to_string(), worker_id.clone()); | ||
| worker_id |
There was a problem hiding this comment.
There's a race condition in this function. If two threads call reserve_id_for_url for the same new URL concurrently, both can pass the if let Some(...) check, generate a new WorkerId, and insert it into the map. The last one to insert will win, but both threads will return a WorkerId, one of which will be for a mapping that was immediately overwritten. This can lead to inconsistent state.
To fix this and make the operation atomic, you can use the entry API of DashMap.
| if let Some(existing_id) = self.url_to_id.get(url) { | |
| return existing_id.clone(); | |
| } | |
| let worker_id = WorkerId::new(); | |
| self.url_to_id.insert(url.to_string(), worker_id.clone()); | |
| worker_id | |
| self.url_to_id | |
| .entry(url.to_string()) | |
| .or_insert_with(WorkerId::new) | |
| .clone() |
| self.url_to_id | ||
| .iter() | ||
| .find_map(|entry| (entry.value() == worker_id).then(|| entry.key().clone())) |
There was a problem hiding this comment.
The fallback logic in this function iterates over the url_to_id map to find a URL by its worker ID. This is an O(N) operation and can become inefficient if the number of workers is large.
For better performance, consider adding a reverse mapping, e.g., id_to_url: Arc<DashMap<WorkerId, String>>. This would make the lookup an O(1) operation. You would need to update this new map in reserve_id_for_url and remove functions accordingly.
| let worker_info = WorkerInfo { | ||
| id: worker_id.as_str().to_string(), | ||
| url: worker_url.clone(), | ||
| model_id: "unknown".to_string(), | ||
| priority: 0, | ||
| cost: 1.0, | ||
| worker_type: "unknown".to_string(), | ||
| is_healthy: false, | ||
| load: 0, | ||
| connection_mode: "unknown".to_string(), | ||
| runtime_type: None, | ||
| tokenizer_path: None, | ||
| reasoning_parser: None, | ||
| tool_parser: None, | ||
| chat_template: None, | ||
| bootstrap_port: None, | ||
| metadata: HashMap::new(), | ||
| job_status: Some(status), | ||
| }; |
There was a problem hiding this comment.
This block of code to create a partial WorkerInfo for a worker that is still in the job queue is quite verbose and hard to read.
To improve readability and maintainability, consider extracting this logic into a helper function within this module. For example:
fn create_partial_worker_info_from_status(
worker_id: &crate::core::WorkerId,
worker_url: String,
status: crate::protocols::worker_spec::JobStatus,
) -> crate::protocols::worker_spec::WorkerInfo {
// ... implementation from these lines ...
}Then you could simply call this function here.
| let worker_id = match parse_worker_id(&worker_id_raw) { | ||
| Ok(id) => id, | ||
| Err(msg) => { | ||
| let error = WorkerErrorResponse { | ||
| error: msg, | ||
| code: "BAD_REQUEST".to_string(), | ||
| }; | ||
| return (StatusCode::BAD_REQUEST, Json(error)).into_response(); | ||
| } | ||
| }; | ||
|
|
||
| let Some(url) = state.context.worker_registry.get_url_by_id(&worker_id) else { | ||
| let error = WorkerErrorResponse { | ||
| error: format!("Worker {worker_id_raw} not found"), | ||
| code: "WORKER_NOT_FOUND".to_string(), | ||
| }; | ||
| return (StatusCode::NOT_FOUND, Json(error)).into_response(); | ||
| }; |
There was a problem hiding this comment.
The logic for parsing the worker_id from the path and resolving it to a URL is duplicated across get_worker, delete_worker, and update_worker handlers. This increases code maintenance overhead.
Consider creating a custom Axum extractor to handle this logic. An extractor could parse the worker_id, resolve it to a URL, and handle BAD_REQUEST and NOT_FOUND errors centrally. This would significantly simplify the handlers and improve maintainability.
For example, you could create a ResolvedWorker extractor that provides (WorkerId, String) and your handlers would look like:
async fn delete_worker(
State(state): State<Arc<AppState>>,
ResolvedWorker { id, url }: ResolvedWorker,
) -> Response {
// ... handler logic using id and url
}
Motivation
Previously, we used endpoint instead of UUID of each resource to make it compatible with old delete worker endpoint.
However, URIs are mutable and can be long or ambiguous as identifiers, for example, deleting a worker required calling DELETE
/workers/{url}and URL-encoding the full worker URL in the path, which was cumbersome and error-prone.We're now making a breaking change: router resources are now identified and managed by UUIDs instead of URIs, meanwhile, GET/PUT/DELETE
/workers/{url}methods will also be changed to GET/PUT/DELETE/workers/{worker_id}.Example
For now, GET /workers will show worker's UUID instead of URL.
And we use
/workers/{worker_id}to perform resource manage instead of/workers/{url}:Modifications
Accuracy Tests
Benchmarking and Profiling
Checklist