Skip to content

[model-gateway] Use UUIDs for router-managed worker resources#15540

Merged
slin1237 merged 1 commit intosgl-project:mainfrom
alphabetc1:feature/support_del_worker
Dec 21, 2025
Merged

[model-gateway] Use UUIDs for router-managed worker resources#15540
slin1237 merged 1 commit intosgl-project:mainfrom
alphabetc1:feature/support_del_worker

Conversation

@alphabetc1
Copy link
Contributor

@alphabetc1 alphabetc1 commented Dec 20, 2025

Motivation

Previously, we used endpoint instead of UUID of each resource to make it compatible with old delete worker endpoint.
However, URIs are mutable and can be long or ambiguous as identifiers, for example, deleting a worker required calling DELETE /workers/{url} and URL-encoding the full worker URL in the path, which was cumbersome and error-prone.

We're now making a breaking change: router resources are now identified and managed by UUIDs instead of URIs, meanwhile, GET/PUT/DELETE /workers/{url} methods will also be changed to GET/PUT/DELETE /workers/{worker_id}.

Example

For now, GET /workers will show worker's UUID instead of URL.

curl -sS 'http://127.0.0.1:30000/workers'

{"workers":[{"id":"8769987f-0d0e-4f5e-89bc-c89bf557bb0b","url":"http://127.0.0.1:8000","model_id":"test","priority":50,"cost":1.0,"worker_type":"regular","is_healthy":true,"load":0,"connection_mode":"HTTP","metadata":{"model_path":"/root/models/Qwen3-32B-FP8","model_type":"qwen3","architectures":"[\"Qwen3ForCausalLM\"]","served_model_name":"test"}}

And we use /workers/{worker_id} to perform resource manage instead of /workers/{url}:

curl -X GET "http://localhost:30000/workers/8769987f-0d0e-4f5e-89bc-c89bf557bb0b"
curl -X DELETE "http://localhost:30000/workers/8769987f-0d0e-4f5e-89bc-c89bf557bb0b"

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alphabetc1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the usability of the model-gateway by introducing a more intuitive and flexible method for deleting workers. It adds a new DELETE /workers endpoint that can process worker URLs provided either as a query parameter or within a JSON payload. This streamlines the deletion process, making it less prone to errors compared to the previous method of URL-encoding the worker URL in the path, while maintaining full compatibility with the existing endpoint.

Highlights

  • New DELETE /workers Endpoint: Introduces a new DELETE endpoint at /workers for worker deletion, offering a more flexible approach.
  • Flexible Worker Deletion: Workers can now be deleted by providing their URL either as a query parameter (e.g., ?url=http://...) or within a JSON request body (e.g., {"url": "http://..."}).
  • Improved Usability: This change simplifies the operator workflow by eliminating the need for cumbersome URL-encoding of the worker URL directly in the path, which was previously required for the /workers/{url} endpoint.
  • Backward Compatibility: The original DELETE /workers/{url} endpoint remains fully functional, ensuring existing integrations are not broken.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a more convenient endpoint DELETE /workers for deleting workers, accepting the URL via query parameter or JSON body. This is a nice usability improvement and reuses existing logic well. However, I've found a critical borrow-checker issue in the new handler that will prevent compilation, for which I've provided a fix. I've also included a suggestion to improve code readability.

Comment on lines 601 to 611
let url_from_query = req.uri().query().and_then(|qs| {
url::form_urlencoded::parse(qs.as_bytes())
.find(|(k, _)| k == "url")
.map(|(_, v)| v.into_owned())
});

let url = if let Some(url) = url_from_query {
Some(url)
} else {
// Fallback: JSON body: {"url":"http://..."}
let (_parts, body) = req.into_parts();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current implementation has a borrow checker error. req.uri().query() borrows req, and this borrow is still active when req.into_parts() is called in the else block. This attempts to move req while it is borrowed, which will cause a compilation failure.

To fix this, you should first extract the query string into an owned String. This releases the borrow on req, allowing it to be safely consumed later.

Suggested change
let url_from_query = req.uri().query().and_then(|qs| {
url::form_urlencoded::parse(qs.as_bytes())
.find(|(k, _)| k == "url")
.map(|(_, v)| v.into_owned())
});
let url = if let Some(url) = url_from_query {
Some(url)
} else {
// Fallback: JSON body: {"url":"http://..."}
let (_parts, body) = req.into_parts();
let query_str = req.uri().query().map(str::to_string);
let url_from_query = query_str.as_deref().and_then(|qs| {
url::form_urlencoded::parse(qs.as_bytes())
.find(|(k, _)| k == "url")
.map(|(_, v)| v.into_owned())
});
let url = if let Some(url) = url_from_query {
Some(url)
} else {
// Fallback: JSON body: {"url":"http://..."}
let (_parts, body) = req.into_parts();

Comment on lines 615 to 619
.and_then(|bytes| {
(!bytes.is_empty())
.then(|| serde_json::from_slice::<DeleteWorkerQuery>(&bytes).ok())
.flatten()
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This chain of then and flatten to handle an empty request body is a bit dense and can be hard to follow. Using a more explicit if check for an empty byte slice would make the logic clearer and improve maintainability.

            .and_then(|bytes| {
                if bytes.is_empty() {
                    None
                } else {
                    serde_json::from_slice::<DeleteWorkerQuery>(&bytes).ok()
                }
            })

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 20, 2025
@slin1237
Copy link
Collaborator

the reason why we used endpoint instead of UUID of each resource is to make it compatible with old delete worker endpoint
perhaps
instead of a new endpoint
shall we make a breaking change to use UUID?

@alphabetc1
Copy link
Contributor Author

alphabetc1 commented Dec 21, 2025

the reason why we used endpoint instead of UUID of each resource is to make it compatible with old delete worker endpoint perhaps instead of a new endpoint shall we make a breaking change to use UUID?

Agree, this is more RESTful and also more convenient to use.
So would it be acceptable for me to make a breaking change to only keep UUID-based deletion?
e.g.

  • The id returned in GET /workers will be a UUID.
{"workers":[{"id":"8769987f-0d0e-4f5e-89bc-c89bf557bb0b","url":"http://127.0.0.1:8000","model_id":"test","priority":50,"cost":1.0,"worker_type":"regular","is_healthy":true,"load":0,"connection_mode":"HTTP","metadata":{"model_path":"/root/models/Qwen3-32B-FP8","model_type":"qwen3","architectures":"[\"Qwen3ForCausalLM\"]","served_model_name":"test"}}
  • DELETE
curl -X DELETE "http://localhost:30000/workers/8769987f-0d0e-4f5e-89bc-c89bf557bb0b"

Meanwhile, GET/PUT /workers/{url} methods will also be changed to GET/PUT /workers/{worker_id}

@slin1237
Copy link
Collaborator

feel free to make that change

  1. create should return UID right away, since it's async
  2. GET/PUT/DELETE should all be using UID instead
    also feel free to only make changes on delete. i can help with other endpoints
    this type of correction should happen earlier rather than later.

@alphabetc1 alphabetc1 force-pushed the feature/support_del_worker branch from 3d9cf19 to 7d8e236 Compare December 21, 2025 05:46
@alphabetc1 alphabetc1 requested a review from key4ng as a code owner December 21, 2025 05:46
@alphabetc1 alphabetc1 changed the title [model-gateway] add simpler worker deletion via /workers endpoint [model-gateway] Use UUIDs for router-managed worker resources Dec 21, 2025
@slin1237
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and beneficial breaking change by switching from URL-based worker identification to UUIDs. This improves the robustness and usability of the worker management API. The changes are well-implemented across the documentation, tests, and core server logic. My review focuses on a few areas for improvement: addressing a race condition in UUID reservation, optimizing a lookup function, and reducing code duplication in the API handlers for better maintainability.

Comment on lines +126 to +131
if let Some(existing_id) = self.url_to_id.get(url) {
return existing_id.clone();
}
let worker_id = WorkerId::new();
self.url_to_id.insert(url.to_string(), worker_id.clone());
worker_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's a race condition in this function. If two threads call reserve_id_for_url for the same new URL concurrently, both can pass the if let Some(...) check, generate a new WorkerId, and insert it into the map. The last one to insert will win, but both threads will return a WorkerId, one of which will be for a mapping that was immediately overwritten. This can lead to inconsistent state.

To fix this and make the operation atomic, you can use the entry API of DashMap.

Suggested change
if let Some(existing_id) = self.url_to_id.get(url) {
return existing_id.clone();
}
let worker_id = WorkerId::new();
self.url_to_id.insert(url.to_string(), worker_id.clone());
worker_id
self.url_to_id
.entry(url.to_string())
.or_insert_with(WorkerId::new)
.clone()

Comment on lines +139 to +141
self.url_to_id
.iter()
.find_map(|entry| (entry.value() == worker_id).then(|| entry.key().clone()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fallback logic in this function iterates over the url_to_id map to find a URL by its worker ID. This is an O(N) operation and can become inefficient if the number of workers is large.

For better performance, consider adding a reverse mapping, e.g., id_to_url: Arc<DashMap<WorkerId, String>>. This would make the lookup an O(1) operation. You would need to update this new map in reserve_id_for_url and remove functions accordingly.

Comment on lines +578 to +596
let worker_info = WorkerInfo {
id: worker_id.as_str().to_string(),
url: worker_url.clone(),
model_id: "unknown".to_string(),
priority: 0,
cost: 1.0,
worker_type: "unknown".to_string(),
is_healthy: false,
load: 0,
connection_mode: "unknown".to_string(),
runtime_type: None,
tokenizer_path: None,
reasoning_parser: None,
tool_parser: None,
chat_template: None,
bootstrap_port: None,
metadata: HashMap::new(),
job_status: Some(status),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code to create a partial WorkerInfo for a worker that is still in the job queue is quite verbose and hard to read.

To improve readability and maintainability, consider extracting this logic into a helper function within this module. For example:

fn create_partial_worker_info_from_status(
    worker_id: &crate::core::WorkerId,
    worker_url: String,
    status: crate::protocols::worker_spec::JobStatus,
) -> crate::protocols::worker_spec::WorkerInfo {
    // ... implementation from these lines ...
}

Then you could simply call this function here.

Comment on lines +613 to +630
let worker_id = match parse_worker_id(&worker_id_raw) {
Ok(id) => id,
Err(msg) => {
let error = WorkerErrorResponse {
error: msg,
code: "BAD_REQUEST".to_string(),
};
return (StatusCode::BAD_REQUEST, Json(error)).into_response();
}
};

let Some(url) = state.context.worker_registry.get_url_by_id(&worker_id) else {
let error = WorkerErrorResponse {
error: format!("Worker {worker_id_raw} not found"),
code: "WORKER_NOT_FOUND".to_string(),
};
return (StatusCode::NOT_FOUND, Json(error)).into_response();
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for parsing the worker_id from the path and resolving it to a URL is duplicated across get_worker, delete_worker, and update_worker handlers. This increases code maintenance overhead.

Consider creating a custom Axum extractor to handle this logic. An extractor could parse the worker_id, resolve it to a URL, and handle BAD_REQUEST and NOT_FOUND errors centrally. This would significantly simplify the handlers and improve maintainability.

For example, you could create a ResolvedWorker extractor that provides (WorkerId, String) and your handlers would look like:

async fn delete_worker(
    State(state): State<Arc<AppState>>,
    ResolvedWorker { id, url }: ResolvedWorker,
) -> Response {
    // ... handler logic using id and url
}

@slin1237 slin1237 merged commit 1d9ba2c into sgl-project:main Dec 21, 2025
69 of 70 checks passed
@alphabetc1 alphabetc1 deleted the feature/support_del_worker branch December 24, 2025 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation model-gateway run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments