-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Checklist
- If this is not a feature request but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
- Please use English. Otherwise, it will be closed.
Motivation
In RL, we may launch a router to dispatch reward computation requests for both generative rewards (via /generate) and discriminative rewards (via /classify). I note that the "classify" api has not been implemented in sglang router (https://docs.sglang.io/advanced_features/router.html#api-surface). Hope future release will include this feat.
In addition to this, I want to report a bug that in the latest sglang router, the router fails to handle "v1/embeddings" requests after the servers sleep and then wake up.
# step 1: Launch multiple workers and one router
# step 2: Sleep servers and then wake up them
# step 3: Post "v1/embeddings" requests to routerFor sglang-router==0.2.4
curl http://{router_address}/v1/models
{"object":"list","data":[{"id":"unknown","object":"model","owned_by":"local"}]}
and the reward computation also fails
For sglang-router==0.2.2
all behaviors are correct
Related resources
No response