Skip to content

[model-gateway] return 503 when all workers are circuit-broken#15611

Merged
slin1237 merged 1 commit intomainfrom
refactor-n/6
Dec 22, 2025
Merged

[model-gateway] return 503 when all workers are circuit-broken#15611
slin1237 merged 1 commit intomainfrom
refactor-n/6

Conversation

@slin1237
Copy link
Collaborator

When all workers for a model are circuit-broken, return SERVICE_UNAVAILABLE (503) instead of model_not_found (404). This provides clearer semantics: 404 means the model doesn't exist, while 503 means the model exists but is temporarily unavailable.

Checklist

When all workers for a model are circuit-broken, return
SERVICE_UNAVAILABLE (503) instead of model_not_found (404).
This provides clearer semantics: 404 means the model doesn't
exist, while 503 means the model exists but is temporarily
unavailable.
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@slin1237 slin1237 merged commit 2142881 into main Dec 22, 2025
71 checks passed
@slin1237 slin1237 deleted the refactor-n/6 branch December 22, 2025 14:06
Liwansi added a commit to iforgetmyname/sglang that referenced this pull request Dec 23, 2025
…n_eagle3_dp

* 'main' of https://github.com/sgl-project/sglang: (208 commits)
  MoE: Skip SiLU/GELU activation for masked experts (sgl-project#15539)
  [GLM-ASR] GLM-ASR Support  (sgl-project#15570)
  Improve engine customization interface (sgl-project#15635)
  chore: bump sgl-kernel version to 0.3.20 (sgl-project#15590)
  bugfix[schedule]: Refactor sort method and add related UT (sgl-project#13576)
  Adjust wrong `mtp` meaning introduce by mimo (sgl-project#15632)
  Tiny add back missing router per attempt response metric (sgl-project#15621)
  Fix router gRPC mode launch error caused by async loading (sgl-project#15368)
  [model-gateway] return 503 when all workers are circuit-broken (sgl-project#15611)
  [Diffusion] Support peak memory record in offline generate and serving (sgl-project#15610)
  [VLM] Tiny: Unify VLM environment variables (sgl-project#15572)
  [diffusion] chore: remove default post-denoising dit offload in local mode (sgl-project#15573)
  Tiny enable soft watchdog in CI for stuck without logs (sgl-project#15616)
  Tiny add stuck simulation (sgl-project#15613)
  Support soft watchdog for tokenizer/detokenizer/dp-controller processes (sgl-project#15607)
  Tiny avoid EnvField misuse (sgl-project#15612)
  add decode round robin policy (sgl-project#15164)
  Add glm-4.6-fp8 with/without mtp in nightly ci (sgl-project#15566)
  Adapt fixture-kit to gsm8k mixin (sgl-project#15599)
  [model-gateway] add retry support to OpenAI router chat endpoint (sgl-project#15589)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments