feat: Add SageMaker support by andjsmi · Pull Request #3740 · sgl-project/sglang

andjsmi · 2025-02-21T03:14:50Z

Motivation

SageMaker Endpoints support /ping for healthchecks and /invocations for invocation payloads however sglang currently doesn't support this invocation pattern to make the package usable on SageMaker Endpoints.

Modifications

This pull request adds two endpoints for /ping/ and /invocations in http_server.py.

/ping provides the same functionality as /health. At present /invocations acts the same as /v1/chat/completions however it may be worth expanding this to invoke as based on the request content.

I've included test cases as well and have been able to test on a SageMaker endpoint.

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://slack.sglang.ai to discuss your PR.

zhaochenyang20 · 2025-02-21T08:40:17Z

@andjsmi This should be a nice PR. But do you find someone to test it on sagemaker? We do not have the access.

andjsmi · 2025-02-21T11:26:49Z

Hey @zhaochenyang20.

Thanks! Yes, I've tested it on SageMaker myself and have included a screenshot below.

The main requirements are responding empty 200 OK from the /ping endpoint and then accepting POST requests on /invocations, with the ability to stream chunked encoding back. I have tested all these usecases on a SageMaker endpoint.

If there's a particular way you'd like me to test further, please let me know.

zhyncs · 2025-02-21T11:28:46Z

docker/Dockerfile.sagemaker

@@ -0,0 +1,78 @@
+ARG CUDA_VERSION=12.5.1
+
+FROM nvcr.io/nvidia/tritonserver:24.04-py3-min


Can we use the lmsysorg/sglang:latest as the base image?

zhyncs · 2025-02-21T11:30:56Z

This change overall looks good, I can merge it first, minor changes can be submitted in a follow-up, thank you very much for AWS's support!

Added SageMaker support

7856fd1

andjsmi requested review from ByronHsu, Ying1123, hnyls2002, ispobock, merrymercy and zhyncs as code owners February 21, 2025 03:14

andjsmi mentioned this pull request Feb 21, 2025

[Feature] Add SageMaker Support #3739

Closed

2 tasks

andjsmi and others added 4 commits February 21, 2025 03:22

Create sample SageMaker container

40883b2

Added sample SageMaker container

de733a8

Added sample SageMaker container

c7862a1

Merge branch 'main' into dev

85a2357

zhyncs reviewed Feb 21, 2025

View reviewed changes

zhyncs approved these changes Feb 21, 2025

View reviewed changes

zhyncs merged commit 1df6eab into sgl-project:main Feb 21, 2025
3 of 18 checks passed

aoshen524 pushed a commit to aoshen524/sglang that referenced this pull request Mar 10, 2025

feat: Add SageMaker support (sgl-project#3740)

f311608

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add SageMaker support#3740

feat: Add SageMaker support#3740
zhyncs merged 5 commits intosgl-project:mainfrom
andjsmi:dev

andjsmi commented Feb 21, 2025

Uh oh!

zhaochenyang20 commented Feb 21, 2025

Uh oh!

andjsmi commented Feb 21, 2025

Uh oh!

zhyncs Feb 21, 2025

Uh oh!

zhyncs commented Feb 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,78 @@
		ARG CUDA_VERSION=12.5.1

		FROM nvcr.io/nvidia/tritonserver:24.04-py3-min

Conversation

andjsmi commented Feb 21, 2025

Motivation

Modifications

Checklist

Uh oh!

zhaochenyang20 commented Feb 21, 2025

Uh oh!

andjsmi commented Feb 21, 2025

Uh oh!

zhyncs Feb 21, 2025

Choose a reason for hiding this comment

Uh oh!

zhyncs commented Feb 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants