Skip to content

Support async in DeepEP#4610

Merged
zhyncs merged 35 commits intosgl-project:mainfrom
fzyzcjy:feat/deepseek_async
Mar 23, 2025
Merged

Support async in DeepEP#4610
zhyncs merged 35 commits intosgl-project:mainfrom
fzyzcjy:feat/deepseek_async

Conversation

@fzyzcjy
Copy link
Collaborator

@fzyzcjy fzyzcjy commented Mar 20, 2025

Motivation

When doing #4068, DeepEP needs to be async. This PR enables that in a minimal way.

(This is a separate PR because #4068 may not be done in a day, and I hope less merge conflicts happen, so extract this part first)

Related: #4232 (Initial DeepEP support)

When viewing diff, please subtract from change in #4608

In order to demonstrate the PR works, I set async_finish=True temporarily. In real world, maybe async will be true only when doing two-batch-overlap. The flag will be handled in #4068. (I can also make it false in this PR if needed)

Modifications

Checklist

@fzyzcjy fzyzcjy changed the title Feat/deepseek async Support async in DeepEP Mar 20, 2025
@fzyzcjy fzyzcjy marked this pull request as ready for review March 20, 2025 06:37
@fzyzcjy fzyzcjy requested a review from merrymercy as a code owner March 20, 2025 06:37
@ch-wan
Copy link
Collaborator

ch-wan commented Mar 23, 2025

For future reference, I did a simple benchmark on one 8xH200 machine for this PR. Here is the command

python3 -m sglang.launch_server --model-path deepseek-ai/DeepSeek-V3 --trust-remote-code   --tp 8 --dp 8 --host 0.0.0.0 --port 30000   --enable-dp-attention --enable-deepep-moe   --disable-cuda-graph
python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompt 512 --random-input 1000 --random-output 1000 --random-range-ratio 1 --host 127.0.0.1 --port 30000 --max-concurrency 128

Results:

Version Concurrency Input Output Num Requests Input Throughput(tok/s) Output Throughput (tok/s) Total Throughput (tok/s)
Oritinal 127.98 1000 1000 512 555.48 555.48 1110.97
Current 127.96 1000 1000 512 612.16 612.16 1224.33

@fzyzcjy fzyzcjy mentioned this pull request Mar 23, 2025
6 tasks
@zhyncs zhyncs merged commit ca75741 into sgl-project:main Mar 23, 2025
0 of 18 checks passed
@ch-wan ch-wan mentioned this pull request Mar 24, 2025
18 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants