Fix dpsk-r1-fp4 launching crash by Qiaolin-Yu · Pull Request #12063 · sgl-project/sglang

Qiaolin-Yu · 2025-10-24T07:45:45Z

Motivation

Closes #12059, which is introduced by #11795

This pr partly reverted the changes by #11795 for a quick fix. Maybe refine this in the future.

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

gemini-code-assist · 2025-10-24T07:45:59Z

Summary of Changes

Hello @Qiaolin-Yu, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical launching crash identified in dpsk-r1-fp4 configurations. The fix involves streamlining the selection mechanism for the Mixture of Experts (MoE) runner backend by removing an automatic detection step. Instead, the backend is now explicitly chosen based on a configuration flag, effectively stabilizing the system and preventing the crash.

Highlights

Crash Fix: This pull request resolves a launching crash specifically affecting dpsk-r1-fp4 models, which was introduced by a previous change (PR Refactor Triton-kernel MoE runner integration #11795).
MoE Runner Backend Selection: The dynamic auto-detection logic for the Mixture of Experts (MoE) runner backend has been removed. The backend is now directly assigned to either TRITON_KERNELS or TRITON based on the self.use_triton_kernels flag.
Partial Revert: To provide a quick fix for the crash, this PR partially reverts some changes that were part of PR Refactor Triton-kernel MoE runner integration #11795, simplifying the backend initialization process.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request fixes a crash in dpsk-r1-fp4 launching by partially reverting a previous change. The modification hardcodes the MoE runner backend selection in Mxfp4MoEMethod and UnquantizedFusedMoEMethod to either TRITON or TRITON_KERNELS. This prevents a NotImplementedError when an unsupported backend is configured.

While this is an effective quick fix as intended, I've suggested adding comments to the code to document this temporary solution and link to the relevant issue. This will help future developers understand the context and the need for a more robust refactoring.

gemini-code-assist · 2025-10-24T07:52:31Z

python/sglang/srt/layers/quantization/mxfp4.py

+        backend = (
+            MoeRunnerBackend.TRITON_KERNELS
+            if self.use_triton_kernels
+            else MoeRunnerBackend.TRITON
+        )


This hardcodes the backend selection to avoid a crash with unsupported backends. As noted in the PR description, this is a temporary fix. It would be beneficial to add a comment here to inform future developers about this, and that it should be refactored for more flexible backend selection.

Suggested change

backend = (

MoeRunnerBackend.TRITON_KERNELS

if self.use_triton_kernels

else MoeRunnerBackend.TRITON

)

# TODO(https://github.com/sgl-project/sglang/issues/12059): This is a temporary fix to avoid a crash with unsupported backends.

# This should be refactored for more flexible backend selection in the future.

backend = (

MoeRunnerBackend.TRITON_KERNELS

if self.use_triton_kernels

else MoeRunnerBackend.TRITON

)

gemini-code-assist · 2025-10-24T07:52:31Z

python/sglang/srt/layers/quantization/unquant.py

+        backend = (
+            MoeRunnerBackend.TRITON_KERNELS
+            if self.use_triton_kernels
+            else MoeRunnerBackend.TRITON
+        )


This hardcodes the backend selection to avoid a crash with unsupported backends. As noted in the PR description, this is a temporary fix. It would be beneficial to add a comment here to inform future developers about this, and that it should be refactored for more flexible backend selection.

Suggested change

backend = (

MoeRunnerBackend.TRITON_KERNELS

if self.use_triton_kernels

else MoeRunnerBackend.TRITON

)

# TODO(https://github.com/sgl-project/sglang/issues/12059): This is a temporary fix to avoid a crash with unsupported backends.

# This should be refactored for more flexible backend selection in the future.

backend = (

MoeRunnerBackend.TRITON_KERNELS

if self.use_triton_kernels

else MoeRunnerBackend.TRITON

)

FlamingoPg · 2025-10-24T07:54:01Z

Looks good!

fix

44a9dbe

Qiaolin-Yu requested review from BBuf, Edwardf0t1, FlamingoPg and ch-wan as code owners October 24, 2025 07:45

Qiaolin-Yu assigned JustinTong0323 Oct 24, 2025

sglang-bot added the run-ci label Oct 24, 2025

Qiaolin-Yu assigned ch-wan and Fridge003 Oct 24, 2025

Qiaolin-Yu added the high priority label Oct 24, 2025

JustinTong0323 added the express-lane A PR may be merged without a full CI check label Oct 24, 2025

gemini-code-assist bot reviewed Oct 24, 2025

View reviewed changes

Qiaolin-Yu assigned ispobock Oct 24, 2025

hnyls2002 merged commit 71d4121 into sgl-project:main Oct 24, 2025
19 of 159 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dpsk-r1-fp4 launching crash#12063

Fix dpsk-r1-fp4 launching crash#12063
hnyls2002 merged 1 commit intosgl-project:mainfrom
Qiaolin-Yu:fix_moe

Qiaolin-Yu commented Oct 24, 2025

Uh oh!

gemini-code-assist bot commented Oct 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 24, 2025

Uh oh!

gemini-code-assist bot Oct 24, 2025

Uh oh!

FlamingoPg commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Comments

Conversation

Qiaolin-Yu commented Oct 24, 2025

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

gemini-code-assist bot commented Oct 24, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

FlamingoPg commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Comments