Skip to content

[CPU] add fused_qkvzba_split_reshape_cat kernel for Qwen3-next#12330

Merged
FlamingoPg merged 9 commits intosgl-project:mainfrom
blzheng:beilei/q3n_fused_qkvzba_split_reshape_cat
Dec 3, 2025
Merged

[CPU] add fused_qkvzba_split_reshape_cat kernel for Qwen3-next#12330
FlamingoPg merged 9 commits intosgl-project:mainfrom
blzheng:beilei/q3n_fused_qkvzba_split_reshape_cat

Conversation

@blzheng
Copy link
Contributor

@blzheng blzheng commented Oct 29, 2025

Motivation

This pr adds fused_qkvzba_split_reshape_cat kernel for Qwen3-next on CPU.
Reference: https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/models/qwen3_next.py#L378-L394

Test Plan:
test/srt/cpu/test_qwen3.py

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@mingfeima mingfeima added cpu cpu backend performance optimization intel sgl-kernel run-ci labels Nov 10, 2025
Copy link
Collaborator

@mingfeima mingfeima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@blzheng blzheng marked this pull request as ready for review November 11, 2025 06:56
@FlamingoPg FlamingoPg merged commit 974c562 into sgl-project:main Dec 3, 2025
126 of 131 checks passed
yingluosanqian pushed a commit to yingluosanqian/sglang that referenced this pull request Dec 4, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpu cpu backend performance optimization intel run-ci sgl-kernel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments