Skip to content

[fix] Handle transposed w13_weight#357

Merged
RuixuanZhang06 merged 1 commit intosgl-project:mainfrom
gjsheu:main
Feb 26, 2026
Merged

[fix] Handle transposed w13_weight#357
RuixuanZhang06 merged 1 commit intosgl-project:mainfrom
gjsheu:main

Conversation

@gjsheu
Copy link
Contributor

@gjsheu gjsheu commented Feb 2, 2026

In a latest PR within the sglang repository, layer.w13_weight was transposed. This PR is intended to adapt to those changes.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gjsheu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request provides a critical fix to adapt the sgl_kernel_npu module to a recent change in the sglang repository where the layer.w13_weight tensor was transposed. The changes ensure that the swiglu_oai functions correctly interpret the dimensions of w13_weight, preventing potential runtime errors and maintaining compatibility with the updated tensor structure.

Highlights

  • w13_weight dimension handling: Adjusted the unpacking of layer.w13_weight.size() in swiglu_oai_native to correctly account for a transposed w13_weight, changing E, N, _ to E, _, N.
  • w13_weight shape access: Modified the index used to access layer.w13_weight.shape from [1] to [2] when passing the dimension to swiglu_oai_triton, aligning with the new w13_weight transposition.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly adapts to the transposed w13_weight by updating the dimension indexing in swiglu_oai_native and swiglu_oai. The changes appear correct. I've added a couple of suggestions to improve code clarity and maintainability by handling an unused variable and avoiding a magic number.

@RuixuanZhang06 RuixuanZhang06 merged commit 5d119b5 into sgl-project:main Feb 26, 2026
4 checks passed
1329009851 added a commit to 1329009851/sgl-kernel-npu that referenced this pull request Feb 27, 2026
…-npu into sgl-cmake2

* 'sgl-cmake2' of https://github.com/1329009851/sgl-kernel-npu:
  Fix the bug that the layout kernel crashed when the num of experts is no less than 384 (sgl-project#383)
  adapt sglang (sgl-project#357)
  GLM5 optimize (sgl-project#382)
  Update layernorm_gated.py (sgl-project#378)
  support qwen3.5 (sgl-project#377)
zzx-study pushed a commit to zzx-study/sgl-kernel-npu that referenced this pull request Feb 28, 2026
Co-authored-by: gengjinsong <gengjinsong@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants