[fix] Handle transposed w13_weight#357
Conversation
Summary of ChangesHello @gjsheu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request provides a critical fix to adapt the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly adapts to the transposed w13_weight by updating the dimension indexing in swiglu_oai_native and swiglu_oai. The changes appear correct. I've added a couple of suggestions to improve code clarity and maintainability by handling an unused variable and avoiding a magic number.
…-npu into sgl-cmake2 * 'sgl-cmake2' of https://github.com/1329009851/sgl-kernel-npu: Fix the bug that the layout kernel crashed when the num of experts is no less than 384 (sgl-project#383) adapt sglang (sgl-project#357) GLM5 optimize (sgl-project#382) Update layernorm_gated.py (sgl-project#378) support qwen3.5 (sgl-project#377)
Co-authored-by: gengjinsong <gengjinsong@huawei.com>
In a latest PR within the sglang repository, layer.w13_weight was transposed. This PR is intended to adapt to those changes.