[quantization] Fix `NameError: name 'WNA16_SUPPORTED_BITS' is not defined` by kevin85421 · Pull Request #11552 · sgl-project/sglang

kevin85421 · 2025-10-13T16:33:42Z

Motivation

This PR fixes the NameError mentioned in #11383 by importing WNA16_SUPPORTED_BITS.

Run:

python3 -m sglang.launch_server --model-path cpatonn/Qwen3-30B-A3B-Thinking-2507-AWQ-4bit

Before this PR:

File ".../sglang/python/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors_moe.py", line 369, in __init__
  and self.num_bits in WNA16_SUPPORTED_BITS
                       ^^^^^^^^^^^^^^^^^^^^
NameError: name 'WNA16_SUPPORTED_BITS' is not defined

[2025-10-13 16:49:19] Received sigquit from a child process. It usually means the child failed.
[1]    1519263 killed     python3 -m sglang.launch_server --model-path

After this PR => The NameError is resolved, but it failed due to other issues. Users can update the configuration to try to launch the server.

 File ".../sglang/python/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors_moe.py", line 388, in create_weights
   params_dtype == torch.float16
AssertionError: float16 is required for MoE compressed models. Set dtype=torch.float16

[2025-10-13 16:51:24] Received sigquit from a child process. It usually means the child failed.
[1]    1519863 killed     python3 -m sglang.launch_server --model-path

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.

Signed-off-by: Kai-Hsun Chen <khchen@x.ai>

merrymercy

please fix the failed vllm dependency test
please avoid the import from vllm. For AWQ, i think sglang should be able to run it without any import from vllm

kevin85421 · 2025-10-13T23:56:14Z

Thanks for the reivew!

please fix the failed vllm dependency test

The error does not seem to relate to this PR.

command=python3 -m sglang.launch_server --model-path hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4 --trust-remote-code --device cuda --host 127.0.0.1 --port 13000
======================================================================
ERROR: setUpClass (__main__.TestAWQ)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/public_sglang_ci/runner-l1f-gpu-23/_work/sglang/sglang/test/srt/quant/test_awq.py", line 20, in setUpClass
    cls.process = popen_launch_server(
  File "/public_sglang_ci/runner-l1f-gpu-23/_work/sglang/sglang/python/sglang/test/test_utils.py", line 641, in popen_launch_server
    raise TimeoutError("Server failed to start within the timeout period.")
TimeoutError: Server failed to start within the timeout period.

----------------------------------------------------------------------
Ran 0 tests in 600.574s

please avoid the import from vllm. For AWQ, i think sglang should be able to run it without any import from vllm

I will take a look.

Hongbosherlock · 2025-10-21T08:55:20Z

please fix the failed vllm dependency test

please avoid the import from vllm. For AWQ, i think sglang should be able to run it without any import from vllm

@merrymercy I have already worked on it. It's ready to be merged now.#10750

update

6db3da6

Signed-off-by: Kai-Hsun Chen <khchen@x.ai>

kevin85421 changed the title ~~[MoE] Fix NameError: name 'WNA16_SUPPORTED_BITS' is not defined~~ [quantization] Fix NameError: name 'WNA16_SUPPORTED_BITS' is not defined Oct 13, 2025

kevin85421 mentioned this pull request Oct 13, 2025

[Bug] NameError: WNA16_SUPPORTED_BITS is not defined when loading AWQ MoE model。 #11383

Closed

5 tasks

kevin85421 marked this pull request as ready for review October 13, 2025 16:54

kevin85421 requested review from BBuf, Edwardf0t1, HaiShaw, Ying1123, ch-wan, ispobock, kushanam, merrymercy and zhyncs as code owners October 13, 2025 16:54

Merge branch 'main' into import-WNA16_SUPPORTED_BITS

38a8192

hebiao064 added run-ci express-lane A PR may be merged without a full CI check labels Oct 13, 2025

merrymercy requested changes Oct 13, 2025

View reviewed changes

hnyls2002 removed the express-lane A PR may be merged without a full CI check label Oct 20, 2025

kevin85421 closed this Oct 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[quantization] Fix `NameError: name 'WNA16_SUPPORTED_BITS' is not defined`#11552

[quantization] Fix `NameError: name 'WNA16_SUPPORTED_BITS' is not defined`#11552
kevin85421 wants to merge 2 commits intosgl-project:mainfrom
kevin85421:import-WNA16_SUPPORTED_BITS

kevin85421 commented Oct 13, 2025 •

edited

Loading

Uh oh!

merrymercy left a comment

Uh oh!

kevin85421 commented Oct 13, 2025

Uh oh!

Hongbosherlock commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Comments

Conversation

kevin85421 commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Uh oh!

merrymercy left a comment

Choose a reason for hiding this comment

Uh oh!

kevin85421 commented Oct 13, 2025

Uh oh!

Hongbosherlock commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kevin85421 commented Oct 13, 2025 •

edited

Loading