Skip to content

Comments

[quantization] Fix NameError: name 'WNA16_SUPPORTED_BITS' is not defined#11552

Closed
kevin85421 wants to merge 2 commits intosgl-project:mainfrom
kevin85421:import-WNA16_SUPPORTED_BITS
Closed

[quantization] Fix NameError: name 'WNA16_SUPPORTED_BITS' is not defined#11552
kevin85421 wants to merge 2 commits intosgl-project:mainfrom
kevin85421:import-WNA16_SUPPORTED_BITS

Conversation

@kevin85421
Copy link
Collaborator

@kevin85421 kevin85421 commented Oct 13, 2025

Motivation

This PR fixes the NameError mentioned in #11383 by importing WNA16_SUPPORTED_BITS.

  • Run:

    python3 -m sglang.launch_server --model-path cpatonn/Qwen3-30B-A3B-Thinking-2507-AWQ-4bit
  • Before this PR:

    File ".../sglang/python/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors_moe.py", line 369, in __init__
      and self.num_bits in WNA16_SUPPORTED_BITS
                           ^^^^^^^^^^^^^^^^^^^^
    NameError: name 'WNA16_SUPPORTED_BITS' is not defined
    
    [2025-10-13 16:49:19] Received sigquit from a child process. It usually means the child failed.
    [1]    1519263 killed     python3 -m sglang.launch_server --model-path
    
  • After this PR => The NameError is resolved, but it failed due to other issues. Users can update the configuration to try to launch the server.

     File ".../sglang/python/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors_moe.py", line 388, in create_weights
       params_dtype == torch.float16
    AssertionError: float16 is required for MoE compressed models. Set dtype=torch.float16
    
    [2025-10-13 16:51:24] Received sigquit from a child process. It usually means the child failed.
    [1]    1519863 killed     python3 -m sglang.launch_server --model-path
    

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Signed-off-by: Kai-Hsun Chen <khchen@x.ai>
@kevin85421 kevin85421 changed the title [MoE] Fix NameError: name 'WNA16_SUPPORTED_BITS' is not defined [quantization] Fix NameError: name 'WNA16_SUPPORTED_BITS' is not defined Oct 13, 2025
@kevin85421 kevin85421 marked this pull request as ready for review October 13, 2025 16:54
@hebiao064 hebiao064 added run-ci express-lane A PR may be merged without a full CI check labels Oct 13, 2025
Copy link
Contributor

@merrymercy merrymercy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. please fix the failed vllm dependency test
  2. please avoid the import from vllm. For AWQ, i think sglang should be able to run it without any import from vllm

@kevin85421
Copy link
Collaborator Author

Thanks for the reivew!

please fix the failed vllm dependency test

The error does not seem to relate to this PR.

command=python3 -m sglang.launch_server --model-path hugging-quants/Mixtral-8x7B-Instruct-v0.1-AWQ-INT4 --trust-remote-code --device cuda --host 127.0.0.1 --port 13000
======================================================================
ERROR: setUpClass (__main__.TestAWQ)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/public_sglang_ci/runner-l1f-gpu-23/_work/sglang/sglang/test/srt/quant/test_awq.py", line 20, in setUpClass
    cls.process = popen_launch_server(
  File "/public_sglang_ci/runner-l1f-gpu-23/_work/sglang/sglang/python/sglang/test/test_utils.py", line 641, in popen_launch_server
    raise TimeoutError("Server failed to start within the timeout period.")
TimeoutError: Server failed to start within the timeout period.

----------------------------------------------------------------------
Ran 0 tests in 600.574s

please avoid the import from vllm. For AWQ, i think sglang should be able to run it without any import from vllm

I will take a look.

@hnyls2002 hnyls2002 removed the express-lane A PR may be merged without a full CI check label Oct 20, 2025
@Hongbosherlock
Copy link
Contributor

  1. please fix the failed vllm dependency test
  2. please avoid the import from vllm. For AWQ, i think sglang should be able to run it without any import from vllm

@merrymercy I have already worked on it. It's ready to be merged now.#10750

@kevin85421 kevin85421 closed this Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants