-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Checklist
- I searched related issues but found no solution.
- The bug persists in the latest version.
- Issues without environment info and a minimal reproducible demo are hard to resolve and may receive no feedback.
- If this is not a bug report but a general question, please start a discussion at https://github.com/sgl-project/sglang/discussions. Otherwise, it will be closed.
- Please use English. Otherwise, it will be closed.
Describe the bug
start the sglang server with the following parameters:
export SGLANG_SET_CPU_AFFINITY=1
export ENABLE_PROFILING=0
export HCCL_OP_EXPANSION_MODE="AIV"
unset ASCEND_LAUNCH_BLOCKING
export ASCEND_USE_FIA=0
export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
export STREAMS_PER_DEVICE=32
export HCCL_SOCKET_IFNAME=lo
export GLOO_SOCKET_IFNAME=lo
source /usr/local/Ascend/8.5.0/bisheng_toolkit/set_env.sh
python3 -m sglang.launch_server
--device npu
--model-path /data/qwen3-32B
--dtype bfloat16
--trust-remote-code
--attention-backend ascend
--disable-radix-cache
--mem-fraction-static 0.8
--chunked-prefill-size 32768
--pp-size 2
--host 0.0.0.0
--port 8080
errors occur like this:
Reproduction
export SGLANG_SET_CPU_AFFINITY=1
export ENABLE_PROFILING=0
export HCCL_OP_EXPANSION_MODE="AIV"
unset ASCEND_LAUNCH_BLOCKING
export ASCEND_USE_FIA=0
export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
export STREAMS_PER_DEVICE=32
export HCCL_SOCKET_IFNAME=lo
export GLOO_SOCKET_IFNAME=lo
source /usr/local/Ascend/8.5.0/bisheng_toolkit/set_env.sh
python3 -m sglang.launch_server
--device npu
--model-path /data/qwen3-32B
--dtype bfloat16
--trust-remote-code
--attention-backend ascend
--disable-radix-cache
--mem-fraction-static 0.8
--chunked-prefill-size 32768
--pp-size 2
--host 0.0.0.0
--port 8080
Environment
A2-NPU, Driver: 25.2.2
