Skip to content

ValueError: Cannot get 31 free blocks from the pool #197

@alecngo

Description

@alecngo

I encountered this error while my engines are inferencing:

(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710] EngineCore encountered a fatal error.
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710] Traceback (most recent call last):
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 701, in run_engine_core
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     engine_core.run_busy_loop()
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 728, in run_busy_loop
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     self._process_engine_step()
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 754, in _process_engine_step
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]                               ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/engine/core.py", line 283, in step
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     scheduler_output = self.scheduler.schedule()
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]                        ^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/core/sched/scheduler.py", line 471, in schedule
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     new_blocks = self.kv_cache_manager.allocate_slots(
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/core/kv_cache_manager.py", line 288, in allocate_slots
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     new_blocks = self.coordinator.allocate_new_blocks(
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/core/kv_cache_coordinator.py", line 112, in allocate_new_blocks
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     return tuple(
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]            ^^^^^^
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/core/kv_cache_coordinator.py", line 113, in <genexpr>
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     manager.allocate_new_blocks(
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/opt/venv/lib/python3.12/site-packages/vllm/v1/core/single_type_kv_cache_manager.py", line 129, in allocate_new_blocks
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     new_blocks = self.block_pool.get_new_blocks(num_new_blocks)
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]   File "/tmp/kvcached/kvcached/integration/vllm/patches.py", line 87, in get_new_blocks
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710]     raise ValueError(f"Cannot get {num_blocks} free blocks from the pool")
(EngineCore_DP0 pid=10271) ERROR 10-27 14:16:34 [core.py:710] ValueError: Cannot get 31 free blocks from the pool

Confirmed that I did not set gpu_memory_utilization. I am using qwen7B in an A100_80GB and hosting 3 engines on 1 GPU. Sometimes I also see the issue similar to #191 and I think it may be due to similar root cause. @jiarong0907 @ivanium

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions