Skip to content

V1 - dont look for bucket we know don't exists#1606

Merged
adobrzyn merged 2 commits intohabana_mainfrom
adobrzyn/dont_look_for_config_that_isnt_there
Jul 16, 2025
Merged

V1 - dont look for bucket we know don't exists#1606
adobrzyn merged 2 commits intohabana_mainfrom
adobrzyn/dont_look_for_config_that_isnt_there

Conversation

@adobrzyn
Copy link
Copy Markdown

No description provided.

Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
@adobrzyn
Copy link
Copy Markdown
Author

/run-gaudi-tests

@madamczyk-intel madamczyk-intel requested a review from Copilot July 16, 2025 12:33
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an early exit in the 2D prompt bucketing logic when the batch size exceeds the maximum allowed, and updates the merge check to handle the new sentinel return values.

  • Added a pre-check in _bucketize_2d_prompt to return (None, None, None) for oversized batches.
  • Updated _can_merge_prefill_contents to treat any None in the bucketing result as a non-mergeable case.
Comments suppressed due to low confidence (2)

vllm/v1/worker/hpu_model_runner.py:969

  • [nitpick] Consider renaming the variable bs to batch_size for improved readability and to make its purpose immediately clear.
        if bs > self.max_prefill_batch_size:

vllm/v1/worker/hpu_model_runner.py:969

  • Add a unit test to verify that _bucketize_2d_prompt returns (None, None, None) when the batch size exceeds max_prefill_batch_size, ensuring this new branch is covered.
        if bs > self.max_prefill_batch_size:

Comment thread vllm/v1/worker/hpu_model_runner.py Outdated
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
@adobrzyn
Copy link
Copy Markdown
Author

/run-gaudi-tests

@adobrzyn adobrzyn merged commit 741e987 into habana_main Jul 16, 2025
53 checks passed
@adobrzyn adobrzyn deleted the adobrzyn/dont_look_for_config_that_isnt_there branch July 16, 2025 15:45
adobrzyn added a commit that referenced this pull request Jul 16, 2025
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
madamczyk-intel pushed a commit that referenced this pull request Jul 17, 2025
Signed-off-by: Agata Dobrzyniewicz <adobrzyniewicz@habana.ai>
kzawora-intel added a commit to vllm-project/vllm-gaudi that referenced this pull request Jul 17, 2025
ripped from: HabanaAI/vllm-fork#1606, fixes
weird bucketing anomaly where bs=1 prefills would be padded to bs=2 and
trigger a recompilation

Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants