Skip to content

Add VLLM_PROFILE_* flags to V1#1203

Merged
kzawora-intel merged 10 commits intohabana_mainfrom
dev/madamczyk/v1_add_profiling
May 8, 2025
Merged

Add VLLM_PROFILE_* flags to V1#1203
kzawora-intel merged 10 commits intohabana_mainfrom
dev/madamczyk/v1_add_profiling

Conversation

@madamczyk-intel
Copy link
Copy Markdown

@madamczyk-intel madamczyk-intel commented May 6, 2025

Introduces two new env variables VLLM_PROFILE_PROMPT and VLLM_PROFILE_DECODE. When either is specified, vllm will profile a single batch with given parameters.

VLLM_PROFILE_PROMPT=<BS>,<QUERY_LEN>,<MAX_BLOCKS>
VLLM_PROFILE_DECODE=<BS>,<TOTAL_BLOCKS>

BS - batch size
QUERY_LEN - query length
MAX_BLOCKS - max context blocks used in a batch
TOTAL_BLOCKS - total number of blocks used in a batch

When both variables are present a single mixed batch (profile + decode) will be created.

Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
@madamczyk-intel
Copy link
Copy Markdown
Author

/run-gaudi-tests

@michalkuligowski
Copy link
Copy Markdown

/run-gaudi-tests

@madamczyk-intel
Copy link
Copy Markdown
Author

/run-gaudi-tests

@madamczyk-intel
Copy link
Copy Markdown
Author

/run-gaudi-tests

Signed-off-by: Michal Adamczyk <madamczyk@habana.ai>
@madamczyk-intel
Copy link
Copy Markdown
Author

/run-gaudi-tests

@kzawora-intel kzawora-intel merged commit 6c3e443 into habana_main May 8, 2025
43 checks passed
@kzawora-intel kzawora-intel deleted the dev/madamczyk/v1_add_profiling branch May 8, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants