Skip to content

Update Force Channel FP8 Check#1561

Merged
yiliu30 merged 2 commits intohabana_mainfrom
update-force-fp8-check
Jul 11, 2025
Merged

Update Force Channel FP8 Check#1561
yiliu30 merged 2 commits intohabana_mainfrom
update-force-fp8-check

Conversation

@yiliu30
Copy link
Copy Markdown

@yiliu30 yiliu30 commented Jul 10, 2025

Starting with 1.22, INC also supports dynamic quantization. To make things smoother for users, and after discussing with @xuechendi , we plan to update how we handle the force channel fp8 check:

  • If the user provides a QUANT_CONFIG, we assume the intention is to use INC for either dynamic or static quantization.
  • If no QUANT_CONFIG is provided but the user passes an FP8 model, the workflow defaults to the built-in dynamic quantization path w/o INC.

cc @thuang6 @czhu15 @yangulei

Signed-off-by: yiliu30 <yi4.liu@intel.com>
Comment thread vllm/envs.py
@xuechendi
Copy link
Copy Markdown

/run-gaudi-tests

@yiliu30 yiliu30 requested a review from czhu15 July 11, 2025 01:31
Copy link
Copy Markdown

@czhu15 czhu15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread vllm/envs.py
@yiliu30 yiliu30 merged commit 7205441 into habana_main Jul 11, 2025
53 checks passed
@yiliu30 yiliu30 deleted the update-force-fp8-check branch July 11, 2025 03:32
michalkuligowski pushed a commit that referenced this pull request Jul 15, 2025
Porting #1561

Signed-off-by: yiliu30 <yi4.liu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants