Skip to content

Fix INC Finalization Check#1230

Merged
yiliu30 merged 4 commits intoHabanaAI:habana_mainfrom
yiliu30:fix-save-measure
May 9, 2025
Merged

Fix INC Finalization Check#1230
yiliu30 merged 4 commits intoHabanaAI:habana_mainfrom
yiliu30:fix-save-measure

Conversation

@yiliu30
Copy link
Copy Markdown

@yiliu30 yiliu30 commented May 8, 2025

When re-quantizing Deepseek R1, we do not pass quantization as inc, as we need to load FP8 weights.
Instead, we update INC finalization condition by checking the QUANT_CONFIG.

cc @thuang6 @zhenwei-intel

Signed-off-by: Yi Liu <yiliu4@habana.ai>
Yi4Liu and others added 2 commits May 8, 2025 12:30
Comment thread vllm/worker/hpu_model_runner.py
@michalkuligowski michalkuligowski mentioned this pull request May 8, 2025
@xuechendi
Copy link
Copy Markdown

/run-gaudi-tests

@jikunshang
Copy link
Copy Markdown

/run-gaudi-tests

@yiliu30 yiliu30 enabled auto-merge May 9, 2025 05:19
@yiliu30 yiliu30 merged commit de96de0 into HabanaAI:habana_main May 9, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants