[QUESTION]NVFP4 Post SFT Training & Model Accuracy

**Your question**
Hi @Phlip79 , this is regarding the issue I created earlier (see below) related to nvfp4. One of the follow on activities I am trying to do is to gauge the accuracy of the SFT trained model using nvfp4. After completion of the SFT training run using nvfp4, I tried accessing the model using SgLang inference serving engine. The output of the inference request seems to be all gibberish. It all seems to work fine without any quantization. Is there anything I am missing that is messing up the model while doing post SFT training using nvfp4? Thanks.   

https://github.com/NVIDIA/Megatron-LM/issues/3470#issue-3955671018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QUESTION]NVFP4 Post SFT Training & Model Accuracy #3671

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[QUESTION]NVFP4 Post SFT Training & Model Accuracy #3671

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions