Skip to content

Expose --qparams_algorithm CLI arg for quantization parameter selection#227

Open
rhn19 wants to merge 1 commit intohuggingface:mainfrom
rhn19:fix/qat-default-qparams-algorithm
Open

Expose --qparams_algorithm CLI arg for quantization parameter selection#227
rhn19 wants to merge 1 commit intohuggingface:mainfrom
rhn19:fix/qat-default-qparams-algorithm

Conversation

@rhn19
Copy link
Copy Markdown

@rhn19 rhn19 commented Apr 13, 2026

Fixes #226

Problem

quantize_model_() accepts a qparams_algorithm parameter, but there is no CLI argument to pass it through. Users cannot control which algorithm is used for computing quantization parameters during export.

Changes

  • Add --qparams_algorithm CLI argument (choices=["affine", "hqq_scale_only"])
  • Wire it through all task loaders (causal_lm, masked_lm, asr, multimodal_text_to_text) to quantize_model_()

Usage

optimum-cli export executorch \
   --model google/gemma-3-1b-it \
   --task text-generation --recipe xnnpack \
   --qlinear 8da4w --qparams_algorithm affine \
   --output_dir output/

No behavior change when --qparams_algorithm is not specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] QAT-trained models produce degraded output after export due to quantization parameter mismatches

1 participant