Skip to content

llama-quant : default ftype param Q5_1 --> Q8_0#20828

Merged
ggerganov merged 3 commits intoggml-org:masterfrom
ddh0:default-quant-ftype-unspecified-Q8_0
Apr 25, 2026
Merged

llama-quant : default ftype param Q5_1 --> Q8_0#20828
ggerganov merged 3 commits intoggml-org:masterfrom
ddh0:default-quant-ftype-unspecified-Q8_0

Conversation

@ddh0
Copy link
Copy Markdown
Contributor

@ddh0 ddh0 commented Mar 20, 2026

Change the default ftype in llama_model_quantize_params from LLAMA_FTYPE_MOSTLY_Q5_1 to LLAMA_FTYPE_MOSTLY_Q8_0.

In case some external program naively uses the default quantization params, we should probably default to a known-good type like Q8_0 rather than Q5_1, which is rather old.

Make sure to read the contributing guidelines before submitting a PR

Change the default `ftype` in `llama_model_quantize_params` from
`LLAMA_FTYPE_MOSTLY_Q5_1` to `LLAMA_FTYPE_MOSTLY_Q8_0`.

In case some external program naively uses the default quantization
params, we should probably default to a known-good type like Q8_0 rather
than Q5_1, which is rather old.
@ddh0 ddh0 requested a review from ggerganov as a code owner March 20, 2026 23:27
@ddh0
Copy link
Copy Markdown
Contributor Author

ddh0 commented Mar 24, 2026

Thoughts on this change? @ggerganov

@ddh0
Copy link
Copy Markdown
Contributor Author

ddh0 commented Apr 4, 2026

Pinging again for visibility - @ggerganov - in my opinion this change makes sense, but it's not a big deal either way, so, feel free to close it. Thank you, in any case.

@ggerganov ggerganov merged commit 9d34231 into ggml-org:master Apr 25, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants