Conversation
This reverts commit 948d124.
|
Is there a script to upgrade the old models to new? I don't have the source models because they're huge. |
|
#1384 does not work for NEON because when we remove the This is the relevant section before this PR: We were ORing the 5th bit after the |
Hello All Could someone please share the way of requantizing? upd: but seems man README.md has lack of !q4 variants and sheet matching -n value with quantization of selected model |
|
Doesn't seem like it though, I don't see references where File version 2 is set during quantization either. Edit: i'm wrong. It's set in |
Check my repos again. I've re-quantised all my GGMLs using the latest code, in q4_0, q5_0, q5_1 and q8_0 variants. So no need to do it yourself unless you want to. |


Close #1241
Q4_2supportQ4andQ5(breaking change)New timings:
Old timings:
overall, all these numbers seem to have about +/- 10% variablility from run to run. not ideal benchmark, but not sure what else to do