Skip to content

Hexagon: Bump HMX Frequency to Max Corner#22334

Merged
max-krasnyansky merged 2 commits intoggml-org:masterfrom
qualcomm:tr/bump-hmx-freq
Apr 24, 2026
Merged

Hexagon: Bump HMX Frequency to Max Corner#22334
max-krasnyansky merged 2 commits intoggml-org:masterfrom
qualcomm:tr/bump-hmx-freq

Conversation

@trivikram-reddy1
Copy link
Copy Markdown
Contributor

@trivikram-reddy1 trivikram-reddy1 commented Apr 24, 2026

Overview

HMX is currently running at the lowest frequency corner, this PR bumps HMX Frequency to max corner.

Additional information

Should give a nice boost to prefill TPS

Results from Snapdragon Gen 5

Llama-3.1-8B-Instruct-Q4_0

Before:
common_perf_print: prompt eval time =    2193.53 ms /   256 tokens (    8.57 ms per token,   116.71 tokens per second)
common_perf_print:        eval time =    2314.84 ms /    31 runs   (   74.67 ms per token,    13.39 tokens per second)

After:
common_perf_print: prompt eval time =    1721.29 ms /   256 tokens (    6.72 ms per token,   148.73 tokens per second)
common_perf_print:        eval time =    2327.55 ms /    31 runs   (   75.08 ms per token,    13.32 tokens per second)

Llama-3.2-1B-Instruct-Q4_0

Before:
common_perf_print: prompt eval time =     383.28 ms /   256 tokens (    1.50 ms per token,   667.93 tokens per second)
common_perf_print:        eval time =     489.70 ms /    31 runs   (   15.80 ms per token,    63.30 tokens per second)

After:
common_perf_print: prompt eval time =     287.79 ms /   256 tokens (    1.12 ms per token,   889.54 tokens per second)
common_perf_print:        eval time =     480.15 ms /    31 runs   (   15.49 ms per token,    64.56 tokens per second)

Requirements

@trivikram-reddy1 trivikram-reddy1 requested a review from a team as a code owner April 24, 2026 19:53
@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning Hexagon labels Apr 24, 2026
@trivikram-reddy1
Copy link
Copy Markdown
Contributor Author

trivikram-reddy1 commented Apr 24, 2026

@max-krasnyansky @lhez could you please review/approve this

@max-krasnyansky max-krasnyansky merged commit 361fe72 into ggml-org:master Apr 24, 2026
47 of 50 checks passed
@trivikram-reddy1 trivikram-reddy1 deleted the tr/bump-hmx-freq branch April 24, 2026 20:55
@mediouni-m
Copy link
Copy Markdown
Contributor

Documentation says:

HAP_power_hmx_payload_v2 is supported starting with v75. On chipsets (v75 onwards) without separate HMX clock plan, requests made for target corner or frequency will return AEE_EBADPARM (invalid parameter) error.

How does this apply to this case? Especially for v73 platforms

@trivikram-reddy1
Copy link
Copy Markdown
Contributor Author

@mediouni-m, thanks for catching the bug, this does not apply for v73 platforms, I will submit a patch shortly

@max-krasnyansky
Copy link
Copy Markdown
Member

@mediouni-m, thanks for catching the bug, this does not apply for v73 platforms, I will submit a patch shortly

Yep, I missed this during review. Then ran into this issue on my MS Surface with X-Elite (Hexagon v73).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Hexagon

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants