Skip to content

Releases: ggml-org/llama.cpp

b7233

02 Dec 18:10
f3a9674

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

llama : fix signed comparison warning on FreeBSD (#17497)

This ensures correct RLIM_INFINITY handling and compatibility on all platforms (32/64-bit).

warning: comparison of integers of different signs: 'rlim_t' (aka 'long') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
  488 |         if (suggest && (lock_limit.rlim_max > lock_limit.rlim_cur + size)) {
      |                         ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Adrien Gallouët [email protected]

b7231

02 Dec 17:06
5d6bd84

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

server: remove default "gpt-3.5-turbo" model name (#17668)

  • server: remove default "gpt-3.5-turbo" model name

  • do not reflect back model name from request

  • fix test

b7230

02 Dec 16:27
fd3abe8

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

server: fixing naming conflict res_error in server-models.cpp (#17679)

b7229

02 Dec 14:37
682e665

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

server: explicitly set exec path when create new instance (#17669)

  • Revert "rm unused fn"

This reverts commit f2dbe9c.

  • server: explicitly set exec path when create new instance

  • put back TODO

  • only call get_server_exec_path() once

  • add fallback logic

b7227

02 Dec 11:48
ab6726e

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

ggml : add fallback definition for HWCAP2_SVE2 (#17683)

This align with other HWCAP2 feature flags

See #17528

Signed-off-by: Adrien Gallouët [email protected]

b7225

02 Dec 05:37
ed32089

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

ggml-cuda: reorder only relevant nodes (#17639)

b7224

02 Dec 04:56
7b6d745

Choose a tag to compare

Warning

Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.

macOS/iOS:

Linux:

Windows:

openEuler:

release: fix duplicate libs, store symbolic links (#17299)

b7223

02 Dec 02:05
98bd9ab

Choose a tag to compare

enhance argsort for UT (#17573)

Co-authored-by: Neo Zhang <[email protected]>

b7222

02 Dec 00:49
746f9ee

Choose a tag to compare

Override SSM_A op for Qwen3 Next to reduce splits (#17587)

* Override SSM_A op for Qwen3 Next to reduce splits

* New tensor mapping SSM_A_NOSCAN for SSM_A used outside of OP_SSM_SCAN context.

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

---------

Co-authored-by: Sigbjørn Skjæret <[email protected]>

b7220

01 Dec 23:07
ecf74a8

Choose a tag to compare

mtmd: add mtmd_context_params::warmup option (#17652)

* mtmd: add mtmd_context_params::warmup option

* reuse the common_params::warmup