Releases: ggml-org/llama.cpp
b7233
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
llama : fix signed comparison warning on FreeBSD (#17497)
This ensures correct RLIM_INFINITY handling and compatibility on all platforms (32/64-bit).
warning: comparison of integers of different signs: 'rlim_t' (aka 'long') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
488 | if (suggest && (lock_limit.rlim_max > lock_limit.rlim_cur + size)) {
| ~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Adrien Gallouët [email protected]
b7231
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
server: remove default "gpt-3.5-turbo" model name (#17668)
-
server: remove default "gpt-3.5-turbo" model name
-
do not reflect back model name from request
-
fix test
b7230
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
server: fixing naming conflict res_error in server-models.cpp (#17679)
b7229
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
b7227
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
ggml : add fallback definition for HWCAP2_SVE2 (#17683)
This align with other HWCAP2 feature flags
See #17528
Signed-off-by: Adrien Gallouët [email protected]
b7225
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
ggml-cuda: reorder only relevant nodes (#17639)
b7224
Warning
Release Format Update: Linux releases will soon use .tar.gz archives instead of .zip. Please make the necessary changes to your deployment scripts.
macOS/iOS:
Linux:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA)
- Windows x64 (Vulkan)
- Windows x64 (SYCL)
- Windows x64 (HIP)
openEuler:
release: fix duplicate libs, store symbolic links (#17299)
b7223
enhance argsort for UT (#17573) Co-authored-by: Neo Zhang <[email protected]>
b7222
Override SSM_A op for Qwen3 Next to reduce splits (#17587) * Override SSM_A op for Qwen3 Next to reduce splits * New tensor mapping SSM_A_NOSCAN for SSM_A used outside of OP_SSM_SCAN context. * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
b7220
mtmd: add mtmd_context_params::warmup option (#17652) * mtmd: add mtmd_context_params::warmup option * reuse the common_params::warmup