Skip to content

Releases: ggml-org/llama.cpp

b7210

30 Nov 22:11
2ba7195

Choose a tag to compare

model: LFM2-VL fixes (#17577)

* Adjust to pytorch

* Add antialiasing upscale

* Increase number of patches to 1024

* Handle default marker insertion for LFM2

* Switch to flag

* Reformat

* Cuda implementation of antialias kernel

* Change placement in ops.cpp

* consistent float literals

* Pad only for LFM2

* Address PR feedback

* Rollback default marker placement changes

* Fallback to CPU implementation for antialias implementation of upscale

b7209

30 Nov 17:12
7f8ef50

Choose a tag to compare

clip: fix nb calculation for qwen3-vl (#17594)

b7208

30 Nov 16:35
3c136b2

Choose a tag to compare

cli: add migration warning (#17620)

b7207

30 Nov 14:04
beb1f0c

Choose a tag to compare

common : throttle download progress output to reduce IO flush (#17427)

This change limits progress updates to approximately every 0.1% of the
file size to minimize stdio overhead.

Also fixes compiler warnings regarding __func__ in lambdas.

Signed-off-by: Adrien Gallouët <[email protected]>

b7206

30 Nov 12:44
def5404

Choose a tag to compare

common: add LLAMA_LOG_FILE env var (#17609)

Signed-off-by: Aaron Teo <[email protected]>

b7205

30 Nov 02:53
fa04659

Choose a tag to compare

ggml: fix: macOS build with `-DGGML_BACKEND_DL=ON` (#17581)

b7204

30 Nov 02:48
5a6241f

Choose a tag to compare

common: update env var name (#17588)

b7203

30 Nov 02:28
c7af376

Choose a tag to compare

CUDA: add stream-based concurrency (#16991)

* CUDA: add stream-based concurrency

* HIP: fix hipStreamWaitEvent define and nodiscard warnings

* ggml-cuda: fix fusion inside stream

* ggml-cuda: fix bug w.r.t first stream launch

* ggml-cuda: format

* ggml-cuda: improve assert message

* ggml-cuda: use lambda instead of duplicating code

* ggml-cuda: add some more comments

* ggml-cuda: add more detailed comments about concurrency

* ggml-cuda: rename + remove unused var

* ggml-cuda: fix condition for stream launch

* ggml-cuda: address review comments, add destructor

* common.cuh: add is_valid for concurrent events

* common.cuh: make comment better

* update comment

Co-authored-by: Johannes Gäßler <[email protected]>

* update comment

Co-authored-by: Johannes Gäßler <[email protected]>

* common.cuh: fix lower_bound condition + remove join_node data from write_ranges

* ggml-cuda: fix overlap condition + shadowing parameter

---------

Co-authored-by: Carl Philipp Klemm <[email protected]>
Co-authored-by: Johannes Gäßler <[email protected]>

b7202

30 Nov 01:37
00425e2

Choose a tag to compare

   cuda : add error checking for cudaMemcpyAsync in argsort (#17599)

* cuda : add error checking for cudaMemcpyAsync in argsort (#12836)

* fix indentation

b7201

30 Nov 01:25
385c3da

Choose a tag to compare

vulkan : fix FA mask load with bounds check (coopmat2) (#17606)