Name and Version
8870
Operating systems
Mac
Which llama.cpp modules do you know to be affected?
Other (Please specify in the next section)
Command line
One one node out of 4:
GGML_RPC_DEBUG=1 ./build/bin/rpc-server -H 192.168.68.66 -d Vulkan0 -p 50020 -c
The server:
/Users/amir/llama.cpp/build/bin/llama-server \
-m /Users/amir/Library/Caches/llama.cpp/Qwen3.5-27B-UD-Q5_K_XL.gguf \
-mg 0 -t 8 -tb 8 --port 8080 --host 0.0.0.0 -np 1 -ngl 99 \
-fa 0 -c 170000 -n -1 \
--rpc 192.168.68.73:50010,192.168.68.73:50011,192.168.68.66:50021,192.168.68.66:50020 \
-ts 51,7,3,3,23,13 \
-sps 0.1 -to 1200 -b 128 -ub 32 \
--temp 0.6 \
--top-p 0.95 \
--top-k 20 \
--min-p 0.00 \
--repeat-penalty 1.0 \
--cache-ram 40960
Problem description & steps to reproduce
My current working setup is a server with 4 RPC nodes, running Qwen 3.5 27B.
I'm on 8334 with PR20518 edits.
With the latest version, there is gibberish with my standard prompt (for daily work I use AnythingLLM; the llama.cpp webserver is just to show bugs):
On one of the nodes, tests show that it can run both Qwen3.5-27B-UD-Q5_K_XL.gguf and llama 3 (the oldest node, the trashcan Mac Pro with D700):
Qwen 3.5:
Llama 3:
From one of the rpc nodes:
From one of the iMac Pro's rpc (Vega 56):
get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600003a64300, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600003a64300, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600003a64300, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600003a64300, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600003a64300, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600003a64300, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600003a64300, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600003a64300, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600003a64300, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600003a64300, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600003a64300, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600003a64300, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
From one of the D700's:
set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
And the other D700:
From the other Mac Pro (2 D700's, remember):
graph_recompute] device: 0
[get_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600000bdc000, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600000bdc000, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600000bdc000, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600000bdc000, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600000bdc000, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600000bdc000, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600000bdc000, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600000bdc000, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
Do advise if you need more information
First Bad Commit
I'm unable to identify the first bad commit
Relevant log output
Logs
Name and Version
8870
Operating systems
Mac
Which llama.cpp modules do you know to be affected?
Other (Please specify in the next section)
Command line
Problem description & steps to reproduce
My current working setup is a server with 4 RPC nodes, running Qwen 3.5 27B.
I'm on 8334 with PR20518 edits.
With the latest version, there is gibberish with my standard prompt (for daily work I use AnythingLLM; the llama.cpp webserver is just to show bugs):
On one of the nodes, tests show that it can run both Qwen3.5-27B-UD-Q5_K_XL.gguf and llama 3 (the oldest node, the trashcan Mac Pro with D700):
Qwen 3.5:
Llama 3:
From one of the rpc nodes:
From one of the iMac Pro's rpc (Vega 56):
get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600003a64300, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600003a64300, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600003a64300, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600003a64300, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600003a64300, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600003a64300, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600003a64300, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600003a64300, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600003a64300, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600003a64300, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600003a64300, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600003a64300, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600003a64300, data: 0xa1000, offset: 0, size: 20480
From one of the D700's:
set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600002ec4000, data: 0xa1000, offset: 0, size: 4
[graph_recompute] device: 0
[get_tensor] buffer: 0x600002ec4000, data: 0x1000, offset: 0, size: 20480
And the other D700:
From the other Mac Pro (2 D700's, remember):
graph_recompute] device: 0
[get_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600000bdc000, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600000bdc000, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600000bdc000, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600000bdc000, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
[set_tensor] buffer: 0x600000bdc000, data: 0xa1000, offset: 0, size: 4
[set_tensor] buffer: 0x600000bdc000, data: 0xa1010, offset: 0, size: 16
[set_tensor] buffer: 0x600000bdc000, data: 0xa1210, offset: 0, size: 8
[set_tensor] buffer: 0x600000bdc000, data: 0xa1310, offset: 0, size: 8192
[set_tensor] buffer: 0x600000bdc000, data: 0xe1310, offset: 0, size: 1024
[graph_recompute] device: 0
[get_tensor] buffer: 0x600000bdc000, data: 0x1000, offset: 0, size: 20480
Do advise if you need more information
First Bad Commit
I'm unable to identify the first bad commit
Relevant log output
Logs