Skip to content

Eval bug: llama-server fails to pass image when switching models and regenerating the image prompt. Starts working if hitting regenerate again though. #22257

@lilblam

Description

@lilblam

Name and Version

ggml_cuda_init: found 1 CUDA devices (Total VRAM: 16310 MiB):
Device 0: NVIDIA GeForce RTX 5060 Ti, compute capability 12.0, VMM: yes, VRAM: 16310 MiB
load_backend: loaded CUDA backend from C:\Chu\LLM\Llamacpp\ggml-cuda.dll
load_backend: loaded RPC backend from C:\Chu\LLM\Llamacpp\ggml-rpc.dll
load_backend: loaded CPU backend from C:\Chu\LLM\Llamacpp\ggml-cpu-zen4.dll
version: 8883 (134d6e5)
built with Clang 19.1.5 for Windows x86_64

Operating systems

Windows

GGML backends

CUDA

Hardware

AMD Ryzen 9 9900X
NVIDIA 5600Ti

Models

Reproduced with any 2 LLM's with image capability, even within the same family.
Try Qwen3.5-35b and Qwen3.5-27b for example.

Problem description & steps to reproduce

  1. Load llama-server with any vision model - like Qwen3.5-35b - any gguf.

  2. Send a prompt with an image like "Describe this image".

  3. Close llama-server back-end to unload the model.

  4. Open a different vision model in llama-server - doesn't matter as long as it's not the same model. Try Gemma or qwen3.5-27b for example.

  5. Refresh llama-server webui in the browser (to make sure the samplers get updated based on what's in launch params of the new model).

  6. Click "regenerate prompt" button:

Image

The new model will initially say it doesn't see any image - you can confirm in llama-server logs no image file was sent.

  1. Click "regenerate prompt" button again, this time the image will be sent.

So if I'm using it to test different models against a prompt with an image, it won't send the image the first time you click "regenerate prompt" when the new model is loaded. But it will the 2nd time you click it. Weird behavior, it started a few months ago, not sure what version. Not an issue using API with python, so must be a front-end quirk.

First Bad Commit

Unfortunately it's been a few months and I did not take note of which release started it.

Relevant log output

Not sure if useful here, the log just won't have image processing when hitting "regenerate prompt" the first time (after loading a different VLM), and it will be there on the 2nd attempt.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions