Skip to content
This repository was archived by the owner on Jan 28, 2026. It is now read-only.
This repository was archived by the owner on Jan 28, 2026. It is now read-only.

Docker image update needed to support Kernel 6.18 #13334

@plumlis

Description

@plumlis

After upgrading the host kernel to Linux 6.18, Ollama-based inference using ipex-llm-inference-cpp-xpu docker (oneAPI / SYCL backend) on Intel Xe (Arc) GPUs fails to function correctly.

The same container and model configuration works as expected on an older kernel(6.17.12)

The issue appears to be related to GPU memory detection / allocation via Level Zero after the kernel upgrade.

Additional

1.The issue does not occur on older kernels with the same container image

2.No changes were made to Docker image, Ollama version, or model

3.This suggests a regression or behavior change in kernel 6.18 affecting:

here are error logs from ollama in docker.

get_memory_info: [warning] ext_intel_free_memory is not supported (export/set ZES_ENABLE_SYSMAN=1 to support), use total memory as free memorytime=2025-12-14T00:04:07.339+08:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server loading model"

[GIN] 2025/12/14 - 00:04:06 | 200 |      23.915µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/12/14 - 00:04:06 | 200 |  125.603561ms |       127.0.0.1 | POST     "/api/show"
Native API failed. Native API returns: 39 (UR_RESULT_ERROR_OUT_OF_DEVICE_MEMORY)
Exception caught at file:/home/runner/_work/llm.cpp/llm.cpp/ollama-llama-cpp/ggml/src/ggml-sycl/ggml-sycl.cpp, line:405, func:operator()
SYCL error: CHECK_TRY_ERROR(ctx->stream->memset( (char *)tensor->data + original_size, 0, padded_size - original_size).wait()): Exception caught in this line of code.
  in function ggml_backend_sycl_buffer_init_tensor at /home/runner/_work/llm.cpp/llm.cpp/ollama-llama-cpp/ggml/src/ggml-sycl/ggml-sycl.cpp:405
/home/runner/_work/llm.cpp/llm.cpp/ollama-llama-cpp/ggml/src/ggml-sycl/../ggml-sycl/common.hpp:115: SYCL error

It appears that GPU memory cannot be correctly queried after upgrading to Linux kernel 6.18, while the same setup works normally on kernel 6.17; upgrading to the latest Level Zero runtime and using newer Ollama versions does not resolve the issue, suggesting a possible regression or change in Intel Xe GPU memory management introduced in kernel 6.18.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions