Skip to content

Gemini said [BUG] Persistent 400 Bad Request on /api/embed using Ollama with LiteLLM #1425

@GratefulDave

Description

@GratefulDave

I am giving you guys another try. I like your implementation more than openclaw and hermes from a ux perspective. If I can every get over intermittent issues, it would be my daily driver. ...

Description

Agent Zero consistently fails with a 400 Bad Request when attempting to access the Ollama /api/embed endpoint via LiteLLM. This occurs during memory similarity searches, specifically within the search_similarity_threshold function of the memory plugin.

The error persists even after clearing the FAISS index (index.faiss/index.pkl) and verifying that the embedding model (nomic-embed-text) is pulled and active. There appears to be a protocol mismatch or malformed request body being sent by LiteLLM to the Ollama API.

Additionally, when using a local utility model (llama3.2:1b), the model fails to adhere to strict JSON output constraints in the _51_memorize_solutions.py extension, providing conversational commentary instead of raw JSON, which suggests the prompt handling or model instruction following is being disrupted by the underlying connection issues.

Environment

  • Agent Zero Version: 2026 Build (Dockerized)
  • Ollama Server Version: 0.18.3 (Homebrew)
  • Ollama Client Version: 0.20.0
  • LiteLLM Version: Latest (via Agent Zero venv)
  • Main Model: GLM-4
  • Utility Model: llama3.2:1b
  • Embedding Model: nomic-embed-text
  • OS: macOS (Host)

Traceback

litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - Client error '400 Bad Request' for url 'http://host.docker.internal:11434/api/embed'

Traceback (most recent call last):
  File "/opt/venv-a0/lib/python3.12/site-packages/litellm/main.py", line 4476, in embedding
    response = ollama_embeddings_fn(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv-a0/lib/python3.12/site-packages/litellm/llms/ollama/completion/handler.py", line 114, in ollama_embeddings
    response = litellm.module_level_client.post(url=api_base, json=data)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv-a0/lib/python3.12/site-packages/httpx/_models.py", line 829, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '400 Bad Request' for url 'http://host.docker.internal:11434/api/embed'

During handling of the above exception, another exception occurred:

  File "/a0/plugins/_memory/helpers/memory.py", line 340, in search_similarity_threshold
    return await self.db.asearch(
  File "/opt/venv-a0/lib/python3.12/site-packages/langchain_community/vectorstores/faiss.py", line 549, in asimilarity_search_with_score
    embedding = await self._aembed_query(query)
  File "/a0/models.py", line 613, in embed_query
    resp = embedding(model=self.model_name, input=[text], **self.kwargs)
  File "/opt/venv-a0/lib/python3.12/site-packages/litellm/main.py", line 4847, in embedding
    raise exception_type(
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - Client error '400 Bad Request' for url 'http://host.docker.internal:11434/api/embed'

Steps to Reproduce

  1. Configure Agent Zero to use ollama/nomic-embed-text for embeddings.
  2. Configure Agent Zero to use ollama/llama3.2:1b for utility tasks.
  3. Run an agent task that triggers memory retrieval or solution memorization.
  4. The system crashes when calling the embedding function during a vector search.

Expected Behavior

The embedding request should be formatted correctly for the Ollama /api/embed endpoint, and the memory index should initialize/search without triggering a 400 status code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions