I am giving you guys another try. I like your implementation more than openclaw and hermes from a ux perspective. If I can every get over intermittent issues, it would be my daily driver. ...
Description
Agent Zero consistently fails with a 400 Bad Request when attempting to access the Ollama /api/embed endpoint via LiteLLM. This occurs during memory similarity searches, specifically within the search_similarity_threshold function of the memory plugin.
The error persists even after clearing the FAISS index (index.faiss/index.pkl) and verifying that the embedding model (nomic-embed-text) is pulled and active. There appears to be a protocol mismatch or malformed request body being sent by LiteLLM to the Ollama API.
Additionally, when using a local utility model (llama3.2:1b), the model fails to adhere to strict JSON output constraints in the _51_memorize_solutions.py extension, providing conversational commentary instead of raw JSON, which suggests the prompt handling or model instruction following is being disrupted by the underlying connection issues.
Environment
- Agent Zero Version: 2026 Build (Dockerized)
- Ollama Server Version: 0.18.3 (Homebrew)
- Ollama Client Version: 0.20.0
- LiteLLM Version: Latest (via Agent Zero venv)
- Main Model: GLM-4
- Utility Model: llama3.2:1b
- Embedding Model: nomic-embed-text
- OS: macOS (Host)
Traceback
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - Client error '400 Bad Request' for url 'http://host.docker.internal:11434/api/embed'
Traceback (most recent call last):
File "/opt/venv-a0/lib/python3.12/site-packages/litellm/main.py", line 4476, in embedding
response = ollama_embeddings_fn(
^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv-a0/lib/python3.12/site-packages/litellm/llms/ollama/completion/handler.py", line 114, in ollama_embeddings
response = litellm.module_level_client.post(url=api_base, json=data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/venv-a0/lib/python3.12/site-packages/httpx/_models.py", line 829, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '400 Bad Request' for url 'http://host.docker.internal:11434/api/embed'
During handling of the above exception, another exception occurred:
File "/a0/plugins/_memory/helpers/memory.py", line 340, in search_similarity_threshold
return await self.db.asearch(
File "/opt/venv-a0/lib/python3.12/site-packages/langchain_community/vectorstores/faiss.py", line 549, in asimilarity_search_with_score
embedding = await self._aembed_query(query)
File "/a0/models.py", line 613, in embed_query
resp = embedding(model=self.model_name, input=[text], **self.kwargs)
File "/opt/venv-a0/lib/python3.12/site-packages/litellm/main.py", line 4847, in embedding
raise exception_type(
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - Client error '400 Bad Request' for url 'http://host.docker.internal:11434/api/embed'
Steps to Reproduce
- Configure Agent Zero to use
ollama/nomic-embed-text for embeddings.
- Configure Agent Zero to use
ollama/llama3.2:1b for utility tasks.
- Run an agent task that triggers memory retrieval or solution memorization.
- The system crashes when calling the embedding function during a vector search.
Expected Behavior
The embedding request should be formatted correctly for the Ollama /api/embed endpoint, and the memory index should initialize/search without triggering a 400 status code.
I am giving you guys another try. I like your implementation more than openclaw and hermes from a ux perspective. If I can every get over intermittent issues, it would be my daily driver. ...
Description
Agent Zero consistently fails with a
400 Bad Requestwhen attempting to access the Ollama/api/embedendpoint via LiteLLM. This occurs during memory similarity searches, specifically within thesearch_similarity_thresholdfunction of the memory plugin.The error persists even after clearing the FAISS index (
index.faiss/index.pkl) and verifying that the embedding model (nomic-embed-text) is pulled and active. There appears to be a protocol mismatch or malformed request body being sent by LiteLLM to the Ollama API.Additionally, when using a local utility model (
llama3.2:1b), the model fails to adhere to strict JSON output constraints in the_51_memorize_solutions.pyextension, providing conversational commentary instead of raw JSON, which suggests the prompt handling or model instruction following is being disrupted by the underlying connection issues.Environment
Traceback
Steps to Reproduce
ollama/nomic-embed-textfor embeddings.ollama/llama3.2:1bfor utility tasks.Expected Behavior
The embedding request should be formatted correctly for the Ollama
/api/embedendpoint, and the memory index should initialize/search without triggering a 400 status code.