fix: use configured VECTOR_STORE.DIMENSIONS instead of hardcoded 1536 by voider1 · Pull Request #491 · plastic-labs/honcho

voider1 · 2026-04-04T23:13:05Z

Summary

Replace hardcoded Vector(1536) in SQLAlchemy models (Document, MessageEmbedding) with Vector(settings.VECTOR_STORE.DIMENSIONS) so the dimension reads from config
Replace hardcoded "output_dimensionality": 1536 in Gemini embedding calls with the same config value
This allows using embedding models with non-1536 dimensions (e.g. 768-dim nomic-embed-text) without patching source

Context

When self-hosting Honcho with a local embedding model (e.g. nomic-embed-text-v1.5 via llama.cpp, which outputs 768 dimensions), the VECTOR_STORE.DIMENSIONS config setting is ignored because src/models.py hardcodes Vector(1536). This causes pgvector to reject embeddings at query time with ValueError: expected 1536 dimensions, not 768.

The config setting already exists and defaults to 1536, so this is a no-op for existing deployments.

Test plan

Verify existing deployments with 1536-dim embeddings are unaffected (default config)
Verify self-hosted setup with VECTOR_STORE.DIMENSIONS = 768 works without source patching
Verify no circular import issues (src.config imported in src.models)

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

Refactor
- Embedding vector dimensions are now configurable through application settings instead of hardcoded values, providing greater flexibility for vector store configurations and system customization.
- Updated database models and embedding services to consistently use configurable dimensions for improved system adaptability and configuration management.

The pgvector embedding dimensions were hardcoded to 1536 in both the SQLAlchemy models and the Gemini embedding client, ignoring the VECTOR_STORE.DIMENSIONS config setting. This made it impossible to use embedding models with different dimensions (e.g., 768-dim models like nomic-embed-text) without patching source code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-04T23:13:21Z

Walkthrough

This PR replaces hardcoded embedding dimensionality (1536) with a configurable settings.VECTOR_STORE.DIMENSIONS value across the embedding client and database models, enabling dynamic vector dimension configuration.

Changes

Cohort / File(s)	Summary
Embedding Client `src/embedding_client.py`	Three methods (`embed`, `simple_batch_embed`, `_process_batch`) now use `settings.VECTOR_STORE.DIMENSIONS` instead of hardcoded `1536` when configuring Gemini embedding output dimensionality.
Database Models `src/models.py`	SQLAlchemy `Vector` column types in `MessageEmbedding` and `Document` models updated to use `settings.VECTOR_STORE.DIMENSIONS` instead of hardcoded `1536`; adds `settings` import from `src.config`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related issues

feat: make embedding dimensions configurable via VECTOR_STORE_DIMENSIONS #463 — This PR directly implements the feature to replace hardcoded embedding dimensionality with configurable settings across embedding_client.py and models.py.

Poem

🐰 A rabbit hops through dimensions so fine,
Trading constants for configs divine—
Embeddings flexible, no longer bound,
In vector space, freedom is found! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: replacing hardcoded 1536 with configured VECTOR_STORE.DIMENSIONS across embedding-related code.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/embedding_client.py (1)
90-94: Add dimensions parameter to OpenAI/OpenRouter embedding calls for consistency with configured vector size.

The Gemini provider respects settings.VECTOR_STORE.DIMENSIONS via output_dimensionality, but the OpenAI and OpenRouter providers don't pass dimensions to the embeddings API. If VECTOR_STORE.DIMENSIONS is configured to a non-default value (e.g., 768), the OpenAI embeddings will still return 1536-dimensional vectors, causing a mismatch with the database schema.
♻️ Suggested fix
         else:  # openai
             response = await self.client.embeddings.create(
-                model=self.model, input=query
+                model=self.model,
+                input=query,
+                dimensions=settings.VECTOR_STORE.DIMENSIONS,
             )
             return response.data[0].embedding
Apply similar changes to lines 126-130 and 266-268.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/embedding_client.py` around lines 90 - 94, The OpenAI/OpenRouter
embedding calls (the branches that call self.client.embeddings.create in the
embedding client) must pass the configured vector size so returned vectors match
settings.VECTOR_STORE.DIMENSIONS; update those calls to include
dimensions=settings.VECTOR_STORE.DIMENSIONS (similar to how Gemini uses
output_dimensionality) in the OpenAI and OpenRouter branches, and apply the same
change to the other two embedding.create call sites that currently omit
dimensions (the other OpenAI/OpenRouter branches referenced in the review).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/models.py`:
- Line 282: The models now use Vector(settings.VECTOR_STORE.DIMENSIONS) for the
embedding columns but existing migrations hardcode Vector(1536); create a new
Alembic migration that alters documents.embedding and
message_embeddings.embedding to the configured dimension (use the same Vector
type with settings.VECTOR_STORE.DIMENSIONS) and include a safe downgrade path;
ensure the migration is idempotent (check current dimension before altering or
use SQL that works across Postgres versions), reference the Vector type change
for columns in the migration, and add a short note in the migration
message/README that deployments with VECTOR_STORE.DIMENSIONS != 1536 must run
the new migration (or manually ALTER TABLE ... TYPE ...) before inserting
embeddings.

---

Nitpick comments:
In `@src/embedding_client.py`:
- Around line 90-94: The OpenAI/OpenRouter embedding calls (the branches that
call self.client.embeddings.create in the embedding client) must pass the
configured vector size so returned vectors match
settings.VECTOR_STORE.DIMENSIONS; update those calls to include
dimensions=settings.VECTOR_STORE.DIMENSIONS (similar to how Gemini uses
output_dimensionality) in the OpenAI and OpenRouter branches, and apply the same
change to the other two embedding.create call sites that currently omit
dimensions (the other OpenAI/OpenRouter branches referenced in the review).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 47328559-6761-46bd-8e06-84d888d9cb23

📥 Commits

Reviewing files that changed from the base of the PR and between e487358 and 7c01b4b.

📒 Files selected for processing (2)

src/embedding_client.py
src/models.py

coderabbitai · 2026-04-04T23:17:46Z

src/models.py

    )
    content: Mapped[str] = mapped_column(TEXT)
-    embedding: MappedColumn[Any] = mapped_column(Vector(1536), nullable=True)
+    embedding: MappedColumn[Any] = mapped_column(Vector(settings.VECTOR_STORE.DIMENSIONS), nullable=True)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify current migration hardcodes the dimension rg -n "Vector$1536$" --glob '*.py' migrations/

Repository: plastic-labs/honcho

Length of output: 531

🏁 Script executed:

#!/bin/bash # Check lines 282 and 390 in src/models.py echo "=== Checking src/models.py lines 282 and 390 ===" sed -n '280,284p' src/models.py echo "---" sed -n '388,392p' src/models.py

Repository: plastic-labs/honcho

Length of output: 694

🏁 Script executed:

#!/bin/bash # List all migration files chronologically to understand scope echo "=== All migrations in versions directory ===" ls -1 migrations/versions/ | sort echo "" echo "=== Count of Vector(1536) across all files ===" rg "Vector$1536$" --glob '*.py' migrations/ -c --sort path

Repository: plastic-labs/honcho

Length of output: 1548

🏁 Script executed:

#!/bin/bash # Check the latest migration(s) to see if they address dimension config echo "=== Latest migration content (last 50 lines) ===" latest=$(ls -1 migrations/versions/*.py | sort | tail -1) echo "File: $latest" tail -50 "$latest"

Repository: plastic-labs/honcho

Length of output: 1706

🏁 Script executed:

#!/bin/bash # Identify which models have embedding columns echo "=== Classes containing embedding columns ===" rg -B 5 "embedding: MappedColumn" src/models.py | grep "^class"

Repository: plastic-labs/honcho

Length of output: 107

🏁 Script executed:

#!/bin/bash # Check the support_external_embeddings migration to see if it attempted a fix echo "=== support_external_embeddings migration (context around Vector changes) ===" sed -n '40,65p' migrations/versions/119a52b73c60_support_external_embeddings.py

Repository: plastic-labs/honcho

Length of output: 801

🏁 Script executed:

#!/bin/bash # Count total affected columns across all migrations echo "=== Tables and columns affected by hardcoded Vector(1536) ===" rg "Vector$1536$" -B 2 -A 0 --glob '*.py' migrations/ | grep -E "(op\.|sa\.Column)" | head -20

Repository: plastic-labs/honcho

Length of output: 744

Database migration required for non-default vector dimensions.

The SQLAlchemy models now declare Vector(settings.VECTOR_STORE.DIMENSIONS) in documents and message_embeddings tables (src/models.py lines 282 and 390), but migrations still hardcode Vector(1536):

a1b2c3d4e5f6_initial_schema.py (documents, line 366)

917195d9b5e9_add_messageembedding_table.py (message_embeddings, line 31)

119a52b73c60_support_external_embeddings.py (referenced in alter operations, lines 45 and 53)

For deployments using VECTOR_STORE.DIMENSIONS != 1536:

Existing databases: Columns remain Vector(1536), causing insertion errors when the model attempts different dimensions.

New databases: Migrations create Vector(1536) regardless of configuration.

The 119a52b73c60_support_external_embeddings.py migration only made these columns nullable; it did not address the dimension mismatch.

Recommended actions:

Add a new migration to alter both documents.embedding and message_embeddings.embedding columns to match the configured dimension.

Document that non-1536 deployments must manually run ALTER TABLE statements or apply the new migration before using custom dimensions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/models.py` at line 282, The models now use Vector(settings.VECTOR_STORE.DIMENSIONS) for the embedding columns but existing migrations hardcode Vector(1536); create a new Alembic migration that alters documents.embedding and message_embeddings.embedding to the configured dimension (use the same Vector type with settings.VECTOR_STORE.DIMENSIONS) and include a safe downgrade path; ensure the migration is idempotent (check current dimension before altering or use SQL that works across Postgres versions), reference the Vector type change for columns in the migration, and add a short note in the migration message/README that deployments with VECTOR_STORE.DIMENSIONS != 1536 must run the new migration (or manually ALTER TABLE ... TYPE ...) before inserting embeddings.

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use configured VECTOR_STORE.DIMENSIONS instead of hardcoded 1536#491

fix: use configured VECTOR_STORE.DIMENSIONS instead of hardcoded 1536#491
voider1 wants to merge 1 commit intoplastic-labs:mainfrom
voider1:fix/vector-dimensions-from-config

voider1 commented Apr 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 4, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

voider1 commented Apr 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

voider1 commented Apr 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 4, 2026 •

edited

Loading