Skip to content

Create a model that caches generated embeddings #4290

@akolson

Description

@akolson

Overview

As part of optimizing the recommendations app’s performance, we need to implement a Django model that will serve as a cache for embeddings generated by the model(s). Embeddings are computational-intensive and time-consuming to generate, so caching them significantly improves response times and reduces redundant computation. This task involves designing and implementing the model, and ensuring proper cache management.

Description and outcomes

  1. Create the Embeddings model in /contentcuration/contentcuration/models.py in the contentcuration app.
  2. Define the fields (content_id, embedding) with their corresponding data types(UUID, VectorField).
  3. Generate the migration to create the Embeddings model.
  4. Include the addition of the pgvector extension to the migration.
  5. Apply the migration to update the database schema and create the Embeddings table.
  6. Write unit tests to verify the correctness of the Embeddings model's creation and migration
  7. Verify that the migration includes the expected changes.

Accessibility requirements

Not applicable

Acceptance criteria

  1. The Embeddings model is implemented with two fields: content_id(UUID), and embedding(VectorField).
  2. A migration is generated to create the Embeddings model.
  3. The pgvector extension is added to the migration.
  4. The migration is applied successfully, resulting in the creation of the Embeddings table and pgvector extension.
  5. Unit tests to validate the correct implementation of the Embeddings model and migration.
  6. Documentation is updated to provide information about the Embeddings model.

Resources

Embeddings
pgvector
pgvector-python

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions