Update txtai index format to remove Python-specific serialization

The txtai index format currently has a number of different components that support persistence as follows.

| Component | Description |
| ----------------- | ----------------- | 
|ANN | Approximate Nearest Neighbor indexes |
|Database | Content storage|
| Embeddings | Semantic search engine. Integrates other components. Has other storage for configuration and index ids.|
| Graph | Graph networks |
| Scoring | Sparse/keyword indexes |

In most cases, an underlying library dictates the storage format. For example, Faiss has it's own index format as does SQLite. 

There are cases in the current code base where Python-specific pickle serialization is being used to save content. While the pickle format is fine for local data, it's well documented that sharing data in pickle format is not recommended. 

The majority of txtai's use cases are building local indexes. Although there is the ability to sync indexes to cloud storage (object storage, hugging face hub etc). It's best to not use pickle serialization except when working with local and/or temporary data. 

The following issues will handle migrating Python-specific pickle serialization to other methods.

- [x] #770
- [x] #771
- [x] #772
- [x] #773
- [x] #774
- [x] #775
- [x] #776
- [x] #777
- [x] #778

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update txtai index format to remove Python-specific serialization #769

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Description
ANN	Approximate Nearest Neighbor indexes
Database	Content storage
Embeddings	Semantic search engine. Integrates other components. Has other storage for configuration and index ids.
Graph	Graph networks
Scoring	Sparse/keyword indexes

Update txtai index format to remove Python-specific serialization #769

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions