-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Context
The current architecture creates one Elasticsearch index per user and per collection.
This choice was initially made to remain functionally equivalent to Qdrant, the technology previously used.
With increased usage and load, this model is showing its limits and does not comply with best practices recommended by Elastic.
AIA creates one collection per document and never deletes it, which leads to saturation of our Elasticsearch database.
Security
Separating data by index does not provide any additional security in our architecture.
Indeed:
- Elasticsearch is not directly exposed
- All access goes through our API
- Isolation (user, collection) is enforced at the application level
In the event of an API bug, unauthorized access is possible regardless of the number of indices.
Security therefore relies on the API, not on the index structure.
Index-level isolation is only relevant if Elasticsearch is:
- used by several distinct services
- or directly exposed to clients ➡️ which is not our case.
Limitations of the Current Model
- The multiplication of indices leads to an excessive number of shards
- This degrades cluster performance and stability
- Elastic explicitly discourages this model
Moreover, physical separation by business criteria (e.g., organization) would greatly complicate:
- organizational changes
- data migrations
- overall system maintenance
Planned Actions
💡
The goal is not to change AIA’s implementation, but to introduce constraints that will allow us to ensure the long-term sustainability of the infrastructure.
Even if AIA eventually migrates to another system, these actions are necessary to maintain the RAG system.
Consolidate Elasticsearch indices (4 days)
-
Move from one index per user / collection to a single global index (or very few indices)
-
Isolation ensured through fields (
user_id,collection_id) and mandatory application-level filters -
Objectives:
- drastically reduce the number of shards
- improve stability and scalability
- align with Elastic’s recommendations
Limit document volume per user (1 day)
-
Define a configurable total volume cap per user, recommended: 2 GB
-
Block uploads when the threshold is reached
-
Objectives:
- prevent abuse or uncontrolled usage
- ensure fairness between users
- protect the infrastructure
→ For next milestone
Implement a TTL per collection (2 days)
-
Define a configurable retention period per collection
-
Apply this TTL by default to all new collections
-
Allow users to choose not to define a TTL
-
Automatically delete documents once they expire
-
Objectives:
- control data volume
- reduce costs
- align with usage patterns (temporary vs. long-lived documents)
-
Possible implementation via:
- an
expires_atfield
- an
→ For next milestone
Limit the number of collections per user (1 day)
-
Define a configurable cap on the number of collections per user, recommended: 100
-
Block creation once the threshold is reached
-
Objectives:
- limit abuse (e.g., massive creation of empty collections)
- prevent pathological usage patterns
- simplify product governance
- simplify technical evaluations
Risks / Drawbacks
- Requires AIA to revise their implementation
→ For next milestone
- + issue 618