Skip to content

[data] Fix ES scaling #643

@leoguillaume

Description

@leoguillaume

Context

The current architecture creates one Elasticsearch index per user and per collection.
This choice was initially made to remain functionally equivalent to Qdrant, the technology previously used.

With increased usage and load, this model is showing its limits and does not comply with best practices recommended by Elastic.

AIA creates one collection per document and never deletes it, which leads to saturation of our Elasticsearch database.


Security

Separating data by index does not provide any additional security in our architecture.

Indeed:

  • Elasticsearch is not directly exposed
  • All access goes through our API
  • Isolation (user, collection) is enforced at the application level

In the event of an API bug, unauthorized access is possible regardless of the number of indices.
Security therefore relies on the API, not on the index structure.

Index-level isolation is only relevant if Elasticsearch is:

  • used by several distinct services
  • or directly exposed to clients ➡️ which is not our case.

Limitations of the Current Model

  • The multiplication of indices leads to an excessive number of shards
  • This degrades cluster performance and stability
  • Elastic explicitly discourages this model

Moreover, physical separation by business criteria (e.g., organization) would greatly complicate:

  • organizational changes
  • data migrations
  • overall system maintenance

Planned Actions

💡

The goal is not to change AIA’s implementation, but to introduce constraints that will allow us to ensure the long-term sustainability of the infrastructure.

Even if AIA eventually migrates to another system, these actions are necessary to maintain the RAG system.

Consolidate Elasticsearch indices (4 days)

  • Move from one index per user / collection to a single global index (or very few indices)

  • Isolation ensured through fields (user_id, collection_id) and mandatory application-level filters

  • Objectives:

    • drastically reduce the number of shards
    • improve stability and scalability
    • align with Elastic’s recommendations

Limit document volume per user (1 day)

  • Define a configurable total volume cap per user, recommended: 2 GB

  • Block uploads when the threshold is reached

  • Objectives:

    • prevent abuse or uncontrolled usage
    • ensure fairness between users
    • protect the infrastructure

→ For next milestone

Implement a TTL per collection (2 days)

  • Define a configurable retention period per collection

  • Apply this TTL by default to all new collections

  • Allow users to choose not to define a TTL

  • Automatically delete documents once they expire

  • Objectives:

    • control data volume
    • reduce costs
    • align with usage patterns (temporary vs. long-lived documents)
  • Possible implementation via:

    • an expires_at field

→ For next milestone

Limit the number of collections per user (1 day)

  • Define a configurable cap on the number of collections per user, recommended: 100

  • Block creation once the threshold is reached

  • Objectives:

    • limit abuse (e.g., massive creation of empty collections)
    • prevent pathological usage patterns
    • simplify product governance
    • simplify technical evaluations

Risks / Drawbacks

  • Requires AIA to revise their implementation

→ For next milestone

  • + issue 618

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions