Skip to content

[Feature Request] Disable System Ingest Pipeline Cache When Memory Usage is High #18251

@bzhangam

Description

@bzhangam

Is your feature request related to a problem? Please describe

This is an follow up item for the system ingest pipeline feature. The system ingest pipeline is a feature to allow plugins to systematically add ingest processors based on the index mapping. And we will cache the index -> pipeline at the node level to avoid recreating the pipeline for each index request. But using the cache can potentially bring us a risk that it uses too much memory. Currently by default we don't limit how many processors can be added to an ingest pipeline and if some plugin introduces a bug creating a lot of processors to an system ingest pipeline and we simply cache it then it may use too much memory.

Describe the solution you'd like

So we propose to add some logic to check the memory usage before we cache another new pipeline to make our system more resilient. When we execute the pipeline if we cannot find the pipeline from the cache we will simply create one at the runtime and release it after the execution. It can add some latency to indexing but at least it can keep the node up without running out of the memory.

Related component

Indexing

Describe alternatives you've considered

No response

Additional context

The system ingest pipeline is introduced by this #17817

And for now we only have one system ingest processor supported by the neural plugin.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingIndexing, Bulk Indexing and anything related to indexingenhancementEnhancement or improvement to existing feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions