-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Is your feature request related to a problem? Please describe
This is an follow up item for the system ingest pipeline feature. The system ingest pipeline is a feature to allow plugins to systematically add ingest processors based on the index mapping. And we will cache the index -> pipeline at the node level to avoid recreating the pipeline for each index request. But using the cache can potentially bring us a risk that it uses too much memory. Currently by default we don't limit how many processors can be added to an ingest pipeline and if some plugin introduces a bug creating a lot of processors to an system ingest pipeline and we simply cache it then it may use too much memory.
Describe the solution you'd like
So we propose to add some logic to check the memory usage before we cache another new pipeline to make our system more resilient. When we execute the pipeline if we cannot find the pipeline from the cache we will simply create one at the runtime and release it after the execution. It can add some latency to indexing but at least it can keep the node up without running out of the memory.
Related component
Indexing
Describe alternatives you've considered
No response
Additional context
The system ingest pipeline is introduced by this #17817
And for now we only have one system ingest processor supported by the neural plugin.