-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Is your feature request related to a problem? Please describe
FieldDataCache is a node level cache and uses a composite key (indexFieldCache + shardId + IndexReader.CacheKey) to uniquely identify an item. IndexFieldCache further contains fieldName and index level details.
As of today, any item in fieldDataCache is removed in blocking/sync manner. This happens in below scenarios:
- Invalidation: During refresh of any indexShard, the desired key is immediately invalidated in a sync manner here
- Index deletion: During removal of an index, all related indexShard fields are removed. This is also done in a sync manner. See this. Here we iterate overall ALL cache keys, check whether key belongs to this index and then delete it. Highly inefficient.
Problem
We already have had issues where during index removal, data node dropped as a lot of time/cpu was taken up in clearing up fieldDataCache.
Scenario(observed in production): Cluster manager node sends cluster state update task to data node on index removal. Data node starts clearing up fieldData cache on same clusterStateApplier(clusterApplierService#updateTask) thread, taking a lot of time(due to large cache size and inefficient all key traversal) and unable to acknowledge back to cluster manager node. This eventually resulted in this data node being removed from cluster.
Sample hot thread dump observed
100.4% (501.8ms out of 500ms) cpu usage by thread 'opensearch[46a4b13b820a8bcf60ac8f1de15cee14][clusterApplierService#updateTask][T#1]'
10/10 snapshots sharing following 22 elements
app//org.opensearch.indices.fielddata.cache.IndicesFieldDataCache$IndexFieldCache.clear(IndicesFieldDataCache.java:219)
app//org.opensearch.index.fielddata.IndexFieldDataService.clear(IndexFieldDataService.java:107)
app//org.opensearch.index.fielddata.IndexFieldDataService.close(IndexFieldDataService.java:179)
app//org.opensearch.core.internal.io.IOUtils.close(IOUtils.java:87)
app//org.opensearch.core.internal.io.IOUtils.close(IOUtils.java:129)
app//org.opensearch.core.internal.io.IOUtils.close(IOUtils.java:79)
app//org.opensearch.index.IndexService.close(IndexService.java:362)
app//org.opensearch.indices.IndicesService.removeIndex(IndicesService.java:887)
app//org.opensearch.indices.cluster.IndicesClusterStateService.removeIndices(IndicesClusterStateService.java:410)
app//org.opensearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:255)
app//org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:591)
app//org.opensearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:578)
app//org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:546)
app//org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:469)
app//org.opensearch.cluster.service.ClusterApplierService.access$000(ClusterApplierService.java:81)
app//org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:180)
Describe the solution you'd like
As a solution, I suggest we should do following:
- Cache clear/invalidation logic should be done in an async manner like we do in RequestCache. Here on any index removal, we will hold all such stale indices in a list and eventually clean them up on a separate thread in some X interval.
- During any item removal in the cache, multiple removal listeners are invoked in a sync manner to update stats. We should also try doing this in an async manner on a different thread.
- We should avoid going through all the entries in the cache event though we need to delete entries only for 1 or few indices. This can be done either by deleting specific entries for an index by calling invalidateAll for set of keys or delaying this until a later point where we want to delete multiple indices.
- Consider integrating FieldDataCache with TieredCache(heap + disk)/DiskCache considering current default onHeapCache size is unlimited and only controlled by field data circuit breakers. This would clear up a lot of heap.
Related component
Search:Performance
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status