Skip to content

[Searchable Snapshot] Design file caching mechanism for block based files #4964

@andrross

Description

@andrross

Currently searchable snapshots download Lucene files using a chunking approach to only download the data that is needed to service a query. It should use a node-level LRU cache that will use up to a configurable amount of local disk space to avoid re-downloading the same parts of frequently-accessed files. All shards on the node should share the same logical cache, meaning that if one shard is queried exclusively then it should use up to the entire cache space configured for the node.

Open questions:

  • How is the cache size configured for a node? Is there a reasonable default if no configuration is provided?
  • Where on disk should the data be cached? i.e. inside the same directory structure as the rest of the index data? Or are there use cases where the cache would want to be a dedicated disk or mount that would require a separate top-level directory?
  • How should a node report cache statistics and/or utilization?

Metadata

Metadata

Assignees

Labels

Indexing & SearchdiscussIssues intended to help drive brainstorming and decision makingenhancementEnhancement or improvement to existing feature or request

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions