Skip to content

Address ignored source read performance issue.#145077

Merged
martijnvg merged 1 commit intoelastic:mainfrom
martijnvg:ignored_source_performance_bug
Mar 27, 2026
Merged

Address ignored source read performance issue.#145077
martijnvg merged 1 commit intoelastic:mainfrom
martijnvg:ignored_source_performance_bug

Conversation

@martijnvg
Copy link
Copy Markdown
Member

@martijnvg martijnvg commented Mar 27, 2026

With the new DOC_VALUES_IGNORED_SOURCE ignored source format, the binary doc values instance (fetched via MultiValuedSortedBinaryDocValues) gets re-initialized for each docid that needs to read ignored source.

The binary doc values instance keeps the current uncompressed block around, so that subsequent docids can read without having to decompress if the docid needs current block. Because binary doc values instance gets re-initialized each time ignored source gets read for a docid, the binary doc values implementation doesn't get a chance to reuse current decompressed block. With the result that each docid always decompresses a compressed block.

Marking as non-issue, since this bug hasn't been released in a stateful release yet.

See attached flamegraph for how this issue manifests:
rally.html

With the new DOC_VALUES_IGNORED_SOURCE ignored source format,
the binary doc values instance (fetched via MultiValuedSortedBinaryDocValues) gets re-initialized for each docid that needs to read ignored source.

The binary doc values instance keeps the current uncompressed block around, so that subsequent docids can read without having to decompress if the docid needs current block. Because binary doc values instance each re-initialized each time, this doesn't happen any more each each docid always needs to compress a block.
@martijnvg martijnvg added >non-issue :StorageEngine/Mapping The storage related side of mappings labels Mar 27, 2026
@martijnvg martijnvg marked this pull request as ready for review March 27, 2026 11:18
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@martijnvg martijnvg enabled auto-merge (squash) March 27, 2026 12:12
Copy link
Copy Markdown
Contributor

@jordan-powers jordan-powers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, LGTM!

@martijnvg martijnvg merged commit 7857dc2 into elastic:main Mar 27, 2026
36 checks passed
mamazzol pushed a commit to mamazzol/elasticsearch that referenced this pull request Mar 30, 2026
With the new DOC_VALUES_IGNORED_SOURCE ignored source format,
the binary doc values instance (fetched via MultiValuedSortedBinaryDocValues) gets re-initialized for each docid that needs to read ignored source.

The binary doc values instance keeps the current uncompressed block around, so that subsequent docids can read without having to decompress if the docid needs current block. Because binary doc values instance gets re-initialized each time ignored source gets read for a docid, the binary doc values implementation doesn't get a chance to reuse current decompressed block. With the result that each docid always decompresses a compressed block.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants