Skip to content

Improve performance for approximated match_all sort queries #18206

@prudhvigodithi

Description

@prudhvigodithi

Describe the bug

  • Update the shortcutTotalHitCount logic to identify the query as MatchAllDocsQuery.class.

  • Today with approximation the match_all is converted to a range query. With this the totalHitsThreshold coming from Lucene TopFieldCollector is changed to 10k.

  • For match_all the threshold should be 10 (the numHits value) which is coming from TopDocsCollectorContext part of OpenSearch.

  • With totalHitsThreshold as 10k, with large threshold this is delaying the updateCompetitiveIterator process part of the Lucene NumericComparator and forcing to compare all the 10k docs.

  • With the default 10, the competitive iterator would have updated early and could eliminate some docs from 10k.

  • This fixed the inconsistency because now the total hit count correctly includes all documents that would match a true match_all query, even when the query has been optimized into a range query on the sort field.

  • Should fix the [AUTOCUT] Gradle Check Flaky Test Report for SimpleSearchIT #16851

Related component

Search:Performance

To Reproduce

N/A

Expected behavior

This should improve the performance for match_all queries that go with approximation as the Lucene competitive iterator would trigger early. Benchmark results #18189 (comment).

This change should also bring the behavior in line with what users expect when running a match_all query with sorting to include the documents that was missing the sort field.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Type

No type

Projects

Status

✅ Done

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions