Skip to content

search_after query optimization with missing value comparison#14852

Open
bugmakerrrrrr wants to merge 5 commits intoopensearch-project:mainfrom
bugmakerrrrrr:null_search_after
Open

search_after query optimization with missing value comparison#14852
bugmakerrrrrr wants to merge 5 commits intoopensearch-project:mainfrom
bugmakerrrrrr:null_search_after

Conversation

@bugmakerrrrrr
Copy link
Copy Markdown
Contributor

Description

With #8391 we skip search_after short cutting in case of missing is specified, but actually no matter user specifies missing in request or not, the default missing value is _last. This PR takes missing value into consideration of search_after query optimization, for example, if we sort result in asc order and even though min/max value is out of range, the target shard can match in case of missing value is larger than search_after value.

Related Issues

Resolves #14824 #14825

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for b8bbcc7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

❌ Gradle check result for 47f5073: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bugmakerrrrrr bugmakerrrrrr requested a review from jed326 as a code owner August 7, 2024 10:49
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Aug 7, 2024

❌ Gradle check result for 3a8d66e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@opensearch-trigger-bot
Copy link
Copy Markdown
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@bugmakerrrrrr
Copy link
Copy Markdown
Contributor Author

@msfroh @reta would you mind taking a look at this if you get a chance? If this doesn't make sense to you, I will close this PR.

@jsvachon2
Copy link
Copy Markdown

I'm having this exact same issue.. is the fix available anywhere?

@msfroh
Copy link
Copy Markdown
Contributor

msfroh commented May 13, 2025

Thank you @jsvachon2 for commenting on this issue, which let me know that I had missed it!

@bugmakerrrrrr -- does this fix still make sense? Are you able to update it to resolve the conflicts that have accrued in the past year (almost). Conceptually, I agree with the nature of the change and am happy to review it.

@jsvachon2
Copy link
Copy Markdown

jsvachon2 commented May 13, 2025

After some experimentation on AWS Opensearch (which I use), setting track_total_hits=true seems to work for now but it's slower. I didn't realize at first that this repo was a fork from OpenSearch (unless I mis-interpret something I saw) but they do have this issue as well

@msfroh
Copy link
Copy Markdown
Contributor

msfroh commented May 13, 2025

I didn't realize at first that this repo was a fork from OpenSearch (unless I mis-interpret something I saw) but they do have this issue as well.

Amazon OpenSearch Service deploys the code that's developed in this repo. So, in general, bugs get fixed here and then those fixes work their way into AWS. In this case, we can + should incorporate @bugmakerrrrrr's fix.

Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
@bugmakerrrrrr bugmakerrrrrr requested review from a team and cwperks as code owners May 20, 2025 09:00
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 5092bd8: SUCCESS

@codecov
Copy link
Copy Markdown

codecov bot commented May 20, 2025

Codecov Report

Attention: Patch coverage is 75.36232% with 17 lines in your changes missing coverage. Please review.

Project coverage is 72.51%. Comparing base (54af445) to head (9f7c449).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
...main/java/org/opensearch/search/SearchService.java 75.86% 4 Missing and 3 partials ⚠️
...ensearch/search/internal/ContextIndexSearcher.java 0.00% 4 Missing ⚠️
...nsearch/search/searchafter/SearchAfterBuilder.java 0.00% 3 Missing ⚠️
...a/org/opensearch/search/sort/FieldSortBuilder.java 86.36% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #14852      +/-   ##
============================================
- Coverage     72.59%   72.51%   -0.09%     
+ Complexity    67447    67427      -20     
============================================
  Files          5492     5493       +1     
  Lines        311122   311150      +28     
  Branches      45220    45231      +11     
============================================
- Hits         225857   225624     -233     
- Misses        66878    67113     +235     
- Partials      18387    18413      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: panguixin <panguixin@bytedance.com>
@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for 9f7c449: SUCCESS

@bugmakerrrrrr
Copy link
Copy Markdown
Contributor Author

After some experimentation on AWS Opensearch (which I use), setting track_total_hits=true seems to work for now but it's slower. I didn't realize at first that this repo was a fork from OpenSearch (unless I mis-interpret something I saw) but they do have this issue as well

@jsvachon2 you can specify missing explicitly to work around it

@bugmakerrrrrr
Copy link
Copy Markdown
Contributor Author

@msfroh I have updated this, please take a look when you get a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Search Search query, autocomplete ...etc stalled Issues that have stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Incomplete results with search_after and multiple shards

3 participants