search_after query optimization with missing value comparison#14852
search_after query optimization with missing value comparison#14852bugmakerrrrrr wants to merge 5 commits intoopensearch-project:mainfrom
Conversation
|
❌ Gradle check result for b8bbcc7: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 47f5073: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
❌ Gradle check result for 3a8d66e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
|
This PR is stalled because it has been open for 30 days with no activity. |
|
I'm having this exact same issue.. is the fix available anywhere? |
|
Thank you @jsvachon2 for commenting on this issue, which let me know that I had missed it! @bugmakerrrrrr -- does this fix still make sense? Are you able to update it to resolve the conflicts that have accrued in the past year (almost). Conceptually, I agree with the nature of the change and am happy to review it. |
|
After some experimentation on AWS Opensearch (which I use), setting track_total_hits=true seems to work for now but it's slower. I didn't realize at first that this repo was a fork from OpenSearch (unless I mis-interpret something I saw) but they do have this issue as well |
Amazon OpenSearch Service deploys the code that's developed in this repo. So, in general, bugs get fixed here and then those fixes work their way into AWS. In this case, we can + should incorporate @bugmakerrrrrr's fix. |
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
Signed-off-by: panguixin <panguixin@bytedance.com>
3a8d66e to
5092bd8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #14852 +/- ##
============================================
- Coverage 72.59% 72.51% -0.09%
+ Complexity 67447 67427 -20
============================================
Files 5492 5493 +1
Lines 311122 311150 +28
Branches 45220 45231 +11
============================================
- Hits 225857 225624 -233
- Misses 66878 67113 +235
- Partials 18387 18413 +26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: panguixin <panguixin@bytedance.com>
3e6253f to
9f7c449
Compare
@jsvachon2 you can specify |
|
@msfroh I have updated this, please take a look when you get a chance. |
Description
With #8391 we skip search_after short cutting in case of missing is specified, but actually no matter user specifies missing in request or not, the default missing value is
_last. This PR takes missing value into consideration of search_after query optimization, for example, if we sort result in asc order and even though min/max value is out of range, the target shard can match in case of missing value is larger than search_after value.Related Issues
Resolves #14824 #14825
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.