search_after query optimization with shard/segment level short cutting#7453
search_after query optimization with shard/segment level short cutting#7453dblock merged 22 commits intoopensearch-project:mainfrom
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/internal/ContextIndexSearcher.java
Outdated
Show resolved
Hide resolved
1a93d43 to
20b8dcc
Compare
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #7453 +/- ##
============================================
- Coverage 70.68% 70.62% -0.06%
- Complexity 56095 59846 +3751
============================================
Files 4680 4897 +217
Lines 266076 286967 +20891
Branches 39065 41366 +2301
============================================
+ Hits 188068 202668 +14600
- Misses 62101 67561 +5460
- Partials 15907 16738 +831
|
20b8dcc to
f510edc
Compare
Gradle Check (Jenkins) Run Completed with:
|
f510edc to
1e2c79a
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
|
Opened an issue to fix test |
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
|
The backport to To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-7453-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c659d04956e300740a7626f952bb7958492ed843
# Push it to GitHub
git push --set-upstream origin backport/backport-7453-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.xThen, create a pull request where the |
opensearch-project#7453) * search_after query optimization with shard/segment level short cutting Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting logical && operator Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressig review comments Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting NPE Signed-off-by: gashutos <gashutos@amazon.com> * Fixing failed integ Signed-off-by: gashutos <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Refactoring a bit to aggressive null checks Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Adding unit tests for MinAndMax methods Signed-off-by: gashutos <gashutos@amazon.com> --------- Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
#7453) (#7727) * search_after query optimization with shard/segment level short cutting * Correcting logical && operator * Addressig review comments * Correcting NPE * Fixing failed integ * Addressing review comments and fixing some more integ * Addressing review comments and fixing some more integ * Refactoring a bit to aggressive null checks * Adding unit tests for MinAndMax methods --------- Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
I'm just getting around to this... @gashutos can you give an example query from the http_logs benchmark above? Did you |
Hey @nknize , Below is the exact http_logs query for which data is present only in few indices & segmetns. For profiling purpose, I am querying only on one shard as mentioned below. (No data after date 1999-06-10 for asc order) below is the output, The above query takes |
|
if I strictly speak about vanilla http_logs improvements, This is the query, same is true for desc order too. |
opensearch-project#7453) * search_after query optimization with shard/segment level short cutting Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting logical && operator Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressig review comments Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting NPE Signed-off-by: gashutos <gashutos@amazon.com> * Fixing failed integ Signed-off-by: gashutos <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Refactoring a bit to aggressive null checks Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Adding unit tests for MinAndMax methods Signed-off-by: gashutos <gashutos@amazon.com> --------- Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
opensearch-project#7453) * search_after query optimization with shard/segment level short cutting Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting logical && operator Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressig review comments Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting NPE Signed-off-by: gashutos <gashutos@amazon.com> * Fixing failed integ Signed-off-by: gashutos <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Refactoring a bit to aggressive null checks Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Adding unit tests for MinAndMax methods Signed-off-by: gashutos <gashutos@amazon.com> --------- Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
opensearch-project#7453) * search_after query optimization with shard/segment level short cutting Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting logical && operator Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressig review comments Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting NPE Signed-off-by: gashutos <gashutos@amazon.com> * Fixing failed integ Signed-off-by: gashutos <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Refactoring a bit to aggressive null checks Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Adding unit tests for MinAndMax methods Signed-off-by: gashutos <gashutos@amazon.com> --------- Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com>
|
Looks like this may have introduced a regression between 2.7 and 2.8, #8212. |
opensearch-project#7453) * search_after query optimization with shard/segment level short cutting Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting logical && operator Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressig review comments Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Correcting NPE Signed-off-by: gashutos <gashutos@amazon.com> * Fixing failed integ Signed-off-by: gashutos <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Addressing review comments and fixing some more integ Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Refactoring a bit to aggressive null checks Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> * Adding unit tests for MinAndMax methods Signed-off-by: gashutos <gashutos@amazon.com> --------- Signed-off-by: Chaitanya Gohel <gashutos@amazon.com> Signed-off-by: gashutos <gashutos@amazon.com> Signed-off-by: Chaitanya Gohel <104654647+gashutos@users.noreply.github.com> Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Description
search_afterqueries are significantly slower on asc/desc order sort. Like for http_logs OSB workload, vanillaASCorder sort queries takes like11 mswhile same query just withsearch_afterparameter takes1703 ms.eg. below query takes 1703 ms, but once we remove
search_after, it just takes 11 ms.Reason
The reason is, in time series kind of workloads, there will be segments or shards, which doesnt contain any result hits comparing to given search_after time parameter. There is a bug in Lucene, where if entire segment doesnt contain any competitive hit given search_after parameter, it is not able to skip any hit, since it is not able to update its competitive score.
Because TopFieldCollector.java#250, the
queuefullvariable will never able to set true because we didnt collect any competitive hit. I will file seperate issue on lucene and planning to fix it on lucene side as well.But from OpenSearch perspective, it makes sense to not to issue any call to Segment read to Lucene in cases we determine if its min/max value for primary sort field is not within range of given
search_afterparamter. Similar logic stands true at shard level as well.Changes in this CR
canMatchshort cutting if MinMax for primary sort field is out of range forsearch_aftercanMatchshort cutting if MinMax for primary sort field is out of range forsearch_aftershouldReverseSegmentReadOrderfunction toContextIndexSearchersince now we are passingDefaultSearchContextobject toContextIndexSearcher.Performance gains
I am observing, we are gaining 10x to 60x depending on dataset with this change for
ASCorder sorting and around 3x forDESCorder sorting. We will end up saving lot of unnecessary computation with short cutting shard/segments in canMatch phase itself. Again, the optimization multiplier will depend on data set , but this change is for sure optimize something if not equal performance, but definitely wont regress. I verified on nyc_taxy and other type of workloads.This work load is used (http_logs)
The operation was performed on single index size of 12.9 G with 4 shards with 136 segments on it.
Testing
canMatchSearchAfter.Related Issues
6596
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.