Implementation for enabling intra-segment search#19704
Implementation for enabling intra-segment search#19704prudhvigodithi merged 63 commits intoopensearch-project:mainfrom
Conversation
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
❌ Gradle check result for 73fba5f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
big5 data distributionThis is on the big5 workload when configured/default slice count is less than partitions requestedShow logs 1Creation of slices with partitions and distributed meeting Lucene constrains (no fallback as we have enough slices to take all the partitions)Show logs 2Parallel execution of slices with segments and partitionsShow logs 3 |
|
IMO this should be a good start for testing the queries and bechmarks. In parallel I will update the PR once I have this sorted #18851. Either way as posted in description even though we have auto partition and slice logic with #18851 we still honor user configured settings. |
|
I ended up using the noaa dataset to work on running the term query Weirdly enough, I did not notice much of an improvement. Without intrasegment concurrent search: With intra segment concurrent search: |
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
❌ Gradle check result for 5ffd097: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
❌ Gradle check result for 019470b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
{"run-benchmark-test": "id_3"} |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/4847/ . Final results will be published once the job is completed. |
|
{"run-benchmark-test": "id_3"} |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/4848/ . Final results will be published once the job is completed. |
|
❌ Gradle check result for 49fd872: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/4848/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/171/
|
Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
|
{"run-benchmark-test": "id_6"} |
|
The Jenkins job url is https://build.ci.opensearch.org/job/benchmark-pull-request/4859/ . Final results will be published once the job is completed. |
Benchmark ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-pull-request/5875/
|
Benchmark Baseline Comparison ResultsBenchmark Results for Job: https://build.ci.opensearch.org/job/benchmark-compare/238/
|
…19704) * Initial commit intra segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * intra segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * intra segment, update default values Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * intra segment, update default values Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * comment info logger Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix SubSearchContext Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Default enable test Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Test auto partition logic Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update LPT logic Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Intra segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * disable logs Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Intra segment, code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Onboard intra segment decider and queries Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Onboard intra segment decider and queries Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Add java docs Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Default Enable Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Default Enable Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Support global agg Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * disable for start tree Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Spotless fix Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Add partition strategy Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Upstream fetch Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Initial code for enabling intra Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Initial code for enabling intra Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Initial code for enabling intra Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code refactor Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code remove Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code remove Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code remove Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Add tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update changelog Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Upstream fetch Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * use balanced as default Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * address comments Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Upstream fetch Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Change from none to segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Address PR comments Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Address PR comments Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix conflict Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix conflict Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> --------- Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
…19704) * Initial commit intra segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * intra segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * intra segment, update default values Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * intra segment, update default values Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * comment info logger Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix SubSearchContext Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Default enable test Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Test auto partition logic Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update LPT logic Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Intra segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * disable logs Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Intra segment, code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update cluster settings Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Onboard intra segment decider and queries Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Onboard intra segment decider and queries Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Add java docs Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Default Enable Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Default Enable Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Support global agg Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * disable for start tree Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Spotless fix Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Add partition strategy Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Upstream fetch Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code cleanup Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Initial code for enabling intra Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Initial code for enabling intra Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Initial code for enabling intra Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code refactor Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code remove Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code remove Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * code remove Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Add tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Update changelog Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Upstream fetch Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * use balanced as default Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * address comments Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Upstream fetch Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Change from none to segment Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Address PR comments Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Address PR comments Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix conflict Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix conflict Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> * Fix tests Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com> --------- Signed-off-by: Prudhvi Godithi <pgodithi@amazon.com>
Description
Initial WIP implementation
Creates segment partitions and distributes partitions across slices. Ensures Lucene constraint is satisfied. Each partition from a segment goes to a different slice with round robin assignment
Partitions are based on
min_segment_sizeandpartitions_per_segmentsettings. I'm working closely on implementing better auto partitions to slice mechanism [Intra-SegmentConcurrentSearch] Slicing mechanism #18851. Even with this auto partition logic the idea is to still honor user passedmin_segment_sizeandpartitions_per_segmentsettings.Falls back to default behavior (search by full segment) when there insufficient slices for the
partitions_per_segment. This is to ensure same segment partitions go to different slice. In this the idea is to increase themax_slice_countand do not rely oncomputeDefaultSliceCount()method during cluster start up.More details here on default Lucene partition to slice mechanism [Intra-SegmentConcurrentSearch] Slicing mechanism #18851 (comment).
To test the benchmarks we can disable the
logger.infolines. This is just to print the partition and assigned slice distribution. See Implementation for enabling intra-segment search #19704 (comment).Dec 18 2025 (Latest Updated changes)
I have updated the PR with the following changes. This PR introduces configuration settings and a decision framework for intra-segment search.
Updated settings:
Prerequisites:
supportsIntraSegmentSearchto returntrue.If either is disabled → intra-segment disabled, skip further evaluation.
Query Evaluation: Walks query tree, checks support for intra-segment and concurrent segment search:
Aggregation Evaluation: Checks AggregatorFactory for intra segment search support:
Final Decision:
Related Issues
Related to #19694 and part of #18851.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
New Features
New Settings
Changes
✏️ Tip: You can customize this high-level summary in your review settings.