BooleanQuery rewrite for must_not RangeQuery clauses#17655
Merged
msfroh merged 14 commits intoopensearch-project:mainfrom Jun 4, 2025
Merged
BooleanQuery rewrite for must_not RangeQuery clauses#17655msfroh merged 14 commits intoopensearch-project:mainfrom
msfroh merged 14 commits intoopensearch-project:mainfrom
Conversation
added 2 commits
March 20, 2025 09:55
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
added 2 commits
March 21, 2025 13:51
Contributor
|
❌ Gradle check result for d9eee10: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
Contributor
Signed-off-by: Peter Alfonsi <petealft@amazon.com>
Contributor
Contributor
Author
|
Hey @msfroh , just bumping on this |
msfroh
approved these changes
Jun 4, 2025
Gagan6164
pushed a commit
to Gagan6164/OpenSearch
that referenced
this pull request
Jun 8, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
Gagan6164
pushed a commit
to Gagan6164/OpenSearch
that referenced
this pull request
Jun 8, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
rgsriram
pushed a commit
to rgsriram/OpenSearch
that referenced
this pull request
Jun 9, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
abhita
pushed a commit
to abhita/OpenSearch
that referenced
this pull request
Jun 9, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
1 task
neuenfeldttj
added a commit
to neuenfeldttj/OpenSearch
that referenced
this pull request
Jun 26, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>Signed-off-by: TJ Neuenfeldt <tjneu@amazon.com>
neuenfeldttj
pushed a commit
to neuenfeldttj/OpenSearch
that referenced
this pull request
Jun 26, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
tandonks
pushed a commit
to tandonks/OpenSearch
that referenced
this pull request
Aug 5, 2025
…ect#17655) --------- Signed-off-by: Peter Alfonsi <petealft@amazon.com> Signed-off-by: Peter Alfonsi <peter.alfonsi@gmail.com> Co-authored-by: Peter Alfonsi <petealft@amazon.com>
atris
added a commit
to atris/OpenSearch
that referenced
this pull request
Aug 18, 2025
…rewriting infrastructure
This commit migrates two existing query optimizations from BoolQueryBuilder to the new
query rewriting infrastructure:
1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match)
from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541)
2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for
better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498)
Changes:
Add MustToFilterRewriter with priority 150 (runs after boolean flattening)
Add MustNotToShouldRewriter with priority 175 (runs after must to filter)
Register both rewriters in QueryRewriterRegistry
Add comprehensive test suites (15 tests for must to filter, 14 for must not to should)
Disable legacy implementations in BoolQueryBuilder
Comment out BoolQueryBuilder tests that relied on the old implementations
The new rewriters maintain full backward compatibility while providing:
Better separation of concerns
Recursive rewriting for nested boolean queries
Proper error handling and logging
Consistent priority based execution order
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
rishabhmaurya
pushed a commit
that referenced
this pull request
Aug 27, 2025
* Add query rewriting infrastructure to reduce query complexity
Implements three query optimizations that work together:
- Boolean flattening: removes unnecessary nested boolean queries
- Terms merging: combines multiple term queries on same field in filter/should contexts
- Match-all removal: eliminates redundant match_all queries
Key features:
- 60-70% reduction in query nodes for typical filtered queries
- Feature flag: search.query_rewriting.enabled (default: true)
- Preserves exact query semantics and results
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Fix forbidden api issues
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Update writers and get tests to pass
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Update per CI
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Fix term merging threshold and update comments
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Expose setting and update per comments
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Update CHANGELOG
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Fix tests and ensure scoring MATCH ALL query is preserved
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Migrate must to filter and must not to should optimizations to query rewriting infrastructure
This commit migrates two existing query optimizations from BoolQueryBuilder to the new
query rewriting infrastructure:
1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match)
from must to filter clauses to avoid unnecessary scoring calculations (from PR #18541)
2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for
better performance on single valued numeric fields (from PRs #17655 and #18498)
Changes:
Add MustToFilterRewriter with priority 150 (runs after boolean flattening)
Add MustNotToShouldRewriter with priority 175 (runs after must to filter)
Register both rewriters in QueryRewriterRegistry
Add comprehensive test suites (15 tests for must to filter, 14 for must not to should)
Disable legacy implementations in BoolQueryBuilder
Comment out BoolQueryBuilder tests that relied on the old implementations
The new rewriters maintain full backward compatibility while providing:
Better separation of concerns
Recursive rewriting for nested boolean queries
Proper error handling and logging
Consistent priority based execution order
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
* Handle fields with missing fields
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
---------
Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
atris
added a commit
to atris/OpenSearch
that referenced
this pull request
Aug 28, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
pranikum
pushed a commit
to pranikum/OpenSearch
that referenced
this pull request
Sep 4, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
kh3ra
pushed a commit
to kh3ra/OpenSearch
that referenced
this pull request
Sep 5, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
jainankitk
pushed a commit
to jainankitk/OpenSearch
that referenced
this pull request
Sep 22, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
jainankitk
pushed a commit
to jainankitk/OpenSearch
that referenced
this pull request
Sep 22, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com> Signed-off-by: Ankit Jain <jainankitk@apache.org>
jainankitk
pushed a commit
to jainankitk/OpenSearch
that referenced
this pull request
Sep 22, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com> Signed-off-by: Ankit Jain <jainankitk@apache.org>
asimmahmood1
pushed a commit
to jainankitk/OpenSearch
that referenced
this pull request
Sep 23, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
vinaykpud
pushed a commit
to vinaykpud/OpenSearch
that referenced
this pull request
Sep 26, 2025
…arch-project#19060) * Add query rewriting infrastructure to reduce query complexity Implements three query optimizations that work together: - Boolean flattening: removes unnecessary nested boolean queries - Terms merging: combines multiple term queries on same field in filter/should contexts - Match-all removal: eliminates redundant match_all queries Key features: - 60-70% reduction in query nodes for typical filtered queries - Feature flag: search.query_rewriting.enabled (default: true) - Preserves exact query semantics and results Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix forbidden api issues Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update writers and get tests to pass Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update per CI Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix term merging threshold and update comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Expose setting and update per comments Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Update CHANGELOG Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Fix tests and ensure scoring MATCH ALL query is preserved Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Migrate must to filter and must not to should optimizations to query rewriting infrastructure This commit migrates two existing query optimizations from BoolQueryBuilder to the new query rewriting infrastructure: 1. **MustToFilterRewriter**: Moves non scoring queries (range, geo, numeric term/terms/match) from must to filter clauses to avoid unnecessary scoring calculations (from PR opensearch-project#18541) 2. **MustNotToShouldRewriter**: Transforms negative queries into positive complements for better performance on single valued numeric fields (from PRs opensearch-project#17655 and opensearch-project#18498) Changes: Add MustToFilterRewriter with priority 150 (runs after boolean flattening) Add MustNotToShouldRewriter with priority 175 (runs after must to filter) Register both rewriters in QueryRewriterRegistry Add comprehensive test suites (15 tests for must to filter, 14 for must not to should) Disable legacy implementations in BoolQueryBuilder Comment out BoolQueryBuilder tests that relied on the old implementations The new rewriters maintain full backward compatibility while providing: Better separation of concerns Recursive rewriting for nested boolean queries Proper error handling and logging Consistent priority based execution order Signed-off-by: Atri Sharma <atri.jiit@gmail.com> * Handle fields with missing fields Signed-off-by: Atri Sharma <atri.jiit@gmail.com> --------- Signed-off-by: Atri Sharma <atri.jiit@gmail.com>
RS146BIJAY
reviewed
Jan 5, 2026
| createIndex( | ||
| "test", | ||
| Settings.EMPTY, | ||
| "{\"properties\":{\"int_field\":{\"type\": \"integer\"},\"term_field_1\":{\"type\": \"keyword\"},\"term_field_2\":{\"type\": \"keyword\"}}}}" |
Contributor
There was a problem hiding this comment.
The json mapping passed here seems incorrect. The json string has an extra closing braces.
This was referenced Jan 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR automatically rewrites boolean queries which have a must_not RangeQuery clause to instead use a should clause of the complement of that range. This can be 2-30x faster depending on the query. See #17586 where this is described in more detail.
Example original query (on nyc_taxis):
Rewritten query:
Some benchmark numbers from http_logs and nyc_taxis (excluded ranges are on
@timestampanddropoff_datetimefields respectively). "Originally written as" means whether the query was sent to OpenSearch with amust_notclause, or if it was sent already rewritten withshouldclauses. Ideally, after the changes are applied, these p50s should be the same.I believe the small differences between runs (for example, 7/1-9/1
shouldgoing from 427 -> 405 ms, when we'd expect no change) is just due to variation between different runs/instances. This is expected from what I've seen in tiered caching benchmarks. I've done a few runs and the direction/magnitude of the changes vary.Related Issues
Part of #17586
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.