QL: Reduce nesting of same bool queries#96265
Conversation
Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fix elastic#96236
|
Pinging @elastic/es-ql (Team:QL) |
|
Hi @costin, I've created a changelog YAML for you. |
luigidellaquila
left a comment
There was a problem hiding this comment.
LGTM
I'd just add one or two unit tests with multiple conditions and mixed AND/OR
| ; | ||
| {"bool":{"must":[{"wildcard":{"process_path":{"wildcard":"*x","boost":1.0}}}, | ||
| {"bool":{"must_not":[{"wildcard":{"process_path":{"wildcard":"*yx","boost":1.0}}}],"boost":1.0}}],"boost":1.0}} | ||
| {"bool":{"must_not":[{"wildcard":{"process_path":{"wildcard":"*yx","boost":1.0}}}],"boost":1.0}} |
There was a problem hiding this comment.
I'm wondering if this change is not affecting the score of a query that uses bools and returns the SCORE() as well.
There was a problem hiding this comment.
Confirmed with the search team that the two types of queries are equivalent as Lucene normalizes them to the same query (the one created above) when minimum_should_query is the default (1 - which it is).
| private final boolean isAnd; | ||
| private final Query left; | ||
| private final Query right; | ||
| private final List<Query> queries; |
There was a problem hiding this comment.
Imo, this change is a bit forced, conceptually speaking. A boolean query by definition is a boolean operation between two expressions, just like a math operation is. Having multiple expressions ALL linked together by either OR or AND seems like a specialized boolean query, MultiBoolQuery if you want. The BoolQueryBuilder in ES is built similarly, but is generalized: a list of must, a list of should, a list of filters etc.
I would have expected a solution around "flattening" the bool statements, but not touching the BoolQuery in this way.
There was a problem hiding this comment.
Balancing the tree would have been another alternative but I opted for the above because it simplifies the Lucene query which is what triggers the exception in the first place.
Based on napkin math a balanced tree has n-2 intermediate nodes (3 ORs - 1 intermediate node, 4 ORs - 2, 5 ORs - 3, etc..). This implementation truly flattens the OR to just 1 node regardless of how many clauses there are.
| } | ||
|
|
||
| private static BoolQuery mutate(BoolQuery query) { | ||
| List<Function<BoolQuery, BoolQuery>> options = Arrays.asList( |
There was a problem hiding this comment.
This test doesn't fully reflect the changes in this PR, but I'm ok with it.
I've added two more tests - OR OR AND OR OR plus (OR OR) AND (OR OR OR) |
💚 Backport successful
|
Improve transpilation of chained OR and AND queries by replacing boolQuery wrapping with extension of the queries inside the bool clause Fixes elastic#96236
Improve transpilation of chained OR and AND queries by replacing
boolQuery wrapping with extension of the queries inside the bool clause
Fix #96236