SQL: Reduce number of ranges generated for comparisons#30267
SQL: Reduce number of ranges generated for comparisons#30267costin merged 6 commits intoelastic:masterfrom
Conversation
Rewrote optimization rule for combining ranges by improving the detection of binary comparisons in a tree to better combine them in a range, regardless of their place inside an expression. Additionally, improve the comparisons of Numbers of different types Also, improve reassembly of conjunction/disjunction into balanced trees. Fix elastic#30017
|
Pinging @elastic/es-search-aggs |
nik9000
left a comment
There was a problem hiding this comment.
I left a few things. The actual range optimizer code makes my head spin a bit so I'm going to have to come back to it.
| } | ||
| } catch (ClassCastException cce) { | ||
| // types are not compatible | ||
| // return null |
There was a problem hiding this comment.
What are the side effects on not catching this and letting it bubble out? Or even wrapping it? I think it'd be nice to have a big comment here about why returning null is ok.
Also, can you move the try into the if statement so it looks smaller? That'd just make me a bit more comfortable. If returning null is right here then I think you should return it from the catch rather than fall out. That feels a little safer even if it isn't.
There was a problem hiding this comment.
CCE is thrown when the types are not compatible; I've expanded the comment and returns the null directly from catch.
| } | ||
|
|
||
| /** | ||
| * Build a binary 'pyramid' - while a bit longer this method creates a balanced tree as oppose to a plain |
There was a problem hiding this comment.
Maybe
Builds a "pyramid" out of all clauses by combining them pairwise. So
{@code combine(List(a, b, c, d, e), AND)` becomes
<pre><code>
AND
|---/ \---|
AND AND
/ \ / \
e AND b c
/ \
a b
</pre></code>
The picture would make me feel better.
| return null; | ||
| } | ||
|
|
||
| List<Expression> result = new ArrayList<>(exps); |
There was a problem hiding this comment.
I wonder if this should be Queue<Expression> work = new ArrayDeque<>(exps); and then you do work.addLast(combiner.apply(work.removeFirst(), work.removeFirst())); The for loop and remove together scare me.
There was a problem hiding this comment.
I talked with @costin earlier about this - he wants to keep the order the same and my proposal doesn't.
What about this?
while (result.size() > 1) {
ListIterator<Expression> itr = result.iterator();
while (itr.hasNext()) {
itr.add(combiner.apply(itr.remove(), itr.remove()));
}
}
Your version works but for (int i = 0; i < result.size() - 1; i++) { make me think it'll be a normal loop and then you remove and add and I'm confused. Using the ListIterator forces the reader to think.
There was a problem hiding this comment.
If you'd like we can follow this up in a different PR.
The approach above doesn't work since remove can only be called once per next/previous.
I've added a comment to the loop explaining that it updates the list (and thus why it uses the temporary variable).
Considering the loops are fairly contained I think that conveys the message without the gymnastics of iterator.
P.S. The list is created just above as a copy for this specific reason.
There was a problem hiding this comment.
I'm fine with leaving it, yeah. I did want a prettier one but if this is what we can do, it'll do.
| public boolean foldable() { | ||
| return value.foldable() && lower.foldable() && upper.foldable(); | ||
| if (lower.foldable() && upper.foldable()) { | ||
| return (excludingBoundaries() || value.foldable()); |
There was a problem hiding this comment.
Can you remove the extra ( and )? I don't think they help here.
| return lowerComparsion && upperComparsion; | ||
| } | ||
|
|
||
| private boolean excludingBoundaries() { |
There was a problem hiding this comment.
Can you move the explanation to javadoc? I do like that you catch this case. I might rename this method. I'd call it isDegenerate but I think that isn't a great name either. matchesNothing?
| static class CombineComparisonsIntoRange extends OptimizerExpressionRule { | ||
| /** | ||
| * Propagate Equals to eliminate conjuncted Ranges. | ||
| * When encountering a different Equals or exclusive {@link Range}, the conjunction becomes false. |
There was a problem hiding this comment.
Can you give an example, maybe in terms of SQL? Like,
converts {@code SELECT * FROM a < 10 OR a == 10} into {@code SELECT * FROM a <= 10}??
There was a problem hiding this comment.
I see. This one converts {@code SELECT * FROM a < 10 AND a = 11} into {@SELECT * FROM FALSE}.
| if (ex instanceof Range) { | ||
| ranges.add((Range) ex); | ||
| } else if (ex instanceof Equals) { | ||
| Equals equ = (Equals) ex; |
There was a problem hiding this comment.
Can you name equ and eq a little more differently? They sort of blur together to me when I read this.
| } | ||
|
|
||
| // | ||
| // Constant folding |
There was a problem hiding this comment.
Maybe in a follow up move each of these chunks into separate tests? OptimizerConstantFoldingTests and OptimizerLogicalSimplificationsTests and OptimizerRangeOptimizationTests or something.
There was a problem hiding this comment.
I'd like to extract the optimizer and the tests and have the rules for BinaryComparisons separate.
|
I added the |
|
Definitely this goes to 6.x, I typically put them in after I merge the code it to avoid race conditions (typically applies for 6.3 though). |
nik9000
left a comment
There was a problem hiding this comment.
Looks like an improvement to me. I'd love a nicer loop to build the tree and I left a suggestion - I'm not 100% sure it'll work, but it is a thing. Also, maybe a unit test for that method too? Otherwise, I think this is good.
Do not promote BinaryComparisons to Ranges since it introduces NULL boundaries and thus a corner-case that needs too much handling Compare BinaryComparisons directly between themselves and to Ranges
…nary-comparison-combination
|
retest this |
|
run tests |
|
test this please |
Rewrote optimization rule for combining ranges by improving the detection of binary comparisons in a tree to better combine them in a range, regardless of their place inside an expression. Additionally, improve the comparisons of Numbers of different types Also, improve reassembly of conjunction/disjunction into balanced trees. Do not promote BinaryComparisons to Ranges since it introduces NULL boundaries and thus a corner-case that needs too much handling Compare BinaryComparisons directly between themselves and to Ranges Fix elastic#30017
* master: Set the new lucene version for 6.4.0 [ML][TEST] Clean up jobs in ModelPlotIT Upgrade to 7.4.0-snapshot-1ed95c097b (#30357) Watcher: Ensure trigger service pauses execution (#30363) [DOCS] Added coming qualifiers in changelog [DOCS] Commented out empty sections in the changelog to fix the doc build. (#30372) Security: reduce garbage during index resolution (#30180) Make RepositoriesMetaData contents unmodifiable (#30361) Change quad tree max levels to 29. Closes #21191 (#29663) Test: use trial license in qa tests with security [ML] Add integration test for model plots (#30359) SQL: Fix bug caused by empty composites (#30343) [ML] Account for gaps in data counts after job is reopened (#30294) InternalEngineTests.testConcurrentOutOfOrderDocsOnReplica should use two documents (#30121) Change signature of Get Repositories Response (#30333) Tests: Use different watch ids per test in smoke test (#30331) [Docs] Add term query with normalizer example Adds Eclipse config for xpack licence headers (#30299) Watcher: Make start/stop cycle more predictable and synchronous (#30118) [test] add debug logging for packaging test [DOCS] Removed X-Pack Breaking Changes [DOCS] Fixes link to TLS LDAP info Update versions for start_trial after backport (#30218) Packaging: Set elasticsearch user to have non-existent homedir (#29007) [DOCS] Fixes broken links to bootstrap user (#30349) Fix NPE when CumulativeSum agg encounters null/empty bucket (#29641) Make licensing FIPS-140 compliant (#30251) [DOCS] Reorganizes authentication details in Stack Overview (#30280) Network: Remove http.enabled setting (#29601) Fix merging logic of Suggester Options (#29514) [DOCS] Adds LDAP realm configuration details (#30214) [DOCS] Adds native realm configuration details (#30215) ReplicationTracker.markAllocationIdAsInSync may hang if allocation is cancelled (#30316) [DOCS] Enables edit links for X-Pack pages (#30278) Packaging: Unmark systemd service file as a config file (#29004) SQL: Reduce number of ranges generated for comparisons (#30267) Tests: Simplify VersionUtils released version splitting (#30322) Cancelling a peer recovery on the source can leak a primary permit (#30318) Added changelog entry for deb prerelease version change (#30184) Convert server javadoc to html5 (#30279) Create default ES_TMPDIR on Windows (#30325) [Docs] Clarify `fuzzy_like_this` redirect (#30183) Post backport of #29658. Fix docs of the `_ignored` meta field. Remove MapperService#types(). (#29617) Remove useless version checks in REST tests. (#30165) Add a new `_ignored` meta field. (#29658) Move repository-azure fixture test to QA project (#30253) # Conflicts: # buildSrc/version.properties # server/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
* 6.x: Stop forking javac (#30462) Fix tribe tests Docs: Use task_id in examples of tasks (#30436) Security: Rename IndexLifecycleManager to SecurityIndexManager (#30442) Packaging: Set elasticsearch user to have non-existent homedir (#29007) [Docs] Fix typo in cardinality-aggregation.asciidoc (#30434) Avoid NPE in `more_like_this` when field has zero tokens (#30365) Build: Switch to building javadoc with html5 (#30440) Add a quick tour of the project to CONTRIBUTING (#30187) Add stricter geohash parsing (#30376) Reindex: Use request flavored methods (#30317) Silence SplitIndexIT.testSplitIndexPrimaryTerm test failure. (#30432) Auto-expand replicas when adding or removing nodes (#30423) Silence IndexUpgradeIT test failures. (#30430) Fix line length violation in cache tests Add failing test for core cache deadlock [DOCS] convert forcemerge snippet Update forcemerge.asciidoc (#30113) Added zentity to the list of API extension plugins (#29143) Fix the search request default operation behavior doc (#29302) (#29405) Watcher: Mark watcher as started only after loading watches (#30403) Correct wording in log message (#30336) Do not fail snapshot when deleting a missing snapshotted file (#30332) AwaitsFix testCreateShrinkIndexToN DOCS: Correct mapping tags in put-template api DOCS: Fix broken link in the put index template api Add put index template api to high level rest client (#30400) [Docs] Add snippets for POS stop tags default value Remove entry inadvertently picked into changelog Move respect accept header on no handler to 6.3.1 Respect accept header on no handler (#30383) [Test] Add analysis-nori plugin to the vagrant tests [Rollup] Validate timezone in range queries (#30338) [Docs] Fix bad link [Docs] Fix end of section in the korean plugin docs add the Korean nori plugin to the change logs Expose the Lucene Korean analyzer module in a plugin (#30397) Docs: remove transport_client from CCS role example (#30263) Test: remove cluster permission from CCS user (#30262) Watcher: Remove unneeded index deletion in tests fix docs branch version fix lucene snapshot version Upgrade to 7.4.0-snapshot-1ed95c097b (#30357) [ML][TEST] Clean up jobs in ModelPlotIT Watcher: Ensure trigger service pauses execution (#30363) [DOCS] Fixes ordering of changelog sections [DOCS] Commented out empty sections in the changelog to fix the doc build. (#30372) Make RepositoriesMetaData contents unmodifiable (#30361) Change signature of Get Repositories Response (#30333) 6.x Backport: Terms query validate bug (#30319) InternalEngineTests.testConcurrentOutOfOrderDocsOnReplica should use two documents (#30121) Security: reduce garbage during index resolution (#30180) Test: use trial license in qa tests with security [ML] Add integration test for model plots (#30359) SQL: Fix bug caused by empty composites (#30343) [ML] Account for gaps in data counts after job is reopened (#30294) [ML] Refactor DataStreamDiagnostics to use array (#30129) Make licensing FIPS-140 compliant (#30251) Do not load global state when deleting a snapshot (#29278) Don't load global state when only restoring indices (#29239) Tests: Use different watch ids per test in smoke test (#30331) Watcher: Make start/stop cycle more predictable and synchronous (#30118) [Docs] Add term query with normalizer example Adds Eclipse config for xpack licence headers (#30299) Fix message content in users tool (#30293) [DOCS] Removed X-Pack breaking changes page [DOCS] Added security breaking change [DOCS] Fixes link to TLS LDAP info [DOCS] Merges X-Pack release notes into changelog (#30350) [DOCS] Fixes broken links to bootstrap user (#30349) [Docs] Remove errant changelog line Fix NPE when CumulativeSum agg encounters null/empty bucket (#29641) [DOCS] Reorganizes authentication details in Stack Overview (#30280) Tests: Simplify VersionUtils released version splitting (#30322) Fix merging logic of Suggester Options (#29514) ReplicationTracker.markAllocationIdAsInSync may hang if allocation is cancelled (#30316) [DOCS] Adds LDAP realm configuration details (#30214) [DOCS] Adds native realm configuration details (#30215) Disable SSL on testing old BWC nodes (#30337) [DOCS] Enables edit links for X-Pack pages Cancelling a peer recovery on the source can leak a primary permit (#30318) SQL: Reduce number of ranges generated for comparisons (#30267) [DOCS] Adds links to changelog sections Convert server javadoc to html5 (#30279) REST Client: Add Request object flavored methods (#29623) Create default ES_TMPDIR on Windows (#30325) [Docs] Clarify `fuzzy_like_this` redirect (#30183) Fix docs of the `_ignored` meta field. Add a new `_ignored` meta field. (#29658) Move repository-azure fixture test to QA project (#30253)
Rewrote optimization rule for combining ranges by improving the
detection of binary comparisons in a tree to better combine
them in a range, regardless of their place inside an expression.
Additionally, improve the comparisons of Numbers of different types
Also, improve reassembly of conjunction/disjunction into balanced
trees.
Fix #30017
Closes #30019