fix(query): handle repeated % in LIKE folding#19590
fix(query): handle repeated % in LIKE folding#19590sundy-li wants to merge 5 commits intodatabendlabs:mainfrom
Conversation
tests/sqllogictests/suites/query/functions/02_0005_function_compare.test
Show resolved
Hide resolved
|
Blocking issue from review: the PR head still fails its own new sqllogictest in the real query path. CI run 23498244372 reports |
|
Blocking issue: |
|
|
sundy-li
left a comment
There was a problem hiding this comment.
Blocking issue: normalize_simple_pattern() now rewrites patterns like abab%%%%% to LikePattern::EndOfPercent("abab"), but calc_like_domain() in src/query/functions/src/scalars/comparison.rs:1007 still derives its prefix from the raw pattern string and only strips a single trailing %.
That means the planner/domain folder reasons about abab%%%% instead of abab, so constant-folded queries like SELECT ababac LIKE abab%%%%% still fold to false even though the runtime LIKE matcher now returns true. This matches the failing query sqllogic jobs on the PR head.
The normalization change in src/query/expression/src/filter/like.rs needs a matching update in calc_like_domain() so planner folding derives prefixes from the normalized LikePattern::EndOfPercent(v) (or otherwise collapses repeated trailing %).
sundy-li
left a comment
There was a problem hiding this comment.
Blocking issue: generate_like_pattern() now normalizes repeated trailing % into LikePattern::EndOfPercent(_), but calc_like_domain() still derives its prefix from the raw SQL pattern string and only removes one trailing %.
That means planner/domain folding can still disagree with runtime LIKE evaluation for repeated-% literals. A case like SELECT ababac LIKE abab%%%%% can still be folded using the prefix abab%%%% instead of abab, so the end-to-end SQL result stays wrong even though the matcher itself was fixed.
calc_like_domain() needs to derive the prefix from the normalized pattern variant (or collapse repeated trailing % there as well) so planner folding and runtime evaluation stay consistent.
|
Confirmed blocker: repeated trailing This needs a matching update in |
|
Blocking issue on current head: That keeps planner/domain folding inconsistent with runtime LIKE evaluation for literals such as
|
sundy-li
left a comment
There was a problem hiding this comment.
Blocking issue: normalize_simple_pattern() now rewrites all-% patterns like '%%%%%' to LikePattern::Constant(true) in src/query/expression/src/filter/like.rs:312, but calc_like_domain() in src/query/functions/src/scalars/comparison.rs:1008 still only treats LikePattern::StartOfPercent(v) if v.is_empty() as the always-true case.
That leaves the zero-segment folding path inconsistent with the existing % folding: LIKE '%' still drops the filter entirely (see tests/sqllogictests/suites/mode/standalone/explain/explain_like.test:18), while LIKE '%%%%%' now normalizes to the same semantics at runtime but won’t fold in the planner because calc_like_domain() falls through to None.
This PR is specifically about repeated-% LIKE folding, so the Constant(true) branch needs the same ALL_TRUE_DOMAIN handling, and it should get a regression around EXPLAIN ... LIKE '%%%%%', before I’d call it complete.
|
Blocking issue on the current head: That means the zero-segment repeated- Since this PR is specifically about repeated- |
|
Updated the planner/domain fold for normalized all- |
|
Blocking issue in
That broadens optimizer behavior beyond the repeated- |
|
Scoped the repeated all- |
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
LIKEconstant folding panics on patterns with repeated%#19561LIKEsimple patterns that collapse to zero or one literal segment when repeated%is constant-folded%patternsTests
Type of change
Validation
cargo test -p databend-common-expressioncargo clippy -p databend-common-expression --lib --tests -- -D warningscargo fmt --all --checktests/sqllogictests/suites/query/functions/02_0005_function_compare.test(not run locally in this environment)This change is