EQL: Fix cidrMatch function fails to match when used in scripts#56246
EQL: Fix cidrMatch function fails to match when used in scripts#56246aleksmaus merged 10 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/es-ql (:Query Languages/EQL) |
| Object o = field.fold(); | ||
| ArrayList<Object> arr = new ArrayList<>(addresses.size()); | ||
| for (Expression address : addresses) { | ||
| final Equals eq = new Equals(source(), field, address); | ||
| func = (func == null) ? eq : new Or(source(), func, eq); | ||
| arr.add(address.fold()); | ||
| } | ||
| return doProcess(o, arr); |
There was a problem hiding this comment.
| Object o = field.fold(); | |
| ArrayList<Object> arr = new ArrayList<>(addresses.size()); | |
| for (Expression address : addresses) { | |
| final Equals eq = new Equals(source(), field, address); | |
| func = (func == null) ? eq : new Or(source(), func, eq); | |
| arr.add(address.fold()); | |
| } | |
| return doProcess(o, arr); | |
| return doProcess(field.fold(), Expressions.fold(addresses)); |
| public ScriptTemplate asScript() { | ||
| ScriptTemplate leftScript = asScript(field); | ||
|
|
||
| List<Object> values = new ArrayList<>(new LinkedHashSet<>(foldListOfValues(addresses, field.dataType()))); |
There was a problem hiding this comment.
| List<Object> values = new ArrayList<>(new LinkedHashSet<>(foldListOfValues(addresses, field.dataType()))); | |
| List<Object> values = Expressions.fold(addresses) |
There was a problem hiding this comment.
this will not eliminate duplicated values as far as I see
| process where cidrMatch(source_address, "10.0.0.0/8") | ||
| ; | ||
| "term":{"source_address":{"value":"10.0.0.0/8" | ||
| "script":{"source":"InternalQlScriptUtils.nullSafeFilter(InternalEqlScriptUtils.cidrMatch( |
There was a problem hiding this comment.
Should we add an optimizer rule to handle this so that we don't lose the more performant term query?
There was a problem hiding this comment.
Will probably have to do another pass for that after @astefan merges his PR that also will affect the query planning optimization for simple equals. Can adjust this PR if that change merges sooner, sometimes this week.
|
Updated PR to use term query for cidrMatch when possible otherwise fallback to script. |
| if (cm.field() instanceof FieldAttribute && Expressions.foldable(cm.addresses())) { | ||
| String targetFieldName = handler.nameOf(((FieldAttribute) cm.field()).exactAttribute()); | ||
|
|
||
| Set<Object> set = new LinkedHashSet<>(CollectionUtils.mapSize(cm.addresses().size())); |
There was a problem hiding this comment.
Is this worthwhile to add as a method? It's also called within CidrMatch.asScript
There was a problem hiding this comment.
Not exactly, needed Set here for TermsQuery constructor and List for the scripting func
| new ExpressionTranslators.StringQueries(), | ||
| new ExpressionTranslators.Matches(), | ||
| new ExpressionTranslators.MultiMatches(), | ||
| new Scalars() |
There was a problem hiding this comment.
preferred without static importing everything here, more visible QL translators vs EQL.
There was a problem hiding this comment.
A relevant aspect here: you might want to update your branch from master, to incorporate the StartsWith changes from #56404. In that PR, the Scalars from QL was updated to have startsWith use a PrefixQuery whenever possible. And for EQL and its QueryTranslator that you added, it will probably need the Scalars from QL, as well.
There was a problem hiding this comment.
Updated, and adjusted Scalar translation in EQ to be able to reuse it for startsWith function
astefan
left a comment
There was a problem hiding this comment.
Looks good in general, but I have some comments as well.
|
|
||
| public static boolean isInRange(String address, String... cidrAddresses) { | ||
| // Check if address is parsable first | ||
| byte[] addr = InetAddresses.forString(address).getAddress(); |
There was a problem hiding this comment.
Since this code is in EQL, maybe we should handle the IllegalArgumentException here and wrap it in EqlIllegalArgumentException?
| new ExpressionTranslators.StringQueries(), | ||
| new ExpressionTranslators.Matches(), | ||
| new ExpressionTranslators.MultiMatches(), | ||
| new Scalars() |
There was a problem hiding this comment.
A relevant aspect here: you might want to update your branch from master, to incorporate the StartsWith changes from #56404. In that PR, the Scalars from QL was updated to have startsWith use a PrefixQuery whenever possible. And for EQL and its QueryTranslator that you added, it will probably need the Scalars from QL, as well.
| import org.elasticsearch.xpack.ql.type.DataType; | ||
| import org.elasticsearch.xpack.ql.type.DataTypeConverter; | ||
|
|
||
| public class EqlTranslatorHandler implements TranslatorHandler { |
There was a problem hiding this comment.
This class is the same as QlTranslatorHandler except asQuery method. Maybe you could reuse most of the code from QlTranslatorHandler and only change the relevant bit in EqlTranslatorHandler. For example, why not extending QlTranslatorHandler and override asQuery only?
| {"terms":{"source_address":["10.0.0.0/8"] | ||
| ; | ||
|
|
||
| cidrMatchFunctionTwo |
There was a problem hiding this comment.
Please, add one more test where there are two cidrMatch function calls in the same query: where cidrMatch(whatever) or cidrMatch(another_whatever).
| process where cidrMatch(source_address, "10.0.0.0/8") or cidrMatch(source_address, "192.168.0.0/16") | ||
| ; | ||
| {"bool":{"must":[{"term":{"event.category":{"value":"process" | ||
| {"terms":{"source_address":["10.0.0.0/8"],"boost":1.0}},{"terms":{"source_address":["192.168.0.0/16"] |
There was a problem hiding this comment.
Indeed, it is relevant to have both terms queries in there and the must for event.category, but there has to be a bool should as well.
The query should translate as a bool with two must statements:
- one is
termonevent.category - the other is a
boolwith twoshouldstatements wheretermson thesource_addressis used.
Replaced the previous implementation with script function.
Addresses #55709
@astefan I think this will conflict with your changes on query folding with equal operator optimization, it generates a different script with function now.