Skip to content

EQL: Filter out null join keys in sequence queries#78195

Closed
astefan wants to merge 3 commits intoelastic:masterfrom
astefan:non_null_keys_joining
Closed

EQL: Filter out null join keys in sequence queries#78195
astefan wants to merge 3 commits intoelastic:masterfrom
astefan:non_null_keys_joining

Conversation

@astefan
Copy link
Copy Markdown
Contributor

@astefan astefan commented Sep 22, 2021

Joining on null keys in sequences can lead to a high amount of queries and matches in a cluster or CCS multi-cluster scenario where some indices have mappings with existent fields while others don't. This change removes support for joining on null values by filtering them out pro-actively with an exists filter.

@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-ql (Team:QL)

Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// do not join on null values
if (keyNames.isEmpty() == false) {
BoolQueryBuilder nullValuesFilter = boolQuery();
for (int keyIndex = 0; keyIndex < keyNames.size(); keyIndex++) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use foreach instead:

        for (String keyName : keyNames) {
            
        }

BoolQueryBuilder nullValuesFilter = boolQuery();
for (int keyIndex = 0; keyIndex < keyNames.size(); keyIndex++) {
// add an "exists" query for each join key to filter out any non-existent values
nullValuesFilter.must(existsQuery(keyNames.get(keyIndex)));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to score, just use filter. Which removes the need for wrapping it into a bool query and simply use RuntimeUtils.addFilter for every existsQuery.

@astefan
Copy link
Copy Markdown
Contributor Author

astefan commented Sep 23, 2021

@elasticmachine run elasticsearch-ci/bwc elasticsearch-ci/part-2

@astefan
Copy link
Copy Markdown
Contributor Author

astefan commented Sep 23, 2021

@elasticmachine run elasticsearch-ci/part-2

Comment on lines +47 to +52
.setHttpClientConfigCallback(new HttpClientConfigCallback() {
@Override
public HttpAsyncClientBuilder customizeHttpClient(HttpAsyncClientBuilder httpClientBuilder) {
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
}
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes it generally easier with testing, so good to have it, but curious if this was added for a specific reason.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a better reason: since #70114 security is enabled in this test cluster.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.
(teaching me a lesson about reading PRs chronologically...)

Copy link
Copy Markdown
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm.

@astefan
Copy link
Copy Markdown
Contributor Author

astefan commented Oct 22, 2021

Superseded by #79677.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants