Skip to content

Allow a number of broadcast transport actions to resolve data streams#55726

Merged
martijnvg merged 11 commits intoelastic:masterfrom
martijnvg:data_stream_resolvability_BroadcastRequest
May 11, 2020
Merged

Allow a number of broadcast transport actions to resolve data streams#55726
martijnvg merged 11 commits intoelastic:masterfrom
martijnvg:data_stream_resolvability_BroadcastRequest

Conversation

@martijnvg
Copy link
Copy Markdown
Member

@martijnvg martijnvg commented Apr 24, 2020

Change TransportBroadcastByNodeAction and TransportBroadcastReplicationAction
to be able to resolve data streams by default. Implementations can change this ability.

This change allows to following APIs to resolve data streams: flush,
refresh (already supported data streams), force merge, clear indices cache,
indices stats, segments, upgrade stats, upgrade, validate query,
searchable snapshots stats, clear searchable snapshots cache and
reload analyzers APIs.

Relates to #53100

…tream

By changing to the default indices options of BroadcastRequest
the following apis can resolve data streams: flush,
refresh (already supported data streams), force merge, clear indices cache,
indices stats, segments, upgrade stats, upgrade, validate query,
searchable snapshots stats, clear searchable snapshots cache and
reload analyzers APIs.

The following APIs extend BroadcastRequest but overwrite default
indices options and therefor don't yet support data streams:
recovery api

This change also adds `strictIncludeDataStreamsExpand()` to `IndicesOptions`
class. (an alternative to `strictExpand()`) This is because
`InternalClusterInfoService` overwrites the default indices options
for the indices stats call.

Relates to elastic#53100
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Data streams)

@martijnvg martijnvg marked this pull request as draft April 24, 2020 15:05
@martijnvg martijnvg marked this pull request as ready for review April 24, 2020 15:40
Copy link
Copy Markdown
Contributor

@danhermann danhermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left one minor comment

Comment on lines +453 to +454
* @return indices option that requires every specified index to exist, expands wildcards to both open and closed
* indices and allows that no indices are resolved from wildcard expressions (not returning an error).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add that data streams are included to distinguish it from ::strictExpand()

@henningandersen
Copy link
Copy Markdown
Contributor

I think ForgetFollowerAction.Request also cannot meaningfully resolve data streams.

I am also a bit worried that we here are introducing support for data streams in a wide array of APIs without any tests for it. I think it would be better to tackle them one by one, changing the indices options in each specific request and adding the necessary tests for it?

@martijnvg
Copy link
Copy Markdown
Member Author

I think ForgetFollowerAction.Request also cannot meaningfully resolve data streams.

True and it doesn't support aliases and wildcard expressions too. The corresponding transport action uses the leader and follower indices as provided in the api. Which why this PR doesn't add data stream support to this api either. The ForgetFollowerAction.Request supplies the leader index to the base class (BroadcastRequest) and that is then used to find the leader shards, but in the transport action's shardOperation() method it then uses the leader index as is provided via api. Using using anything other than the concrete leader index results in failure. I think this behaviour is intentional.

I think it would be better to tackle them one by one, changing the indices options in each specific request and adding the necessary tests for it?

I've added tests for all the APIs that have a request that extends from BroadcastRequest and that support data streams. So that should be ok? Or do you prefer that I revert the change to BroadcastRequest and make the change to all the concrete request classes that should support data streams? (similar to the refresh request class prior to this PR)

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Apr 29, 2020
to include data streams by default.

If a wildcard expression (which is `*` here) matches a data stream
but the indices options don't include data streams then an error is returned.

This change was spawned of from elastic#55726 which changes the default indices options
of the indices stats api.

This will avoid cluster info service from failing to refresh the stats:

```
[2020-04-29T17:16:00,218][WARN ][o.e.c.InternalClusterInfoService] [runTask-0] Failed to execute IndicesStatsAction for ClusterInfoUpdateJob
java.lang.IllegalArgumentException: The provided expression [[_all]] matches a data stream, specify the corresponding concrete indices instead.
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.dataStreamsNotSupportedException(IndexNameExpressionResolver.java:278) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:699) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:159) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:138) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:71) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.<init>(TransportBroadcastByNodeAction.java:252) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:225) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:76) ~[elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:88) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.ActionFilter$Simple.apply(ActionFilter.java:53) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:86) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$apply$0(SecurityActionFilter.java:86) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$authorizeRequest$4(SecurityActionFilter.java:172) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authz.AuthorizationService.authorizeSystemUser(AuthorizationService.java:388) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authz.AuthorizationService.authorize(AuthorizationService.java:192) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.authorizeRequest(SecurityActionFilter.java:172) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$applyInternal$3(SecurityActionFilter.java:158) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$authenticateAsync$2(AuthenticationService.java:323) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lambda$lookForExistingAuthentication$6(AuthenticationService.java:385) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.lookForExistingAuthentication(AuthenticationService.java:396) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authc.AuthenticationService$Authenticator.authenticateAsync(AuthenticationService.java:320) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.authc.AuthenticationService.authenticate(AuthenticationService.java:156) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.applyInternal(SecurityActionFilter.java:155) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$apply$1(SecurityActionFilter.java:92) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.core.security.SecurityContext.executeAsUser(SecurityContext.java:127) [x-pack-core-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.apply(SecurityActionFilter.java:90) [x-pack-security-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:86) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.tasks.TaskManager.registerAndExecute(TaskManager.java:151) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:90) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:79) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:380) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1210) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.stats(AbstractClient.java:1465) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
        at org.elasticsearch.cluster.InternalClusterInfoService.updateIndicesStats(InternalClusterInfoService.java:264) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
```
@henningandersen
Copy link
Copy Markdown
Contributor

About ForgetFollowerAction, I think it will result in an NPE

        final Index leaderIndex = clusterService.state().metadata().index(request.leaderIndex()).getIndex();

I think we should rather use an indicesOptions that does not support data streams. I think the investigation work you are doing will likely address this?

I think the tests look fine (though the forget follower action is missing). Thanks for pointing that out.

I wonder if I should review this now or wait until you concluded the experiment on extracting the data stream option?

@martijnvg
Copy link
Copy Markdown
Member Author

I wonder if I should review this now or wait until you concluded the experiment on extracting the data stream option?

Yes, I think we should wait with this pr and see how #56034 unfolds.

@rjernst rjernst added the Team:Data Management (obsolete) DO NOT USE. This team no longer exists. label May 4, 2020
@martijnvg martijnvg changed the title Change the BroadcastRequest default indices options to resolve data stream Allow a number of broadcast transport actions to resolve data streams May 8, 2020
@martijnvg
Copy link
Copy Markdown
Member Author

I've updated this PR based on the changes that came with #56034.

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@martijnvg martijnvg merged commit 77aa236 into elastic:master May 11, 2020
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request May 11, 2020
…elastic#55726)

Change TransportBroadcastByNodeAction and TransportBroadcastReplicationAction
to be able to resolve data streams by default. Implementations can change this ability.

This change allows to following APIs to resolve data streams: flush,
refresh (already supported data streams), force merge, clear indices cache,
indices stats (already supported data streams), segments, upgrade stats, 
upgrade, validate query, searchable snapshots stats, clear searchable snapshots cache and
reload analyzers APIs.

Relates to elastic#53100
martijnvg added a commit that referenced this pull request May 11, 2020
…#55726) (#56502)

Change TransportBroadcastByNodeAction and TransportBroadcastReplicationAction
to be able to resolve data streams by default. Implementations can change this ability.

This change allows to following APIs to resolve data streams: flush,
refresh (already supported data streams), force merge, clear indices cache,
indices stats (already supported data streams), segments, upgrade stats, 
upgrade, validate query, searchable snapshots stats, clear searchable snapshots cache and
reload analyzers APIs.

Relates to #53100
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :StorageEngine/Data streams Data streams and their lifecycles Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v7.9.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants