Skip to content

[ClusterManager - Read Actions] Leverage the ClusterState from Publish phase for serving Admin Read API's #15414

@rajiv-kv

Description

@rajiv-kv

Is your feature request related to a problem? Please describe

ClusterManager uses the committed ClusterState on node for serving Read Requests. If the applied cluster-state is not in-sync with cluster-manager, nodes fetche the entire cluster-state from cluster-manager. In large clusters, appliers might take significant time to commit the cluster-state.

During this duration, the read requests on the node fetch the cluster-state from cluster-manager as the term-version will not match.

Describe the solution you'd like

Cluster Manager has two phases :- Publish and Commit to propagate the Cluster State updates from leader to follower.
As part of publish phase, the updated cluster-state is published to all nodes in cluster. The nodes cache the published cluster-state locally. Transport handlers of Read Admin API's fetch the cluster-wide committed term and version from cluster-manager. Nodes can utilize cached cluster-state that is published but not yet committed locally if the term-version matches.

This will help especially for large clusters where there is a significant delay in order of seconds (~18s in 1000 node cluster) between the publication and commit of cluster-state.

Related component

Cluster Manager

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Cluster ManagerenhancementEnhancement or improvement to existing feature or request

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions