Skip to content

[BUG] Cluster Stats Request performs shard level aggregations on coordinator node instead of individual data nodes #14714

@Pranshu-S

Description

@Pranshu-S

Describe the bug

The current approach for generating cluster stats indices incurs significant overhead in large clusters with numerous shards distributed across nodes. This issue arises due to the following steps:

  1. Fetching Shard-Level Stats: Each node fetches a map containing individual shard statistics.
  2. Accumulating on Request Node: The node receiving the REST request (request node) gathers this information from all participating nodes.
  3. Inefficient Data Structures: Creating hash-maps from the StreamInput and iterating through them for accumulation are computationally expensive.

image

Related component

Other

To Reproduce

In TransportClusterStatsAction, each node is made to return ShardStats map as defined here which is then iterated over in ClusterStatsIndices. This iteration goes through each shard stats entry in each node response.

Expected behavior

The ShardLevel stats should be pre-computed in the data nodes prior to sending it back to the coordinator node.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Cluster ManagerbugSomething isn't workingv2.16.0Issues and PRs related to version 2.16.0

    Type

    No type

    Projects

    Status

    ✅ Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions