Skip to content

[BUG] Offline calculation of total shard across all node and caching it for weight calculation inside LocalShardBalancer #15108

@RS146BIJAY

Description

@RS146BIJAY

Describe the bug

Description

When selecting a node on which shard will be allocated, OpenSearch calculates weight of that shard on every node. Weight of a shard represents comparison of the number of shards on this node to the number of shards that should be on each node on average (both taking the cluster as a whole into account as well as shards per index). Calculating the average shard per node during weight calculation is a resource-intensive operation. To do this, we sum up the shards count on all nodes by iterating through metadata information of all nodes and dividing this sum by total number of nodes. Since this computation is performed for each node during shard allocation, it becomes computationally expensive. As there is only single thread on master node which handles all the operations including the deciders and make allocation decisions, allocation deciders execution may continue to block these threads which may prevent execution of certain high priority tasks like applying/sending cluster state update, index create, etc.

Screenshot 2024-07-08 at 5 48 10 PM

As can be validated from the graph, about 50% of the time spent for relocating 6k shards (empty shards) from 100 source nodes and assigning them on 100 destination nodes is attributed to average shard calculation during weight determination.

Related component

Indexing:Replication

To Reproduce

Create 500k shards on a setup with 1000 data nodes and 3 master nodes.

Expected behavior

Do an offline calculation of total shards across all nodes and caches it so that LocalShardBalancer does not needs to traverse all the nodes metadata for weight calculation.

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Indexing:ReplicationIssues and PRs related to core replication framework eg segrepbugSomething isn't workinguntriaged

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions