-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Node concurrent recoveries settings not being honoured. #13702
Labels
Comments
This was referenced May 20, 2024
Closed
SwethaGuptha
pushed a commit
to SwethaGuptha/OpenSearch
that referenced
this issue
May 29, 2024
…arch-project#13702) Signed-off-by: Swetha Guptha <[email protected]>
SwethaGuptha
pushed a commit
to SwethaGuptha/OpenSearch
that referenced
this issue
May 29, 2024
…arch-project#13702) Signed-off-by: Swetha Guptha <[email protected]>
SwethaGuptha
pushed a commit
to SwethaGuptha/OpenSearch
that referenced
this issue
May 29, 2024
…arch-project#13702) Signed-off-by: Swetha Guptha <[email protected]>
SwethaGuptha
pushed a commit
to SwethaGuptha/OpenSearch
that referenced
this issue
Jun 6, 2024
…arch-project#13702) Signed-off-by: Swetha Guptha <[email protected]>
github-project-automation
bot
moved this from 👀 In review
to ✅ Done
in Cluster Manager Project Board
Jun 13, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Default/updated concurrency recovery settings (node_concurrent_recoveries, node_initial_primaries_recoveries) settings are not being honored and doesn’t have any effect on the recovery speed for clusters with batch mode enabled
This is happening because of the way we allocate unassigned shards in a batch. For a batch:
OpenSearch/server/src/main/java/org/opensearch/gateway/BaseGatewayShardAllocator.java
Lines 89 to 113 in da3ab92
Because the decider execution and update to the shard status is not happening together for a shard, the cluster state doesn't change after running deciders on the unassigned shards. ThrottlingAllocationDecider reads the cluster state to decide if a shard recovery can be started on the node or not by comparing ongoing recoveries on the node with configured value of recovery settings (node_concurrent_recoveries, node_initial_primaries_recoveries). So, when we run the allocation decision together for all shards in a batch, the decider doesn't account for the decisions made for the other shards in the batch and we end up initializing all shards at once.
Logs indicating the same:
Related component
Cluster Manager
To Reproduce
logger.debug( "ThrottlingAllocationDecider decision, throttle: [{}] primary recovery limit [{}]," + " primaries in recovery [{}] invoked for [{}] on node [{}]", primariesInRecovery >= primariesInitialRecoveries, primariesInitialRecoveries, primariesInRecovery, shardRouting, node.node() );
Expected behavior
Number of ongoing shard recoveries on a node should adhere to the node concurrent recovery setting.
Additional Details
OpenSearch Version: 2.14
The text was updated successfully, but these errors were encountered: