-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] [Segment Replication] Balanced primary count across all nodes during rebalancing #12250
Comments
@dreamer-89 @ashking94 please provide your inputs on the issue. Will be adding more details around how the existing algorithm for relocation works and what improvements we can do. Thanks! |
The current imbalance originates primarily due to the reason that we do not consider the overall node primary count during the rebalance. In Segment replication, this cause more issues since primaries are doing majority of the heavy lifting. Rather than doing another round of rebalancing as discussed in #6642, @dreamer-89 I'm thinking of the following:
if (preferPrimaryBalance == true
&& shard.primary()
- && maxNode.numPrimaryShards(shard.getIndexName()) - minNode.numPrimaryShards(shard.getIndexName()) < 2) {
+ && maxNode.numPrimaryShards() - minNode.numPrimaryShards() < 2) {
continue;
}
I think this will reduce the total relocation since we will be considering both constraints in the same iterations of rebalanceByWeights and handling the relocation of primary and replication shards as well. Thoughts @dreamer-89 @ashking94 ? |
Did an POC on the above Idea.. Here are the initial results: For simplicity, let's take an 4 node cluster with 4 indices each having 4 primary and 4 replica shards.
Now, we drop one of the nodes and this is what the shards distribution looks like in the cluster, As per the current algorithm:
With the changes mention above(Only part 1)
|
Thanks @Arpit-Bandejiya for putting this up. Did we check if can re use allocation constraints mechanism to achieve this - elastic/elasticsearch#43350? |
Yes, @imRishN. This approach extends the same allocation constraints mechanism to rank the nodes during rebalancing. |
@Arpit-Bandejiya Thanks for the POC. This looks promising. Around the change, I believe that there are 2 settings that allow to place shards per index per node and total shard per node in a cluster. So, it looks like that this change should be less intrusive. We should also be ensuring the following things -
|
Before discussing how many shard relocation happen. We need to understand how the shards are assigned in initial state. For example, let's assume we have an 3 node cluster with 3 indices each having 3 primary and 3 replica. Shard level assigment look like this: Index - 1
Index - 2
Index - 3
Now let's assume node N1 goes down, If we can check above all the replica for the primary shard assigned on
Now with the above distribution, the cluster goes to re-balancing phase. Since the shards are already skewed in primary, we need to do more relocations for primary shards balancing. So for the above case, when we rebalance with the existing logic: Rebalancing with existing logic(shard balance): 0 For the case, when we have 4 nodes with 4 indices and 4 primary and 4 replica. We got the following: Rebalancing with existing logic(shard balance): 2 Initial state:
Intermediate state
Final state(current approach of shard balance)
Total relocation: 2 Final state(current approach of shard balance)
Total relocation: 6 |
As can be seen above, if we try to rebalance the shards based on primary shard count across the cluster. We need to come up with an better allocation strategy for shards. Currently we pick the node for the unassigned shard in decideAllocateUnassigned. In case of tie-breaker in weighted nodes, we make sure we make sure that we follow an pattern. This sometime can cause the primary skew as we have seen above. To avoid this, we try to randomly select the nodes which have minimum weight. Added the given below logic to do an quick POC:
|
When we allocate the shards based on the above logic and when we try to rebalance the primary shards, we see that the number of relocations have reduced in general. For example, in the above example of 4 node cluster with 4 indices each having 4 primary and 4 replica. We saw the following from the test:
|
Is your feature request related to a problem? Please describe
Currently in our system, we only have contraint for the primary shards rebalance at an index level which was introduced in #6422. Though there are cases when we need to consider the overall primary shards count in the nodes. For example:
Initial configuration
Let's assume we have 5 node setup and 5 indices each having 5 primary and 1 replica configuration.
Case 1:
When we drop one node, the new distribution looks like this:
a better distribution could be:
In case, we have added another node in the initial configuration, the node distribution looks like this:
Case2:
Similary for this case, a better distribution could be:
We can clearly see that the primary shards are skewed in both the cases distribution and we can have better distributions in the nodes during rebalancing.
Describe the solution you'd like
Related component
Storage:Performance
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: