Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cluster-bus] Send a MEET packet to a node if there is no inbound link #1307

Open
wants to merge 3 commits into
base: unstable
Choose a base branch
from

Conversation

pieturin
Copy link
Contributor

In some cases, when meeting a new node, if the handshake times out, we can end up with an inconsistent view of the cluster where the new node knows about all the nodes in the cluster, but the cluster does not know about this new node (or vice versa).
To detect this inconsistency, we now check if a node has an outbound link but no inbound link, in this case it probably means this node does not know us. In this case we (re-)send a MEET packet to this node to do a new handshake with it.

This fixes the bug described in #1251.

In some cases, when meeting a new node, if the handshake times out, we
can end up with an inconsistent view of the cluster where the new node
knows about all the nodes in the cluster, but the cluster does not know
about this new node (or vice versa).
To detect this inconsistency, we now check if a node has an outbound
link but no inbound link, in this case it probably means this node does
not know us. In this case we (re-)send a MEET packet to this node to do
a new handshake with it.

Signed-off-by: Pierre Turin <[email protected]>
Signed-off-by: Pierre Turin <[email protected]>
Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.67%. Comparing base (32f7541) to head (6c67d41).

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1307      +/-   ##
============================================
- Coverage     70.69%   70.67%   -0.02%     
============================================
  Files           115      115              
  Lines         63153    63163      +10     
============================================
  Hits          44643    44643              
- Misses        18510    18520      +10     
Files with missing lines Coverage Δ
src/cluster_legacy.c 86.20% <100.00%> (+0.01%) ⬆️

... and 13 files with indirect coverage changes

Copy link
Contributor

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we disconnect the outbound link if inbound link is not available? I think it will lead to the same reconnection flow. Would it help with having simpler code and one unified flow. I'm not sure if it will perform the MEET operation though.

Comment on lines -3227 to +3241
}

/* If this is a MEET packet from an unknown node, we still process
* the gossip section here since we have to trust the sender because
* of the message type. */
if (!sender && type == CLUSTERMSG_TYPE_MEET) clusterProcessGossipSection(hdr, link);
/* If this is a MEET packet from an unknown node, we still process
* the gossip section here since we have to trust the sender because
* of the message type. */
clusterProcessGossipSection(hdr, link);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but this double if with the same condition was driving me crazy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind this. But in general we avoid changes to the lines of code not related to the PR.

@@ -60,6 +60,7 @@ typedef struct clusterLink {
#define nodeIsPrimary(n) ((n)->flags & CLUSTER_NODE_PRIMARY)
#define nodeIsReplica(n) ((n)->flags & CLUSTER_NODE_REPLICA)
#define nodeInHandshake(n) ((n)->flags & CLUSTER_NODE_HANDSHAKE)
#define nodeIsMeeting(n) ((n)->flags & CLUSTER_NODE_MEET)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about nodeInMeetProcess / nodeInMeetState ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I do prefer nodeInMeetState().

clusterDelNode(node);
return 1;
}
if (node->link != NULL && node->inbound_link == NULL &&
!nodeInHandshake(node) && !nodeIsMeeting(node) && !nodeTimedOut(node) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we create a macro for this node state check? Not readable at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodeInNormalState()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodeInHealthyState() ?

src/cluster_legacy.c Outdated Show resolved Hide resolved
tests/unit/cluster/cluster-reliable-meet.tcl Show resolved Hide resolved
Comment on lines 231 to 233
[llength [R 0 CLUSTER NODES]] == 26 &&
[llength [R 1 CLUSTER NODES]] == 26 &&
[llength [R 2 CLUSTER NODES]] == 26
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we match certain value in the string output? I don't like this magic number comparison which can change in the near future.

tests/unit/cluster/cluster-reliable-meet.tcl Show resolved Hide resolved
@pieturin
Copy link
Contributor Author

pieturin commented Nov 14, 2024

What if we disconnect the outbound link if inbound link is not available?

In this case we would just re-open an outbound connection, which the other node will accept, but it won't force the other node to recognize us as being part of the cluster if it doesn't trust us yet. The only way to force the other node to add us to its cluster view is for us to send a MEET packet.

Update test to check node IDs instead of relying on number of words.
Rename nodeIsMeeting() to nodeInMeetState().
Introduce nodeInNormalState() macro.

Signed-off-by: Pierre Turin <[email protected]>
@hpatro
Copy link
Contributor

hpatro commented Nov 14, 2024

What if we disconnect the outbound link if inbound link is not available?

In this case we would just re-opened an outbound connection, which the other node will accept, but it won't force the other node to recognize us as being part of the cluster if it doesn't trust us yet. The only way to force the other node to add us to its cluster view is for us to send a MEET packet.

CLUSTER MEET is an admin operation but I guess we are fine with the case of reinitiating it if the operation wasn't successful in first place and retry it.

Comment on lines +91 to +114
proc cluster_3_nodes_all_know_each_other {} {
set node0_id [dict get [get_myself 0] id]
set node1_id [dict get [get_myself 1] id]
set node2_id [dict get [get_myself 2] id]

if {
[cluster_get_node_by_id 0 $node0_id] != {} &&
[cluster_get_node_by_id 0 $node1_id] != {} &&
[cluster_get_node_by_id 0 $node2_id] != {} &&
[cluster_get_node_by_id 1 $node0_id] != {} &&
[cluster_get_node_by_id 1 $node1_id] != {} &&
[cluster_get_node_by_id 1 $node2_id] != {} &&
[cluster_get_node_by_id 2 $node0_id] != {} &&
[cluster_get_node_by_id 2 $node1_id] != {} &&
[cluster_get_node_by_id 2 $node2_id] != {} &&
[llength [R 0 CLUSTER LINKS]] == 4 &&
[llength [R 1 CLUSTER LINKS]] == 4 &&
[llength [R 2 CLUSTER LINKS]] == 4
} {
return 1
} else {
return 0
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From ChatGPT:

Suggested change
proc cluster_3_nodes_all_know_each_other {} {
set node0_id [dict get [get_myself 0] id]
set node1_id [dict get [get_myself 1] id]
set node2_id [dict get [get_myself 2] id]
if {
[cluster_get_node_by_id 0 $node0_id] != {} &&
[cluster_get_node_by_id 0 $node1_id] != {} &&
[cluster_get_node_by_id 0 $node2_id] != {} &&
[cluster_get_node_by_id 1 $node0_id] != {} &&
[cluster_get_node_by_id 1 $node1_id] != {} &&
[cluster_get_node_by_id 1 $node2_id] != {} &&
[cluster_get_node_by_id 2 $node0_id] != {} &&
[cluster_get_node_by_id 2 $node1_id] != {} &&
[cluster_get_node_by_id 2 $node2_id] != {} &&
[llength [R 0 CLUSTER LINKS]] == 4 &&
[llength [R 1 CLUSTER LINKS]] == 4 &&
[llength [R 2 CLUSTER LINKS]] == 4
} {
return 1
} else {
return 0
}
}
proc cluster_nodes_all_know_each_other {num_nodes} {
# Collect node IDs dynamically
set node_ids {}
for {set i 0} {$i < $num_nodes} {incr i} {
lappend node_ids [dict get [get_myself $i] id]
}
# Check if all nodes know each other
foreach node_id $node_ids {
foreach check_node_id $node_ids {
for {set node_index 0} {$node_index < $num_nodes} {incr node_index} {
if {[cluster_get_node_by_id $node_index $check_node_id] == {}} {
return 0
}
}
}
}
# Verify cluster link counts for each node
set expected_links [expr {2 * ($num_nodes - 1)}]
for {set i 0} {$i < $num_nodes} {incr i} {
if {[llength [R $i CLUSTER LINKS]] != $expected_links} {
return 0
}
}
return 1
}

Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do something like #461? only clear the CLUSTER_NODE_MEET flag when myself receive a "ack" (not the plain PONG but something with a strong ack, ack that sender has already meet myself?) I haven't thought about it carefully, but i feel it is more reliable?

@madolson
Copy link
Member

only clear the CLUSTER_NODE_MEET flag when myself receive a "ack" (not the plain PONG but something with a strong ack, ack that sender has already meet myself?

Do you mean by like adding a new flag? I think the concern is we could still end up in the inverse state, where the the node that received the "strong" ack will put the other node online but then might go offline.

My original thought was that as long as one node believes the other is part of the cluster, is should try to have the other node join. It's sort of like an "enhanced" version of how we built up the mesh when two disjoin clusters meet each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants