-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor: Send heartbeat with dedicated workers #1225
Conversation
033f717
to
27acd6d
Compare
C: RaftTypeConfig, | ||
N: RaftNetworkV2<C>, | ||
{ | ||
pub(crate) id: C::NodeId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is id
needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For logging purpose only :(
27acd6d
to
2411bd7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can say, looks good. Just see the comment - not sure about semantics change.
Reviewed 5 of 16 files at r1, 13 of 13 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ariesdevil and @drmingdrmer)
openraft/src/core/heartbeat/worker.rs
line 87 at r3 (raw file):
let payload = AppendEntriesRequest { vote: *heartbeat.session_id.leader_vote.deref(), prev_log_id: None,
Since it's now sending empty AppendEntriesRequest
with prev_log_id == None
, this is a bit of a semantics change.
Maybe document on append entries somewhere that the prev_log_id
is uninteresting when the entries are empty?
I believe in our implementation we are checking prev_log_id
(currently unconditionally).
openraft/src/engine/handler/leader_handler/send_heartbeat_test.rs
line 60 at r3 (raw file):
assert_eq!( vec![ //
Why those empty comments (also below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 16 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @ariesdevil and @schreter)
openraft/src/core/heartbeat/worker.rs
line 87 at r3 (raw file):
Previously, schreter wrote…
Since it's now sending empty
AppendEntriesRequest
withprev_log_id == None
, this is a bit of a semantics change.Maybe document on append entries somewhere that the
prev_log_id
is uninteresting when the entries are empty?I believe in our implementation we are checking
prev_log_id
(currently unconditionally).
Semantically, prev_log_id
is used to ensure that the log entries in AppendEntries.entries
are consecutive with the entries on the leader by asserting that prev_log_id
is equal. Raft ensures that a matching log ID implies all preceding logs match. If entries
is empty, it is considered consecutive at any index on the follower, and prev_log_id=None
matches the very first position None
.
In short AppendEntries with prev_log_id=None && entries.is_empty()
means to append nothing at the very beginning, which is always valid.
openraft/src/engine/handler/leader_handler/send_heartbeat_test.rs
line 60 at r3 (raw file):
Previously, schreter wrote…
Why those empty comments (also below)?
To force line break and minimize git diff changes :(
accb0ac
to
c1eeca9
Compare
Heavy AppendEntries traffic can block heartbeat messages. For example, future AppendEntries in stream RPC may not receive a response indicating a follower is alive. In such cases, the leader might time out to extend its lease, and be considered partitioned from the cluster. This commit moves heartbeat broadcasting to separate tasks that won't be blocked by AppendEntries. This ensures the leader can always be acknowledged with the liveness of followers. Separate log progress notification and clock progress notification: When ReplicationCore successfully finished one RPC to Follower/Learner, it informs the RaftCore to update log progress and clock(heartbeat) progress. This commit split these two informations into two `Notification` variants, in order to make progress handling more clear. Another improvement is to ignore a heartbeat progress if it is sent with an older cluster membership config. Because a follower can be removed and re-added, the obsolete heartbeat progress is invalid. This check is done by remembering the membership log id in the `HeartbeatEvent`. `HigherVote` can be sent directly to Notification channel. replication::Response does not need `HigherVote` variant any more. And `Response` is renamed to `Progress`
c1eeca9
to
2f87402
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ariesdevil)
openraft/src/core/heartbeat/worker.rs
line 87 at r3 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
Semantically,
prev_log_id
is used to ensure that the log entries inAppendEntries.entries
are consecutive with the entries on the leader by asserting thatprev_log_id
is equal. Raft ensures that a matching log ID implies all preceding logs match. Ifentries
is empty, it is considered consecutive at any index on the follower, andprev_log_id=None
matches the very first positionNone
.In short AppendEntries with
prev_log_id=None && entries.is_empty()
means to append nothing at the very beginning, which is always valid.
Strictly speaking, this was previously undefined on the API level. Now it is defined, thanks.
Perhaps we should include this change in the architecture doc? |
Make sense. I'm gonna do this |
And some notes need to be mentioned in the doc. E.g, the node disk may have been damaged, but the heartbeat has been normal. |
Changelog
Refactor: Send heartbeat with dedicated workers
Heavy AppendEntries traffic can block heartbeat messages. For example,
future AppendEntries in stream RPC may not receive a response indicating
a follower is alive. In such cases, the leader might time out to extend
its lease, and be considered partitioned from the cluster.
This commit moves heartbeat broadcasting to separate tasks that won't be
blocked by AppendEntries. This ensures the leader can always be
acknowledged with the liveness of followers.
Separate log progress notification and clock progress notification:
When ReplicationCore successfully finished one RPC to Follower/Learner,
it informs the RaftCore to update log progress and clock(heartbeat) progress.
This commit split these two informations into two
Notification
variants, in order to make progress handling more clear.
Another improvement is to ignore a heartbeat progress if it is sent with
an older cluster membership config. Because a follower can be removed
and re-added, the obsolete heartbeat progress is invalid. This check is
done by remembering the membership log id in the
HeartbeatEvent
.HigherVote
can be sent directly to Notification channel.replication::Response does not need
HigherVote
variant any more.And
Response
is renamed toProgress
This change is