-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raft: allow inflights size to dynamically increase #135225
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)
pkg/raft/tracker/inflights.go
line 81 at r1 (raw file):
next -= size } if next >= len(in.buffer) {
Is this branch ever reached at all?
When we get to the last entry (say size=100
):
next := 0 + 100
size := 100
if 100 >= 100 {
next -= 100
}
// next == 0
pkg/raft/tracker/inflights.go
line 102 at r1 (raw file):
} newBuffer := make([]inflight, newSize) copy(newBuffer, in.buffer)
This will never shrink now? Potentially concerning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)
pkg/raft/tracker/inflights.go
line 77 at r1 (raw file):
func (in *Inflights) Add(index, bytes uint64) { next := in.start + in.count size := in.size
I am very confused by the logic that manages in.buffer
. It seems a very peculiar implementation of a circular buffer.
The modulus logic here uses in.size
and not len(in.buffer)
. Which suggests that say in.size
was 100 and the stable number of entries were only 2 (due to Add alternating with FreeLE), we would still grow the buffer to 100. Which seems to go against the claim:
We grow on demand
// instead of preallocating to inflights.size to handle systems which have
// thousands of Raft groups per process.
It seems to me that the original logic should have been to grow up to in.size
when there were actually that many in-flight entries.
Also, this suggests this PR is not quite right since the modulus logic is incorrect (the buffer could be bigger than size
). It should do:
if in.count == len(in.buffer) {
// grow
}
Then compute next
using the current length of the buffer.
As a quick fix I think it is ok to grow and never shrink. But I think we should in the future move rac2.CircularBuffer
into a util package and rewrite this code to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @pav-kv)
pkg/raft/tracker/inflights.go
line 102 at r1 (raw file):
Previously, pav-kv (Pavel Kalinnikov) wrote…
This will never shrink now? Potentially concerning.
When were we ever shrinking before?
51dfa68
to
6ebc78e
Compare
`Inflights.Add()` could encounter an OOB panic if more than the configured `Inflights.size` entries were added via `Add()`, despite stating support for this behavior: ```go // [..] However, Add() is still valid and allowed, for cases when this pacing // is implemented at the higher app level. The tracker correctly tracks all the // in-flight entries. ``` Allow the buffer to grow dynamically larger than `Inflights.size`, as it should remain bounded by the `RangeController` in cases where `Full()` is not already checked (such as Raft's on flow control machinery). Fixes: cockroachdb#135223 Release note: None
6ebc78e
to
6fa1f7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @pav-kv and @sumeerbhola)
pkg/raft/tracker/inflights.go
line 77 at r1 (raw file):
Also, this suggests this PR is not quite right since the modulus logic is incorrect (the buffer could be bigger than
size
). It should do:
I'm not following how in.count >= in.size
with this change (but I'm also questioning many things about this change myself), the only modifications to in.size
and len(in.buffer)
are made in grow
outside of init, and the method there explicitly sets in.size
to be equal to len(in.buffer)
should it ever be smaller (or the newSize
really).
pkg/raft/tracker/inflights.go
line 81 at r1 (raw file):
Previously, pav-kv (Pavel Kalinnikov) wrote…
Is this branch ever reached at all?
When we get to the last entry (say
size=100
):next := 0 + 100 size := 100 if 100 >= 100 { next -= 100 } // next == 0
Doesn't this assume len(in.buffer) == in.size
? That doesn't appear to hold, rather len(in.buffer) <= in.size
.
pkg/raft/tracker/inflights.go
line 102 at r1 (raw file):
Previously, sumeerbhola wrote…
When were we ever shrinking before?
Don't think so from looking around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @pav-kv)
pkg/raft/tracker/inflights.go
line 77 at r1 (raw file):
Which suggests that say
in.size
was 100 and the stable number of entries were only 2 (due to Add alternating with FreeLE), we would still grow the buffer to 100.
I think the above would not happen only if it shrank to 0.
if in.count == 0 {
// inflights is empty, reset the start index so that we don't grow the
// buffer unnecessarily.
in.start = 0
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli and @pav-kv)
pkg/raft/tracker/inflights.go
line 77 at r1 (raw file):
I'm not following how
in.count >= in.size
with this change
Maybe I am misunderstanding the question: in.size
is the limit we no longer respect in lazy replication mode, so in.count
can exceed it. Isn't that attempt what caused the panic.
Closing in favor of #135237. |
Inflights.Add()
could encounter an OOB panic if more than the configuredInflights.size
entries were added viaAdd()
, despite stating support for this behavior:Allow the buffer to grow dynamically larger than
Inflights.size
, as it should remain bounded by theRangeController
in cases whereFull()
is not already checked (such as Raft's on flow control machinery).Fixes: #135223
Release note: None