-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add wal write system call metrics observation (main) #17618
Conversation
Hi @Akiqqqqqqq. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Thanks @Akiqqqqqqq
@fuweid master, can help approve merge this PR? |
Also update the metrics in etcd/server/storage/wal/encoder.go Lines 119 to 120 in 1348062
|
will do |
2f64140
to
18ed11c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks
cc @serathius |
server/storage/wal/encoder.go
Outdated
nv, err := w.Write(buf) | ||
walWriteSec.Observe(time.Since(start).Seconds()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be two separate calls to Observe? I think those are part of a single write. I think it would be better to have single call with time measured before writeUint64
.
start := time.Now()
if err = writeUint64(e.bw, lenField, e.uint64buf); err != nil {
return err
}
if padBytes != 0 {
data = append(data, make([]byte, padBytes)...)
}
n, err = e.bw.Write(data)
walWriteSec.Observe(time.Since(start).Seconds())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you followed the same flow as for walWriteBytes
, however the difference is that number of calls to cumulative .Add
doesn't matter while .Observe
on histogram does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observing a write of writeUint64
as separate call would skew the histogram, as it's expected to write much smaller than size of data, then your 50%ile would be just writeUint64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that make sense, will fix later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix measurement of write so there is only 1 call to .Observe
.
Signed-off-by: Qiuyu Wu <[email protected]>
18ed11c
to
97efc2a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please also add a changelog for 3.6 if you don't backport the fix to 3.5; otherwise add a changelog for 3.5 after backporting. |
will do later |
I backported to 3.5 and raised a changelog: #17728 |
xref. #17615
add new metrics: wal_write_duration_seconds
wal_write_duration_seconds represent total delay of: