-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix duration measurement issues in the client #24344
base: master
Are you sure you want to change the base?
Conversation
@bazel-io fork 8.0.0 |
Even with `CLOCK_MONOTONIC`, we are still seeing Bazel servers fail to start up occasionally due to start time being larger than end time. Make this non-fatal by truncating to 0 and emitting a warning with start and end time to facilitate further investigation. Also flip the conditions for command and extraction wait time, which previously were only included if *not* known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level comment: I realize that I'm asking for a larger refactor, but I'm wondering if we could simply use std::chrono::duration
for DurationMillis
, and std::optional<std::chrono::duration>
for ExtractionDurationMillis
(because we either extracted and have a known duration, or we didn't extract and don't know it; the other two combinations don't appear to be useful).
WDYT?
Ignore. My suggestion doesn't really gel with what you're trying to do. See my other comments instead. |
@@ -557,14 +557,14 @@ static void AddLoggingArgs(const LoggingInfo &logging_info, | |||
|
|||
// The time in ms a command had to wait on a busy Blaze server process. | |||
// This is part of startup_time. | |||
if (command_wait_duration_ms.IsUnknown()) { | |||
if (!command_wait_duration_ms.IsUnknown()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spirit of avoiding double negatives (which I suspect might have contributed to the bug) can we call the method IsKnown()
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking some more about it: if the only consequence of IsUnknown()
is that we don't set a --command_wait_time
flag, why not set it to 0 in that case? The flags library can't distinguish between 0 and unset anyway.
// Value representing that a timing event never occurred or is unknown. | ||
static constexpr uint64_t kUnknownDuration = 0; | ||
}; | ||
|
||
// DurationMillis that tracks if an archive was extracted. | ||
struct ExtractionDurationMillis : DurationMillis { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we really need this class: we either did an extraction and know the time it took, or we didn't and we don't; we never inspect archive_extracted
, except in tests. Wouldn't a DurationMillis
suffice, with "unknown" signifying "not extracted"?
@@ -731,7 +732,7 @@ LockHandle AcquireLock(const std::string& name, const blaze_util::Path& path, | |||
} | |||
} | |||
|
|||
*wait_time = elapsed_time; | |||
*wait_time = elapsed_time->millis; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning a DurationMillis
through an out-parameter seems less awkward.
Even with
CLOCK_MONOTONIC
, we are still seeing Bazel servers fail to start up occasionally due to start time being larger than end time. Make this non-fatal by truncating to 0 and emitting a warning with start and end time to facilitate further investigation.Also flip the conditions for command and extraction wait time, which previously were only included if not known.