-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial commit for cpu stats #63
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #63 +/- ##
==========================================
- Coverage 53.08% 51.58% -1.51%
==========================================
Files 70 71 +1
Lines 6403 6477 +74
==========================================
- Hits 3399 3341 -58
- Misses 2724 2857 +133
+ Partials 280 279 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite neat. I have some nitpicks about variable names but otherwise, it looks good.
metrics/replicacpumem.go
Outdated
} | ||
|
||
// CPUMemStat measures CPU usage and Memory usage and record in the metric logs. | ||
type CPUMemStat struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think renaming it to just CPUMem
would be better.
metrics/replicacpumem.go
Outdated
c.tick(event.(types.TickEvent)) | ||
}) | ||
c.mods.Logger().Info("CPU-Memory stats metric enabled") | ||
// Percent with 0 interval returns 0 usage when called first time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment should be more descriptive. Maybe:
The cpu.Percent function returns the CPU usage since the last call when called with an interval of 0.
This initial call ensures that the first measurement of the CPU usage is nonzero.
metrics/replicacpumem.go
Outdated
} | ||
|
||
// getCPUsage Method returns the average CPU per core and the number of cores, including logical ones. | ||
func (c *CPUMemStat) getCPUsage() (float64, uint32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename this method to getCPUPercentage
.
metrics/replicacpumem.go
Outdated
// tick method is invoked periodically based on the configured measuring interval of metrics | ||
func (c *CPUMemStat) tick(_ types.TickEvent) { | ||
now := time.Now() | ||
cpusage, cores := c.getCPUsage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to use camelCase
for variable names. Change to cpuUsage
, availableMem
, memUsage
metrics/types/types.proto
Outdated
|
||
message CPUMemoryStats { | ||
Event Event = 1; | ||
double CPUsagePercentage = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either change CPUsagePercentage to CPUUsagePercentage or drop "Usage" from both.
I.E. CPUPercentage
and MemoryPercentage
@@ -39,3 +39,11 @@ message ViewTimeouts { | |||
// Number of view timeouts. | |||
uint64 Timeouts = 3; | |||
} | |||
|
|||
message CPUMemoryStats { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest renaming this to CPUMemoryMeasurement
to align with the Throughput
and Latency
variants.
}) | ||
} | ||
|
||
// CPUMem measures CPU usage and Memory usage and record in the metric logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPUMem measures CPU and memory usage and records these in the metrics logs.
"github.com/shirou/gopsutil/v3/mem" | ||
) | ||
|
||
// CPUMem metics measures the percentage of cpu and memory utilization on the node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of important info for a user. As is, it will not be shown in the package documentation or elsewhere except in the source code. Please move this doc to the metrics
package level, and integrate it with other documentation for the different metrics supported. Currently, the metrics/doc.go
file contains general information about the package, but it could be expanded with more paragraphs on the specific metrics supported by the package.
Note: Some typos in this text: metic(s)
-> metric(s)
@@ -0,0 +1,84 @@ | |||
package metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this file covers both replica
and client
CPU and memory metrics, I suggest renaming the file to cpumem.go
instead.
} | ||
} | ||
|
||
// getCPUPercentage Method returns the average CPU per core and the number of cores, including logical ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// getCPUPercentage returns the average CPU per core and the number of logical cores.
// tick method is invoked periodically based on the configured measuring interval of metrics | ||
func (c *CPUMem) tick(_ types.TickEvent) { | ||
now := time.Now() | ||
cpu, cores := c.getCPUPercentage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need the two helper methods getCPUPercentage()
and getMemoryPercentage()
. The code would be easier to read if they were just inlined here.
|
||
// tick method is invoked periodically based on the configured measuring interval of metrics | ||
func (c *CPUMem) tick(_ types.TickEvent) { | ||
now := time.Now() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now
is only used on line 77. You can just call time.Now()
there.
c.mods.EventLoop().RegisterObserver(types.TickEvent{}, func(event interface{}) { | ||
c.tick(event.(types.TickEvent)) | ||
}) | ||
c.mods.Logger().Info("CPU-Memory stats metric enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CPU-Memory metric enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been here for a long time, and I finally reviewed it... 👍🏼
Anyway, I think it is better to split this into two separate metrics, one for memory and another for CPU. While they might be used together, they require two different "calls" to the runtime API, and thus it doesn't seem logical to me to combine them into one. Also, it prevents you from enabling only one of them. It also makes naming things more awkward.
Could we drop the dependency in favor of the new metrics API in stdlib. See runtime/metrics. Or maybe this makes the whole PR obsolete? |
Arian wanted to see the amount of CPU and memory saved if some of the replicas chose not to participate in the Handel. I thought maybe it is interesting in general hotstuff experiments to understand the CPU and memory usage.
Did not clean up the code, and please let me know if the direction is fine.