Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit for cpu stats #63

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Initial commit for cpu stats #63

wants to merge 5 commits into from

Conversation

hanish520
Copy link
Contributor

Arian wanted to see the amount of CPU and memory saved if some of the replicas chose not to participate in the Handel. I thought maybe it is interesting in general hotstuff experiments to understand the CPU and memory usage.
Did not clean up the code, and please let me know if the direction is fine.

@hanish520 hanish520 requested review from meling and johningve June 5, 2022 19:26
@codecov-commenter
Copy link

codecov-commenter commented Jun 5, 2022

Codecov Report

Merging #63 (035df7d) into master (ce18e82) will decrease coverage by 1.50%.
The diff coverage is 2.04%.

@@            Coverage Diff             @@
##           master      #63      +/-   ##
==========================================
- Coverage   53.08%   51.58%   -1.51%     
==========================================
  Files          70       71       +1     
  Lines        6403     6477      +74     
==========================================
- Hits         3399     3341      -58     
- Misses       2724     2857     +133     
+ Partials      280      279       -1     
Flag Coverage Δ
unittests 51.58% <2.04%> (-1.51%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
metrics/replicacpumem.go 2.04% <2.04%> (ø)
internal/proto/hotstuffpb/convert.go 76.47% <0.00%> (-16.58%) ⬇️
backend/config.go 48.96% <0.00%> (-15.04%) ⬇️
modules/registry.go 42.85% <0.00%> (-5.59%) ⬇️
backend/server.go 40.74% <0.00%> (-5.56%) ⬇️
consensus/types.go 62.12% <0.00%> (-5.31%) ⬇️
twins/network.go 81.31% <0.00%> (-4.48%) ⬇️
synchronizer/viewduration.go 27.11% <0.00%> (-3.39%) ⬇️
blockchain/blockchain.go 88.77% <0.00%> (-2.05%) ⬇️
crypto/bls12/bls12.go 47.29% <0.00%> (-1.81%) ⬇️
... and 12 more

Help us with your feedback. Take ten seconds to tell us how you rate us.

Copy link
Member

@johningve johningve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite neat. I have some nitpicks about variable names but otherwise, it looks good.

}

// CPUMemStat measures CPU usage and Memory usage and record in the metric logs.
type CPUMemStat struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think renaming it to just CPUMem would be better.

c.tick(event.(types.TickEvent))
})
c.mods.Logger().Info("CPU-Memory stats metric enabled")
// Percent with 0 interval returns 0 usage when called first time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment should be more descriptive. Maybe:

The cpu.Percent function returns the CPU usage since the last call when called with an interval of 0.
This initial call ensures that the first measurement of the CPU usage is nonzero.

}

// getCPUsage Method returns the average CPU per core and the number of cores, including logical ones.
func (c *CPUMemStat) getCPUsage() (float64, uint32) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename this method to getCPUPercentage.

// tick method is invoked periodically based on the configured measuring interval of metrics
func (c *CPUMemStat) tick(_ types.TickEvent) {
now := time.Now()
cpusage, cores := c.getCPUsage()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to use camelCase for variable names. Change to cpuUsage, availableMem, memUsage


message CPUMemoryStats {
Event Event = 1;
double CPUsagePercentage = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either change CPUsagePercentage to CPUUsagePercentage or drop "Usage" from both.
I.E. CPUPercentage and MemoryPercentage

@@ -39,3 +39,11 @@ message ViewTimeouts {
// Number of view timeouts.
uint64 Timeouts = 3;
}

message CPUMemoryStats {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming this to CPUMemoryMeasurement to align with the Throughput and Latency variants.

})
}

// CPUMem measures CPU usage and Memory usage and record in the metric logs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPUMem measures CPU and memory usage and records these in the metrics logs.

"github.com/shirou/gopsutil/v3/mem"
)

// CPUMem metics measures the percentage of cpu and memory utilization on the node.
Copy link
Member

@meling meling Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of important info for a user. As is, it will not be shown in the package documentation or elsewhere except in the source code. Please move this doc to the metrics package level, and integrate it with other documentation for the different metrics supported. Currently, the metrics/doc.go file contains general information about the package, but it could be expanded with more paragraphs on the specific metrics supported by the package.

Note: Some typos in this text: metic(s) -> metric(s)

@@ -0,0 +1,84 @@
package metrics
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this file covers both replica and client CPU and memory metrics, I suggest renaming the file to cpumem.go instead.

}
}

// getCPUPercentage Method returns the average CPU per core and the number of cores, including logical ones.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// getCPUPercentage returns the average CPU per core and the number of logical cores.

// tick method is invoked periodically based on the configured measuring interval of metrics
func (c *CPUMem) tick(_ types.TickEvent) {
now := time.Now()
cpu, cores := c.getCPUPercentage()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need the two helper methods getCPUPercentage() and getMemoryPercentage(). The code would be easier to read if they were just inlined here.


// tick method is invoked periodically based on the configured measuring interval of metrics
func (c *CPUMem) tick(_ types.TickEvent) {
now := time.Now()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now is only used on line 77. You can just call time.Now() there.

c.mods.EventLoop().RegisterObserver(types.TickEvent{}, func(event interface{}) {
c.tick(event.(types.TickEvent))
})
c.mods.Logger().Info("CPU-Memory stats metric enabled")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU-Memory metric enabled

Copy link
Member

@meling meling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been here for a long time, and I finally reviewed it... 👍🏼

Anyway, I think it is better to split this into two separate metrics, one for memory and another for CPU. While they might be used together, they require two different "calls" to the runtime API, and thus it doesn't seem logical to me to combine them into one. Also, it prevents you from enabling only one of them. It also makes naming things more awkward.

@meling
Copy link
Member

meling commented Oct 6, 2022

Could we drop the dependency in favor of the new metrics API in stdlib. See runtime/metrics. Or maybe this makes the whole PR obsolete?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants