Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added -R to check-process.rb for RSS checks #46

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ This project adheres to [Semantic Versioning](http://semver.org/).
This CHANGELOG follows the format listed at [Keep A Changelog](http://keepachangelog.com/)

## [Unreleased]
### Added
- check-process.rb: added `-R` to allow checking for processes whose RSS exceeds some value

## [2.3.0] - 2017-05-29
### Changed
- metrics-per-process.py: Use memory_info() in psutil versions greater than 4.0.0, as memory_info_ex() was deprecated.
Expand Down
3 changes: 1 addition & 2 deletions Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,7 @@ desc 'Test for binstubs'
task :check_binstubs do
bin_list = Gem::Specification.load('sensu-plugins-process-checks.gemspec').executables
bin_list.each do |b|
`which #{ b }`
unless $CHILD_STATUS.success?
unless system("which #{b} > /dev/null")
puts "#{b} was not a binstub"
exit
end
Expand Down
15 changes: 14 additions & 1 deletion bin/check-process.rb
Original file line number Diff line number Diff line change
Expand Up @@ -117,9 +117,17 @@ class CheckProcess < Sensu::Plugin::Check::CLI
description: 'Trigger on a Virtual Memory size is bigger than this',
proc: proc(&:to_i)

# Alert if RSS is under this value
option :rss,
short: '-r RSS',
long: '--resident-set-size RSS',
description: 'Trigger on a Resident Set size is smaller than this',
proc: proc(&:to_i)

# Alert if RSS is over this value
option :over_rss,
short: '-R RSS',
long: '--over-resident-set-size RSS',
description: 'Trigger on a Resident Set size is bigger than this',
proc: proc(&:to_i)

Expand Down Expand Up @@ -209,7 +217,7 @@ def line_to_hash(line, *cols)
#
def on_cygwin?
# #YELLOW
`ps -W 2>&1`; $CHILD_STATUS.exitstatus == 0 # rubocop:disable Semicolon
`ps -W 2>&1`; $CHILD_STATUS.exitstatus.zero? # rubocop:disable Semicolon
end

# Acquire all the proceeses on a system for further analysis
Expand Down Expand Up @@ -249,6 +257,7 @@ def cputime_to_csec(time)

# The main function
#
# rubocop:disable Metrics/AbcSize
def run
procs = acquire_procs

Expand All @@ -261,7 +270,10 @@ def run
procs.reject! { |p| p[:command] =~ /#{config[:exclude_pat]}/ } if config[:exclude_pat]
procs.reject! { |p| p[:command] !~ /#{config[:cmd_pat]}/ } if config[:cmd_pat]
procs.select! { |p| p[:vsz].to_f > config[:vsz] } if config[:vsz]
# Ensure RSS is over this value
procs.select! { |p| p[:rss].to_f > config[:rss] } if config[:rss]
# Ensure RSS is under this value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is wrong...swap it with the comment above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can clarify (or remove entirely) the comments I added there, but I'm not sure they are incorrect. Looking at both RSS-related selects:

# Ensure RSS is over this value
procs.select! { |p| p[:rss].to_f > config[:rss] } if config[:rss]
# Ensure RSS is under this value
procs.select! { |p| p[:over_rss].to_f < config[:over_rss] } if config[:over_rss]

The first (and original) selects only those processes (p) whose reported RSS (:rss) is greater than the supplied value in config (thus my comment of # Ensure RSS is over this value).

The second selects only those processes whose reported RSS is less than the supplied value in the config (thus my comment of # Ensure RSS is under this value). Albeit, the use of "under" and the config key of :over_rss is not super clear, my intention on the new -R is to alert when the RSS value goes over the value you specify on the command line, so I think it still fits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, my last comment made me see that the process (p) key should be :rss for that comparison. You'll see another commit in just a second.

Copy link
Member

@majormoses majormoses Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that must have been what threw me off, makes perfect sense now.

Copy link
Member

@majormoses majormoses Jun 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks wrong to me:

# Alert if RSS is over this value
 +  option :over_rss,
+    # Ensure RSS is over this value
      procs.select! { |p| p[:rss].to_f > config[:rss] } if config[:rss]
+    # Ensure RSS is under this value
+    procs.select! { |p| p[:rss].to_f < config[:over_rss] } if config[:over_rss]

I would write it like this which makes more sense to me:

# Alert if RSS is under this value
 +  option :rss_under,
# Alert if RSS is over this value
 +  option :rss_over,
+    # Ensure RSS is over this value
      procs.select! { |p| p[:rss].to_f > config[:rss_over] } if config[:rss_over]
+    # Ensure RSS is under this value
+    procs.select! { |p| p[:rss].to_f < config[:rss_under] } if config[:rss_under]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should just always log a message saying something to this effect:

INFO: checking if there are (greater|less) than (warn|crit) thresholds
CheckProcess OK: Found 6 matching processes; cmd /google/; rss > 368381

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I am almost tempted to say there should be no defaults for any of the crit/warnings if we want to stay flexible and keep it all in the same check. Let the user tell us what they want to check as we cant have sane defaults for both scenarios as they are mutually exclusive from what I see.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would obviously be a breaking change @eheydrick thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thankful that I'm not the only one that finds this check's options less than intuitive. I'm happy with postponing / canceling this PR until this discussion is concluded. I'll probably just write a really simple check to do just what I need for now. I'm also happy to help contribute to a library too if that would be valuable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can do what you need with the existing features as outlined above. you just need to set -w, -c to what they should be. Let's close out this PR and continue the discussion in the issue. I really do want to improve this but what is proposed is already available so unless we make things more intuitive I don't see anything that should be addressed here.

procs.select! { |p| p[:rss].to_f < config[:over_rss] } if config[:over_rss]
procs.select! { |p| p[:cpu].to_f > config[:cpu_utilization] } if config[:cpu_utilization]
procs.select! { |p| p[:thcount].to_i > config[:thcount] } if config[:thcount]
procs.reject! { |p| etime_to_esec(p[:etime]) >= config[:esec_under] } if config[:esec_under]
Expand All @@ -277,6 +289,7 @@ def run
msg += "; user #{config[:user].join(',')}" if config[:user]
msg += "; vsz > #{config[:vsz]}" if config[:vsz]
msg += "; rss > #{config[:rss]}" if config[:rss]
msg += "; rss < #{config[:over_rss]}" if config[:over_rss]
msg += "; cpu > #{config[:cpu_utilization]}" if config[:cpu_utilization]
msg += "; thcount > #{config[:thcount]}" if config[:thcount]
msg += "; esec < #{config[:esec_under]}" if config[:esec_under]
Expand Down