Skip to content
This repository has been archived by the owner on Apr 30, 2024. It is now read-only.
Mauricio Teixeira edited this page Sep 21, 2018 · 6 revisions

Welcome to the tower-nagios-integration wiki!

This repository will contain various documentation and scripts that help integrate Ansible Tower with Nagios.

tower_handler.py

Script to be used as Nagios event handler to trigger jobs in Ansible Tower.

Software requirements

  • Python 2.7
  • Nagios 3.5 or higher
  • Ansible Tower 3.2 or higher

Configuration requirements

By the time that you arrived here, you may already have everything you need to run this script. We will list the requirements here, but this document does not intend to explain how to achieve these. Please refer to the specific documentation of the given technology used.

  • Ansible Tower
    • Username/password to be used by Nagios.
    • At least one inventory and one job template.
    • It's highly advisable that your job template have the inventory "prompt on launch" check box marked, however it's not required.
  • Nagios
    • tower-cli installed and configured with the proper credentials.
      • HINT: On RHEL7 you can install python2-ansible-tower-cli from EPEL

Installation

Copy tower_handler.py into the directory where your event handler scripts should run (as defined by your configuration).

Test your environment

First of all, make sure tower-cli is working properly. The minimum viable test is this:

# tower-cli job list
===== ============ ======================== ========== ======= 
 id   job_template         created            status   elapsed 
===== ============ ======================== ========== ======= 
    1           1  2018-10-03T18:30:00.000Z successful  42.000
===== ============ ======================== ========== =======

To confirm if the handler itself is working, you can trigger a job from the command line:

# /path/to/tower_handler.py --template <my_template> --inventory <my_inventory> --attempt 2

If successful, the script will not produce any return, but you will see a job on your Ansible Tower Jobs tab (or in the job list, if you repeat the command above).

Command line options

Even though this script has been written to be used as a Nagios event handler, it can also be used from the command line (even though it's a little more complicated than using tower-cli directly).

It's important to know all the available command line options, because you will need to know them in order to define your own Nagios handlers. Depending on how you use those options will make it easier or harder to consume the handler.

# /path/to/tower_handler.py --help
usage: tower_handler.py [-h] --template TEMPLATE --inventory INVENTORY
                        [--playbook PLAYBOOK] [--extra_vars EXTRA_VARS]
                        [--limit LIMIT] [--state STATE] [--attempt ATTEMPT]
                        [--downtime DOWNTIME] [--service SERVICE]
                        [--hostname HOSTNAME] [--warning]

optional arguments:
  -h, --help            show this help message and exit
  --template TEMPLATE   Job template (number or name)
  --inventory INVENTORY
                        Inventory (number or name)
  --playbook PLAYBOOK   Playbook to run (yaml file inside template)
  --extra_vars EXTRA_VARS
                        Extra variables (JSON)
  --limit LIMIT         Limit run to these hosts (group name, or comma
                        separated hosts)
  --state STATE         Nagios check state
  --attempt ATTEMPT     Nagios check attempt
  --downtime DOWNTIME   Nagios downtime check
  --service SERVICE     Nagios alerting service
  --hostname HOSTNAME   Nagios alerting hostname
  --warning             Trigger on WARNING (otherwise just CRITICAL and
                        UNKNOWN)

Nagios configuration

There are many ways to configure Nagios to use this script. We will present here some suggestions.

Example 1 - short call to the handler, wide impact

This will trigger the job run against all the hosts on the specified inventory.

/etc/nagios/conf.d/eventhandlers.cfg
define command {
    command_name        tower-handler-min
    # when playbook does not require extra_vars, and you want to run on full inventory
    command_line        $HANDLERS$/tower_handler.py --state '$SERVICESTATE$' --attempt '$SERVICEATTEMPT$' --downtime '$SERVICEDOWNTIME$' --service '$SERVICEDESC$' --hostname '$HOSTADDRESS$' --template '$ARG1$' --inventory '$ARG2$'
}
/etc/nagios/hosts.d/server01.example.com.cfg
define service {
    use                         generic-service
    host_name                   server01.example.com
    service_description         MyAppService
    contact_groups              it-production
    check_command               check_myappservice
    event_handler               tower-handler-min!My Template!My Inventory
}

Example 2 - longer call to the handler, more precise action

This allows the use of all parameters during the handler call, which provides more information to the job template, allowing fore more precise action.

/etc/nagios/conf.d/eventhandlers.cfg
define command {
    command_name        tower-handler-full
    command_line        $HANDLERS$/tower_handler.py --state '$SERVICESTATE$' --attempt '$SERVICEATTEMPT$' --downtime '$SERVICEDOWNTIME$' --service '$SERVICEDESC$' --hostname '$HOSTADDRESS$' --template '$ARG1$' --inventory '$ARG2$' --extra_vars '$ARG3$' --limit '$ARG4$'
}
/etc/nagios/hosts.d/server01.example.com.cfg
define service {
    use                         generic-service
    host_name                   server01.example.com
    service_description         MyAppService
    contact_groups              it-production
    check_command               check_myappservice
    event_handler               tower-handler-full!My Template!My Inventory!my_variable: value!<fqdn>"
}

Note: in this case, <fqdn> can be either the host itself, or a totally different host, as long as it exists in the inventory.

Useful variations

  • Run against the host itself. By adding --limit '$HOSTADDRESS$' to the command definition, the job will run only against the host which called the handler.
  • Run in WARNING state By default, the script only runs when the alert is in CRITICAL or UNKNOWN state. Adding --warning to the command definition will allow it to trigger during a WARNING state.
Clone this wiki locally