Send notifications when something goes wrong in rancher
- Will kick your ass when service goes down and send message when on recover
- Various notification mechanisms
- slack
-
- please create an issue if you need more
- Configure notification mechanisms globally or on a per service level(supported in
.json
config setup for now) - Customize your notification messages
rancher-alarms:
image: ndelitski/rancher-alarms
environment:
ALARM_SLACK_WEBHOOK_URL:https://hooks.slack.com/services/:UUID
labels:
io.rancher.container.create_agent: true
io.rancher.container.agent.role: environment
How to create Slack Webhook URL
NOTE: Including rancher agent labels is crucial otherwise you need provide rancher credentials manually with RANCHER_* variables
docker run \
-d \
-e RANCHER_ADDRESS=rancher.yourdomain.com \
-e RANCHER_ACCESS_KEY=ACCESS-KEY \
-e RANCHER_SECRET_KEY=SECRET-KEY \
-e RANCHER_PROJECT_ID=1a8 \
-e ALARM_SLACK_WEBHOOK_URL=https://hooks.slack.com/services/YOUR_SLACK_UUID \
--name rancher-alarms \
ndelitski/rancher-alarms
On startup get a list of services and instantiate healthcheck monitor for each of them if service is in a running state. Removed, purged and etc services will be ignored
List of healthcheck monitors is updated with a pollServicesInterval
interval. When service is removed it will be no longer monitored.
When a service transitions to a degraded state, all targets will be invoked to process notification(s).
rancher-alarms:
image: ndelitski/rancher-alarms
environment:
RANCHER_ADDRESS:your-rancher.com
ALARM_SLACK_WEBHOOK_URL:https://hooks.slack.com/services/...
More docker-compose examples see in examples
Could be ignored if you are running inside Rancher environment (service should be started as a rancher agent though)
RANCHER_ADDRESS
RANCHER_PROJECT_ID
RANCHER_ACCESS_KEY
RANCHER_SECRET_KEY
ALARM_POLL_INTERVAL
ALARM_MONITOR_INTERVAL
ALARM_MONITOR_HEALTHY_THRESHOLD
ALARM_MONITOR_UNHEALTHY_THRESHOLD
ALARM_FILTER
ALARM_EMAIL_ADDRESSES
ALARM_EMAIL_USER
ALARM_EMAIL_PASS
ALARM_EMAIL_SSL
ALARM_EMAIL_SMTP_HOST
ALARM_EMAIL_SMTP_PORT
ALARM_EMAIL_FROM
ALARM_EMAIL_SUBJECT
ALARM_EMAIL_TEMPLATE
ALARM_EMAIL_TEMPLATE_FILE
ALARM_SLACK_WEBHOOK_URL
ALARM_SLACK_CHANNEL
ALARM_SLACK_BOTNAME
ALARM_SLACK_TEMPLATE
ALARM_SLACK_TEMPLATE_FILE
See examples using environment config in docker-compose files
{
"rancher": {
"address": "rancher-host:port",
"auth": {
"accessKey": "<ACCESS_KEY>",
"secretKey": "<KEEP_YOUR_SECRETS_SAFE>"
},
"projectId": "1a5"
},
"pollServicesInterval": 10000,
"filter": [
"app/*"
],
"notifications": {
"*": {
"targets": {
"email": {
"recipients": [
"[email protected]"
]
}
},
"healthcheck": {
"pollInterval": 5000,
"healthyThreshold": 2,
"unhealthyThreshold": 3
},
},
"frontend": {
"targets": {
"email": {
"recipients": [
"[email protected]"
]
}
}
}
},
"targets": {
"email": {
"smtp": {
"from": "<Alarm Service> [email protected]",
"auth": {
"user": "[email protected]",
"password": "Str0ngPa$$"
},
"host": "smtp.gmail.com",
"secureConnection": true,
"port": 465
}
},
"slack": {
"webhookUrl": "https://hooks.slack.com/services/YOUR_SLACK_UUID",
"botName": "rancher-alarm",
"channel": "#devops"
}
}
}
rancher
Rancher API settings.required
pollServicesInterval
interval in ms of fetching list of services.required
.filter
whitelist filter for stack/services names in environment. List of string values. Every string is a RegExp expression so you can use something like this to match all stack servicesfrontend/*
.optional
notifications
per service notification settings. Wildcard means any servicerequired
healtcheck
monitoring state options.optional
defaults are:
{ pollInterval: 5000, healthyThreshold: 2, unhealthyThreshold: 3 }
targets
what notification targets to use. Will override base target settings in a roottargets
section. Currently each target must be an Object value. If you have nothing to override from a base settings just place{}
as a value.optional
targets
base settings for each notification target.required
healthyState
HEALTHY or UNHEALTHYstate
service state like it named in Rancher APIprevMonitorState
rancher-alarms previous service state namemonitorState
rancher-alarms service state name - e.g. always degraded for unhealthyserviceName
Name of a service in a RancherserviceUrl
Url to a running service in a Rancher UIstackUrl
Url to stack in a Rancher UIstackName
Name of a stack in a RancherenvironmentName
Name of a environment in a RancherenvironmentUrl
URL to environment in a rancher UI
Hey buddy! Your service #{serviceName} become #{healthyState}, direct link to the service #{serviceUrl}
More detailed examples your can see in the examples
folder
- [] Simplify configuration.
- [] More use of rancher labels and metadata. Alternate configuration through rancher labels/metadata(can be used in a conjunction with initial config).
- [] Run in a rancher environment as an agent with a new label
agent: true
. No need to specify keys anymore! - [] More notifications mechanisms: AWS SNS, http, sms
- Support templating
- [] Test coverage. Setup drone.io
- Notify when all services operate normal after some of them were in a degraded state
- [] Refactor code
- Shrinking image size with alpine linux