The Alert
API defines how events are filtered by severity and involved object, and what provider to use for dispatching.
The following is an example of how to send alerts to Slack when Flux fails to reconcile the flux-system
namespace.
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack-bot
namespace: flux-system
spec:
type: slack
channel: general
address: https://slack.com/api/chat.postMessage
secretRef:
name: slack-bot-token
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: slack
namespace: flux-system
spec:
summary: "Cluster addons impacted in us-east-2"
providerRef:
name: slack-bot
eventSeverity: error
eventSources:
- kind: GitRepository
name: '*'
- kind: Kustomization
name: '*'
In the above example:
- A Provider named
slack-bot
is created, indicated by theProvider.metadata.name
field. - An Alert named
slack
is created, indicated by theAlert.metadata.name
field. - The Alert references the
slack-bot
provider, indicated by theAlert.spec.providerRef
field. - The notification-controller starts listening for events sent for
all GitRepositories and Kustomizations in the
flux-system
namespace. - When an event with severity
error
is received, the controller posts a message on Slack channel from.spec.channel
, containing thesummary
text and the reconciliation error.
You can run this example by saving the manifests into slack-alerts.yaml
.
-
First create a secret with the Slack bot token:
kubectl -n flux-system create secret generic slack-bot-token --from-literal=token=xoxb-YOUR-TOKEN
-
Apply the resources on the cluster:
kubectl -n flux-system apply --server-side -f slack-alerts.yaml
As with all other Kubernetes config, an Alert needs apiVersion
,
kind
, and metadata
fields. The name of an Alert object must be a
valid DNS subdomain name.
An Alert also needs a
.spec
section.
.spec.summary
is an optional field to specify a short description of the
impact and affected cluster.
The summary max length can't be greater than 255 characters.
.spec.providerRef.name
is a required field to specify a name reference to a
Provider in the same namespace as the Alert.
.spec.eventSources
is a required field to specify a list of references to
Flux objects for which events are forwarded to the alert provider API.
To select events issued by Flux objects, each entry in the .spec.eventSources
list
must contain the following fields:
kind
is the Flux Custom Resource Kind such as GitRepository, HelmRelease, Kustomization, etc.name
is the Flux Custom Resource.metadata.name
, or it can be set to the*
wildcard.namespace
is the Flux Custom Resource.metadata.namespace
. When not specified, the Alert.metadata.namespace
is used instead.
To select events issued by a single Flux object, set the kind
, name
and namespace
:
eventSources:
- kind: GitRepository
name: webapp
namespace: apps
The *
wildcard can be used to select events issued by all Flux objects of a particular kind
in a namespace
:
eventSources:
- kind: HelmRelease
name: '*'
namespace: apps
To select events issued by all Flux objects of a particular kind
with specific labels
:
eventSources:
- kind: HelmRelease
name: '*'
namespace: apps
matchLabels:
team: app-dev
Note: On multi-tenant clusters, platform admins can disable cross-namespace references by
starting the controller with the --no-cross-namespace-refs=true
flag.
When this flag is set, alerts can only refer to event sources in the same namespace as the alert object,
preventing tenants from subscribing to another tenant's events.
.spec.eventMetadata
is an optional field for adding metadata to events dispatched by
the controller. This can be used for enhancing the context of the event. If a field
would override one already present on the original event as generated by the emitter,
then the override doesn't happen, i.e. the original value is preserved, and an info
log is printed.
Add metadata fields to successful HelmRelease
events:
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: <name>
spec:
eventSources:
- kind: HelmRelease
name: '*'
inclusionList:
- ".*succeeded.*"
eventMetadata:
app.kubernetes.io/env: "production"
app.kubernetes.io/cluster: "my-cluster"
app.kubernetes.io/region: "us-east-1"
.spec.eventSeverity
is an optional field to filter events based on severity. When not specified, or
when the value is set to info
, all events are forwarded to the alert provider API, including errors.
To receive alerts only on errors, set the field value to error
.
.spec.exclusionList
is an optional field to specify a list of regex expressions to filter
events based on message content. The event will be excluded if the message matches at least
one of the expressions in the list.
Skip alerting if the message matches a Go regex from the exclusion list:
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: <name>
spec:
eventSources:
- kind: GitRepository
name: '*'
exclusionList:
- "waiting.*socket"
The above definition will not send alerts for transient Git clone errors like:
unable to clone 'ssh://[email protected]/v3/...', error: SSH could not read data: Error waiting on socket
.spec.inclusionList
is an optional field to specify a list of regex expressions to filter
events based on message content. The event will be sent if the message matches at least one
of the expressions in the list, and discarded otherwise. If the message matches one of the
expressions in the inclusion list but also matches one of the expressions in the exclusion
list, then the event is still discarded (exclusion is stronger than inclusion).
Alert if the message matches a Go regex from the inclusion list:
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: <name>
spec:
eventSources:
- kind: HelmRelease
name: '*'
inclusionList:
- ".*succeeded.*"
exclusionList:
- ".*uninstall.*"
- ".*test.*"
The above definition will send alerts for successful Helm installs, upgrades and rollbacks, but not uninstalls and tests.
.spec.suspend
is an optional field to suspend the altering.
When set to true
, the controller will stop processing events.
When the field is set to false
or removed, it will resume.