Velero exposes Metrics for monitoring, that are available for scraping with a tool such as Prometheus.
On Azure, You don't need to setup your own Prometheus server: Azure Container Insights knows how to scrape Prometheus Metrics, and you can export your metrics to Azure seamlessly.
Read the documentation on Prometheus Integration: https://learn.microsoft.com/azure/azure-monitor/containers/container-insights-prometheus-integration
To enable scraping with Container insights, you simply need to deploy a ConfigMap related to metrics & data collection.
You can find an example, in this repository for testing purposes (we recommend to download the latest version from the documentation link above).
In this example (because Velero already exports the monitoring metrics), we simply enable monitoring_kubernetes_pods in the ConfigMap, and that's it !
monitor_kubernetes_pods = true
## Restricts Kubernetes monitoring to namespaces for pods that have annotations set and are scraped using the monitor_kubernetes_pods setting.
## This will take effect when monitor_kubernetes_pods is set to true
# ex. monitor_kubernetes_pods_namespaces = ["velero"]
- Run the following command to deploy the configMap to the AKS cluster:
``` bash
kubectl apply -f container-azm-ms-agentconfig.yaml
```
You can access Insights on Velero Metrics, by viewing the InsightsMetrics of your AKS cluster.
You can simply filter the name for the metric: velero_backup_failure_total
You can create an Alert When the value of velero_backup_failure_total exceeds a threshold (example >=1)
- You will need to setup the Log Analytics workspaces for both the source AKS (where backups are taken), and the target (where they are restored).
- If you plan to you Azure Container Insights, You'll need to setup the ConfigMap (monitoring) configuration on both clusters.