-
Notifications
You must be signed in to change notification settings - Fork 265
Grafana and Prometheus Monitoring Setup
This documentation provides a comprehensive guide on setting up Grafana and Prometheus for monitoring. It covers installation steps, configuration details, instructions for accessing and managing dashboards, and troubleshooting tips for maintaining the monitoring setup.
-
Install and Configure Prometheus and Grafana:
Set up Prometheus for metric collection and Grafana for data visualization.
-
Create and Configure Grafana Dashboards:
Develop dashboards to visualize metrics.
-
Set Up Alerts Based on Collected Metrics:
Configure alerting to notify team of potential issues.
-
Ensure Proper Data Retention and Access Control:
Manage data storage and user access.
- Installing Prometheus and Grafana
- Configuring the Monitoring Dashboards
- Configuring Alerting
- Data Retention and Access Control
- Download Prometheus: Obtain the latest version from the Prometheus website.
- Extract the Archive:
tar xvfz prometheus-*.tar.gz
- Move to Installation Directory:
sudo mv prometheus-* /usr/local/prometheus
- Create a Prometheus User:
sudo useradd --no-create-home --shell /bin/false prometheus
- Set Up Directories and Permissions:
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
- Copy Configuration Files:
sudo cp prometheus.yml /etc/prometheus/
sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml
- Create a Systemd Service File:
[Unit]
Description=Prometheus
After=network.target
[Service]
User=prometheus
Group=prometheus
ExecStart=/usr/local/prometheus/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/var/lib/prometheus/
[Install]
WantedBy=multi-user.target
- Enable and Start Prometheus:
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
sudo apt-get install -y apt-transport-https software-properties-common
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
Create a Grafana User:
sudo useradd --no-create-home --shell /bin/false grafana
Set Up Directories and Permissions:
sudo chown grafana:grafana /usr/local/grafana
Create a Systemd Service File:
#!/bin/bash
# Check if script is run as root
if [ "$EUID" -ne 0 ]
then echo "Please run as root"
exit
fi
# Create systemd service file for Prometheus
cat > /etc/systemd/system/prometheus.service <<EOL
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
EOL
Enable and Start Grafana:
sudo systemctl daemon-reload
sudo systemctl enable grafana
sudo systemctl start grafana
Node Exporter is essential for collecting server metrics such as CPU usage, memory usage, disk I/O, and network traffic.
wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
tar xvfz node_exporter-*.tar.gz
cd node_exporter-*
sudo mv node_exporter /usr/local/bin/
Create a Systemd Service File:
ini
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
Enable and Start Node Exporter:
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
cAdvisor Installation file
VERSION=v0.47.0 # This was the latest stable version as of my last update
wget https://github.com/google/cadvisor/releases/download/$VERSION/cadvisor-$VERSION-linux-amd64
chmod +x cadvisor-$VERSION-linux-amd64
sudo mv cadvisor-$VERSION-linux-amd64 /usr/local/bin/cadvisor
cAdvisor (/etc/systemd/system/cadvisor.service):
ini
[Unit]
Description=cAdvisor
Wants=network-online.target
After=network-online.target
[Service]
User=cadvisor
Group=cadvisor
Type=simple
ExecStart=/usr/local/bin/cadvisor
[Install]
WantedBy=multi-user.target
After creating these files, run the following commands:
sudo systemctl daemon-reload
sudo systemctl enable prometheus node_exporter cadvisor alertmanager
sudo systemctl start prometheus node_exporter cadvisor alertmanager
Note: You'll need to create the appropriate users and groups (prometheus, node_exporter, cadvisor, alertmanager) and ensure proper permissions for directories and files.
#!/bin/bash
# Check if script is run as root
if [ "$EUID" -ne 0 ]
then echo "Please run as root"
exit
fi
# Create systemd service file for Prometheus
cat > /etc/systemd/system/prometheus.service <<EOL
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
EOL
# Create systemd service file for Node Exporter
cat > /etc/systemd/system/node_exporter.service <<EOL
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
EOL
# Create systemd service file for cAdvisor
cat > /etc/systemd/system/cadvisor.service <<EOL
[Unit]
Description=cAdvisor
Wants=network-online.target
After=network-online.target
[Service]
User=cadvisor
Group=cadvisor
Type=simple
ExecStart=/usr/local/bin/cadvisor
[Install]
WantedBy=multi-user.target
EOL
# Create systemd service file for Alertmanager
cat > /etc/systemd/system/alertmanager.service <<EOL
[Unit]
Description=Alertmanager
Wants=network-online.target
After=network-online.target
[Service]
User=alertmanager
Group=alertmanager
Type=simple
ExecStart=/usr/local/bin/alertmanager \
--config.file=/etc/alertmanager/alertmanager.yml \
--storage.path=/var/lib/alertmanager
[Install]
WantedBy=multi-user.target
EOL
# Reload systemd to recognize new service files
systemctl daemon-reload
# Enable and start services
systemctl enable prometheus node_exporter cadvisor alertmanager
systemctl start prometheus node_exporter cadvisor alertmanager
echo "Systemd service files have been created and services have been enabled and started."
echo "Please ensure that you have created the necessary users and groups,"
echo "and that the binary files and directories exist with proper permissions."
Configure Grafana to Use Prometheus as a Data Source:
- Access Grafana via your web browser (http://localhost:3000/). Since we have an application running on port 3000, we configured Grafana to use 3050 instead.
- Navigate to Configuration > Data Sources > Add data source.
- Select Prometheus and set the URL to http://localhost:9090. Click Save & Test.
Import Node Exporter Dashboard:
- In Grafana, go to the Dashboards page.
- Click the “+” icon and select “Import”.
- Enter Dashboard ID 1860 and click “Load”.
- Choose the Prometheus data source and click “Import”.
Using cAdvisor:
- In Grafana, click the “+” icon and select “Dashboard”.
- Click “import” and the add the dashboard json or ID.
Configuring Dynamic Variables:
- Go to the dashboard you want to configure dynamic variables, click the settings icon, click the variables tab, and create the dynamic variable.
- We chose to make a dynamic variable based on the environments of our containerized applications. e.g.(dev, staging, prod).
Testing the Dynamic Variable:
- From the image below, our dynamic variable lists the containers based on the environments.
Create and Configure Alerts in Grafana:
- Open the desired panel and click the Edit button.
- Navigate to the Alert tab and click “Create Alert”.
- Define conditions based on Prometheus queries (e.g., rate(myapp_request_count[1m]) > 100).
Configure Alert Evaluation and Frequency:
- Set the evaluation interval (e.g., every minute) and the duration for which the alert condition must be met (e.g., 5 minutes).
Set Up Notification Channels:
- Navigate to Alerting > Notification channels > New Channel.
- Configure the notification channel (e.g., Slack) with your webhook URL.
Link Alerts to Notification Channels:
- In the alert configuration, associate the alert with the notification channel created.
- Data Retention and Access Control
- Configuring Data Retention
Set Data Retention in Prometheus:
- Specify the data retention period in the Prometheus configuration or startup parameters (e.g., --storage.tsdb.retention.time=15d).
Update Prometheus Service File:
- Ensure that the retention flag is included in the Prometheus service configuration.
- Configuring Access Control
Manage User Roles and Permissions in Grafana:
- Log in to Grafana and navigate
Made with ❤️ by Ravencodes | AugustHottie | CodeReaper0 | bySegunMoses | Suesue | DrInTech22 courtesy of @HNG-Internship