-
Notifications
You must be signed in to change notification settings - Fork 44
7. Monitoring with Prometheus and Grafana
Grafana and Prometheus are critical, complementary tools for modern infrastructure and software monitoring. Prometheus, an open-source system, collects and stores metrics from various sources in a time-series database, with a default 15-day retention period. While Prometheus offers basic querying capabilities, its primary function is data collection and storage. Grafana, on the other hand, provides a user-friendly web interface for visualizing this data. It connects to Prometheus, uses PromQL (Prometheus Query Language) to query the stored metrics, and presents the information in customizable dashboards. Together, they form a powerful solution for comprehensive monitoring and data visualization in software environments.
This documentation covers installation, configuration, dashboard creation, alerting, and best practices for maintaining this monitoring setup.
- Install and Configure Prometheus and Grafana
- Create and Configure Grafana Dashboards.
- Setting up alerts.
- Ensuring proper data retention and access control.
- Step 1: Download Prometheus
PROMETHEUS_VERSION="2.53.1"
wget [https://github.com/prometheus/prometheus/releases/download/v${PROMETHEUS_VERSION}/prometheus-${PROMETHEUS_VERSION}.linux-amd64.tar.gz](https://github.com/prometheus/prometheus/releases/download/v$%7BPROMETHEUS_VERSION%7D/prometheus-$%7BPROMETHEUS_VERSION%7D.linux-amd64.tar.gz)
-
Check for the latest version on the Prometheus website.
-
Sets the Prometheus version to install.
-
Downloads the Prometheus binary for Linux AMD64 architecture.
-
Step 2: Extract the archive
tar xvfz prometheus*.tar.gz
- Step 3: Setup Prometheus directories and user
sudo mkdir -p /opt/prometheus /etc/prometheus /var/lib/prometheus
sudo useradd --no-create-home --shell /bin/false prometheus || true
sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus || true
- Step 4: Move Prometheus files
sudo mv prometheus-${PROMETHEUS_VERSION}.linux-amd64/* /opt/prometheus/
- Step 5: Copy Prometheus binaries
sudo cp /opt/prometheus/prometheus /usr/local/bin/
sudo cp /opt/prometheus/promtool /usr/local/bin/
sudo cp /opt/prometheus/consoles /etc/prometheus
sudo cp /opt/prometheus/console_libraries /etc/prometheus
sudo cp /opt/prometheus/prometheus.yml /etc/prometheus
- Step 6: Set ownership for Prometheus binaries
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
sudo chown prometheus:prometheus /etc/prometheus
sudo chown -R prometheus:prometheus /etc/prometheus/consoles
sudo chown -R prometheus:prometheus /etc/prometheus/console_libraries
sudo chown -R prometheus:prometheus /var/lib/prometheus
- Step 7: Create Prometheus configuration
cat << EOF | sudo tee /etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["91.229.239.213:9090"]
- job_name: "node exporter"
static_configs:
- targets: ["91.229.239.213:9100"]
- job_name: "postgres-exporter"
static_configs:
- targets: ["91.229.239.213:9187"]
EOF
- Step 8: Create a Prometheus systemd service file
cat << EOF | sudo tee /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
EOF
- Step 9: Enable and start Prometheus
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
- Step 10: Allow port 9090 on your firewall for Prometheus
sudo ufw allow 9090/tcp
This allows incoming TCP traffic on port 9090, which is Prometheus' default port
It is important we install the necessary dependencies on our ubuntu system to prepare it for Grafana installation. Without these packages, you might encounter errors or issues during the later stages of the installation process.
- Step 1: Install dependencies
sudo apt-get install -y apt-transport-https software-properties-common wget
-
wget
is a utility for non-interactive download of files from the web. It's used in the next steps to download the Grafana GPG key, which is essential for verifying the authenticity of the Grafana packages. -
apt-transport-https
allows the apt package manager to retrieve packages over HTTPS. It's crucial for security, ensuring that package downloads are encrypted and protected from tampering. -
software-properties-common
provides scripts for managing software repositories. It's particularly important for adding PPAs (Personal Package Archives) and other third-party repositories, which might be necessary for some Grafana configurations or plugins. -
Step 2: Add Grafana GPG key
sudo mkdir -p /etc/apt/keyrings/
wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /etc/apt/keyrings/grafana.gpg > /dev/null
Adding the Grafana GPG key is a critical security measure. It ensures that you're installing genuine, unaltered Grafana packages, protects against potential security threats, and sets up your system for secure ongoing management of Grafana. This step is not just a formality but a fundamental part of maintaining the security and integrity of your system when installing and using third-party software like Grafana
- Step 3: Add Grafana repository
echo "deb [signed-by=/etc/apt/keyrings/grafana.gpg] https://apt.grafana.com/ stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
- Step 4: Update package lists again
sudo apt-get update
- Step 5: Install Grafana
sudo apt-get install -y grafana
echo "You can access Grafana at http://your_server_ip:3000/"
echo "Default login: admin / admin"
echo "Grafana is set to start automatically on system boot."
This step downloads and install all necessary Grafana components, including binaries, configuration files, and service scripts. The echo statements provide information on how to access Grafana providing the url format and the default login credentials.
- Step 6: Start the Grafana server To start and enable the Grafana server, run the commands below.
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
- Step 1: Download Node Exporter
NODE_EXPORTER_VERSION="1.8.2"
wget [https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXPORTER_VERSION}/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz](https://github.com/prometheus/node_exporter/releases/download/v$%7BNODE_EXPORTER_VERSION%7D/node_exporter-$%7BNODE_EXPORTER_VERSION%7D.linux-amd64.tar.gz)
- Step 2: Extract Node Exporter
tar xvfz node_exporter-*.tar.gz
- Step 3: Move Node Exporter binary
sudo mv node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64/node_exporter /usr/local/bin/
- Step 4: Create Node Exporter user
sudo useradd --no-create-home --shell /bin/false node_exporter || true
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter || true
- Step 5: Create systemd service for Node Exporter
cat << EOF | sudo tee /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
EOF
- Step 6: Start and enable Node Exporter service
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
log "Node Exporter installation completed"
- Step 1: Create a System User for Postgres Exporter
log "creating system user for Postgres Exporter"
sudo groupadd --system postgres_exporter || true
sudo useradd -s /sbin/nologin --system -g postgres_exporter postgres_exporter || true
- Step 2: Download Postgres Exporter
POSTGRES_EXPORTER_VERSION="0.15.0"
wget [https://github.com/prometheus-community/postgres_exporter/releases/download/v${POSTGRES_EXPORTER_VERSION}/postgres_exporter-${POSTGRES_EXPORTER_VERSION}.linux-amd64.tar.gz](https://github.com/prometheus-community/postgres_exporter/releases/download/v$%7BPOSTGRES_EXPORTER_VERSION%7D/postgres_exporter-$%7BPOSTGRES_EXPORTER_VERSION%7D.linux-amd64.tar.gz)
- Step 3: Extract Postgres Exporter
tar xvfz postgres_exporter-*.linux-amd64.tar.gz
- Step 4: Setup Postgres Exorter directory and user
sudo mkdir -p /opt/postgres_exporter
sudo useradd --no-create-home --shell /bin/false postgres_exporter || true
sudo chown postgres_exporter:postgres_exporter /opt/postgres_exporter/
sudo chmod 755 /opt/postgres_exporter/postgres_exporter
- Step 5: Move Postgres Exporter Binary
sudo mv postgres_exporter-${POSTGRES_EXPORTER_VERSION}.linux-amd64/* /opt/postgres_exporter/
- Step 6: Set up the environment file
cat << EOF | sudo tee /opt/postgres_exporter/.env
DATA_SOURCE_NAME="postgresql://admin:[email protected]:5432/postgres?sslmode=disable"
EOF
- Step 7: Set ownership for Prometheus binaries
sudo chown postgres_exporter:postgres_exporter /opt/postgres_exporter/.env
sudo chmod 600 /opt/postgres_exporter/.env
- Step 8: Create Postgres configuration
cat << EOF | sudo tee /etc/systemd/system/postgres_exporter.service
[Unit]
Description=Prometheus exporter for Postgresql
Wants=network-online.target
After=network-online.target
[Service]
User=postgres_exporter
Group=postgres_exporter
Type=simple
WorkingDirectory=/opt/postgres_exporter
EnvironmentFile=/opt/postgres_exporter/.env
ExecStart=/opt/postgres_exporter --web.listen-address=:9187 --web.telemetry-path=/metrics
Restart=always
[Install]
WantedBy=multi-user.target
EOF
check_success "Postgres Exporter configuration creation"
log "Postgres Exporter installation completed"
- Step 9: Enable and start Postgres Exporter
sudo systemctl daemon-reload
sudo systemctl start postgres_exporter
sudo systemctl enable postgres_exporter
- Step 10: Database Check
echo "Checking metrics..."
curl -s http://127.0.0.1:9187/metrics | grep pg_up
This checks for the health of the Postgresql database by querying the Prometheus exporter running on the local machine, specifically looking for the pg_up
metric which indicates the database's operational status.
- Step 11: Set-up firewall
sudo ufw allow 9187/tcp
sudo ufw reload
To set up a Java monitoring dashboard, follow the below steps:
- You can choose to configure a dashboard from scratch or use a preconfigured one. For a preconfigured dashboard, copy this dashboard ID
4701
- In Grafana's dashboard tab, click on New > New Dashboard
- Click on the import dashboard card.
- Paste the dashboard ID in the
Find and import dashboards
box and click on theLoad
button. - Give the dashboard a name, select or create a folder for it and select Prometheus as the data source then load the dashboard.
To set up a Node exporter monitoring dashboard to monitor your server, follow the below steps:
- You can configure a dashboard from scratch or use a preconfigured one. For a preconfigured dashboard, copy this dashboard ID
1860
- In Grafana's dashboard tab, click on New > New Dashboard
- Click on the import dashboard card.
- Paste the dashboard ID in the
Find and import dashboards
box and click on theLoad
button. - Give the dashboard a name, select or create a folder for it, select Prometheus as the data source, then load the dashboard.
To set up a Postgres exporter monitoring dashboard, follow the below steps:
- You can configure a dashboard from scratch or use a preconfigured one. For a preconfigured dashboard, copy this dashboard ID
12485
- In Grafana's dashboard tab, click on New > New Dashboard
- Click on the import dashboard card.
- Paste the dashboard ID in the
Find and import dashboards
box and click on theLoad
button. - Give the dashboard a name, select or create a folder for it, select Prometheus as the data source, then load the dashboard.
This guide walks you through the process of setting up alerts in Grafana for instance disk space alert.
- Step 1: Login to Setup dashboard
Log in to your Grafana dashboad, click on new dashboard to create a new one and navigate to the Alerting section.
-
Step 2: Click Alert to set alert for the metrics you are monitoring.
-
Step 3: Create a New Alert Rule
Click on "Add new alert rule" to begin setting up your alert.
- Step 4: Switch to Code View
In the upper right corner, switch to the "Code" view for more detailed configuration options.
- Step 5: Enter the Query
Use the following PromQL query to monitor disk space usage:
max(100 - ((node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes)) by (instance)
- Step 6: Review Query Results
This query returns the percentage of disk space used by each instance. You should see results for different instances in your environment.
- Step 7: Set Alert Threshold
Specify the disk space usage percentage at which you want to receive an alert.
- Step 8: Save and Configure Notifications
Save the rule and set up the contact point where you would like to receive notifications for alerts (e.g., Slack).
Note
Adjust the threshold percentage based on your specific needs and infrastructure requirements.
- Step 9: Set up a contact point for Slack
Fill in a Name, Integration (in this case, Slack) and WebHook URL from Slack
After that, test and save
- Step 10: Set the alert rule to use your slack contact point
Then choose the contact point from the specific alert rule configuration page. Save and exit.
-
Click
Manage contact points
to configure who receives notifications and how they are sent -
Click
Add contact point
to setup a new one -
Setup the contact point using the right information
- Enter the
name of the alert
- Select the
channel
to receive the alert - For Slack, paste the
webhook url
:https://hooks.slack.com/services/AAAAAAAA/BBBBBBBBB/CCCCCCCCCCCCCCCCCCCCCCCC
- Enter the
-
Test
it if successful, -
Save the
contact point
To access historical data for capacity planning, compliance, auditing, and enhanced debugging, it is best to set up data retention for Prometheus.
-
Step 1: In the earlier Prometheus installation, Prometheus was configured as a service. Open the
prometheus.service
config:
nano /etc/systemd/system/prometheus.service
-
Step 2: Add the below flag in the
prometheus.service
file if you do not have it yet, if you have it, skip this step.
--storage.tsdb.path /var/lib/prometheus/
This flag specifies the directory where Prometheus will store its time-series database (TSDB).
Breakdown:
--storage.tsdb.path
: This indicates that we are setting the path for the TSDB storage.
/var/lib/prometheus/
: This is the actual directory path where Prometheus will create the necessary files and folders to store your time-series data.
- Step 3: For data retention, add the below flag at the end of the
ExecStart
line:
--storage.tsdb.retention.time=60d
This overrides the default 15d storage for Prometheus and tells it to retain data for 60 days.
Note: Set the retention time based on how long you need the data stored. The duration should match your project requirements and compliance needs.
The file should look like this:
[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
--storage.tsdb.retention.time=60d
[Install]
WantedBy=default.target
- Step 4: Reload the daemon to apply the changes
sudo systemctl daemon reload
- Step 5: Restart Prometheus for the changes to be applied.
sudo systemctl restart prometheus
In your Prometheus UI, under the Status tab, select 'Command-line Flags' and search for --storage.tsdb.retention.time
to verify the retention time is set to 60d
.
To ensure only specific users can access your Grafana dashboards, it is important to configure user roles and permissions.
- Step 1: Log in to Grafana as an admin user. Navigate to "Administration" tab (gear icon) > "Users and access".
- Step 2: In the "Users and access" page, click on "Users" to add a user.
Then click on the "New User" button.
- Step 3: Set the user's name, email, username and password. Then hit the "Create user" button.
Verify the user details and assign the user to an organization with a role (Viewer, Editor, or Admin)
After the user has been created, permissions can be set on what dashboards the user can access and what the user can do.
- Step 4: Navigate to the Dashboards tab. Select the dashboard to which you want to assign the user.
- Step 5: In the top right part of the dashboard you will find a gear icon. Click on it to access the dashboard settings.
- Step 6: In the dashboard settings page, click on the "Add a permission" button and select the user, username, and role for that user.
- To check the status of all services including services being monitored
sudo systemctl status prometheus.service
sudo systemctl status grafana-server
sudo systemctl status java_app.service
sudo systemctl status postgresql
sudo systemctl status postgres_exporter.service
sudo systemctl status node_exporter.service
- To check logs of services in case of failures
sudo journalctl -u java_app.service -n 50 --no-pager
sudo journalctl -u postgresql -n 50 --no-pager
sudo journalctl -u prometheus.service -n 50 --no-pager
sudo journalctl -u postgres_exporter.service -n 50 --no-pager
sudo journalctl -u node_exporter.service -n 50 --no-pager
- To verify data retention was successfully set for Prometheus
ps -eo args | grep -- '--storage.tsdb.retention.time'
- To verify Prometheus is scraping metrics from your targets
- Visit the Prometheus UI you setup.
- Click on the 'Status' drop and select 'Targets'.
- In the Targets page you can see the services actively being monitored.
- Set up monitoring and alerting for your critical services.
- Enforce the principle of least privilege by ensuring only certain users can access the monitoring dashboards.
- Use the latest software version to avoid errors.
Made by Dhee ‖ Sudobro ‖ Stephennwachukwu ‖ Dominic-source