This is a Prometheus Exporter for extracting metrics from a server using the Redfish API. The hostname of the server has to be passed as target parameter in the http call.
It has been tested with the following server models:
Cisco UCS C480M5, working properly since BMC FW 4.1(1d)
Cisco UCS C240M4
Cisco UCS C240M5
Cisco UCS C220M4
Cisco UCS C220M5
Cisco BMC FW below 4.x has its flaws regarding redfish API. Hence, I recommend updating at least to 4.0(1c).
Dell PowerEdge R640
Dell PowerEdge R730
Dell PowerEdge R740
Dell PowerEdge R640
Dell PowerEdge R840
Lenovo ThinkSystem SR950
HPE DL360 Gen10
HPE DL560 Gen10
If you are logged into the POD running the exporter, you can call
curl http://localhost:9200/redfish?target=server1.example.com&job=redfish-myjob
The exporter was written for Python 3.6 or newer. To install all modules needed you have to run the following command:
pip3 install --no-cache-dir -r requirements.txt
There is also a docker file available to create a docker container to run the exporter.
-l <logfile>
- all output is written to a logfile.
-d
- switches on debugging mode
-c <config file>
- you can specify the path to the config file, default is config.yml.
-
The listen_port is providing the port on which the exporter is waiting to receive calls. It is overwritten by the environment variable LISTEN_PORT.
-
The credentials for login to the switches can either be added to the config.yaml file or passed via environment variables. The environment variables are taking precedence over the entries in config.yaml file.
The mapping of job names to environment variables follows a schema:
REDFISH_JOB1_USERNAME
andREDFISH_JOB1_PASSWORD
would be the variables for example of the first job calledredfish/job1
. A slash gets replaced by underscore and everything gets converted to uppercase. -
The timeout parameter specifies the amount of time to wait for an answer from the server. Again this can alos be provided via TIMEOUT environment variable.
-
The job parameter specifies the Prometheus job that will be passed as label if no job was handed over during the API call.
listen_port: 9200
username: <your username>
password: <your password>
timeout: 40
job: 'redfish-myjob'
All metrics returned by the redfish exporter are gauge metrics.
Indicating if the redfish API was giving useful data back (== 1) or not (== 0).
Show the health information of the hardware parts like processor, memory, storage controllers, disks, fans, power and chassis if available.
Showing the count of errors per dimm.
Cisco servers do not seem to provide this kind of information via redfish. Dell PowerEdge servers only with certain DIMM manufacturers (Samsung not, Micron Technology and Hynix Semiconductor do).
Showing the powerstate of the server
The duration of the first response of the server to a call to /redfish/v1
Metric indicating if there was a valid redfish response while calling /redfish/v1
Total duration of scarping all data from the server
A collection of firmware version data stored in the labels. The value is always 1.