[JENKINS-72434] Add metrics for failure causes on builds #176
+68
−4
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes https://issues.jenkins.io/browse/JENKINS-72434
This adds a new
jenkins_bfa*
metric for failure causes found in a specific job build in the convention ofjenkins_bfa.job.@<JOB_NAME>@.number.@<JOB_BUILD_NUMBER>@.cause.@<FAILURE_CAUSE_NAME>
.Used in conjunction with the prometheus metrics plugin, and a few additional
metric_relabel_configs
, this results in a metric like:jenkins_bfa{build_number="14", cause="job_exits_1", instance="host.docker.internal:8080", jenkins_job="jake_test_job", job="jenkins"}
jenkins_bfa{build_number="15", cause="no_matching_cause", instance="host.docker.internal:8080", jenkins_job="jake_test_job", job="jenkins"}
This allows
jenkins_bfa
metrics to be joined to other metrics such asdefault_jenkins_builds_build_result_ordinal
via common labelsjenkins_job
andbuild_number
/number
.I would have implemented this a bit differently, but it seems that unfortunately metrics-core does not support "tags" (What prometheus refers to as labels), hence the need to relabel the prometheus metrics. My understanding based on discussions in https://github.com/dropwizard/metrics is that tags will be supported in 5.x, which is not officially released yet anyways.
Testing done
The above mentioned metrics and screenshots below were gathered via:
Using
mvn -DskipTests clean hpi:run
in my Eclipse Run Configuration, as well asdocker run --rm -p 9090:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
(prometheus.yml pasted below, and diff or pom.xml to add 2 dependencies in order to pull in the prometheus plugin for testing prometheusmetrics)
pom.xml
prometheus.yml
Create a freestyle job that simply
set +x
&exit 1
to generate metrics for build failures with no matching causes.Create a new failure cause with an indication matching the "exit 1" in the log to generate metrics for build failures with the matching cause.
Prometheus screenshot below:
Submitter checklist