Add support for Percentile Stats lines #11

aleroyer · 2023-02-18T18:57:21Z

This is a first attempt in supporting Percentile Stats.

Not entirely satisfied for now as the exporter doesn't support multiple labels per time series (this will need bigger changes) but this should be sufficient to fix #10.

Samples:

rsyslog_percentile_global{counter="host_statistics.new_metric_add"} 1
rsyslog_percentile_global{counter="host_statistics.ops_overflow"} 0
rsyslog_percentile_global{counter="my_stats.new_metric_add"} 0
rsyslog_percentile_global{counter="my_stats.ops_overflow"} 0

rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p50"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p95"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p99"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_count"} 6
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_max"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_min"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_sum"} 0

Signed-off-by: Antoine Leroyer <[email protected]>

matthiasr · 2023-02-18T20:08:46Z

The data provided reasonably resembles the shape of a Prometheus summary, I think the exporter should present them as such. Some tallying and conversion will be required but I don't see a blocker right now

matthiasr · 2023-02-18T20:11:02Z

Also as a general terminology note, "bucket" is strongly associated with histograms, not quantiles. I would like to avoid mixing these to avoid confusion

aleroyer · 2023-02-19T14:39:07Z

Thanks for the feedback @matthiasr.

I think the exporter should present them as such. Some tallying and conversion will be required but I don't see a blocker right now

I agree with summary, forgot about this type!
For the conversion part, it shouldn't be hard. I think I might get issues with the render of such metrics as this exporter is using a point struct that only support 1 label. Having a summary means having more than 1 label (quantile, name and stat maybe). I will check how to implement this gradually.

For example, I can rename this

rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p50"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p95"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|p99"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_count"} 6
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_max"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_min"} 0
rsyslog_host_statistics_percentile_bucket{bucket="test_stat|window_sum"} 0

into this

# summary type
rsyslog_percentile_stats{name="host_statistics", quantile="0.50", stat="test_stat"} 0
rsyslog_percentile_stats{name="host_statistics", quantile="0.95", stat="test_stat"} 0
rsyslog_percentile_stats{name="host_statistics", quantile="0.99", stat="test_stat"} 0
rsyslog_percentile_stats_count{name="host_statistics", stat="test_stat"} 6
rsyslog_percentile_stats_sum{name="host_statistics", stat="test_stat"} 0
# gauge type? I don't think they can be included in summary
rsyslog_percentile_stats_min{name="host_statistics", stat="test_stat"} 0
rsyslog_percentile_stats_max{name="host_statistics", stat="test_stat"} 0

However, count and sum here might be wrong and will not respect Prometheus summary.

As stated in Prometheus documentation:

While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.
[...]

the total sum of all observed values, exposed as _sum

the count of events that have been observed, exposed as _count

But rsyslog here is exposing the sum and count for the window size specified by the user in rsyslog.conf. Not all observed values during the lifetime of the process.

msg_per_host|window_sum: sum of recorded values in the statistical population window.
msg_per_host|window_count: count of recorded values in the statistical population window.

Will this be an issue? If yes, I will need to do the sum/count part in the exporter but I don't think I can obtain something accurate.

matthiasr · 2023-02-20T07:53:47Z

Yes they need to be cumulative. You can create a complete summary with NewConstSummary, but you will need the aggregated sum and count for that. The way I am used to doing that is to hold the values in a separate (map of) struct(s), then implement the custom collector interface to only construct the actual metric objects at scrape time

How does the window relate to the reporting interval? If we get the count for the window every time at the end of the window, summing them all up would be "good enough". If somehow an increment gets lost, nothing we can do about it. OTOH if the exporter restarts and the count starts from zero again, that's OK as long as it doesn't happen too frequently.

aleroyer added 2 commits February 18, 2023 19:58

Support percentile statistics counters

ed32dbe

Signed-off-by: Antoine Leroyer <[email protected]>

Handle lines with percentile statistics counters and buckets

e579ec0

Signed-off-by: Antoine Leroyer <[email protected]>

aleroyer force-pushed the feat-percentilestat branch from 4e17db7 to e579ec0 Compare February 18, 2023 18:58

Fix the type for percentile.bucket

dd59941

Signed-off-by: Antoine Leroyer <[email protected]>

aleroyer marked this pull request as ready for review February 18, 2023 19:42

aleroyer added the enhancement New feature or request label Feb 18, 2023

aleroyer marked this pull request as draft February 21, 2023 20:21

vilhelmprytz mentioned this pull request Feb 21, 2024

error handling stats line - after update to rsyslog-8.2204 #10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Percentile Stats lines #11

Add support for Percentile Stats lines #11

aleroyer commented Feb 18, 2023 •

edited

Loading

matthiasr commented Feb 18, 2023

matthiasr commented Feb 18, 2023

aleroyer commented Feb 19, 2023

matthiasr commented Feb 20, 2023

Add support for Percentile Stats lines #11

Are you sure you want to change the base?

Add support for Percentile Stats lines #11

Conversation

aleroyer commented Feb 18, 2023 • edited Loading

matthiasr commented Feb 18, 2023

matthiasr commented Feb 18, 2023

aleroyer commented Feb 19, 2023

matthiasr commented Feb 20, 2023

aleroyer commented Feb 18, 2023 •

edited

Loading