You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Histograms are commonly used for recording latencies. The default values for bucket boundaries are []float64{0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000} (code). This works well when working with milliseconds however the Prometheus documentation recommends using seconds, rather than milliseconds for units. When recording latency metrics in seconds with the default buckets, the vast majority of timings will land in the 0 second to 5 seconds bucket. This results inaccurate histogram quantile calculations.
Agreed. This is the reason we have not made the change.
It's also the reason explicitly called out in the specification:
SDKs SHOULD use the default value when boundaries are not explicitly provided, unless they have good reasons to use something different (e.g. for backward compatibility reasons in a stable SDK release).
This does not look like a proposal we plan to accept.
Problem Statement
Histograms are commonly used for recording latencies. The default values for bucket boundaries are
[]float64{0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000}
(code). This works well when working with milliseconds however the Prometheus documentation recommends using seconds, rather than milliseconds for units. When recording latency metrics in seconds with the default buckets, the vast majority of timings will land in the 0 second to 5 seconds bucket. This results inaccurate histogram quantile calculations.This is very similar to this issue in the .NET repo: open-telemetry/opentelemetry-dotnet#4797
Proposed Solution
opentelemetry-go could use a different set of default buckets when the histogram units are known to be seconds.
This was implemented in the .NET library here: open-telemetry/opentelemetry-dotnet#4820
Alternatives
The current workaround is to use the
WithExplicitBucketBoundaries
option on all histograms dealing in seconds.Prior Art
.NET issue: open-telemetry/opentelemetry-dotnet#4797
.NET solution: open-telemetry/opentelemetry-dotnet#4820
Additional Context
This would likely be a breaking change.
The text was updated successfully, but these errors were encountered: