You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The help text for the unbound_response_time_seconds histogram says: "Query response time in seconds"
I thought this meant it would measure the time unbound takes to respond to every client query, however it does not seem to include queries served from cache
I'm not sure it's possible in Prometheus to do histogram quantile calculation over a histogram + another stray series interpreted as an extra bucket. Perhaps unbound_response_time_seconds should include cache hits in the lowest bucket? At least this should be documented
The text was updated successfully, but these errors were encountered:
An interesting question! Cache hits and cache misses will have a completely different distribution, so it's probably hard to represent them nicely in a single set of histogram buckets. We could add a label cache="hit" vs cache="miss" but the buckets would still be suboptimal for one or the other situation.
I can also see, though, why you would be interested in the question of "what is the performance my end-users see, covering both hits and misses."
it does not seem to include queries served from cache
Can I ask what you're basing this on? I don't know one way or the other what the answer is.
I can also see, though, why you would be interested in the question of "what is the performance my end-users see, covering both hits and misses."
That's exactly it 🙂
Can I ask what you're basing this on?
It was a guess based on some surprising results I was seeing on my dashboard, reinforced by checking out the munin setup, and then experimentation confirmed my guess.
I started a new unbound server and repeated the same query a few times, checking unbound-control stats_noreset after each query, and found that the first answer was counted in one of the buckets and subsequent answers were not. I also found through experimentation that background "prefetch" queries don't seem to be counted in the histogram either. I thought maybe the histogram measured outgoing recursion time, regardless of whether it is user-facing or not.
Caveat emptor, I didn't check local authority zones, forward zones, etc, I can't say if those are counted or not.
The help text for the unbound_response_time_seconds histogram says: "Query response time in seconds"
I thought this meant it would measure the time unbound takes to respond to every client query, however it does not seem to include queries served from cache
The munin plugin plots total cache hits along with the histogram, putting them under the lowest histogram bucket
I'm not sure it's possible in Prometheus to do histogram quantile calculation over a histogram + another stray series interpreted as an extra bucket. Perhaps unbound_response_time_seconds should include cache hits in the lowest bucket? At least this should be documented
The text was updated successfully, but these errors were encountered: