-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tsdb: SeriesIterator.Seek() is vaguely defined #5871
Comments
OK, I figured out the confusing unit test: the But questions (2), (3) and (4) still stand. Thanks. |
Hi, thanks for taking a look! You've made some really valid points!
Also, we should re-use the iterator if it's the same chunk, we are decompressing the chunk from beginning every single Finally, if you want to add Thanks for volunteering to do the work, I look forward to reviewing the PRs and let me know if you need help. |
PR coming up soon (still have some documentation to go through). Regarding question (4) though, my example might not have been the best. What I am worried about is that one might request a More importantly I believe that requesting the last 10 minutes and then seeking back beyond that is very likely to be a bug and it is more likely go unnoticed if a value is returned than if it consistently fails. |
Oh, and about Currently the way the instant value of a series at time It is simply an optimization for retrieving the instant value of a series without always iterating over the previous 5 minutes. The |
Signed-off-by: Alin Sinpalean <[email protected]>
Anyway, created PR prometheus-junkyard/tsdb#329 which attempts to fix (1), (2) and (3). I also see (4) as an issue, as well as what |
I added more info to I belive this PR fixes this issue. |
Is there anything left to do here? |
Hello from the bug scrub.
We take three years of silence as "no". Closing. |
Context: I was looking into cleaning up the iterator code, as I'm trying to add a separate
SeekBefore()
method, that would return the value at timet
as defined by Prometheus without always parsing and iterating over a whole chunk covering the previous 5 minutes (when the looked up value is toward the beginning of a chunk).However, I ran into some issues regarding how chunked series and iterators should actually work. There are a couple of gray areas that are only partially covered by comments or test code, making it impossible to figure out what is the expected correct behavior:
blockQuerier
and the resultingchunkedSeries
andchunkSeriesIterator
all havemint
andmaxt
fields, which (along with(*chunkSeriesIterator).Seek()
implementation details) suggest that iterators should only return samples with timestampsmint <= t <= maxt
. However, there are bugs (such as the one I'm attempting to fix with Seek() shouldn't return true when past maxt. Also add some tests to s… prometheus-junkyard/tsdb#327) and even confusing/confused unit tests -- which appear to create an iterator withmint = 5
,maxt = 8
and expect(*chunkSeriesIterator).Seek()
to return a value with timestamp3
.Are
mint
andmaxt
supposed to be "hard" limits on the respective iterators or only on which chunks are selected?SeriesIterator.Seek()
is only supposed to seek forward (which doesn't work as expected, as it will also seek backward within the same chunk). Is this assumption correct?Seek()
supposed to always advance the iterator or should a secondSeek()
call with the same timestamp as the immediately preceding one leave the iterator positioned where it was before?Seek()
with a timestampt < mint
to ever succeed? If so a poorly defined series e.g.[now - 10m, now]
will "successfully" produce a value of the series atnow - 1y
equal to the first value afternow - 10m
.I'm glad to do the work (including properly documenting the code) regardless of what the answers are, I'd just like to get some "authoritative" input. Thanks.
The text was updated successfully, but these errors were encountered: