-
Notifications
You must be signed in to change notification settings - Fork 179
Add ChunksIterator method to Series interface. #665
base: master
Are you sure you want to change the base?
Conversation
7feb231
to
291ab8e
Compare
Signed-off-by: Bartek Plotka <[email protected]>
291ab8e
to
64eeed9
Compare
Addressed comments @codesome PTAL |
Signed-off-by: Bartek Plotka <[email protected]>
… method. Signed-off-by: Bartek Plotka <[email protected]>
…o far. TODO: tests & splitting into 2. Signed-off-by: Bartek Plotka <[email protected]>
Added bit yolo implementation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks fine to me.
querier.go
Outdated
} | ||
|
||
func (it *noSeekSeriesIterator) Seek(t int64) bool { | ||
it.err = errors.New("not implemented: Seek method invoked for noSeekSeriesIterator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather prefer having an inefficient implementation as @brian-brazil said instead of an error here.
TODO: Add tests. Signed-off-by: Bartek Plotka <[email protected]>
Signed-off-by: Bartek Plotka <[email protected]>
@brian-brazil very true. But we need to split chunks from time to time as mentioned here: https://github.com/prometheus/tsdb/issues/670 so I would reencode in place here. Changed the code to reuse more stuff around, let me know how that looks. |
Signed-off-by: Bartek Plotka <[email protected]>
CHANGELOG.md
Outdated
## 0.10.0 | ||
|
||
- [FEATURE] Added `DBReadOnly` to allow opening a database in read only mode. | ||
- `DBReadOnly.Blocks()` exposes a slice of `BlockReader`s. | ||
- `BlockReader` interface - removed MinTime/MaxTime methods and now exposes the full block meta via `Meta()`. | ||
- [FEATURE] `chunckenc.Chunk.Iterator` method now takes a `chunckenc.Iterator` interface as an argument for reuse. | ||
- [CHANGE] `Series` interface allows return chunk iterator that allows iterating over encoded chunks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrong paragraph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failed rebase
return false | ||
} | ||
|
||
for it.Next() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when the current time is the requested time calling Seek should not advance the iterator.
for it.Next() { | |
if atT, _ := it.At(); t >= atT { | |
return true | |
} | |
for it.Next() { |
the same for all places.
I am basing this behavior basd on @gouthamve's comment in https://github.com/prometheus/tsdb/issues/328#issuecomment-387739347
Should also add this to ensure the behavior in the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! I will add decisions there to Seek interface comment and add tests for this.
return newVerticalMergeSeriesIterator(s.series...) | ||
} | ||
|
||
// verticalMergeSeriesIterator implements a series iterater over a list | ||
func (s *verticalChainedSeries) ChunkIterator() ChunkIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to also add a test for this.
It will help visualize what output is expected with a given input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test that covers this?
} | ||
return &verticalMergeChunkIterator{ | ||
a: s[0].ChunkIterator(), | ||
b: newVerticalMergeChunkIterator(s[1:]...), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this really hard to understand. I know it is used in the verticalMergeSeriesIterator
as well, but will be really happy if we simplify this at some point.
Not a blocker for this PR thought. I just thought I would mention it 😈
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The recursive approach is sometimes hard to follow indeed. Let's postpone cleaning this though, as we do this everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did run a debug session and the final struct is really bizarre, sort of deeply nested.
But yeah a conversation for another PR.
Thanks @krasi-georgiev for review. Let me add few unit tests to cover all iterators I added (: |
Signed-off-by: Bartek Plotka <[email protected]>
Signed-off-by: Bartek Plotka <[email protected]>
// Err returns the current error. | ||
Err() error | ||
func (s *chunkSeries) ChunkIterator() ChunkIterator { | ||
return &chunkIterator{chunks: s.chunks} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newChunkSeriesIterator
accepts mint and maxt. Should we do the same here? @brian-brazil @krasi-georgiev ? I think we should.
However not sure what mint maxt means here in detail. Should we be strict and allow mint and maxt in terms of samples inside chunks as well?
By the looks of this: https://github.com/prometheus/tsdb/blob/78e9e360581153d73b1c69d31fd6e2d89e060a82/querier.go#L796 we just check if they overlap with required min or maxt here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, we pass to newChunkSeriesIterator
with mint maxt so I am pretty sure we need to be strict here as well in ChunkIterator
. E.g when we query block [0, 20)with querier restricted to
[10, 40]we need to return chunk
[10,19]even if the block has
[0, 19]on this
ChunkIterator` level, I assume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newChunkSeriesIterator accepts mint and maxt. Should we do the same here? @brian-brazil @krasi-georgiev ? I think we should.
I think it is fine to start with with the current interface if it covers the streaming use case. Don't see a problem to refactor when/if needed.
* Simplified; merged seek and non seek cases together. Added explicit min/max only for chunk series iterator, where it is relevant. * Adjusted all seek implementation to match edge case requirement (double seek, failed seek + next). Signed-off-by: Bartek Plotka <[email protected]>
d60075a
to
b84c439
Compare
@@ -60,6 +60,19 @@ type Series interface { | |||
ChunkIterator() ChunkIterator | |||
} | |||
|
|||
// ChunkIterator iterates over the chunk of a time series. | |||
type ChunkIterator interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to be near Series
interface.
// Do binary search between current position and end. | ||
it.idx = sort.Search(len(it.list)-it.idx, func(i int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem here was that sort.Search
on not found gives the index to put the value in. This makes sense but seems like we forgot that we operate on shorter len(it.list)-it.idx
list (:
With new changes this PR also fixes https://github.com/prometheus/tsdb/issues/328 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this cover all changes from #329 so I can close it in favor of this one?
// Err returns the current error. | ||
Err() error | ||
func (s *chunkSeries) ChunkIterator() ChunkIterator { | ||
return &chunkIterator{chunks: s.chunks} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newChunkSeriesIterator accepts mint and maxt. Should we do the same here? @brian-brazil @krasi-georgiev ? I think we should.
I think it is fine to start with with the current interface if it covers the streaming use case. Don't see a problem to refactor when/if needed.
return newVerticalMergeSeriesIterator(s.series...) | ||
} | ||
|
||
// verticalMergeSeriesIterator implements a series iterater over a list | ||
func (s *verticalChainedSeries) ChunkIterator() ChunkIterator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test that covers this?
} | ||
return &verticalMergeChunkIterator{ | ||
a: s[0].ChunkIterator(), | ||
b: newVerticalMergeChunkIterator(s[1:]...), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did run a debug session and the final struct is really bizarre, sort of deeply nested.
But yeah a conversation for another PR.
cc @free if you are happy with that. I am touching this place as well. |
// Seek advances the iterator forward to the given timestamp. | ||
// It advances to the chunk with min time at t or first chunk with min time after t. | ||
Seek(t int64) bool | ||
// At returns the meta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"current value". This makes it sounds like it's dropping memes :)
This is alternative to the #659 I feel like iterators is the way to go here as we want to iterate & stream chunks instead of buffering all.
Both PRs are for same use case: Allowing
Querier
to return raw chunks if asked. This will help massively on prometheus/prometheus#5703This iterators implementation is easy here but will be more important & tricky for
head
in Prometheus.fixes https://github.com/prometheus/tsdb/issues/328
Signed-off-by: Bartek Plotka [email protected]