Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterate from minimum value #286

Closed
grelner opened this issue Aug 5, 2024 · 10 comments
Closed

Iterate from minimum value #286

grelner opened this issue Aug 5, 2024 · 10 comments

Comments

@grelner
Copy link
Contributor

grelner commented Aug 5, 2024

I have a use case where I need to iterate all values greater than a specific number. The bitmaps I'm working with are huge, so using .iter().skipWhile is too slow. I've had a look at the contains method is implemented, and a sound approach to achieve what I need would probably be to do a binary search to find the starting container, and iterate from there, introducing a new method RoaringBitmap.iter_from(&self, min_value: u32)

I just wanted to check if anyone have any insights or opinions on this before I start implementing, or if there is an api to achieve this already that I have somehow missed?

@Kerollmops
Copy link
Member

Kerollmops commented Aug 5, 2024

Hey @grelner 👋

I am wondering if it would be better to implement a new method directly on the RoaringBitmap iterator instead of having another iterator constructor.

The method signature would be Iter::skip_to(&self, n: u32) -> SkipTo<Self> working exactly like the SkipWhile iterator adapter but being much more optimized. A call to SkipTo::next would make the iterator skip all the useless internal collections and start returning numbers from n included.

I am wondering if we want to introduce four methods on the four iterators (bitmap and treemap with the ref and owned ones) or if we want to declare a new trait 🤔

What do you think?

@grelner
Copy link
Contributor Author

grelner commented Aug 5, 2024

Hi @Kerollmops
Implementing it on the iterators rather than a new constructor on RoaringBitmap is a great idea! It would probably be much cleaner to implement. I've looked into the constructor way, and it would require adding support functions in the entire call chain (RoaringBitmap, Container, ArrayStore, BitmapStore). Using your suggestion we can avoid most of that.

As for the trait, why not? It may be useful for someone at some point

@Kerollmops
Copy link
Member

@grelner, do you plan to implement it and propose a PR? I will be happy to review one and you should probably start without the trait, we can introduce it later in the same PR.

@grelner
Copy link
Contributor Author

grelner commented Aug 5, 2024

@Kerollmops yes, working on it right now 😅

@Dr-Emann
Copy link
Member

Dr-Emann commented Aug 5, 2024

In the CRoaring bindings, I used the name reset_at_or_after, though that's because the c implementation (roaring_uint32_iterator_move_equalorlarger) allows moving both forward or backward.

@grelner
Copy link
Contributor Author

grelner commented Aug 5, 2024

I've run into a rust quirk I haven't encountered before in my implementation. I want to preserve the DoubleEndedIterator after the skip_to call, so that you can still both go backwards and forwards from the point you skipped to.
To do this I need something like slice::Iter in ArrayStore that can be moved to a specific offset. I could pretty easily implement this myself, but slice::Iter is highly optimized and uses a few unstable calls so I can't really copy the code and modify it. Any tips on how to achieve this efficiently?

@Kerollmops
Copy link
Member

Kerollmops commented Aug 5, 2024

Hey @grelner 👋

You should take inspiration from the SkipWhile iterator combinator and as you can see you it doesn't support DoubleEndedIterator.

@Dr-Emann
Copy link
Member

Dr-Emann commented Aug 7, 2024

Think it's worth looking at what the lowest level abstraction would be. I'd argue (similar to advance_by unstablely in std) that the more fundamental operation is advancing an iterator in place until right before the desired location.

@Kerollmops
Copy link
Member

Hey @grelner 👋

After thinking a little bit more about this issue I am wondering if implementing a non-lazy advance_to method couldn't be easier, indeed.

Like you just remove the Flatten combinator and move forward until you find the right container iterator and then skip the items from this one until you reach the number.

What do you think?

@grelner
Copy link
Contributor Author

grelner commented Aug 9, 2024

Hi @Kerollmops , that is essentially what I have implemented in #287 . There are a couple of things I'm not happy with in that PR tho, but I'll clean it up hopefully at some point in the next days, and mark it ready for review

@grelner grelner closed this as completed Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants