Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes #295: Use a semaphore to protect the Batcher from unbounded memory growth. #304

Merged
merged 1 commit into from
May 15, 2024

Conversation

chirino
Copy link
Contributor

@chirino chirino commented Apr 30, 2024

No description provided.

@@ -141,10 +143,11 @@ impl Batcher {
notifier: Default::default(),
interval: period,
priority_flush: AtomicBool::new(false),
limiter: Semaphore::new(1000),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may want to make this configurable later. Is there a better default we can use for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the same for the batch size, made it a 100... but it all really is a function of what the Limits these counters back look like. i.e. ones with many Conditions or large variable names, will just consume more memory. And these limits change during a Limitador instance "lifetime"... So other than delegating the issue to the user, I don't really see anything really. We could, as I did in for the in-memory one, do some broken maths on the memory available on the pod... but again, it's a huge guestimate :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we'd need to monitor memory usage and do it all conditional on that... Not sure I want to start going down that rabbit hole tho 🙈

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could, for now, use the same max_cached_counters value. Gives the user some control, while avoiding leaking too many "implementation details" just quiet yet, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Updated.

@chirino chirino force-pushed the fix-295 branch 2 times, most recently from 3488c3a to bb56676 Compare April 30, 2024 17:49
Copy link
Member

@alexsnaps alexsnaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use Semaphore that easily I fear. So either we move to a different notification model, or we do the proper math (tho that'd be racy) on adding permits back based off the size of the pending writes and the permits remaining.

.remove_if(counter, |_, v| v.no_pending_writes())
.is_some()
{
self.limiter.add_permits(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the example in the doc is somewhat oversimplified...
Reading the docs, .add_permits doesn't account for the initial permits provided (i.e. consider it a max), see here
So we need to do all the math ourselves, as this wouldn't account for re-adding permits back when counters expire for instance. I'd expect this would run out of permit after a while.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually... thinking about it, it might be good enough. As we don't compact the queue anywhere 🤔
So that, if the entry is passed to the consumer, even if there are no pending writes for this entry e.g. because it expired, then this would also remove it from self.updates and increment the tickets back...

Copy link
Member

@alexsnaps alexsnaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this, but we need to start adding tests rather sooner than later

@chirino chirino merged commit 37bb70b into Kuadrant:main May 15, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants