Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CachedRedisStorage will now default to flushing a batch ASAP to… #294

Closed
wants to merge 2 commits into from

Conversation

chirino
Copy link
Contributor

@chirino chirino commented Apr 24, 2024

… reduce the length of time the cache is out of sync with Redis counters.

Setting a flushing_period is still supported to have a delay before flushing which can be useful to reduce the load on the Redis server in exchange for having staler counters in limitador.

… reduce the length of time the cache is out of sync with Redis counters.

Setting a flushing_period is still supported to have a delay before flushing which can be useful to reduce the load on the Redis server in exchange for having staler counters in limitador. 

Signed-off-by: Hiram Chirino <[email protected]>
Comment on lines 258 to 265
// Wait for a new flush request,
flush_rx.changed().await.unwrap();

if flushing_period != Duration::ZERO {
// Set the flushing_period to reduce the load on Redis the downside to
// setting it the cached counters will not be as accurate as when it's zero.
tokio::time::sleep(flushing_period).await
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at tokio's select!? We could have either wait for the period to elapse or wait to be notified... So that you'd flush every n seconds or flush when notified. Cause this would have the flush task wait after being notified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting after the first notification is a feature. As it's acting as a debounce timeout. Helping you batch and dedup counter updates in a short spurt of activity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here... I don't know how much of an added value that notification is useful really... What do you think we're addressing by introducing this behavior? Other than the loop as fast as possible with the 0 flushing period... This "just" changes what the starting point of a flushing period is... 😕 What am I missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tokio::select! would be useful if we want to flush when whatever happens first, either the notify() or the flushing_period.

@@ -44,6 +44,7 @@ pub struct CachedRedisStorage {
async_redis_storage: AsyncRedisStorage,
redis_conn_manager: ConnectionManager,
partitioned: Arc<AtomicBool>,
flush_tx: tokio::sync::watch::Sender<()>,
Copy link
Member

@alexsnaps alexsnaps Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if a Notify wouldn't be more appropriate, correct me if I'm wrong, but this will enqueue every single change and have the loop on "the other side" redo all the work... while it'd race against updates in the shared DS...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it does look simpler. Watch channels don't enqueue, they only store the last value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, they do. And Notify sorts of behave similarly. So that now... with the timeout being 0 this will probably loop like crazy, other than the system being idle. The batch will be of the size of the race. So thinking we should notify "smartly"... wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR to use Notify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct: it will keep flushing ASAP while there there is a pending increment. This will converge the cache with the global state quicker.

with the timeout being 0 this will probably loop like crazy, other than the system being idle.

@@ -174,6 +175,9 @@ impl AsyncCounterStorage for CachedRedisStorage {
}
}

// ask the flusher to flush the batch
self.flush_tx.send(()).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot, but would make it conditional on ... being within a certain threshold of the limit? e.g. close to expiry and/or "reaching" the limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of delaying the flush? I can see the following benefits:

  1. Reduce the number Redis API calls since you group multiple calls into 1 batch
  2. Can dedupe multiple increments of the same counter

Both of the above will likely reduce CPU load on Redi s. But that's not the main problem global rate limiters are solving for. The main problem is high latency calls to Redis. Adding a delay only increases that latency further.
More latency means the local cache counters will not have the correct global value.

I do think if the load is large enough CPU load on redis will be a problem, so keeping the flushing_period option is good since that would introduce that delay that can help reduce the load on Redis. But again, I don't think that's the main problem we are solving for.

@alexsnaps
Copy link
Member

I think we can close this one, right?

@chirino
Copy link
Contributor Author

chirino commented May 1, 2024

yep not need any more.

@chirino chirino closed this May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants