Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support pending writes within CachedCounterValue #299

Merged
merged 10 commits into from
Apr 30, 2024
Merged

Conversation

alexsnaps
Copy link
Member

@alexsnaps alexsnaps commented Apr 25, 2024

Alright, this is a shot at unifying the "write-behind" and the cache we use in front of Redis.
It does so in a few ways:

  • The "pending increments" are recorded directly in the CachedCounterValue
  • Both the cache and the "write-behind queue" (still a Map) point to the same instance of a CachedCounterValue (within an Arc) for the same Counter "key"
  • All entries added to the Cache should be atomic and only then added to the Batcher (our write-behind "queue")
  • We always look up in the Batcher if a CachedCounterValue is mapped to key, in case it got evicted from the cache
  • We only add CachedCounterValue to the Batcher with an .pending_writes() > 0
  • We only remove atomically from the Batched if the .pending_writes() == 0
  • If we ever added two different CachedCounterValue instances (there is a race when inserting into the cache and being evicted before we made it into the Batcher, where we then merge the pending writes from one CachedCounterValue into the other one

This was achieved by doing:

  • folding the pending writes within the CachedCounterValue
  • change the CachedRedisStorage.batcher_counter_updates to point to the same Arc as within CountersCache.cache
  • use the CachedCounterValue to populate the batch to redis (and update the values back)
  • Introduce a proper Batcher type back exposing .add() and .consume(u64).await
  • Have the Batcher awaken on either: flush period being elapsed the batch size being reached
  • Flush on "important" flushes being required
  • .consume would return the most "important" updates first always
  • Lookup the queue on cache misses
  • Address Use 0 as a default on CachedRedisStorage cache misses #298

This still misses tests 🤦 yet I still think this is better than what we had before... So, I think we should merge this and iterate, but only if my reasoning makes sense and "seems to be implemented" in this PR.

@alexsnaps alexsnaps force-pushed the write-behind branch 3 times, most recently from e770469 to d4eed7b Compare April 25, 2024 20:08
limitador/Cargo.toml Outdated Show resolved Hide resolved
Comment on lines +85 to +88
for counter in &batch {
self.updates
.remove_if(counter, |_, v| v.no_pending_writes());
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If after having flushed to redis no additional pending writes were added, we can remove these entries from the queue.

pub fn add(&self, counter: Counter, value: Arc<CachedCounterValue>, priority: bool) {
let priority = priority || value.value.ttl() < self.interval;
self.updates.entry(counter).or_insert(value);
if priority {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we start constraining the batch size sent to Redis, we may want to add the priority flag in the value too so that the first batch we execute contains the priority items first.

Copy link
Member Author

@alexsnaps alexsnaps Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I do this "hack" where I check the TTL on entry… to decide the things that are "important", but I think I should maybe just waste the additional bit… 

limitador/src/storage/redis/redis_cached.rs Outdated Show resolved Hide resolved
expiry: AtomicExpiryTime,
}

pub struct Batcher {
updates: DashMap<Counter, Arc<CachedCounterValue>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DashMap makes it much cleaner and easier to consume <3

@alexsnaps alexsnaps requested review from chirino, didierofrivia and adam-cattermole and removed request for chirino April 29, 2024 21:58
@alexsnaps alexsnaps marked this pull request as ready for review April 29, 2024 21:58
Copy link
Contributor

@didierofrivia didierofrivia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part that I particularly find hard/complex is the consume method, that first iterates to get the hot counters that needs to flush asap, but then we iterate again to find a remainder and push it to the same batch (?), then we need to iterate again to have the final batch that we will send to the batch update fn. I only wonder if it's possible to simplify the collection we send to update... maybe in the near future we can review this. Great work! 👍🏼

@didierofrivia didierofrivia merged commit 9f54a24 into main Apr 30, 2024
20 checks passed
@alexsnaps alexsnaps deleted the write-behind branch April 30, 2024 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants