Skip to content

Commit

Permalink
Reduce in memory queue limit by 16x (restatedev#2455)
Browse files Browse the repository at this point in the history
This was previously across all partitions, but since 1.1 its per
partition. And it is 350M per partition. Those entries are not initially
used, but as you scale to 1m invocations per partition, all the
memory pages in the queue's ring buffer are dirtied and contribute to
RSS. This leads to 9G of usage on a 24 partition node.

This PR reduces the limit by 16x to 21M per partition, or 562M on a 24
partition node, which it will reach after 1.5 million invocations. A
more manageable figure, even if it still appears as a 'leak' until that
amount is reached.
  • Loading branch information
jackkleeman authored Dec 23, 2024
1 parent 25a89c0 commit 1ac1f70
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions crates/types/src/config/worker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ pub struct InvokerOptions {
///
/// Defines the threshold after which queues invocations will spill to disk at
/// the path defined in `tmp-dir`. In other words, this is the number of invocations
/// that can be kept in memory before spilling to disk.
/// that can be kept in memory before spilling to disk. This is a per-partition limit.
in_memory_queue_length_limit: NonZeroUsize,

/// # Limit number of concurrent invocations from this node
Expand Down Expand Up @@ -223,7 +223,7 @@ impl Default for InvokerOptions {
None,
Some(Duration::from_secs(10)),
),
in_memory_queue_length_limit: NonZeroUsize::new(1_056_784).unwrap(),
in_memory_queue_length_limit: NonZeroUsize::new(66_049).unwrap(),
inactivity_timeout: Duration::from_secs(60).into(),
abort_timeout: Duration::from_secs(60).into(),
message_size_warning: NonZeroUsize::new(10_000_000).unwrap(), // 10MB
Expand Down

0 comments on commit 1ac1f70

Please sign in to comment.