Make InvokerOptions.in_memory_queue_length_limit respect global memory limit of a node #2402

tillrohrmann · 2024-12-10T15:00:33Z

Right now, the invoker uses the SegmentQueue to queue up invocations it cannot immediately execute. The number of in memory invocations is controlled via InvokerOptions.in_memory_queue_length_limit. With the change to run a dedicated invoker per PartitionProcessor, the total memory consumption due to SegmentQueues can be #partitions * in_memory_queue_length_limit * sizeof which by default roughly equals to 7.7 GB of memory. Ideally, we can integrate this memory consumption into the global memory limit of a node (needs to be added).

The text was updated successfully, but these errors were encountered:

slinkydeveloper · 2024-12-11T07:32:46Z

IDK if this makes so much sense, isn't there a risk that a single partition overloads this in memory limit and all the other invokers always end up with this limit exhausted, writing to disk even though they have capacity?

Maybe the story here is that both the in_memory_queue_length_limit and the invoker concurrency limit should be per node (it also somewhat makes sense, the invoker concurrency limit is bound to open files, which is an OS resource), essentially returning to the behavior before the invoker was made per partition.

I think this brings back the deadlock situation between PP(s), but we have this solved by the inactivity timeout that fires in.

tillrohrmann · 2024-12-11T07:59:02Z

Changing the concurrency limit to be per node makes also sense to me because of what you said.

One way to make the memory consumption configurable per node is to divide the configured memory by the total number of partition processors. This is not ideal if one partition needs more memory and will need support for reconfiguration once we support changing the number of running partition processors per node but at least it would let users set an upper bound. A bit more elaborate approach is to reserve for every partition a min amount of memory and then have a free floating budget that is being taken on a first come first serve basis.

A completely different approach could be to get rid of the SegmentQueue and try to do the queuing of ready invocations in the partition processor if the concurrency limit is reached.

tillrohrmann mentioned this issue Dec 17, 2024

Refactor invoker to be an "object" and not an actor anymore #2432

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make InvokerOptions.in_memory_queue_length_limit respect global memory limit of a node #2402

Make InvokerOptions.in_memory_queue_length_limit respect global memory limit of a node #2402

tillrohrmann commented Dec 10, 2024

slinkydeveloper commented Dec 11, 2024 •

edited

Loading

tillrohrmann commented Dec 11, 2024

Make InvokerOptions.in_memory_queue_length_limit respect global memory limit of a node #2402

Make InvokerOptions.in_memory_queue_length_limit respect global memory limit of a node #2402

Comments

tillrohrmann commented Dec 10, 2024

slinkydeveloper commented Dec 11, 2024 • edited Loading

tillrohrmann commented Dec 11, 2024

slinkydeveloper commented Dec 11, 2024 •

edited

Loading