You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since f50997f partitions can deadlock.
This happens when the aperodic process tries to send something via the "syscall" UnixDatagram to the hypervisor, but the hypervisor freezes the partition in this very moment. When the next partition time window comes around, the periodic process is scheduled first (unfrozen) but it never computes. We think, this is because the aperiodic process is frozen during a critical section of the send, which locks the entire process. Since f50997f processes within a partition are actually threads of a single process. This would explain, why a freeze during a critical section in the aperiodic process could lock the perodic process out of executing. This results in a deadlock, because the aperiodic process is only scheduled after the periodic process finishes its work in this partition time window, and since the periodic process can not execute, this never happens.
This class of errors/deadlocks can be avoided by moving intra-partition scheduling into each partition
The text was updated successfully, but these errors were encountered:
Here is one possible solution:
We can spawn a new thread for each partition on partition start. Let's call this thread the Manager Thread (MT).
The MT will then perform all critical operations on behalf of the partition's processes.
Also the MT should always run in the background, whenever a process from its partition is running.
When a process (which runs in another thread, but in the same address space as the MT) encounters an operation, that could cause a deadlock right now, the process instead invokes the manager thread to execute said operation. It does this by sending a closure through an mpsc::channel to the manager thread along with another channel, which is used for receiving the return value.
The logic inside the manager thread would be pretty straightforward, as it can just do a blocking receive call on the channel.
Since f50997f partitions can deadlock.
This happens when the aperodic process tries to send something via the "syscall"
UnixDatagram
to the hypervisor, but the hypervisor freezes the partition in this very moment. When the next partition time window comes around, the periodic process is scheduled first (unfrozen) but it never computes. We think, this is because the aperiodic process is frozen during a critical section of the send, which locks the entire process. Since f50997f processes within a partition are actually threads of a single process. This would explain, why a freeze during a critical section in the aperiodic process could lock the perodic process out of executing. This results in a deadlock, because the aperiodic process is only scheduled after the periodic process finishes its work in this partition time window, and since the periodic process can not execute, this never happens.This class of errors/deadlocks can be avoided by moving intra-partition scheduling into each partition
The text was updated successfully, but these errors were encountered: