You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In one Haskell course, it was taught that you should never use modifyTVar because it leaks memory, always use modifyTVar' instead. This way, we have only the latter provided by Universum.
But gaining a bit of experience with concurrent primitives I started doubting whether this distinction is justified. I'm still looking for the truth in this question, but want to write down some thoughts.
First of all, note that STM has a quite special model, there are cases that it suits well and cases when it does not. I'm not well-acknowledged with how it is implemented inside, but probably the following statements would be close to real life.
If STM computation retries, it has to perform entirely again until the next retry or success.
When some STM variable changes, all computations involving the work with that variable which are "waiting in retry" wake up simultaneously and try to perform on available cores. This is sort of similar to notifyAll in java.
Apparently, in some cases, STM will be significantly less efficient than IO-based primitives (locks or CAS+locks like in unagi-chan package). We like STM because it provides much more control, but I have witnessed an attempt to use TQueue under a high load which caused an excessive number of STM retires (much higher than the number of successful completions), leading to enormous CPU consumption.
So note the problem with modifyTVar' - it performs the evaluation right during the transaction, which increases the chances of concurrent modification and further retry. Instead, we could only update thunks during the transaction (very quick) and perform evaluation right before, right after the transaction or even in a separate spark.
Regarding the spark option, the literature says that evaluation of a thunk creates a black hole, and any other thread trying to evaluate that thunk in parallel blocks, but the first thread which performs evaluation gains a high priority in its Capability so is likely to finish the computation soon. This option seems to make sense when the computation is large and updated are sporadic.
Also, it's worth thinking about what cases are justified for using STM. It looks like queues are something you never want to have in STM under a high-load, better use unagi-chan package. For control structures which are updates several times during application lifetime, performance or memory leaks do not matter. Hopefully, practice shows what is better to be used here.
The text was updated successfully, but these errors were encountered:
In one Haskell course, it was taught that you should never use
modifyTVar
because it leaks memory, always usemodifyTVar'
instead. This way, we have only the latter provided by Universum.But gaining a bit of experience with concurrent primitives I started doubting whether this distinction is justified. I'm still looking for the truth in this question, but want to write down some thoughts.
First of all, note that STM has a quite special model, there are cases that it suits well and cases when it does not. I'm not well-acknowledged with how it is implemented inside, but probably the following statements would be close to real life.
retry
or success.notifyAll
in java.Apparently, in some cases, STM will be significantly less efficient than IO-based primitives (locks or CAS+locks like in
unagi-chan
package). We like STM because it provides much more control, but I have witnessed an attempt to useTQueue
under a high load which caused an excessive number of STM retires (much higher than the number of successful completions), leading to enormous CPU consumption.So note the problem with
modifyTVar'
- it performs the evaluation right during the transaction, which increases the chances of concurrent modification and further retry. Instead, we could only update thunks during the transaction (very quick) and perform evaluation right before, right after the transaction or even in a separate spark.Regarding the spark option, the literature says that evaluation of a thunk creates a black hole, and any other thread trying to evaluate that thunk in parallel blocks, but the first thread which performs evaluation gains a high priority in its Capability so is likely to finish the computation soon. This option seems to make sense when the computation is large and updated are sporadic.
Also, it's worth thinking about what cases are justified for using STM. It looks like queues are something you never want to have in STM under a high-load, better use
unagi-chan
package. For control structures which are updates several times during application lifetime, performance or memory leaks do not matter. Hopefully, practice shows what is better to be used here.The text was updated successfully, but these errors were encountered: