-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor performance with default request type queue size #12
Comments
Hi @mp49, that is interesting and your proposal for a PR sounds good. I am puzzled however on some benchmarks I did back in 2017. They are shown in this slide from the EPICS meeting. I seems like I was getting 1.2 GB/s when using 1 pvaDriver. The images were large, so it was only about 100 frames/s. Is this consistent with what you were seeing? My tests were done on a single machine, so the PVA traffic was not going across a physical wire. |
Thanks, I'll work on that PR. I'll also do more testing next week, with different image sizes and frame rates. I was testing on a RHEL8 VM, which only has 2 cores, and the sim detector IOC and the pvaDriver IOC were running on the same VM. I couldn't find any information on the 'queueSize' parameter for the request in the PVA documentation, so I'm not fully sure what the default size is, but grepping the source code leads me to think it is only 1 or 2. |
I did find a document here that says the default is 2. But this does not look like the location of official documentation. https://mrkraimer.github.io/website/developerGuide/pvRequest/pvRequest.html (search for the word "queue"). |
I'm still running tests and I'll post some results here. However, I think we can't rely on this code snippet in pvaDriver to tell us how many images we lost:
It assumes that we only lost 1 image if the overrunBitSet is not empty, but I think it tells us we lost 1 or more images. We could use the NTNDArray uniqueId field instead? |
I don't think you can use the UniqueID field, since there is no guarantee that the source of the NTNDArrays is sending you all of them, and that they will be in the correct order. Mark |
That makes sense. In the NDPluginScatter case that you pasted above, the separate pvaDrivers would each only get a sub-set of images. I think I'll just compare the total sent and the total received, and not rely on the overrun counter. There may be another way. Perhaps the NDPluginPva could make use of the userTag in the timestamp: time_t dataTimeStamp 2022-06-21 18:19:54.828 And then the pvaDriver would always expect that number to increment by 1. |
I've made a PR here: #13 The default is the same as before. This just deals with setting a different queueSize for the PVA request, which improves performance at high frame rates or on machines that are heavily loaded. For example, with the standard queueSize, on a underpowered VM (2-cores) I run into problems at 100Hz frame rates even for tiny 128x128 UInt8 images, but with a queueSize=100 I can safely run at 700Hz (which is the maximum rate I can generate images on the same VM). On a more powerful machine (8-cores), using the default queueSize, I only saw a few dropped images (out of several 1000) when running at 1700Hz (the max the simulation driver was able to run at). However when using queueSize=100 I did not see any dropped images. |
Hi,
I've been playing around with pvaDriver today, transporting 0.5MB images between two areaDetector IOCs with and without LZ4 compression.
I saw poor performance with the pvaDriver with even a low frame rate of 100Hz (only 50MB/s), in that I was dropping arrays every few seconds.
Then I made this change in pvaDriver:
diff --git a/pvaDriverApp/src/pvaDriver.cpp b/pvaDriverApp/src/pvaDriver.cpp
index 02b5d4a..278c21f 100644
--- a/pvaDriverApp/src/pvaDriver.cpp
+++ b/pvaDriverApp/src/pvaDriver.cpp
@@ -23,8 +23,8 @@
#include <epicsExport.h>
#include "pvaDriver.h"
-//#define DEFAULT_REQUEST "record[queueSize=100]field()"
-#define DEFAULT_REQUEST "field()"
+#define DEFAULT_REQUEST "record[queueSize=100]field()"
+//#define DEFAULT_REQUEST "field()"
And that worked wonders. I was able to reliably run at 100Hz, 800Hz and even 1500Hz without dropping frames.
It seems like that driver was used with queueSize=100 at some point but that it was commented out.
I think this parameter could be made configurable as an argument to pvaDriverConfig(). Does that sound reasonable? If so, I can make a pull request and test it.
Matt
The text was updated successfully, but these errors were encountered: