-
-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Deferred readback" mode #9
Comments
Re-tagging as funding needed due to the likelihood of needing to split cost on this feature between multiple companies |
I recently completed a contract that involved implementing a form of deferred readback in a proprietary image transport plugin that uses NvIFR to read back and compress the rendered frames as an H.264 video stream. As a result of this work, VGL 2.6.4 will include image transport plugin API enhancements that make it possible for an image transport plugin to implement deferred readback using its own preferred brand of GPU-based buffer (with NvIFR specifically, it proved necessary to use RBOs rather than PBOs.) Basically, that effort eliminated half of the argument in favor of this feature, since GPU-based compression, post-processing, and deferred readback can now be implemented in image transport plugins with no modifications to VirtualGL proper. The remaining argument is that deferred readback, if implemented for the built-in image transports as well, would reduce GPU and bus usage by avoiding a GPU-to-CPU pixel transfer for spoiled frames. However, it would triple the amount of GPU memory required per user, so potentially a GPU shared among 50 users would require about 1.3 gigabytes of GPU memory to accommodate all of those users if each was using a 3D application with a 1920x1200 window. Not an issue with high-end modern GPUs, which can contain 48 GB or more of GPU memory, but in order to support older GPUs, the deferred readback feature would need to be optional and perhaps not even enabled by default. That would greatly increase the complexity of VGL. The burning question in my mind is: for the high-end GPUs that have enough GPU memory to spare, does the overhead of reading back spoiled frames even matter? In order to proceed with this feature, I'm going to need to work closely with an organization that has a concrete use case that might benefit from it. |
This would basically move VirtualGL's buffer pool to the GPU (using PBOs) so that it would not be necessary to read back every frame that is rendered. Only the frames that actually made it to the image transport (i.e. the frames that are not spoiled) would be read back. This would also eliminate the memory copy that currently has to occur when transferring the pixels from the PBO to one of the buffers in the buffer pool, and it would reduce the overhead on the GPU caused by reading back frames that are never displayed. Furthermore, it would give transport plugins the option of performing additional processing on the pixels using the GPU prior to transferring them. The disadvantage would be increased GPU memory usage (it would probably be necessary to maintain 3 PBOs for each OpenGL context.)
The text was updated successfully, but these errors were encountered: