What would async transfers look like for the SDL_GPU backend? Are they possible at all? #209

kg · 2024-08-30T21:17:25Z

Thinking of two common scenarios here and whether FNA3D + SDL_GPU can support them cleanly, if at all:

I want good load times and smooth loading w/minimal dropped frames, so I want to do asynchronous texture uploads (maybe VBs/IBs too, but I'd bet money that for virtually all games using FNA3D, texture data dominates buffer data.)
I need to do readback during gameplay, i.e. luminance histograms for HDR, display hit testing data for click selection/drag selection, whole frames for screen recording or screenshots
I want to do the above without calling synchronous FNA(3D) APIs from a thread pool (doing this is already a bad idea for multiple reasons)

I feel like the baseline solution here is a set of EXT APIs that only work on backends that support them (I would suggest the only backend that supports them is SDL_GPU) and fail-fast if not supported.

The narrowest possible scope is probably SetDataPointerAsyncEXT and GetDataPointerAsyncEXT on FNA textures, with the accompanying FNA3D APIs. Unfortunately, this seems like it would impose a lot of work on the FNA3D side to manage transfer buffers, so I'm wary to suggest that. There's already a lot of code to handle transfer/staging buffers behind the scenes and we've found bugs in it before.

So what this might look like is:

FNA3D API to create a transfer buffer. FNA3D_Buffer * FNA3D_GenTransferBufferEXT? Transfer buffers can only be mapped or transferred to/from. Creating on unsupported backends will fail.
FNA API for transfer buffers, i.e. class TransferBufferEXT. Narrowest possible interface to make it functional. Constructing one on unsupported backends will throw.
FNA3D APIs for async transfers. As a starting point, one for TransferBuffer->Texture and one for RenderTarget->TransferBuffer.
FNA APIs for async transfers. One for TransferBufferEXT->Texture2D and one for RenderTarget2D->TransferBufferEXT.

I would propose a very very simple synchronization model here, which is that no particular guarantees are made about completion order and we don't expose any way to do things like wait for completion, we just say 'once the frame is done all your async transfers have finished'. Maybe that's still too aggressive, in which case we'd need a way to signal completion of async transfers.

For my workloads, I would easily be able to consume any completion model for GPU->CPU transfers - I can just keep the old luminance histogram and hit testing maps around until the new ones are ready, and a little latency won't hurt me. For uploads I can also consume pretty much any model, but 'it finishes this frame' is much easier.

To signal completion of async transfers we'd want to keep the overhead there low - i.e. the transfer operation doesn't return an object with a lifetime or invoke a callback, those both seem too hairy. Maybe the transfer target has an 'isAsyncTransferTarget' flag and the transfer source has an 'isAsyncTransferSource' flag, and any attempts to perform an async transfer set the appropriate flags (if they are already set, the async operation fails). Then the flag is cleared by the backend when the async transfer is finished, so the API consumer can just check the 'busy flag' on their transfer target.

The text was updated successfully, but these errors were encountered:

thatcosmonaut · 2024-08-30T22:23:57Z

I honestly have to say that I'm against this. I don't know how to reasonably expose async transfer goodies without exposing a fence concept, and at that point things are getting hairy.

I really don't want to leak SDL_GPU concepts into FNA3D. This is formally a preservation project. I think anything that goes too far outside the conceptual model of XNA is likely to get us into MonoGame territory. Once we bring in async readback, why not bring in compute, and storage buffers, and indirect rendering, etc. These kinds of abstractions are almost trivial to set up directly in SDL_GPU, but grafting them into XNA opens up all kinds of edge cases that don't fit the original model.

At some point people are gonna have to rip the bandaid off and write new renderers if they want modern GPU features. We might honestly be making things harder for people in the long run if we let FNA3D get too muddy.

flibitijibibo · 2024-08-30T22:42:52Z

I will likely make the call once SDL3 is tagged, but now that we have 3.0 and all the stuff we added to it, I will likely get even more strict with additions since we went to the effort of making 3.0 to begin with - I know migrating to new libraries is a nonzero amount of work but maintaining never ending mutations is much worse. If you like SDL3, use SDL3, we are actively encouraging migrating away from FNA for this purpose.

kg · 2024-08-30T22:58:33Z

So is the answer that people using FNA shouldn't be doing readback at all, even though it worked better in XNA? Threaded uploads worked better too. I'm OK with that as an answer. Right now they're only remotely adequate in D3D11, don't work at all in OpenGL, and are broken in Vulkan (this isn't your fault). I can re-evaluate once the SDL_GPU backend is working though, it might be totally satisfactory there without async transfers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What would async transfers look like for the SDL_GPU backend? Are they possible at all? #209

What would async transfers look like for the SDL_GPU backend? Are they possible at all? #209

kg commented Aug 30, 2024

thatcosmonaut commented Aug 30, 2024 •

edited

Loading

flibitijibibo commented Aug 30, 2024

kg commented Aug 30, 2024

What would async transfers look like for the SDL_GPU backend? Are they possible at all? #209

What would async transfers look like for the SDL_GPU backend? Are they possible at all? #209

Comments

kg commented Aug 30, 2024

thatcosmonaut commented Aug 30, 2024 • edited Loading

flibitijibibo commented Aug 30, 2024

kg commented Aug 30, 2024

thatcosmonaut commented Aug 30, 2024 •

edited

Loading