Future work for GPU device class #284

Ivan-Velickovic · 2024-11-12T05:12:35Z

Thanks to @erichchan999 we have an initial design/implementation for 2D graphics.

There's still unclear design decisions etc. This issue will track those. Edit by @erichchan999,

Device class future work

Implement a cursor queue, this is an optimisation for desktop environments to allow for more responsive cursor movement.
Consider using scatter gather for resource memory allocations. Would need a more thorough look at use cases to see whether this is necessary.
Remove legacy 2D resources and only support blob resources. The way it is currently is because it maps more cleanly to virtio-gpu. Blob resources can potentially allow zero-copying, depending on whether the system uses unified memory access (UMA), typically this is not the case for dedicated GPUs with VRAM which will benefit from this. For compatibility with the base virtio-gpu without blob, we should write a translation layer so that blob resources can still plug to 2D resources.
Consider changing all memory backings to be page aligned as part of the GPU protocol as they are most likely to be implemented as DMA transfers, which typically must be page aligned.
The virtIO driver will not work with virtio_gpu with virgl features enabled in QEMU despite having the same set of base 2D commands. The protocols seem to be interpreted differently with virgl enabled. Need to investigate why this is.
There are some assumptions about request failure that have been made to simplify the current gpu virtualiser logic. Bookkeeping requests from multiple clients in the GPU virtualiser is complex, and is made even more complex when these requests fail due to the asynchronous nature of the request/response queues. An assumption has been made that other than the requests which create a resource, requests should never fail under normal circumstances. And if they do there is something catastrophically wrong with either the driver/device which would render recovery of bookkeeping state meaningless. This simplifies the virtualiser drastically by avoiding complex recovery logic upon request failure. The only exception for this is when requests are rendered stale due to display info events, which thankfully, if you inspect the logic carefully does not require us to perform any complex recovery logic. However, it is worth doing an exhaustive investigation on the possible failure conditions on each request to validate these assumptions.
Move towards a 3D GPU protocol, this interface should ideally be compatible with 3D graphics API libraries (opengl, vulkan).
Discussion about blob resources and 3D: Blob resources do not have private memory, the GPU will scanout from the memory attached to that resource directly, allowing zero-copy communication. This is true if the system has integrated graphics that does not have its own VRAM, otherwise for a dedicated GPU there is still one necessary copy from main memory to device memory. 2D resources will typically have their private memory in device memory, thus requiring a transfer operation from the attached backing that the client has access to, to that private memory. This is not an inefficiency for dedicated GPUs where a copy is necessary, but it is an inefficiency for environments without dedicated VRAM which is typical of integrated GPUs. Note that technically, we can create blob resources with device memory and introduce a memory map and unmap request for clients allowing zero-copy for dedicated GPUs. VirtIO allows this, but for some reason only if 3D operations are supported. Providing support for this feature in sDDF GPU which only supports 2D would make interfacing with virtIO more difficult. This is not the only reason: it is tricker to implement on Microkit as it only allows static mapping of memory regions, which would mean unwanted static mappings into VRAM from each client.
Modern 3D GPUs functions much like a state machine, and thus drivers would need to validate and/or do pre-processing / post-processing of client command stream data before it can be passed to the device. This design means that a 3D gpu driver must have access to the client's data and thus needs to be trusted. For 2D GPU drivers, there is no command stream so no need to do validations, and the framebuffer objects are standardised with pixel formats, thus 2D GPU drivers do not need the client data mapped in.

Example improvements

Zig build is missing some of the things the Makefile can do. See README in example for details.
For QEMU, udmabuf expects each entry in the scatter gather list to be page aligned. QEMU will only warn you when you create a blob resource with a memory size that's unaligned, and otherwise let the request succeed. This I believe is not the correct behaviour, it should fail the request upon failing to create a proper memory backing. Currently the virtio driver modifies the request from the client to be page aligned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Future work for GPU device class #284

Future work for GPU device class #284

Ivan-Velickovic commented Nov 12, 2024 •

edited by erichchan999

Loading

Future work for GPU device class #284

Future work for GPU device class #284

Comments

Ivan-Velickovic commented Nov 12, 2024 • edited by erichchan999 Loading

Device class future work

Example improvements

Ivan-Velickovic commented Nov 12, 2024 •

edited by erichchan999

Loading