Owned pixel buffer for no-copy presentation #65

ids1024 · 2023-01-11T00:10:19Z

WIP implementation of #40.

Currently this only has Wayland. It can be run with cargo run --no-default-features --features wayland --example winit.

We should do some testing to see what impact it actually has.

This is based on the API that will be used for no-copy presentation. But wraps it in `set_buffer`. This also fixes the wayland buffer code to set `self.width` and `self.height` on resize, and set the length of the shared memory file when the buffer is created.

Currently only for `wayland` and the `winit` example.

ids1024 · 2023-01-11T00:41:27Z

Benchmarking is always subtle and easy to mess up, but comparing the time set_buffer takes against resize/buffer_mut/present, it seems like in the winit example this has about an order of magnitude decrease in the overhead softbuffer adds? (very roughly 1ms to 100µs, for instance, depending on window size).

It makes sense that this would be pretty efficient. In the winit example we shouldn't ever wait on buffer releases since we only draw when the server requests a draw, and Wayland is generally an asynchronous protocol, so while there may be a few context switches we shouldn't actually need to wait on the compositor for anything. So copying the buffer was the main overhead here.

ids1024 · 2023-01-11T02:44:38Z

Added code for Windows, using BitBlt. Looks like the performance is better there, but not by as much?

I guess it would be possible to write some portable automated tests winit::Window::set_inner_size to test at a particular size and test resizing. Maybe a timer to try redrawing periodically without a resize (or rust-windowing/winit#2412 if that was implemented). And something to calculate average/min/max/etc. times for each function call and the total.

src/wayland/buffer.rs

notgull · 2023-01-11T17:17:44Z

I added the X11 implementation for this. While we're rewriting the API, do we want to make these functions return Results? On X11 at least there are a few failure states that can happen during the buffering process, and I either log or panic on them right now.

src/wayland/buffer.rs

src/lib.rs

ids1024 · 2023-01-11T18:15:15Z

I added the X11 implementation for this. While we're rewriting the API, do we want to make these functions return Results? On X11 at least there are a few failure states that can happen during the buffering process, and I either log or panic on them right now.

Yeah, the question is whether every method should return a Result, or if some are expected to never fail. Like resize is clearly infallible as implemented on Wayland now (just setting member variables). But that could fail if it actually performed resizing. Or maybe if the u32 is out of range for i32, which Wayland uses (should never happen, so maybe better to just panic if anything).

For the public API, we should generally document what blocking, panicking, and errors are expected in each call. Including platform specific concerns if relevant.

notgull · 2023-01-11T18:18:16Z

Yeah, the question is whether every method should return a Result, or if some are expected to never fail.

At the very least, present() should return a Result, because that's usually where the most failure-prone API calls happen. I'm neutral on resize() and buffer_mut(). The way I have them implemented right now, errors can happen, it's just that unless there's an underlying I/O error or another call failed, it's unlikely to happen.

Also, I added support to the web backend for this PR.

src/win32.rs

ids1024 · 2023-01-11T21:26:17Z

So a lot of the questions here are tied to the design of the public API:

Surface sizes

Should we use u32 or NonZeroU32 for window sizes?
resize() should possibly panic if the value is somehow too large. (Too large for an i32 on Wayland; which would be an utterly absurd size. May be limited to 16-bit on some backends.)
If a size of 0 is allowed, how does this behave? Including resizing from a larger size.
Should Surface::new take width/height, or do we initialize to 0 or 1 and expect resize to be called later?

Thread-safety

softbuffer::Surface likely needs to be !Send/!Sync on some backends. This should be done explicitly and not backend-dependent.
Is it necessary/desirable for softbuffer::Context to be !Send/!Sync too?

Error handling

present() should definitely return a Result. I guess resize() and buffer_mut() should be too, but whether they actually fail will depend on backend?
Or resize() can be non-blocking and never fail, and defer any buffer allocating/resizing to the next buffer_mut() call.
If we want to not panic even on things like disconnect from the display server, we need to return a Result or silently ignore errors in any call that might communicate with the display server. (A later call like present may still get an error.)

Documentation

We should document what errors are expected on functions returning Result. And what may cause panics, if applicable.
The public API should document what assumptions are required for soundness. Like buffer_mut() trusts the display server not to mutate shared memory on backends using shared memory.
What calls block? I guess we assume things like buffer_mut() won't block for long if the display server is well-behaved.

notgull · 2023-01-11T23:03:33Z

My opinions, in no particular order:

resize() should possibly panic if the value is somehow too large. (Too large for an i32 on Wayland; which would be an utterly absurd size. May be limited to 16-bit on some backends.)

We should return an error instead; the maximum size varies between backends, so this is a legitimate failure state rather than an unexpected issue.

present() should definitely return a Result. I guess resize() and buffer_mut() should be too, but whether they actually fail will depend on backend?

present() and resize() should return a Result. buffer_mut() shouldn't and should just panic on failure. The case where is actually panics on failure (on X11 at least) is startingly rare.

Or resize() can be non-blocking and never fail, and defer any buffer allocating/resizing to the next buffer_mut() call.

I don't think this is a good idea; buffer_mut() should only block in the event that the SHM buffer is still being flushed out.

If we want to not panic even on things like disconnect from the display server, we need to return a Result or silently ignore errors in any call that might communicate with the display server. (A later call like present may still get an error.)

Silently ignoring errors is a recipe for disaster, I disagree with this on principle.

Should we use u32 or NonZeroU32 for window sizes?

NonZeroU32 should be used, IMO.

softbuffer::Surface likely needs to be !Send/!Sync on some backends. This should be done explicitly and not backend-dependent.

Yes, we should use PhantomData<*const ()> or something to avoid surprises.

Is it necessary/desirable for softbuffer::Context to be !Send/!Sync too?

Yes, to future-proof it.

Should Surface::new take width/height, or do we initialize to 0 or 1 and expect resize to be called later?

buffer_mut() should panic if it's called before resize().

ids1024 · 2023-01-12T00:10:18Z

We should return an error instead; the maximum size varies between backends, so this is a legitimate failure state rather than an unexpected issue.

If the size is simply passed along from winit we know it must already be in range for the backend. (Are there expected uses where it might pass something out of range?). But returning an error is easy enough if we're making that function return a Result anyway. This would more be a consideration for if we didn't.

buffer_mut() shouldn't and should just panic on failure. The case where is actually panics on failure (on X11 at least) is startingly rare.
buffer_mut() should only block in the event that the SHM buffer is still being flushed out.

In the Wayland backend, buffer_mut currently also allocates and resizes the buffers. But with Wayland none of that should block (or I guess it would block of the socket write if the buffer was full, until the compositor reads?). And the only fallible part of that is allocating and mapping shared memory, which should only really happen when out of file descriptors / memory / virtual memory. So panics make sense.

That could be moved to resize. But regardless, both methods need to block on buffer release. Waiting for buffer release could block on the compositor for any amount of time but probably shouldn't block for long. And wayland-client could return a DispatchError. So that needs to be turned into a Err return value, ignored, or turned into a panic.

Silently ignoring errors is a recipe for disaster

Well, specifically if the error is permanent loss of connection to the display server, it would be fine since we'd be guaranteed to get the exact same error when present is called. But yeah, ignoring other errors would be problematic.

Yes, to future-proof it.

It also lets use use Rc/RefCell/etc instead of Arc/Mutex.

buffer_mut() should panic if it's called before resize().

Hm. Hadn't thought of that option. I guess a default doesn't entirely make sense. Either that isn't what you wanted, and a panic would be clearer, or if you really want that you can just call .resize(1, 1).

Currently in the winit example I have it calling .resize() and .buffer_mut() every draw. So another option would be to combine them into a single call, statically preventing this panic.

But if we add an option for alpha and even other (potentially not 32-bit) pixel formats, that also impacts the API here.

i509VCB · 2023-01-12T00:26:24Z

We should probably have a test to ensure that the Send and Sync impls are correct.

I believe the static_assertions crate can do that

ids1024 · 2023-01-13T17:05:41Z

A simple macOS implementation for this API could just store a Vec that can be written into, then turned into a CGDataProvider and CGImage. So exactly like the current version but saving a copy. Swapping IOSurface front/back buffers may be viable as a more optimal solution (And these APIs seem to exist on iOS too).

@jackpot51 I'm not sure how this would be handled in the Orbital implementation. Unlike the others it fetches the current size of the window and maps it, then iterates over y copying each line. Rather than mapping a buffer for a exactly the width/height passed. (Also does this handle synchronization in some implicit way, or is there a possible race if the window is resized between getting the size and mapping, or while mapped?)

Still probably not the most efficient implementation.

ids1024 · 2023-01-14T01:47:14Z

Oh, I guess the size of the window on orbital only changes when the client sends a call to change it.

So that makes sense but clashes somewhat with the way the API here works. Probably the size should match, but we don't want to panic when a size that doesn't match the window size is passed.

madsmtm · 2023-01-15T02:02:43Z

We should probably have a test to ensure that the Send and Sync impls are correct.

I believe the static_assertions crate can do that

You can't really do that, since you'd need to specify a negative bound. However, you can do:

/// ```compile_fail
/// use softbuffer::Context;
/// fn needs_send<T: Send>() {}
/// needs_send::<Context>();
/// ```
/// ```compile_fail
/// use softbuffer::Surface;
/// fn needs_send<T: Send>() {}
/// needs_send::<Surface>();
/// ```
extern "C" {}

Should Surface::new take width/height, or do we initialize to 0 or 1 and expect resize to be called later?

Surface::new already takes HasRawWindowHandle, can't it just figure out the initial height itself from that? (At least, I know it can on macOS).

softbuffer::Surface likely needs to be !Send/!Sync on some backends. This should be done explicitly and not backend-dependent.

While we're at it, Surface::new on macOS is only safe to call from the main thread, so perhaps we could enforce that as well? Though I guess I could also work around that by submitting work to the main thread queue instead.

i509VCB · 2023-01-15T02:04:55Z

assert_not_impl_all is in static_assertions and should work.

madsmtm · 2023-01-15T02:29:32Z

assert_not_impl_all is in static_assertions and should work.

Huh, cool!

ids1024 · 2023-01-17T17:37:11Z

While we're at it, Surface::new on macOS is only safe to call from the main thread, so perhaps we could enforce that as well? Though I guess I could also work around that by submitting work to the main thread queue instead.

Perhaps that should just be documented as a requirement of the unsafe new() functions. The caller is responsible for making sure the handles are valid as long as the Context/Surface exist. It make sense that would require they are called on the right thread if that is a requirement on the platform.

It would be nice if there could be an entirely safe API rather than the unsafely and unenforced lifetimes inherent in raw-window-handle... but I'm not sure how that could work across platforms.

ids1024 · 2023-03-16T15:39:11Z

There are some other things I'd like to work on in Softbuffer, but I wouldn't really want to work on any non-trivial changes until this is merged.

kchibisov

Few notes about Wayland impl.

src/cg.rs

src/wayland/mod.rs

daxpedda

The web implementation is LGTM.

Sorry to drop my 2c here a bit late to the party, but I was wondering why the original Surface::set_buffer() method was completely replaced instead of just amending buffer_mut() to Surface.

The issue is that if a user is unable to avoid not copying the buffer, for example they receive the buffer already as a Vec from somewhere else, if a platform doesn't support "no-copy" the user will be copying twice now: once from their buffer to the softbuffer buffer, then softbuffer will copy from it's buffer to the OS buffer.

This affects at least two Tier 1 platforms: X11 (sometimes) and MacOS. Additionally it affects Orbital (sometimes) and Web (in a weird way).

I propose to keep Surface::set_buffer() and maybe even go a step further, return an error in buffer_mut() if the surface can not be represented in a no-copy manner, making sure the user is aware and doesn't get the misconception that this is a no-copy operation.

This is off-topic for this PR, but originally I would have liked to introduce a new function in an extension trait for the Web target to be able to set the buffer in the correct format (RGBA vs 0RGB), avoiding the extra copying. This wouldn't play well with the missing Surface::set_buffer().

I saw in #17 that maybe there is the intention to introduce multiple Surface formats, which would solve that problem in a different, probably better, way.

daxpedda · 2023-03-29T11:08:00Z

src/orbital.rs

        {
            // Map window buffer
            let window_map =
-                unsafe { OrbitalMap::new(window_fd, window_width * window_height * 4) }
+                unsafe { OrbitalMap::new(self.window_fd(), window_width * window_height * 4) }
                    .expect("failed to map orbital window");

            // Window buffer is u32 color data in 0xAABBGGRR format


@jackpot51 sorry for my naivety here, but this comment suggests that the buffer taken and returned is partly reversed compared to what is expected by the rest of the library, ABGR vs 0RGB. Isn't that a problem?

Hm, good point. In my testing it seemed like the colors where the same on all platforms, but something isn't right with the documentation here, at least.

In any case the existing implementation isn't swapping components and this comment is already here, so this isn't something introduced in this PR. (Though if the format didn't match that might impact the possibility of no-copy on Orbital.)

src/lib.rs

ids1024 · 2023-03-29T17:00:59Z

I was wondering why the original Surface::set_buffer() method was completely replaced instead of just amending buffer_mut() to Surface.

Basically there are two advantages to removing it:

Supporting both adds a bit of complexity to the library and all backends. Unless set_buffer is implemented using buffer_mut, which addresses any perceived "convince" but is problematic for performance, and deceptive if it results in 2 copies.
If users of the library can easily use buffer_mut without just copying into it from another buffer, they should.

Arguably the crate is more intuitive with only one way to do things.

This affects at least two Tier 1 platforms: X11 (sometimes) and MacOS. Additionally it affects Orbital (sometimes) and Web (in a weird way).

Usually XShm should be available so the no-copy path can be used. If you use X over a network that is likely more of a bottleneck than the copy anyway, so that shouldn't be too much of an issue. I believe macOS should be able to support no-copy too, but that isn't implemented yet. The handling of Orbital is awkward, but generally correct use of the library should have a buffer size that matches window size.

For web... yeah, that's awkward if it can't support the same format that is used everywhere else.

I propose to keep Surface::set_buffer() and maybe even go a step further, return an error in buffer_mut() if the surface can not be represented in a no-copy manner, making sure the user is aware and doesn't get the misconception that this is a no-copy operation.

I definitely don't agree on making buffer_mut error if it isn't no-copy. Unless I'm missing something, that should be at least as efficient as a set_buffer method on all backends when the application is rendering into the buffer. And it just makes things more difficult for an application to have to handle this explicitly. If some applications will want to behave differently if no-copy is supported than if it isn't (when would that be applicable?), there could be a method returning a bool to query that.

In my opinion set_buffer doesn't seem that useful for normal application windows, since you'll want to dynamically render the window contents based on window size, etc. Though maybe for some niche applications, and for surfaces other than ordinary windows (like a drag-and-drop icon).

We could support both. It adds some complexity to the implementation, and we'll need to document to favor buffer_mut unless that could only be used by copying from another buffer.

I saw in #17 that maybe there is the intention to introduce multiple Surface formats, which would solve that problem in a different, probably better, way.

We definitely at least need a way to chose between alpha and non-alpha modes. There are a lot of formats that could be supported, depending on backend. So what is worth supporting here and how much complexity is worth adding is up for debate.

daxpedda · 2023-03-29T20:49:17Z

Usually XShm should be available so the no-copy path can be used. If you use X over a network that is likely more of a bottleneck than the copy anyway, so that shouldn't be too much of an issue. I believe macOS should be able to support no-copy too, but that isn't implemented yet.

I wasn't aware of this. My motivation was the (potential) overhead when no-copy isn't available, so that's moot as soon as MacOS gets implemented.

I think for Web we need a different solution anyway, which we can address later.

ids1024 · 2023-03-29T21:01:35Z

In particular, macOS (and iOS) has something called IOSurface which I think we could use with the right format, memory map, using a pair to swap as front/back buffers? But I have absolutely no familiarity with this API, so if anyone more familiar with Apple platforms knows anything about this, it would be good to hear how this sounds.

We can also add back set_buffer, or something like it, later if it seems necessary.

ids1024 · 2023-04-04T02:39:12Z

Updated to use NonZeroU32. This would be more convenient if winit returned that instead of u32, but it's not too bad. We seem to be in agreement to leave out an API like set_buffer() for now, and change that later if it seems necessary.

kchibisov

Use .get() on NonZero types, instead of u32::from.

src/cg.rs

notgull · 2023-04-06T13:16:17Z

Great job, everyone!

* On MacOS, the contents scale is updated when set_buffer() is called, to adapt when the window is on a new screen (#68). * **Breaking:** Split the `GraphicsContext` type into `Context` and `Surface` (#64). * On Web, cache the document in the `Context` type (#66). * **Breaking:** Introduce a new "owned buffer" for no-copy presentation (#65). * Enable support for multi-threaded WASM (#77). * Fix buffer resizing on X11 (#69). * Add a set of functions for handling buffer damage (#99). * Add a `fetch()` function for getting the window contents (#104). * Bump MSRV to 1.64 (#81).

ids1024 added 2 commits January 10, 2023 16:06

wayland: Use mmap

c12cba8

This is based on the API that will be used for no-copy presentation. But wraps it in `set_buffer`. This also fixes the wayland buffer code to set `self.width` and `self.height` on resize, and set the length of the shared memory file when the buffer is created.

WIP no-copy in public API

8577275

Currently only for `wayland` and the `winit` example.

win32: Use owned buffer and Gdi:BitBlt

7a5f61d

ids1024 force-pushed the owned-buffer branch from 017875c to 7a5f61d Compare January 11, 2023 02:16

Drakulix reviewed Jan 11, 2023

View reviewed changes

src/wayland/buffer.rs Show resolved Hide resolved

notgull added 2 commits January 11, 2023 09:13

Implement X11 owned buffer

de2f37d

fmt

ce59528

i509VCB reviewed Jan 11, 2023

View reviewed changes

src/wayland/buffer.rs Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

Add owned buffer for web

579656a

i509VCB reviewed Jan 11, 2023

View reviewed changes

src/win32.rs Show resolved Hide resolved

ids1024 added 3 commits January 13, 2023 15:37

cg: Port to owned buffer

256ff9e

Still probably not the most efficient implementation.

win32: Fix length of buffer slice

8c3275c

Port more examples to owned buffers

b16e00e

ids1024 added 2 commits January 18, 2023 13:18

Merge remote-tracking branch 'origin/master' into owned-buffer

39cad4e

orbital: Implement new API (but with copy)

5e8a09d

kchibisov requested changes Mar 28, 2023

View reviewed changes

src/cg.rs Outdated Show resolved Hide resolved

src/wayland/mod.rs Show resolved Hide resolved

src/wayland/mod.rs Show resolved Hide resolved

src/wayland/mod.rs Show resolved Hide resolved

john01dav mentioned this pull request Mar 28, 2023

buffer width and height are only u16 #13

Closed

daxpedda approved these changes Mar 29, 2023

View reviewed changes

ids1024 mentioned this pull request Mar 31, 2023

Investigate more optimal way to implement CoreGraphics backend #83

Open

Updates to documentation for Surface

20ab5fd

ids1024 force-pushed the owned-buffer branch from 6e3d41e to 872c66a Compare April 4, 2023 02:31

ids1024 requested review from john01dav and kchibisov April 5, 2023 17:46

ids1024 mentioned this pull request Apr 5, 2023

set_buffer is slow #18

Open

kchibisov requested changes Apr 5, 2023

View reviewed changes

ids1024 force-pushed the owned-buffer branch from 872c66a to 26304be Compare April 5, 2023 22:31

ids1024 requested a review from kchibisov April 5, 2023 22:38

kchibisov reviewed Apr 5, 2023

View reviewed changes

src/cg.rs Outdated Show resolved Hide resolved

Use NonZeroU32 for arguments to Surface::resize

58e277d

ids1024 force-pushed the owned-buffer branch from 26304be to 58e277d Compare April 5, 2023 22:43

kchibisov approved these changes Apr 5, 2023

View reviewed changes

kchibisov merged commit a09e4cf into master Apr 6, 2023

kchibisov deleted the owned-buffer branch April 6, 2023 07:31

daxpedda mentioned this pull request Apr 6, 2023

Add from_canvas() to Surface for Wasm #76

Merged

notgull mentioned this pull request Apr 6, 2023

Make softbuffer window contexts own the pixel buffers. #40

Closed

ids1024 mentioned this pull request Apr 6, 2023

Damage in presentation #39

Closed

ids1024 mentioned this pull request May 13, 2023

Initial Window Size #106

Open

notgull mentioned this pull request Jun 4, 2023

v0.3.0 #118

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Owned pixel buffer for no-copy presentation #65

Owned pixel buffer for no-copy presentation #65

ids1024 commented Jan 11, 2023

ids1024 commented Jan 11, 2023

ids1024 commented Jan 11, 2023

notgull commented Jan 11, 2023

ids1024 commented Jan 11, 2023

notgull commented Jan 11, 2023

ids1024 commented Jan 11, 2023

notgull commented Jan 11, 2023

ids1024 commented Jan 12, 2023

i509VCB commented Jan 12, 2023

ids1024 commented Jan 13, 2023

ids1024 commented Jan 14, 2023

madsmtm commented Jan 15, 2023 •

edited

Loading

i509VCB commented Jan 15, 2023

madsmtm commented Jan 15, 2023

ids1024 commented Jan 17, 2023

ids1024 commented Mar 16, 2023

kchibisov left a comment

daxpedda left a comment •

edited

Loading

daxpedda Mar 29, 2023 •

edited

Loading

ids1024 Mar 29, 2023

ids1024 Apr 1, 2023

ids1024 commented Mar 29, 2023

daxpedda commented Mar 29, 2023

ids1024 commented Mar 29, 2023

ids1024 commented Apr 4, 2023

kchibisov left a comment

notgull commented Apr 6, 2023

Owned pixel buffer for no-copy presentation #65

Owned pixel buffer for no-copy presentation #65

Conversation

ids1024 commented Jan 11, 2023

ids1024 commented Jan 11, 2023

ids1024 commented Jan 11, 2023

notgull commented Jan 11, 2023

ids1024 commented Jan 11, 2023

notgull commented Jan 11, 2023

ids1024 commented Jan 11, 2023

Surface sizes

Thread-safety

Error handling

Documentation

notgull commented Jan 11, 2023

ids1024 commented Jan 12, 2023

i509VCB commented Jan 12, 2023

ids1024 commented Jan 13, 2023

ids1024 commented Jan 14, 2023

madsmtm commented Jan 15, 2023 • edited Loading

i509VCB commented Jan 15, 2023

madsmtm commented Jan 15, 2023

ids1024 commented Jan 17, 2023

ids1024 commented Mar 16, 2023

kchibisov left a comment

Choose a reason for hiding this comment

daxpedda left a comment • edited Loading

Choose a reason for hiding this comment

daxpedda Mar 29, 2023 • edited Loading

Choose a reason for hiding this comment

ids1024 Mar 29, 2023

Choose a reason for hiding this comment

ids1024 Apr 1, 2023

Choose a reason for hiding this comment

ids1024 commented Mar 29, 2023

daxpedda commented Mar 29, 2023

ids1024 commented Mar 29, 2023

ids1024 commented Apr 4, 2023

kchibisov left a comment

Choose a reason for hiding this comment

notgull commented Apr 6, 2023

madsmtm commented Jan 15, 2023 •

edited

Loading

daxpedda left a comment •

edited

Loading

daxpedda Mar 29, 2023 •

edited

Loading