Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RGB images get converted to RGBA images before getting uploaded to the GPU,which is causing a huge performance drop #2306

Open
zrezke opened this issue Jun 2, 2023 · 3 comments
Labels
🚀 performance Optimization, memory use, etc 🔺 re_renderer affects re_renderer itself 📺 re_viewer affects re_viewer itself

Comments

@zrezke
Copy link
Contributor

zrezke commented Jun 2, 2023

Description

Displaying RGB images for the first time is very expensive as they get converted to RGBA before being sent off to the GPU.
I did some profiling with a 4k video stream where this issue really becomes apperant:
RGB image profiling

Suggestion to resolve this issue

Instead of using the TextureFormat::Rgba8Unorm, maybe we could use a texture format like TextureFormat::R8Unorm and upload the RGB data directly to the GPU inside of this texture, then do some extra work in the shader to correctly sample an R8Unorm as an rgb texture.

If this performance issue get's resolved this would be really useful especially for live streaming data, where you really want to minimize the time wasted loading and converting the data...

@zrezke zrezke added other Generated by the "Other" issue template 👀 needs triage This issue needs to be triaged by the Rerun team labels Jun 2, 2023
@Wumpf Wumpf added 🔺 re_renderer affects re_renderer itself 📺 re_viewer affects re_viewer itself 🚀 performance Optimization, memory use, etc and removed other Generated by the "Other" issue template 👀 needs triage This issue needs to be triaged by the Rerun team labels Jun 2, 2023
@emilk
Copy link
Member

emilk commented Jun 9, 2023

This looks surprisingly slow to me - is this a debug build?

@zrezke
Copy link
Contributor Author

zrezke commented Jun 9, 2023

Not sure what I was running at the time, I tried it again today with a fresh pip install:

rerun --version
rerun_py 0.6.0 [rustc 1.69.0 (84c898d65 2023-04-16), LLVM 15.0.7] x86_64-unknown-linux-gnu release-0.6 643dea9, built 2023-05-25T20:35:24Z

The times for pad_rgb_to_rgba range from ~6.5 ms to ~13 ms.

Screenshot from 2023-06-09 12-36-09

Screenshot from 2023-06-09 12-35-26

Wumpf pushed a commit that referenced this issue Jun 9, 2023
…2345)

### What
Helps #2306 a bit

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)

<!-- This line will get updated when the PR build summary job finishes.
-->
PR Build Summary: https://build.rerun.io/pr/2345

<!-- pr-link-docs:start -->
Docs preview: https://rerun.io/preview/7bed2a1/docs
Examples preview: https://rerun.io/preview/7bed2a1/examples
<!-- pr-link-docs:end -->
emilk added a commit that referenced this issue Jun 15, 2023
…2345)

Helps #2306 a bit

* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)

<!-- This line will get updated when the PR build summary job finishes.
-->
PR Build Summary: https://build.rerun.io/pr/2345

<!-- pr-link-docs:start -->
Docs preview: https://rerun.io/preview/7bed2a1/docs
Examples preview: https://rerun.io/preview/7bed2a1/examples
<!-- pr-link-docs:end -->
@Wumpf
Copy link
Member

Wumpf commented Nov 13, 2024

we moved all the image conversion to re_renderer now since some of those are gpu driven (see #7700)
we should try to make this gpu driven as well:
This is absolutely trivial if we support random buffer access (which is related to compute shader support and as such is not available on webgl): just upload the buffers and write everything out in a fragment shader (yes could be a compute shader, but why bother! fragment shader is likely faster for this operation).
Doing this with textures all the way can be done, but not only does this likely have more overhead, we also may easily hit the max texture size limit way earlier since we'll need to update this as an R8/R16/R32(u/i/f) texture that has 3x the width

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🚀 performance Optimization, memory use, etc 🔺 re_renderer affects re_renderer itself 📺 re_viewer affects re_viewer itself
Projects
None yet
Development

No branches or pull requests

3 participants