Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gamma correctness phase one #1302

Closed

Conversation

AnyOldName3
Copy link
Contributor

@AnyOldName3 AnyOldName3 commented Oct 14, 2024

As discussed here #1291, there are quite a few gamma-correctness issues and annoying-to-work-around obstacles in the VSG. Between this, and related PRs vsg-dev/vsgXchange#204 and vsg-dev/vsgExamples#323, things should be more manageable.

This PR:

  • Changes the default surface image format from a UNORM one to an SRGB one, enabling automatic linear-to-sRGB conversions when data is written to the framebuffer. We're already telling the driver that we're writing sRGB data to the surface as we're giving it an sRGB colour space. This should avoid things like alpha blending issues caused by doing the sRGB conversion first. The vsgExamples PR includes an update to the PBR shader to remove the now-redundant conversion it was doing at the end.
  • Fixes the conversion functions in color.h as they did the opposite of what they said. The call site I found seemed to be calling the one with the right implementation but wrong name, so I switched it over.
  • Adds extra fields to vsg::Options to control when vsgXchange converts colour spaces in loaded data.

Things with a potential to break existing applications can be opted out of - you can still explicitly ask for a UNORM surface format and set the source and destination colour spaces via vsg::Options.

We were already telling the implementation to interpret it as if it were sRGB via VK_COLOR_SPACE_SRGB_NONLINEAR_KHR, but with the image format as UNORM rather than SRGB, we'd have to explicitly convert before emitting the colour from the shader, even though that makes blending wrong.
This should allow greater control when loading data that played it fast and loose with gamma correctness, which is unfortunately pretty common.
@@ -41,7 +41,7 @@ Window::Window(ref_ptr<WindowTraits> traits) :
{
if (_traits && (_traits->swapchainPreferences.surfaceFormat.format == VK_FORMAT_B8G8R8A8_SRGB || _traits->swapchainPreferences.surfaceFormat.format == VK_FORMAT_B8G8R8_SRGB))
{
_clearColor = linear_to_sRGB(_clearColor);
_clearColor = sRGB_to_linear(_clearColor);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the swapchain is SRGB then why would one need to convert from sRGB_to_linear?

The default _clearColor has been linear up to this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image format was previously _UNORM with the colour space set to sRGB, so there was no automatic from-sRGB conversion happening during the clear before this PR, and the value of _clearColor was what would end up on an sRGB screen. As linear_to_sRGB was really doing a sRGB_to_linear conversion before this PR, that meant you'd end up with a nonsense colour, but it looked fine because it's not like the precise maths to derive the clear colour particularly matters. When the framebuffer image format was changed to _SRGB, it made Vulkan start doing an automatic linear-to-sRGB conversion during the clear, which combined with the fix to make the colour space conversion functions match their names, made the clear colour really dark due to a double linear-to-sRGB conversion. As people typically type out colours in perceptually uniform sRGB space (e.g. they'd chosen it with a colour picker in an image editor, which nearly always use sRGB by default), and Vulkan was now expecting a linear colour, it seemed most reasonable to do a sRGB-to-linear conversion here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that sRGB_to_linear may have been wrong, but if the default _clearColor is set in linear, and you want it to be sRGB then the appropriate function to call is linear_to_sRGB(..).

The fact that you've flipped it to sRGB_to_linear() doesn't make sense in this context. It suggests to me that something somewhere is going wrong as the original function usage made sense, now it doesn't in terms of reading, even if it gives you the "right" visual result.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now inclined to merge this PR as a branch and start testing. I think the changes to Window.cpp really need explanation as what is going on why it looks like something perverse is being done.

Copy link
Contributor Author

@AnyOldName3 AnyOldName3 Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: Some of the text below is bogus as I'd missed something in the original code. There's a more correct explanation below. I'm not using strikethrough on this post because it still explains my why-I-thought-the-colour-literal-was-in-sRGB thought process.

The original function usage didn't make sense. The original _clearColor value was being interpreted as sRGB due to the _UNORM image format paired with _SRGB_NONLINEAR colour space, and yet the {0.2f, 0.2f, 0.4f, 1.0f} was going through an operation claiming to be an sRGB-to-linear conversion. Obviously, you don't convert something from the colour space it's about to be interpreted as to a different one.

If {0.2f, 0.2f, 0.4f, 1.0f} is linear, then the correct original behaviour would be a linear-to-sRGB conversion (which is what the function call said it was doing, but it was instead doing the opposite) and the new correct behaviour would be to do no conversion at all.

If {0.2f, 0.2f, 0.4f, 1.0f} is sRGB, then the correct original behaviour would be to do no conversion at all, and the new correct behaviour would be to do an sRGB-to-linear conversion.

If the displayed colour on the screen was correct, then that would suggest that {0.2f, 0.2f, 0.4f, 1.0f} was in a made-up colour space with a 4.84 gamma value (as it became sRGB when undergoing an sRGB-to-linear conversion), and the sensible thing to do would be to write it out in a real colour space (whether sRGB or linear) instead.

As it seemed unlikely that you'd knowingly have used a mad colour space, called sRGB_to_linear on a colour you thought was already linear, or ignored a clear colour that was wildly different to the one you thought you'd specified, it seemed most plausible that {0.2f, 0.2f, 0.4f, 1.0f} was sRGB, so the code now does what's most appropriate for an sRGB colour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it seems something's slightly off in my explanation. I made a test image:

image

Both with and without this PR, the clear colour is displayed like the middle row on my machine. That makes sense with this PR, but as it was previously doing a sRGB-to-linear conversion (mislabeled as linear-to-sRGB) and then supposedly no further conversions, it doesn't make sense without this PR. I'm digging further to see what's going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that was dumb. I'd somehow missed that the conversion was happening in an if block predicated on the swapchain format. I'll take another look and revise some of what I said.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, the new explanation is thankfully much simpler:

  • Before this PR, as nearly everything used a _UNORM image format, the _SRGB-image-format branch typically wasn't taken.

    • When it wasn't, the colour would be sent to Vulkan as {0.2f, 0.2f, 0.4f, 1.0f} and used to clear a _UNORM framebuffer, so get written as {0.2f, 0.2f, 0.4f, 1.0f}, and get interpreted as _SRGB_NONLINEAR, so displayed on sRGB displays as {0.2f, 0.2f, 0.4f, 1.0f}.
    • When it was, the colour would go through an sRGB-to-linear conversion (mislabeled as linear-to-sRGB) to become {0.033f, 0.033f, 0.133f, 1.0f}, and used to clear an _SRGB framebuffer, which does a conversion, so get written as {0.2f, 0.2f, 0.4f, 1.0f}, then get interpreted as _SRGB_NONLINEAR, so displayed on sRGB displays as {0.2f, 0.2f, 0.4f, 1.0f}.

    Everything worked fine on account of all the problems cancelling each other out:

    • It looks like you thought you should write linear colours to a _UNORM framebuffer and sRGB ones to an _SRGB one when they're both used with _SRGB_NONLINEAR colour space, but it's the other way around and the image format tells you the target format it'll convert colours to, hence it being ill-advised to use a _UNORM image format with an _SRGB_NONLINEAR colour space as they don't match.
    • You thought the colour was already linear so didn't convert it when you thought you wanted linear colour.
    • The conversion was backwards, so converted it from sRGB to linear when you thought you had linear and wanted sRGB.
  • With this PR, nearly everything has an _SRGB image format, so the branch usually is taken.

    • When it isn't, just like before, the colour will be sent to Vulkan as {0.2f, 0.2f, 0.4f, 1.0f} and used to clear a _UNORM framebuffer, so get written as {0.2f, 0.2f, 0.4f, 1.0f}, and get interpreted as _SRGB_NONLINEAR, so displayed on sRGB displays as {0.2f, 0.2f, 0.4f, 1.0f}.
    • When it is, the colour will go through a (correctly-labelled) sRGB-to-linear conversion to become {0.033f, 0.033f, 0.133f, 1.0f}, and used to clear an _SRGB framebuffer, which does a conversion, so get written as {0.2f, 0.2f, 0.4f, 1.0f}, then get interpreted as _SRGB_NONLINEAR, so displayed on sRGB displays as {0.2f, 0.2f, 0.4f, 1.0f}.

    Everything works fine because we're aware we should write linear colours to an _SRGB framebuffer to benefit from implicit to-sRGB conversions and are telling the device to interpret the sRGB data as sRGB thanks to _SRGB_NONLINEAR being the colour space, and we're aware that when we've used a _UNORM framebuffer to bypass automatic conversion and still told the device to interpret the data as sRGB, we need to make sure we write sRGB.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for trying to come up with coherent explanation.

The clearColor values used are simply what worked for the OSG and I wanted to keep the background consistent to avoid too much of jump for users going between the OSG and VSG. The implicit assumption was that the clear colour was in linear RGB colour space, the frame buffer and final display would also be in linear RGB colour space.

From my reading of what you've written is the assumption that the final display would honour the linear RGB colour space of the frame buffer is misplaced, and that monitors would be treating the linear RGB frame buffer as straight input without conversion and would be displaying it in sRGB, but... these monitors are assuming the frame buffers are already in sRGB rather linear RGB, so the whole chain of conflicting assumptions is what we are essence trying to untangle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vulkan's default is for everything internal to use linear colour, as lots of maths only works easily with linear colour. sRGB variants of image formats have an implicit sRGB-to-linear conversion when they're read, and an implicit linear-to-sRGB conversion when they're stored to, so all the maths the GPU does can happen in linear space, but the data stored in VRAM is still sRGB. During presentation, the driver takes the bytes from the swapchain image and interprets them as if they're in the colour space set with its surface format, which might mean just flinging them down the HDMI cable, converting them to the monitor's native colour space, or telling the monitor to switch its colour space, depending on how the system's configured and what the monitor's capable of.

As we were previously setting up the swapchain with a UNORM image format, Vulkan wasn't converting the output from shaders and the clear colour from what it assumed to be linear, and was instead storing it directly, and as we set the swapchain's colour space to sRGB with VK_COLOR_SPACE_SRGB_NONLINEAR_KHR, the data would be interpreted as sRGB. That meant that on an sRGB monitor, the clear colour we set was being used directly. If it's used directly on an sRGB monitor and looks right, then it's sRGB.

That's pretty much the same as what the OSG did by default. In OpenGL, if you didn't glEnable(GL_FRAMEBUFFER_SRGB), then it won't do a linear-to-sRGB conversion when the framebuffer is written to, so the shader output and clear colour are used directly, and the image will be assumed to be in the OS' native colour space, which will almost always be sRGB. The clear colour would therefore end up displayed unaltered on an sRGB monitor, so would need to already be sRGB.

@@ -99,28 +99,28 @@ namespace vsg
template<typename T>
constexpr t_vec4<T> linear_to_sRGB(const t_vec4<T>& src)
{
const T exponent = static_cast<T>(2.2);
const T exponent = static_cast<T>(1.0 / 2.2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What testing did you use to determine that the conversion was wrong? You've flipped vsg::Window constructor and this local code so that Window.cpp code is now doing exactly the same as before but with flipped meaning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was feeding sRGB_to_linear sRGB colours with the vsgXchange PR, and getting nonsensical results. When switched, I got correct results. It's pretty easy to find sources saying these functions should work the way this PR makes them work. For a start, the conversion functions in the standard VSG PBR shader work this way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that shaders used the ratios in your PR.

The change you've made to Window.cpp still seem odd. If the frame buffer is sRGB when would you convert to sRGB to linear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a comment on that thread about it. When the framebuffer has an sRGB format, it means Vulkan expects to be fed linear colour and will then do a to-sRGB conversion any time anything is written to the framebuffer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um.... OK, so that kinda explains the perverse conversion in Window.cpp on the _clearColor, but it's not a sRGB_to_linear conversion, it's really a preconversion of a linear colour to something that remains the approrpiate colour when the framebuffer operation converts the linear colour to sRGB. Mathematically I can see that it's the same as sRGB_to_linear but it's not actually doing that conversion.

@robertosfield
Copy link
Collaborator

I'm happy to see us migrate to sRGB frame buffers, but some of these changes have left me scratching my head - I have added comments to the code changes that will hopefully highlight why I'm confused by the changes.

I think it would be useful to have a test example in vsgExamples to test the conversions between linear and sRGB.

linearRGB
};

/// Color space to use for vertex colors for scene graph
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this essentially the target colour space that users want to create their scene graphs so they match with the framebuffer colour spaces settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily to match the framebuffer's image format (it's probably more common to have it as the opposite of the framebuffer's image format), but it is to control the colour space that users want their scenegraph to use internally. Users who are doing explicit manual conversions in their shaders or are being intentionally gamma-incorrect (e.g. to faithfully draw legacy content) will probably want sRGB vertex and material colours, whereas most users will want linear vertex and material colours and will have their shaders operate entirely with linear colours, and rely on the automatic conversions on store that come with using an sRGB image format for the framebuffer.

@robertosfield
Copy link
Collaborator

Do we need to start putting the colour space info into the scene graph so when we save and then reload the application can decide whether the vertex and texture data needs converting?

The VSG doesn't presently record colour space in the scene graph, so existing .vsgb/.vsgt will implicitly be linearRGB. Does the vsg::Data::Properties struct now need a colour space hint? Adding to Data::Properties has to be carefully as we don't want to bump over a word boundary and have it's size increase.

@robertosfield
Copy link
Collaborator

I have merged this PR as branch:

https://github.com/vsg-dev/VulkanSceneGraph/tree/AnyOldName3-gamma-correctness-phase-one

I have also merged the vsgXchange and vsgExamples PR's as branches too. Then ran make build_ShaderSets to make sure all the built in shaders are updated too.

@AnyOldName3
Copy link
Contributor Author

Do we need to start putting the colour space info into the scene graph so when we save and then reload the application can decide whether the vertex and texture data needs converting?

The VSG doesn't presently record colour space in the scene graph, so existing .vsgb/.vsgt will implicitly be linearRGB. Does the vsg::Data::Properties struct now need a colour space hint? Adding to Data::Properties has to be carefully as we don't want to bump over a word boundary and have it's size increase.

Maybe. The other potentially sensible option would be to assume everything that stores colours as floats in .vsgt/.vsgb is in linear space and make sure that apps that are knowingly doing something else have the power to explicitly say so by making the serialisers and deserialisers for those formats respect the new members of vsg::Options.

It's pretty common for files that have a field to declare the colour space don't have the right value for sRGB and linear. For example, most DDS files in the wild with S3TC data within have the format declared as the UNORM/SNORM variant rather than the sRGB one, despite obviously holding an sRGB image, so applications typically ignore it. You can usually trust other colour spaces, though, as people typically don't go out of their way to mark something as Adobe RGB or Dolby Vision 8.4 if they don't know what colour space they're working in.

We're probably going to end up needing to guess what colour space data uses at some point, whether it's only when loading old files and when users specify colours in code, or every time we load anything. If we're sufficiently good at guessing (whether it's because other people make mistakes, or because we successfully minimise use of all but one approach except by people who know to provide hints), then we'd get better results by always guessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants