Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NVIDIA] frames stop being presented (and appear to hang) sporadically when using the WSI layer and VK_KHR_present_wait #1592

Open
4 of 6 tasks
matte-schwartz opened this issue Oct 22, 2024 · 30 comments

Comments

@matte-schwartz
Copy link

matte-schwartz commented Oct 22, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Are you using any gamescope patches or a forked version of gamescope?

  • The issue occurs on upstream gamescope without any modifications

Current Behavior

Some games, especially Sony PlayStation ports, seem to hang and stop presenting frames even while game audio remains in the background. The most consistent repro I have of this now is with God of War: Ragnarok, with the launch option:
gamescope --hdr-enabled -e -f -h 2160 -w 7680 -r 120 -- %command%

around 50% of the time, the game will hang at some point within the first 20-30 seconds or so. will include screenshots of what this looks like + in-game settings

Steps To Reproduce

  1. Launch God of War: Ragnarok with gamescope --hdr-enabled -e -f -h 2160 -w 7680 -r 120 -- %command%
  2. Attempt to load the game
  3. If gamescope hangs on one frame, while otherwise remaining responsive, you've hit the issue

If you don't freeze on launch and load-in, try quitting and re-launching 1-2 more times.

Hardware information

- Distro: Arch Linux
- CPU: 13th Gen Intel(R) Core(TM) i9-13900K
- GPU: NVIDIA GeForce RTX 4090
- Driver Version: NVIDIA 565.57.01

Software information

- Desktop environment: Plasma 6.2.1
- Session type: Wayland
- Gamescope version: 3.15.13
- Gamescope launch command(s): `gamescope --hdr-enabled -e -f -h 2160 -w 7680 -r 120 -- %command%`

Which gamescope backends have the issue you are reporting?

  • Wayland (default for nested gamescope)
  • DRM (default for embedded gamescope, i.e. gamescope-session)
  • SDL
  • OpenVR

Logging, screenshots, or anything else

After gamescope hangs:
image

in-game settings from a successful launch (no changes from the hang):
Screenshot_20241022_150829

bug report immediately after a hang:
nvidia-bug-report.log.gz

Proton log from a hang:
steam-2322010.tar.gz

@matte-schwartz
Copy link
Author

@cubanismo I was still able to repro the WSI hang in God of War: Ragnarok, here's a new issue report as there were no matches I was finding

@pbasov
Copy link

pbasov commented Oct 22, 2024

Can confirm the same issue with Elden Ring on 565.57.01 with launch command:
gamescope -w 3840 -h 2160 --adaptive-sync --hdr-enabled --fullscreen -- %command%

Game launches, HDR works. After 3-5 minutes of gameplay usually hangs with sound running in background.

@cubanismo
Copy link

Thank you for the reports and log files. I've filed NVDIA internal bug 4924590 to track our investigation of this issue.

One other question so far: Is my memory correct that this issue occurred with prior NVIDIA drivers (560 or 555-series) as well?

@cubanismo
Copy link

cubanismo commented Oct 23, 2024

One other question: If you grab a screenshot from within gamescope (Super + S) after the hang, does it show more up-to-date content, or is it the same as the "stuck" visible image? I'm not clear how the attached screenshots were captured, so just checking to be sure.

@matte-schwartz
Copy link
Author

@cubanismo yes, as far as I know it has gone back to the 555 series, but I also did not really use nvidia with gamescope much prior to that. I can try to test some older releases to confirm if you'd like.

Those screenshots above were using Spectacle on KDE via the PRTSC button on my keyboard. I just tested using gamescopectl screenshot ~/Pictures/god-of-war-freeze.png and that also shows the same stuck visible image, which it captures once I re-focus the gamescope window after alt + tabbing to execute the command.
god-of-war-freeze

The game was frozen prior to me alt-tabbing fwiw

@cubanismo
Copy link

Thanks for testing. If it goes back to at least 555-series drivers, that's good enough. No need to check older drivers at the moment. Gamescope support in general gets pretty spotty prior to that.

@tlneondo
Copy link

I think this may be specifically something with egl-wayland, as gamescope was having this issue, I went and tried to use the wine-wayland driver in Baldur's gate 3 and it looks like I'm getting the exact same kind of freezing shortly after launch.

@valgusk
Copy link

valgusk commented Nov 15, 2024

did you try disabling vk_khr_present_wait? I don't know why or where the bug is originating (either vkd3d or nvidia), but usually it helps with hdr+wayland

@matte-schwartz
Copy link
Author

not seeing much of a difference with vk_khr_present_wait disabled both in compilation and as an env var.

@adamdmoss
Copy link

adamdmoss commented Nov 19, 2024

I can also repro this pretty easily (usually within seconds or minutes) if it helps. Not sure if the game matters, but I can repro with Silent Hill 2, Ratchet & Clank, Death Stranding and some random others.

Mostly been trying driver 565.57.01 but saw the same with 560.35.03. (550.135 was a non-starter in casual testing, some drm failure but I didn't look deeply.) GPU is a RTX 4080.

I've mostly tried gamescope master / 3.15.9+ but had the same problem with 3.14.24+ (or more specifically 5e8fddf )

Note, I can repro the same issue (just seems to take longer) without HDR. I've also tried with and without --immediate-flips and --adaptive-sync (and just about any other gamescope options I've been desperate enough to try).

I'm only using the drm backend of gamescope because HDR is what I'm actually interested in achieving with gamescope, but I'll see if this reproduces in a nested window.

@adamdmoss
Copy link

I can alt-tab in and out of gamescope and the same frozen frame is displayed, so technically I think gamescope hasn't hung (maybe the compositor has though, since I don't get a wiggly mouse-cursor).

The game itself continues to take CPU time on some threads, but no longer seems to be responding to input or (of course) visibly outputting new frames.

Maximal logging on gamescope shows nothing unusually suspicious.
PROTON_LOG on the game(+proton) itself shows nothing unusually suspicious.
dmesg shows nothing unusual from the NV drivers etc.

@adamdmoss
Copy link

vk_khr_present_wait=false is also not helping for me.

@adamdmoss
Copy link

Well! VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait did help, however. Many hours of stability now.

@matte-schwartz
Copy link
Author

Disabling the extension via VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait does seem to be helping here in God of War: Ragnarok so far, will keep testing a few other games just to make sure.

@jarrard
Copy link

jarrard commented Nov 22, 2024

Disabling the extension via VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait does seem to be helping here in God of War: Ragnarok so far, will keep testing a few other games just to make sure.

I'm going to test this out with Stalker-2 which appears to have a very similar random instability crash issue when GS and HDR is enabled. (NV4090)

UPDATE: Sadly no go, stalker2 and hdr just silent ctd. I'll probably need to keep waiting for more people to encounter this bug before a solution is found. Seems almost nobody is using HDR atm.

@jtellm
Copy link

jtellm commented Nov 22, 2024

It seems VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait fixes the freezing issue for me too (Fedora 41, GNOME, gamescope 3.15.13, NVIDIA 565.57.01 driver) although --force-grab-cursor was neccesary in some cases to avoid crashing.

I am noticing some strange flickering with HDR enabled but I'm not sure if this is related to gamescope.

Update: Haven't found a way to disable the extension for DXVK so still getting the freezing issue for some DX11 games.

@matte-schwartz
Copy link
Author

UPDATE: Sadly no go, stalker2 and hdr just silent ctd. I'll probably need to keep waiting for more people to encounter this bug before a solution is found. Seems almost nobody is using HDR atm.

The issue described here isn't a crash but rather a freeze, so you are likely hitting a different issue.

@jarrard
Copy link

jarrard commented Nov 22, 2024

True I opened up a separate issue for this problem.

#1633

UPDATE: I turned HDR off and now I get a freeze issue... hmmm

@postcert
Copy link

Well! VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait did help, however. Many hours of stability now.

This fixed my Stalker 2 issues on gamescope when using --force-grab-cursor (fine without it but unplayable input wise)
Had a similar issue with a single rendered frame and gameplay continuing in the "background"

@matte-schwartz matte-schwartz changed the title [NVIDIA] frames stop being presented (and appear to hang) sporadically when using the WSI layer and HDR [NVIDIA] frames stop being presented (and appear to hang) sporadically when using the WSI layer and VK_KHR_present_wait Nov 24, 2024
@tlneondo
Copy link

tlneondo commented Nov 28, 2024

Can confirm here, disabling VK_KHR_present_wait seems to get me further in game into Armored Core 6 than usual. But that's not a solution for DX11 titles.

@matte-schwartz
Copy link
Author

Can confirm here, disabling VK_KHR_present_wait seems to get me further in game into Armored Core 6 than usual. But that's not a solution for DX11 titles.

Do you know of any specific DX11 titles offhand that are broken on NV w/ gamescope in a similar way?

@adamdmoss
Copy link

Can confirm here, disabling VK_KHR_present_wait seems to get me further in game into Armored Core 6 than usual. But that's not a solution for DX11 titles.

Do you know of any specific DX11 titles offhand that are broken on NV w/ gamescope in a similar way?

IIRC: Alone In The Dark, Blacksad, Uncharted(?). Probably more.

@matte-schwartz
Copy link
Author

matte-schwartz commented Dec 1, 2024

Thanks, Blacksad: Under The Skin hangs for me under gamescope.

EDIT: aaaaand of course now it doesn't hang after trying enough times and hasn't frozen again, making this very hard to repro

@matte-schwartz
Copy link
Author

I found some DirectX 11 games in my library that froze reliably on NVIDIA with HDR (Disco Elysium, Ori and the Will of the Wisps) using gamescope's DRM backend, and was finally able to find a way to work around the freezing in-game without sacrificing HDR in both vkd3d-proton and DXVK, at least locally on my setup.

Since there is no way to disable Vulkan extensions at runtime like there is with vkd3d-proton, you will have to rebuild DXVK with a patch that essentially just removes the Vulkan capability for present wait if it detects NVIDIA drivers in-use. I was actually able to find workarounds that were used in the past within both repos to do exactly the same thing because of some problematic NVIDIA driver versions, so re-tooling those for debugging this issue was relatively straightforward.

You can use this combined patch that I created that is meant to re-compile Proton following the README.md to obtain the latest Proton sources, checkout bleeding-edge with git checkout bleeding-edge, applying the patch with patch -p1 < disable-nvidia-vk-khr-present-wait.patch (or whatever you saved the contents of that gist as), and then proceed to rebuild Proton. The advantage of this is that you can keep an entire Proton version separate from Valve's standard proton versions specifically to use just for problematic games in gamescope.

If you'd prefer to only re-build the individual components of vkd3d-proton and dxvk (much lighter to compile than a whole Proton build) then you can apply this patch to vkd3d-proton and this patch to dxvk while following their READMEs.

There's still one freeze without using any games while in the Steam Deck UI of gamescope-session when HDR is already enabled in the session that I'm trying to track down the source on, but would be good to know if anyone ends up trying these two patches and if they help or hurt specifically while in-game.

@adamdmoss
Copy link

I haven't tried your specific patch but FWIW I also hacked VK_KHR_present_wait out of DXVK and it also solved my DX11 hangs.

@tlneondo
Copy link

tlneondo commented Dec 9, 2024

Still seeing issues on 565.77. Hope we get some good news when 570 finally drops.

@danielgormly
Copy link

danielgormly commented Dec 16, 2024

Yes 565.77 with RTX2060 Mobile on ARCH. I've tried both INSIDE & Dark Souls Remastered and experience the same freezing/non-responsiveness while the audio continues.

INSIDE lasts between seconds to 25min. I can consistently get it freeze few seconds by leaving off immediate mode in hyprland and switch windows quickly. It has no vsync option & I couldn't get any other hack working i.e. VK_KHR_present_wait.

Dark Souls Remastered on the other hand, with vsync off, runs with perfect stability.

To launch gamescope in the first place, I have to launch it with the Intel GPU, but then nvidia takes over for the game. It dumps core otherwise, like in issue 497. That might put me in a special category.

@matte-schwartz
Copy link
Author

In reply to: #1592 (comment)

INSIDE looks like D3D11 which means that using VKD3D_DISABLE_EXTENSIONS has no effect, you'd have to rebuild DXVK as outlined here: #1592 (comment). Or is it still freezing even with DXVK that has VK_KHR_present_wait patched out for NVIDIA?

@Arcitec
Copy link

Arcitec commented Dec 19, 2024

DXVK used to blacklist VK_KHR_present_wait and one other Vulkan extension on NVIDIA GPUs. But it was "fixed" like a year ago, so they unblacklisted it (you can see an NVIDIA driver version check in DXVK which enables it for all "fixed" drivers).

Then another bug was found: If you use X11 and enable NVIDIA's "Force Full Compositing Pipeline" thing which forces vsync (no screen tearing) on X11, then the API broke again. But I think NVIDIA fixed that bug.

And now a 3rd bug is found: Using that VK_KHR_present_wait API on Wayland still freezes NVIDIA. At least when the game is running inside Gamescope.

So yeah that seems like a fun and endlessly buggy API.


If the game uses Direct3D 12 or newer (VKD3D), you can use VKD3D_DISABLE_EXTENSIONS=VK_KHR_present_wait.

If it uses Direct3D 11 or older (DXVK) you can't do anything except forcing the game to disable Vsync. For example by launching it without Gamescope, disabling Vsync, and then launching it with Gamescope.

@brain-anti-freeze
Copy link

Thanks @matte-schwartz for logging this ! I've been having this issue for a very long time. Like @adamdmoss I'm using the DRM backend (to a Samsung TV over HDMI) and I get this issue when I enable either HDR, the WSI layer, or both - with HDR and WSI both disabled it works OK.

It also sounds like it might be more repeatable in my setup than you are seeing - I always get the freeze in any dx11 or dx12 game using proton in the first 30-90 seconds. I'm using the gamescope-session setup where steam is run separately from gamescope with the DISPLAY and GAMESCOPE_WAYLAND_DISPLAY environment variables set. I'm pretty sure I also used to be able to easily recreate this by running gamescope --hdr-enabled vkcube in a tty, but I tried that again now and it seems to work OK.

Curious if @cubanismo has been able to root-cause this any further ? I noticed this comment in the layer

// Always send MAILBOX as the mode to the driver, as we implement FIFO ourselves -- using the
about forcing VK_PRESENT_MODE_MAILBOX_KHR to the driver, but the output of vulkaninfo shows that the nvidia driver doesn't support it

Present Modes: count = 2
                PRESENT_MODE_FIFO_KHR
                PRESENT_MODE_IMMEDIATE_KHR

(I also saw this post that suggests that sending that mode to the nvidia regardless doesn't cause an error but perhaps doesn't perform correctly). I was wondering if something was going on like the driver was stalled waiting for commits it submitted but which have been dropped so will never be presented, or just some other incompatibility with using an unsupported mode ...

I've found this to be a successful workaround - #1671 - so far no hangs and HDR looks great (on 565.77, it was washed out in earlier versions).

antheas pushed a commit to antheas/gamescope that referenced this issue Jan 2, 2025
This adds a workaround for ValveSoftware#1592 which removes the
VkPhysicalDevicePresentWaitFeaturesKHR extension in
the layer if the environment variable
GAMESCOPE_WSI_HIDE_PRESENT_WAIT_EXT is set.

This resolves the current freezing issue on nVidia
in dx12 (without having to set
VKD3D_DISABLE_EXTENSIONS), dx11 (without having
to patch DXVK not to use the extension) and in
native vulkan games.
antheas pushed a commit to antheas/gamescope that referenced this issue Jan 3, 2025
This adds a workaround for ValveSoftware#1592 which removes the
VkPhysicalDevicePresentWaitFeaturesKHR extension in
the layer if the environment variable
GAMESCOPE_WSI_HIDE_PRESENT_WAIT_EXT is set.

This resolves the current freezing issue on nVidia
in dx12 (without having to set
VKD3D_DISABLE_EXTENSIONS), dx11 (without having
to patch DXVK not to use the extension) and in
native vulkan games.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests