-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test GPU (AMD Radeon RX 6700 XT) #222
Comments
A few notes on drivers from the Twitterverse:
So yeah... this one could be interesting, and I think my first attempts will be a bit faltering. We'll see. |
|
|
While compiling on kernel version 5.10 from the raspberrypi/linux tree, I noticed an error:
|
Looks like it was missed in raspberrypi/linux@6bd4634 which removed |
2nd Attempt:
Rebooting... |
Without the card plugged in, a
Now trying with the card plugged in... |
Good news! The Pi doesn't completely lock up and halt now... it errors out then goes back to letting me debug. Makes test cycles oh-so-much-simpler: In one terminal:
And in the other:
|
Hmm... See more: Radeon RX 6700 XT "Navy Flounder" Microcode Lands In Linux-Firmware.Git, and the commit where firmware was added. (Good ol' Phoronix) |
First time doing this (grabbing newer firmware from the
And now trying again... |
Okay, earlier firmware bug gave me false hope. We're still crashing and burning:
|
It does seem like it's running out of address space for a large BAR:
But that doesn't seem to be the issue here. |
Added a few debug lines, and things were a little different!
Not sure what |
Tried:
And the green ACT light on the IO board just stays lit green. |
Trying a few more times, with various debug statements. I can definitely get to Anyways, sometimes I get back to:
|
Another run with some more debugging:
(Hit that same Pi firmware issue, but system is still hard locked up.) Looks like it might be failing somewhere in here: static int psp_sw_init(void *handle)
...
if (mem_training_ctx->enable_mem_training) {
ret = psp_memory_training_init(psp);
if (ret) {
DRM_ERROR("Failed to initialize memory training!\n");
return ret;
}
ret = psp_mem_training(psp, PSP_MEM_TRAIN_COLD_BOOT);
if (ret) {
DRM_ERROR("Failed to process memory training!\n");
return ret;
}
} |
Opened an issue on the 'official' tracker: Freedesktop GitLab - Can't get RX 6700 XT running on Raspberry Pi CM4. |
The way I read this log is that the actual panic occurs when the Raspberry Pi itself is setting some clockspeed (PCIE bus? its own CPU? But why would that fail…) through a firmware call that times out. I think that’s why we’re not seeing that DRM error about failed memory training being printed, which leads me to believe we’re seeing the crashes occur at random points again? Smells familiar… |
Indeed, I'm running through a few more tests just to see if I can get consistent results (with a tons of .5s delays mixed in). I just checked before I was going to load
And looking back, those same two errors occurred 10 seconds into the boot cycle. PCIe bus seems to not be up either on this boot:
But a reboot brings it right back. |
I'm also adding .5s delays with two lines like the following: printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
msleep(500); And it looks like I can very consistently reach:
The next block of code, which does not run, is:
|
Debugging
It looks like it's hitting this portion of code: static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
...
if (drm_dev_enter(&adev->ddev, &idx)) {
memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz);
ret = psp_v11_0_memory_training_send_msg(psp, PSP_BL__DRAM_LONG_TRAIN);
if (ret) {
DRM_ERROR("Send long training msg failed.\n");
vfree(buf);
drm_dev_exit(idx);
return ret;
}
Edit: It seems like every time with debug statements around it, the system halts on the line: memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz); |
Maybe it's time for me to read through the entire Linux Device Drivers book on PCIe memory access? |
Trimming down the debug to just before the static int psp_v11_0_memory_training(struct psp_context *psp, uint32_t ops)
...
if (drm_dev_enter(&adev->ddev, &idx)) {
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
printk(KERN_ALERT "DEBUG: addr %p, value %u, count %d \n",buf,adev->mman.aper_base_kaddr,sz);
msleep(500);
memcpy_fromio(buf, adev->mman.aper_base_kaddr, sz); I see:
|
Can you send the rest of the Oops? Most importantly the stack trace and link register. |
dmesg attached. |
Just realized btop's AMD GPU metrics support requires ROCm, which as I noted is not available on arm64 as of right now. Also, looks like a rougue
|
Can you give it a try with this patch applied? Interestingly enough, with your card, it seems to only need an 8GB BAR for 8GB of VRAM, while it's requesting 24GB for the 12GB of VRAM on the 6700xt. |
Nice fix, now it boots up and shows things via HDMI. I ran glmark2-es2 windowed
I have RX7700XT here also, will try it also... |
I just tested with RX7700XT. for sure it failed. dmesg_rx7700xt.txt
|
The 7000 series has gfx11, which currently doesn't have any fixes, but I'd expect it to not really be much different from gfx10. Those firmware load issues are odd though. Can you check if those files actually exist on your system? Error -2 would suggest they don't. |
I did a bit of cleanup and added a proper config option for the alignment trap under |
I downloaded those necessary firmware bins manually
now, those file missing error weres gone, also full dmesg here: dmesg_rx7700.txt |
@Coreforge - I just played Doom Eternal in 4K for about 20 minutes, was switching between 4K, 720p, 1080p, and trying some different graphics settings. It ran between 15-24 fps, but I didn't have any instability (this was with your patch as of a week ago, it took the weekend to get the game downloaded lol). Interestingly, lower resolutions ran slower than 4K, as I think the CPU became a greater bottleneck? Or at least lower resolutions weren't any faster. The GPU never got over like 60% utilization: 480p: 4K: I may try a 3.0 GHz overclock and see if that makes a big difference or not on my setup. |
Was this with write-combining enabled or disabled? If you didn't enable the write-combining option I added, can you check dmesg if it's using a large BAR? I've noticed that the 6700xt in my PC seems to be fine with a 16GB bar, so maybe some cards can get a large BAR on the pi, while others can't? Maybe the 3GHz overclock also isn't quite stable on my pi and I just need to lower the clock speed a little bit. |
@Coreforge - I don't think you had that option ready yet in the commit I was running: Coreforge/linux@7fa79e5 I'm assuming you're referring to the Regarding BAR space:
|
Then either write-combining isn't entirely stable (which is very possible, there's a reason it's generally disabled for arm in DRM), or the version of mesa I'm using has some bug. |
@Coreforge - It should be noted Crysis was less stable. It crashed out after a while on every game session I started (4K, 1080p, whatever). Also, OC'ing the Pi to 3.0 GHz increased frame rates a little, but even Doom Eternal was less stable at that clock (it never crashed at 2.4, but crashed after a few minutes at 3.0). Interestingly, it can play in 'Can it run Crysis', it just takes a loooong time to get all the assets loaded in :D |
@martinx72 - FYI, we can maybe continue discussion of the 7xxx series cards over in #680 — it loooks like you're having the same issues that I'm hitting with the Radeon Pro W7700 |
Since I mentioned it yesterday in #680, I did a run of
That test was with the pi overclocked to 3.0GHz, without overclocking, I got It was still mostly limited by the SDMA block according to |
I also saw an issue from back in 2021 about CPU bottlenecking with glmark2... I wonder if there's a better GPU benchmark we can run that is almost entirely transparent to the CPU? So far only a few games like Doom Eternal even got the GPU up past 70% utilization, and only sometimes (the CPU was usually chugging along with one or two cores at 99-100%, blocking rendering). |
GravityMark has been quite light on the CPU, especially the vulkan version (OpenGL not quite as much) |
Not exactly a benchmark, but I like using the game Veloren to test Vulkan performance. It's available in flatpak (net.veloren.airshipper is the launcher package). Sadly, v3d/vc4 doesn't seem to support some feature or other that the game needs. |
@Coreforge - Indeed, GravityMark is the only thing I've found that will max out the GPU even more than SuperTuxKart. CPU isn't doing a thing, while GPU is rendering out 200+ fps with 200k asteroids, at 150W (the limit for the W7700). Nice! |
Is the @Coreforge memcpy lib still a requirement? I'm asking, because it's not mentioned in @geerlingguy recent article regarding eGPUs. |
@Srandista - nope! Not anymore. |
Is there a performance advantage to using the library? If not I'll remove from my system too! :-) |
@jamesfmackenzie - After my latest round of testing, I don't think so... It seems like things like gravitymark and glmark2-es2 have similar results with and without. But more testing would be good — before you remove it, maybe run a couple benchmarks. Then remove it, run them again and make sure :D |
Technically, there could be a performance advantage in specific scenarios, but I don't know how much it transfers into the real world. |
@Coreforge can you please rebase your patch against the latest changes in RPi kernel? Patch can't be cleanly merged since mid-December. |
Just received an OEM AMD Radeon RX 6700 XT in the mail. I was able to get it at MSRP+Shipping, which is something of a miracle these days:
I will be interested in seeing what, if anything, the card does when powered up and plugged into the Compute Module 4 IO Board!
The following issues are closely related:
Current steps to get this card working with Pi OS Bookworm
Last updated: 2025-01-03
6.6.y
kernel tree with Coreforge's GPU-enablement patch (or just check out Coreforge's branch directly).make menuconfig
and select the options:1. Kernel Features > Page Size > 4 KB (for Box86 compatibility)
2. Kernel Features > Kernel support for 32-bit EL0 > Fix up misaligned multi-word loads and stores in user space
3. Kernel Features > Fix up misaligned loads and stores from userspace for 64bit code
4. Device Drivers > Graphics support > AMD GPU (optionally SI/CIK support too)
5. Device Drivers > Graphics support > Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) > Force Architecture can write-combine memory
sudo apt install -y firmware-amd-graphics
Confirm everything is working by plugging a monitor into the graphics card; then confirm the card's GPU is in use by running
glxinfo -B
(part of themesa-utils
package), for example:(Prepend
DISPLAY=:0
if running commands over SSH.)The text was updated successfully, but these errors were encountered: