Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -v /dev:/dev to X11 default argument to fix libGL error: MESA-LOADER #258

Closed
wants to merge 1 commit into from

Conversation

woensug-choi
Copy link
Contributor

On freshly install Ubuntu 22.04 Jammy LTS. Without doing anything,
I've installed rocker with,

pip3 install rocker
pip3 install --force-reinstall git+https://github.com/osrf/rocker.git@main
rocker --version
# rocker 0.2.12

and ran Example in README

rocker --nvidia --x11 osrf/ros:noetic-desktop-full gazebo

and Got error saying

libGL error: MESA-LOADER: failed to retrieve device information

I was able to fix the problem by adding --volume /dev:/dev in rocker argument. which adds -v /dev:/dev to docker argument.

rocker --volume /dev:/dev --nvidia --x11 osrf/ros:noetic-desktop-full gazebo

I believe the right position to add -v /dev:/dev is --x11 argument tag since it wouldn't break even if /dev doesn't exist.

Related articles
#257
#206
kinu-garage/hut_10sqft#819

@tfoote
Copy link
Collaborator

tfoote commented Nov 29, 2023

That's an improperly broad solution for getting nvidia to work. In particular you say this is on a fresh installation. Do you have your NVIDIA drivers setup and docker-nvidia setup as well? The fact that it's trying to use MESA suggest that you it's not detecting NVIDIA. And the fix for a non-NVIDIA gpu is usually to use --device /dev/dri/card0 I would suggest that you try that instead of mounting the full /dev If that doesn't fix it you should find out the specific device that you need.

Copy link
Collaborator

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have double checked my system and cannot reproduce. And if there is something missing, this is way too broad of a solution to be appropriate.

@woensug-choi
Copy link
Contributor Author

That's an improperly broad solution for getting nvidia to work. In particular you say this is on a fresh installation. Do you have your NVIDIA drivers setup and docker-nvidia setup as well? The fact that it's trying to use MESA suggest that you it's not detecting NVIDIA. And the fix for a non-NVIDIA gpu is usually to use --device /dev/dri/card0 I would suggest that you try that instead of mounting the full /dev If that doesn't fix it you should find out the specific device that you need.

I will try --device /dev/dri/card0
meanwhile, I did install NVIDIA driver and toolkit installed. nvidia-smi shows fine. Also glxinfo shows no particular error message. glxgears works fast.

@woensug-choi
Copy link
Contributor Author

woensug-choi commented Nov 29, 2023

Yes, --device /dev/dri/card0 solves problem! but without it I get the same error. Does this mean that we still need a PR to add --device /dev/dri/card0? I thought gpus -all whould deal with this.

@woensug-choi
Copy link
Contributor Author

Hmm. More notes,

both with and without --device /dev/dri/card0, when running nvidia-smi inside the container does get correct NVIDIA GPU. But opening gazebo doesn't work without --device /dev/dri/card0

tested with,

rocker --nvidia --x11 osrf/ros:noetic-desktop-full /bin/bash
nvidia-smi

prints

Wed Nov 29 20:51:02 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro RTX 3000 with Max...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   53C    P0              22W /  65W |      5MiB /  6144MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

when running gazebo,

gazebo
# libGL error: MESA-LOADER: failed to retrieve device information

It's a fresh install of Ubuntu 22.04 with recommended NVIDIA driver installation which were selected automatically during installation process.

@tfoote
Copy link
Collaborator

tfoote commented Nov 29, 2023

I thought gpus -all whould deal with this.

My understanding is that the --gpus all only catches discrete GPUs I believe. /dev/dri/card0 is the integrated intel graphics If you're using that I don't believe that you're using the NVIDIA card.

with recommended NVIDIA driver installation

This may be a generic ubuntu recommendation not necessarly what you want

I see that you're using the 535 driver. I have only ever tested up to the 470 NVIDIA driver. (See the README) There's a level of required compatibility between the internal and external drivers and going to the

Are you using the 535-open drivers? https://forums.linuxmint.com/viewtopic.php?t=401149

I see reported issues with 535 reported here too: NVIDIA/nvidia-docker#1767

@woensug-choi
Copy link
Contributor Author

woensug-choi commented Nov 30, 2023

The 535 NVIDIA driver wasn't open driver version apart from other open versions of drivers in Additional Drivers in ubuntu 22.04. I've tested with 470.223.02 the proprietary version of NVIDIA driver (I did reboot by the way), but also didnt' work.
Screenshot from 2023-11-30 13-16-55

@tfoote
Copy link
Collaborator

tfoote commented Dec 5, 2023

I'm not sure what I can do to help you. I can't reproduce your issue. Does Gazebo run on the host machine with NVIDIA support?

@tfoote
Copy link
Collaborator

tfoote commented Mar 1, 2024

containers/podman#7801 (comment)

I found another pattern of potential devices that might be a solution instead of /dev/dri/card0 /dev/dri/renderD128 Also I see that the p16s can also have nvidia chips so that might mount up in a different location. Because this is too generic I don't want to merge it as proposed so I'm going to close this. But if there's a more specific device that we can detect and mount if needed that would be a good reason to reopen this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants