-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nvidia: using container toolkit results in EGL exception #124
Comments
Sorry this has taken me so long to reply, I've been on holiday the past week 😅 There's been a couple of similar reports on Discord but I can't really pinpoint down what the underlying issue is, a quick workaround that seems to work reliably is to use the old manual instructions from the quickstart guide instead of the Nvidia Container Toolkit. I'd like to get a bit more info though and possibly start to collect them here, could you please report the followings?
Thanks! [EDIT] |
Hi, This is my current run command:
1, As you can see I have already added the DEBUG environmental variables, which has indeed enriched the output. Also, since my experimenting with your code I have managed to get docker-steam-headless running, it works as expected. All feature, every feature, playing Satisfactory in 4k@60fps with AV1 encoding. But I like your approach more, so will keep testing :)
Regarding your EDIT about kernel module, I have this added to grub default: Moving forward, I will test out the Manual section in the quickstart. |
Thanks for the very quick reply! The odd part is that Gstreamer is able to create the encoders
which means that the drivers and the nvidia libraries are in place, from the other logs on Discord that wasn't the case and it was failing to create the HW encoders. Could you try running the following:
I'm specifically looking for @Drakulix any idea on what else could it be? |
Hmm the first one does not seem to be there, second yes. Let me see if I can install it somehow
|
Re-reading the old Container Toolkit discussion linked above, I wonder if it's dependent on how you've installed the drivers. Have you used |
See: #52 (comment) |
I have installed the missing libnvidia-egl-gbm. Everything I installed was using apt, I came to hate the binary installer.
However with this dependency installed, still getting the same error
Later this week I might still try the .run installer, but rather not mess up the server, I use it for all sorts of CUDA runner 😅 |
That's what the manual installation in the quickstart guide does for you without having to mess with the host. It'll basically download the .run file, install the library files into a docker volume, and then mount that in the right places. I feel that the issue here is with old packages, especially given that you are running Ubuntu 22.04, which is fairly ancient in the Nvidia+Wayland world. If you could give the manual driver volume a try, I'd be very interested to know what the results are! |
Well, I just recently did a driver upgrade from 535 to 550, but I get your point. I have also installed libnvidia-gl-550-server package and now things stated happening, but not there yet. Log output is definitely better, but still black screen.
|
This looks good! |
I have no idea which combination of environmental variables and extra installed packages I used 😅 Will sleep now and see tomorrow if I can figure it out 💤 |
No worries, and thanks for sticking around! At least we've got a good trail to follow now. |
Hi, coming over from #127 as requested. Requested info OS Version: nvidia-smi output:
Nvidia driver installation method: Logs from linked issue (to keep everything in one place): I'll try installing the Nvidia drivers using the manual method now and see what happens. Thanks again for all the help! |
Thanks for all the info, could you try installing the following apt packages
and see if that fixes it? |
I'd already tried the Nvidia (Manual) installation method before I saw your reply, and got that working with both Steam and Lutris. To help with troubleshooting, I've removed the Wolf containers and stack and started fresh. Installing using the Nvidia (Container Toolkit) Docker Compose MethodThis gave me the same Moonlight error as before: The Wolf logs show that it couldn't get hardware encoding working: Full debug logs: Installing apt PackagesWhen I tried installing these packages using apt: Both said they were already installed: Trying to launch Steam in Wolf then gave the same error. Installing using Nvidia (Manual) MethodAfter running these commands: and changing the definition of my Wolf stack in Portainer to the one in the Nvidia (Manual) section of the Quickstart page, my Wolf container logs now show it using hardware encoding: Wolf-Logs-New-Installation-Nvidia-Manual.txt Launching Steam in Wolf then worked. I could also launch and use Lutris. So, thanks very much for the guidance. Really appreciate your time and knowledge! (I do have a problem with Steam - when I select my ethernet connection during setup it flashes up a message too quickly to read then goes back to the network selection screen. I'm not sure how to make the Steam container logs more verbose as the container is created and deleted as required. If anyone has a quick answer that'd be great, otherwise I'll keep poking around). |
Re my problem setting up Steam, looks like it's a known issue with a workaround: So everything's working now, awesome! Thanks @ABeltramo. |
@TransparentDuck thanks for all the info, that's very helpful. |
For me fix was in docker compose to change deploy:
resources:
reservations:
devices:
- capabilities: [gpu] to specific nvidia driver deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu] |
I also had this problem briefly when using Nvidia (Manual) configuration method. Turns out if the video drivers in host system had any changes you should rebuild |
This helped me as well. Running Ubuntu 22.04.
|
Hi,
I have an issue what seems to be similar to #96
Please let me know what further debug output may help.
The text was updated successfully, but these errors were encountered: