Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia: using container toolkit results in EGL exception #124

Open
kosztyua opened this issue Sep 15, 2024 · 20 comments
Open

Nvidia: using container toolkit results in EGL exception #124

kosztyua opened this issue Sep 15, 2024 · 20 comments
Labels
bug Something isn't working cannot reproduce I can't reproduce it reliably, needs more info nvidia

Comments

@kosztyua
Copy link

Hi,
I have an issue what seems to be similar to #96
Please let me know what further debug output may help.

root@docker-cuda:/mnt/sdb# docker run \
    --name wolf \
    --network=host \
    -e XDG_RUNTIME_DIR=/tmp/sockets \
    -v /tmp/sockets:/tmp/sockets:rw \
    -e HOST_APPS_STATE_FOLDER=/etc/wolf \
    -v /mnt/sdb/wolf:/etc/wolf:rw \
    -e NVIDIA_DRIVER_CAPABILITIES=all \
    -e NVIDIA_VISIBLE_DEVICES=all \
    -e WOLF_ENCODER_NODE=/dev/dri/renderD128 \
    -e WOLF_RENDER_NODE=/dev/dri/renderD128 \
    -v /var/run/docker.sock:/var/run/docker.sock:rw \
    --gpus=all \
    --device /dev/dri/ \
    --device /dev/uinput \
    --device /dev/uhid \
    -v /dev/:/dev/:rw \
    -v /run/udev:/run/udev:rw \
    --device-cgroup-rule "c 13:* rmw" \
    ghcr.io/games-on-whales/wolf:stable
[2024-09-15 18:11:46]
[2024-09-15 18:11:46] [ /etc/cont-init.d/10-setup_user.sh: executing... ]
[2024-09-15 18:11:46] **** Configure default user ****
[2024-09-15 18:11:46] Container running as root. Nothing to do.
[2024-09-15 18:11:46] DONE
[2024-09-15 18:11:46]
[2024-09-15 18:11:46] [ /etc/cont-init.d/15-setup_devices.sh: executing... ]
[2024-09-15 18:11:46] **** Configure devices ****
[2024-09-15 18:11:46] Exec device groups
[2024-09-15 18:11:46] Adding user 'root' to groups: gow-gid-107,root
[2024-09-15 18:11:46] DONE
[2024-09-15 18:11:46]
[2024-09-15 18:11:46] [ /etc/cont-init.d/30-nvidia.sh: executing... ]
[2024-09-15 18:11:46]
[2024-09-15 18:11:46] [ /etc/cont-init.d/init-gamescope.sh: executing... ]
[2024-09-15 18:11:46] Launching the container's startup script as user 'root'
Stack trace (most recent call first):
#0  0x00005586a4299f76 in shutdown_handler(int) at /wolf/src/moonlight-server/./exceptions/exceptions.h:69:5
#1  0x00007f2f13f3831f at /usr/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f2f13f91b1b at /usr/lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f2f13f3826d at /usr/lib/x86_64-linux-gnu/libc.so.6
#4  0x00007f2f13f1b8fe at /usr/lib/x86_64-linux-gnu/libc.so.6
#5  0x00007f2f16fb7d99 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#6  0x00007f2f16fb51be at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#7  0x00007f2f16fb4ee1 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#8  0x00007f2f16fb4c13 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#9  0x00007f2f16fb3a38 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#10 0x00007f2f16fb4946 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#11 0x00007f2f16d9df92 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#12 0x00007f2f16d9e425 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#13 0x00007f2f16da50a2 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#14 0x00007f2f16da6749 at /usr/local/lib/liblibgstwaylanddisplay.so.0.3
#15 0x00005586a44a7e73 in wolf::core::virtual_display::create_wayland_display(immer::array<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap, 1024ul>, immer::refcount_policy, immer::spinlock_policy, immer::no_transience_policy, false, true> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /wolf/src/core/src/platforms/linux/virtual-display/wayland-display.cpp:30:36
#16 0x00005586a42a1970 in setup_sessions_handlers(immer::box<state::AppState, immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap, 1024ul>, immer::refcount_policy, immer::spinlock_policy, immer::no_transience_policy, false, true> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::optional<AudioServer> const&)::$_2::operator()(immer::box<state::StreamSession, immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap, 1024ul>, immer::refcount_policy, immer::spinlock_policy, immer::no_transience_policy, false, true> > const&) const::{lambda()#1}::operator()() const at /wolf/src/moonlight-server/wolf.cpp:225:29
#17 0x00007f2f14307bb3 at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#18 0x00007f2f13f8fa93 at /usr/lib/x86_64-linux-gnu/libc.so.6
#19 0x00007f2f1401ca33 at /usr/lib/x86_64-linux-gnu/libc.so.6
0:00:00.010775826    93 0x562d80131a20 WARN                 default gstvaapi.c:229:plugin_init: Cannot create a VA display
0:00:00.019510167    93 0x562d80131a20 WARN          adaptivedemux2 gstadaptivedemuxelement.c:41:adaptivedemux2_base_element_init: Failed to load libsoup library
0:00:00.019635326    93 0x562d80131a20 WARN          adaptivedemux2 gstadaptivedemuxelement.c:41:adaptivedemux2_base_element_init: Failed to load libsoup library
0:00:00.019743691    93 0x562d80131a20 WARN          adaptivedemux2 gstadaptivedemuxelement.c:41:adaptivedemux2_base_element_init: Failed to load libsoup library
0:00:00.067963981    93 0x562d80131a20 WARN               vadisplay gstvadisplay.c:401:gst_va_display_initialize:<vadisplaydrm0> vaInitialize: unknown libva error
0:00:00.074398098    93 0x562d80131a20 WARN               vadisplay gstvadisplay.c:401:gst_va_display_initialize:<vadisplaydrm1> vaInitialize: unknown libva error
0:00:00.076398891    93 0x562d80131a20 WARN                 default ges-meta-container.c:236:_set_value:<GESAsset@0x562d805b5e10> Could not set value on item: format-version
0:00:00.076417557    93 0x562d80131a20 WARN                 default ges-meta-container.c:236:_set_value:<GESAsset@0x562d805b6690> Could not set value on item: format-version
0:00:00.076432858    93 0x562d80131a20 WARN                 default ges-meta-container.c:236:_set_value:<GESAsset@0x562d805b6de0> Could not set value on item: format-version
0:00:00.076841919    93 0x562d80131a20 WARN               structure gststructure.c:2375:priv_gst_structure_parse_fields: Failed to find delimiter, r=mimetype
0:00:00.085014384    93 0x562d80131a20 WARN               cudanvrtc gstcudanvrtc.cpp:177:gst_cuda_nvrtc_load_library_once: Failed to load 'nvrtcGetCUBINSize', 'nvrtcGetCUBINSize': /usr/local/nvidia/lib/libnvrtc.so: undefined symbol: nvrtcGetCUBINSize
0:00:01.185902469    93 0x562d80131a20 WARN      GST_PLUGIN_LOADING gstplugin.c:542:gst_plugin_register_func: plugin "/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/validate/libgstvalidatessim.so" failed to initialise
0:00:01.209897704    93 0x562d80131a20 WARN      GST_PLUGIN_LOADING gstplugin.c:542:gst_plugin_register_func: plugin "/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/libgstvalidatessim.so" failed to initialise
0:00:01.216224003     1 0x55db60547be0 WARN            GST_REGISTRY gstregistry.c:457:gst_registry_add_plugin:<registry0> Not replacing plugin because new one (/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/libgstvalidatessim.so) is blacklisted but for a different location than existing one (/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/validate/libgstvalidatessim.so)
0:00:01.214666410    93 0x562d80131a20 WARN      GST_PLUGIN_LOADING gstplugin.c:542:gst_plugin_register_func: plugin "/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/validate/libgstvalidatessim.so" failed to initialise
0:00:01.217481103    93 0x562d80131a20 WARN      GST_PLUGIN_LOADING gstplugin.c:542:gst_plugin_register_func: plugin "/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/libgstvalidatessim.so" failed to initialise
0:00:01.225714109     1 0x55db60547be0 WARN            GST_REGISTRY gstregistry.c:457:gst_registry_add_plugin:<registry0> Not replacing plugin because new one (/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/libgstvalidatessim.so) is blacklisted but for a different location than existing one (/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/validate/libgstvalidatessim.so)
17:11:48.187286889 INFO  | Gstreamer version: 1.25.0-1
17:11:48.188352335 INFO  | Reading config file from: /etc/wolf/cfg/config.toml
0:00:01.296783594     1 0x55db60547be0 WARN               cudanvrtc gstcudanvrtc.cpp:177:gst_cuda_nvrtc_load_library_once: Failed to load 'nvrtcGetCUBINSize', 'nvrtcGetCUBINSize': /usr/local/nvidia/lib/libnvrtc.so: undefined symbol: nvrtcGetCUBINSize
17:11:49.214213660 INFO  | Using H264 encoder: nvcodec
17:11:49.214473298 INFO  | Using HEVC encoder: nvcodec
17:11:49.214672225 INFO  | Using AV1 encoder: nvcodec
17:11:49.223825014 INFO  | RTSP server started on port: 48010
17:11:49.223859981 INFO  | HTTP server listening on port: 47989
17:11:49.223945489 INFO  | Control server started on port: 47999
XDG_RUNTIME_DIR (/tmp/sockets) is not owned by us (uid 0), but by uid 1000! (This could e.g. happen if you try to connect to a non-root PulseAudio as a root user, over the native protocol. Don't do that.)
17:11:49.224206167 WARN  | [PULSE] Unable to connect, Access denied
17:11:49.224277100 INFO  | Starting PulseAudio docker container
17:11:49.225600036 WARN  | [DOCKER] Container WolfPulseAudio already present, removing first
17:11:49.225734042 INFO  | HTTPS server listening on port: 47984
17:11:59.178961723 INFO  | RTP server started on port: 48200
17:11:59.178969047 INFO  | RTP server started on port: 48100
libEGL warning: egl: failed to create dri2 screen
2024-09-15T17:11:59.201040Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
libEGL warning: egl: failed to create dri2 screen
2024-09-15T17:11:59.205039Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
libEGL warning: egl: failed to create dri2 screen
2024-09-15T17:11:59.230218Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
2024-09-15T17:11:59.230235Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: eglInitialize
thread '<unnamed>' panicked at wayland-display-core/src/comp/mod.rs:189:60:
Failed to create EGLDisplay: InitFailed(NotInitialized)
stack backtrace:
   0:     0x7f6bb0823575 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd736fd5964392270
   1:     0x7f6bb08470fb - core::fmt::write::hc6043626647b98ea
   2:     0x7f6bb08207df - std::io::Write::write_fmt::h0d24b3e0473045db
   3:     0x7f6bb082334e - std::sys_common::backtrace::print::h45eb8174d25a1e76
   4:     0x7f6bb0824899 - std::panicking::default_hook::{{closure}}::haf3f0170eb4f3b53
   5:     0x7f6bb082463a - std::panicking::default_hook::hb5d3b27aa9f6dcda
   6:     0x7f6bb0824d33 - std::panicking::rust_panic_with_hook::h6b49d59f86ee588c
   7:     0x7f6bb0824c14 - std::panicking::begin_panic_handler::{{closure}}::hd4c2f7ed79b82b70
   8:     0x7f6bb0823a39 - std::sys_common::backtrace::__rust_end_short_backtrace::h2946d6d32d7ea1ad
   9:     0x7f6bb0824947 - rust_begin_unwind
  10:     0x7f6bb060df93 - core::panicking::panic_fmt::ha02418e5cd774672
  11:     0x7f6bb060e426 - core::result::unwrap_failed::h55f86ada3ace5ed2
  12:     0x7f6bb06712f5 - waylanddisplaycore::comp::init::h0646e1a9d0f59f64
  13:     0x7f6bb062bcd0 - std::sys_common::backtrace::__rust_begin_short_backtrace::h3e12ecad4fbffac1
  14:     0x7f6bb062fa7e - core::ops::function::FnOnce::call_once{{vtable.shim}}::h66c8be57d5da151d
  15:     0x7f6bb082774b - std::sys::pal::unix::thread::Thread::new::thread_start::hb85dbfa54ba503d6
  16:     0x7f6bad7ffa94 - <unknown>
  17:     0x7f6bad88ca34 - __clone
thread '<unnamed>' panicked at /tmp/gst-wayland-display/wayland-display-core/src/lib.rs:87:41:
called `Result::unwrap()` on an `Err` value: RecvError
stack backtrace:
   0:     0x7f6bb0823575 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd736fd5964392270
   1:     0x7f6bb08470fb - core::fmt::write::hc6043626647b98ea
   2:     0x7f6bb08207df - std::io::Write::write_fmt::h0d24b3e0473045db
   3:     0x7f6bb082334e - std::sys_common::backtrace::print::h45eb8174d25a1e76
   4:     0x7f6bb0824899 - std::panicking::default_hook::{{closure}}::haf3f0170eb4f3b53
   5:     0x7f6bb082463a - std::panicking::default_hook::hb5d3b27aa9f6dcda2024-09-15T17:11:59.231787Z ERROR waylanddisplaycore: Compositor thread panic'ed! err=Any { .. }

   6:     0x7f6bb0824d33 - std::panicking::rust_panic_with_hook::h6b49d59f86ee588c
   7:     0x7f6bb0824c14 - std::panicking::begin_panic_handler::{{closure}}::hd4c2f7ed79b82b70
   8:     0x7f6bb0823a39 - std::sys_common::backtrace::__rust_end_short_backtrace::h2946d6d32d7ea1ad
   9:     0x7f6bb0824947 - rust_begin_unwind
  10:     0x7f6bb060df93 - core::panicking::panic_fmt::ha02418e5cd774672
  11:     0x7f6bb060e426 - core::result::unwrap_failed::h55f86ada3ace5ed2
  12:     0x7f6bb06150a3 - waylanddisplaycore::MaybeRecv<T>::get::hfdc57fb8b9f1c5c8
  13:     0x7f6bb061674a - display_get_devices_len
  14:     0x55db5a9f7e74 - _ZN4wolf4core15virtual_display22create_wayland_displayERKN5immer5arrayINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS2_13memory_policyINS2_21free_list_heap_policyINS2_8cpp_heapELm1024EEENS2_15refcount_policyENS2_15spinlock_policyENS2_20no_transience_policyELb0ELb1EEEEERKS9_
                               at /wolf/src/core/src/platforms/linux/virtual-display/wayland-display.cpp:30:36
  15:     0x55db5a7f1971 - _ZZZ23setup_sessions_handlersRKN5immer3boxIN5state8AppStateENS_13memory_policyINS_21free_list_heap_policyINS_8cpp_heapELm1024EEENS_15refcount_policyENS_15spinlock_policyENS_20no_transience_policyELb0ELb1EEEEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt8optionalI11AudioServerEENK3$_2clERKNS0_INS1_13StreamSessionESA_EEENKUlvE_clEv
                               at /wolf/src/moonlight-server/wolf.cpp:225:29
  16:     0x7f6badb77bb4 - <unknown>
  17:     0x7f6bad7ffa94 - <unknown>
  18:     0x7f6bad88ca34 - __clone
fatal runtime error: failed to initiate panic, error 5
root@docker-cuda:/mnt/sdb# sudo nvidia-container-cli -V
cli-version: 1.16.1
lib-version: 1.16.1
build date: 2024-07-23T14:57+00:00
build revision: 4c2494f16573b585788a42e9c7bee76ecd48c73d
build compiler: x86_64-linux-gnu-gcc-7 7.5.0
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fplan9-extensions -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
root@docker-cuda:/mnt/sdb# sudo cat /sys/module/nvidia_drm/parameters/modeset
Y
root@docker-cuda:/mnt/sdb# nvidia-smi
Sun Sep 15 19:18:43 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.07              Driver Version: 550.90.07      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+

@ABeltramo
Copy link
Member

ABeltramo commented Sep 23, 2024

Sorry this has taken me so long to reply, I've been on holiday the past week 😅

There's been a couple of similar reports on Discord but I can't really pinpoint down what the underlying issue is, a quick workaround that seems to work reliably is to use the old manual instructions from the quickstart guide instead of the Nvidia Container Toolkit.

I'd like to get a bit more info though and possibly start to collect them here, could you please report the followings?

  1. A full run of Wolf up until the crash setting the following env variables: WOLF_LOG_LEVEL=DEBUG and RUST_LOG=DEBUG
  2. Which OS have you installed on the host and do you have any DE installed or is it headless?
  3. How many GPUs have you got installed and which model?

Thanks!

[EDIT]
I wonder if it might be some kernel module that doesn't get loaded up on your OS automatically (see: https://wiki.archlinux.org/title/NVIDIA#Early_loading)

@ABeltramo ABeltramo added bug Something isn't working cannot reproduce I can't reproduce it reliably, needs more info labels Sep 23, 2024
@kosztyua
Copy link
Author

Hi,
No worries, rest is important :) Thank you for getting back to me.

This is my current run command:

docker run \
    --name wolf \
    --network=host \
    -e XDG_RUNTIME_DIR=/tmp/sockets \
    -v /tmp/sockets:/tmp/sockets:rw \
    -e HOST_APPS_STATE_FOLDER=/etc/wolf \
    -v /mnt/sdb/wolf:/etc/wolf:rw \
    -e NVIDIA_DRIVER_CAPABILITIES=all \
    -e NVIDIA_VISIBLE_DEVICES=all \
    -e WOLF_ENCODER_NODE=/dev/dri/renderD128 \
    -e WOLF_RENDER_NODE=/dev/dri/renderD128 \
    -e WOLF_LOG_LEVEL=DEBUG \
    -e RUST_LOG=DEBUG \
    -v /var/run/docker.sock:/var/run/docker.sock:rw \
    --gpus=all \
    --device /dev/dri/ \
    --device /dev/uinput \
    --device /dev/uhid \
    -v /dev/:/dev/:rw \
    -v /run/udev:/run/udev:rw \
    --device-cgroup-rule "c 13:* rmw" \
    ghcr.io/games-on-whales/wolf:stable

1, As you can see I have already added the DEBUG environmental variables, which has indeed enriched the output.
2, Host OS is Ubuntu 22.04.5 LTS, fully updated. By "DE" not sure what you mean, this is a fully headless system, although just today I have received dummy HDMI plugs and I have added one.
3, 1 GPU, NVIDIA GeForce RTX 4070 Super

Also, since my experimenting with your code I have managed to get docker-steam-headless running, it works as expected. All feature, every feature, playing Satisfactory in 4k@60fps with AV1 encoding. But I like your approach more, so will keep testing :)

[2024-09-23 20:42:26]
[2024-09-23 20:42:26] [ /etc/cont-init.d/10-setup_user.sh: executing... ]
[2024-09-23 20:42:26] **** Configure default user ****
[2024-09-23 20:42:26] Container running as root. Nothing to do.
[2024-09-23 20:42:26] DONE
[2024-09-23 20:42:26]
[2024-09-23 20:42:26] [ /etc/cont-init.d/15-setup_devices.sh: executing... ]
[2024-09-23 20:42:26] **** Configure devices ****
[2024-09-23 20:42:26] Exec device groups
[2024-09-23 20:42:27] Adding user 'root' to groups: gow-gid-107,root
[2024-09-23 20:42:27] DONE
[2024-09-23 20:42:27]
[2024-09-23 20:42:27] [ /etc/cont-init.d/30-nvidia.sh: executing... ]
[2024-09-23 20:42:27]
[2024-09-23 20:42:27] [ /etc/cont-init.d/init-gamescope.sh: executing... ]
[2024-09-23 20:42:27] Launching the container's startup script as user 'root'
Stack trace (most recent call first):
#0  0x0000563925dd5f76 in shutdown_handler(int) at /wolf/src/moonlight-server/./exceptions/exceptions.h:69:5
#1  0x00007fa3a63b131f at /usr/lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fa3a640ab1b at /usr/lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fa3a63b126d at /usr/lib/x86_64-linux-gnu/libc.so.6
#4  0x00007fa3a63948fe at /usr/lib/x86_64-linux-gnu/libc.so.6
#5  0x00007fa3a673ba34 at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fa3a673ba48 at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fa3a6751127 at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x0000563925e38ed9 in void boost::throw_exception<boost::lock_error>(boost::lock_error const&) at /usr/include/boost/throw_exception.hpp:165:5
#9  0x0000563925e38f64 in boost::mutex::lock() at /usr/include/boost/thread/pthread/mutex.hpp:64:17
#10 0x0000563925e38e2e in boost::unique_lock<boost::mutex>::lock() at /usr/include/boost/thread/lock_types.hpp:346:10
#11 0x0000563925e39eaf in boost::shared_mutex::lock_shared() at /usr/include/boost/thread/pthread/shared_mutex.hpp:173:46
#12 0x0000563925e37412 in boost::log::v2s_mt_posix::record boost::log::v2s_mt_posix::sources::basic_composite_logger<char, logs::my_logger_mt, boost::log::v2s_mt_posix::sources::multi_thread_model<boost::shared_mutex>, boost::log::v2s_mt_posix::sources::features<boost::log::v2s_mt_posix::sources::severity<boost::log::v2s_mt_posix::trivial::severity_level>, boost::log::v2s_mt_posix::sources::exception_handler> >::open_record<boost::parameter::aux::tagged_argument_list_of_1<boost::parameter::aux::tagged_argument<boost::log::v2s_mt_posix::keywords::tag::severity, boost::log::v2s_mt_posix::trivial::severity_level> > >(boost::parameter::aux::tagged_argument_list_of_1<boost::parameter::aux::tagged_argument<boost::log::v2s_mt_posix::keywords::tag::severity, boost::log::v2s_mt_posix::trivial::severity_level> > const&) at /usr/include/boost/log/sources/basic_logger.hpp:462:50
#13 0x0000563925ec03db in void logs::log<char [51], char const*, unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(boost::log::v2s_mt_posix::trivial::severity_level, char const (&) [51], char const* const&, unsigned long const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /wolf/src/core/src/platforms/all/helpers/./helpers/logger.hpp:131:3
#14 0x0000563925ebe3c8 in control::run_control(int, std::shared_ptr<immer::atom<immer::vector<state::StreamSession, immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap, 1024ul>, immer::refcount_policy, immer::spinlock_policy, immer::no_transience_policy, false, true>, 5u, 0u>, immer::memory_policy<immer::free_list_heap_policy<immer::cpp_heap, 1024ul>, immer::refcount_policy, immer::spinlock_policy, immer::no_transience_policy, false, true> > > const&, std::shared_ptr<dp::event_bus> const&, int, std::chrono::duration<long, std::ratio<1l, 1000l> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at /wolf/src/moonlight-server/control/control.cpp:150:11
#15 0x0000563925de3085 in run()::$_3::operator()() const at /wolf/src/moonlight-server/wolf.cpp:463:5
#16 0x00007fa3a6780bb3 at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#17 0x00007fa3a6408a93 at /usr/lib/x86_64-linux-gnu/libc.so.6
#18 0x00007fa3a6495a33 at /usr/lib/x86_64-linux-gnu/libc.so.6
0:00:00.010059263    93 0x5598c027ba20 WARN                 default gstvaapi.c:229:plugin_init: Cannot create a VA display
0:00:00.018083742    93 0x5598c027ba20 WARN          adaptivedemux2 gstadaptivedemuxelement.c:41:adaptivedemux2_base_element_init: Failed to load libsoup library
0:00:00.018223979    93 0x5598c027ba20 WARN          adaptivedemux2 gstadaptivedemuxelement.c:41:adaptivedemux2_base_element_init: Failed to load libsoup library
0:00:00.018386222    93 0x5598c027ba20 WARN          adaptivedemux2 gstadaptivedemuxelement.c:41:adaptivedemux2_base_element_init: Failed to load libsoup library
0:00:00.065126062    93 0x5598c027ba20 WARN               vadisplay gstvadisplay.c:401:gst_va_display_initialize:<vadisplaydrm0> vaInitialize: unknown libva error
0:00:00.072268875    93 0x5598c027ba20 WARN               vadisplay gstvadisplay.c:401:gst_va_display_initialize:<vadisplaydrm1> vaInitialize: unknown libva error
0:00:00.074428711    93 0x5598c027ba20 WARN                 default ges-meta-container.c:236:_set_value:<GESAsset@0x5598c06ffe10> Could not set value on item: format-version
0:00:00.074460686    93 0x5598c027ba20 WARN                 default ges-meta-container.c:236:_set_value:<GESAsset@0x5598c0700690> Could not set value on item: format-version
0:00:00.074472039    93 0x5598c027ba20 WARN                 default ges-meta-container.c:236:_set_value:<GESAsset@0x5598c0700de0> Could not set value on item: format-version
0:00:00.074820216    93 0x5598c027ba20 WARN               structure gststructure.c:2375:priv_gst_structure_parse_fields: Failed to find delimiter, r=mimetype
0:00:00.082643633    93 0x5598c027ba20 WARN               cudanvrtc gstcudanvrtc.cpp:177:gst_cuda_nvrtc_load_library_once: Failed to load 'nvrtcGetCUBINSize', 'nvrtcGetCUBINSize': /usr/local/nvidia/lib/libnvrtc.so: undefined symbol: nvrtcGetCUBINSize
0:00:01.047425470    93 0x5598c027ba20 WARN      GST_PLUGIN_LOADING gstplugin.c:542:gst_plugin_register_func: plugin "/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/validate/libgstvalidatessim.so" failed to initialise
0:00:01.068886441    93 0x5598c027ba20 WARN      GST_PLUGIN_LOADING gstplugin.c:542:gst_plugin_register_func: plugin "/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/libgstvalidatessim.so" failed to initialise
0:00:01.076256361     1 0x558647e5c8e0 WARN            GST_REGISTRY gstregistry.c:457:gst_registry_add_plugin:<registry0> Not replacing plugin because new one (/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/libgstvalidatessim.so) is blacklisted but for a different location than existing one (/usr/local/lib/x86_64-linux-gnu/gstreamer-1.0/validate/libgstvalidatessim.so)
19:42:28.631920799 INFO  | Gstreamer version: 1.25.0-1
19:42:28.633023072 DEBUG | XDG_RUNTIME_DIR=/tmp/sockets
19:42:28.633035442 INFO  | Reading config file from: /etc/wolf/cfg/config.toml
19:42:28.715499021 DEBUG | /dev/dri/renderD128 vendor: NVIDIA Corporation
0:00:01.179763010     1 0x558647e5c8e0 WARN               cudanvrtc gstcudanvrtc.cpp:177:gst_cuda_nvrtc_load_library_once: Failed to load 'nvrtcGetCUBINSize', 'nvrtcGetCUBINSize': /usr/local/nvidia/lib/libnvrtc.so: undefined symbol: nvrtcGetCUBINSize
19:42:29.659572537 INFO  | Using H264 encoder: nvcodec
19:42:29.659781718 INFO  | Using HEVC encoder: nvcodec
19:42:29.659888963 INFO  | Using AV1 encoder: nvcodec
19:42:29.664770731 DEBUG | Loading server certificates from disk: /etc/wolf/cfg/cert.pem /etc/wolf/cfg/key.pem
19:42:29.666174394 INFO  | RTSP server started on port: 48010
19:42:29.666181861 INFO  | HTTP server listening on port: 47989
19:42:29.666274666 INFO  | Control server started on port: 47999
XDG_RUNTIME_DIR (/tmp/sockets) is not owned by us (uid 0), but by uid 1000! (This could e.g. happen if you try to connect to a non-root PulseAudio as a root user, over the native protocol. Don't do that.)
19:42:29.666350248 DEBUG | [PULSE] Connecting...
19:42:29.666395927 DEBUG | [PULSE] Context failed
19:42:29.666408134 WARN  | [PULSE] Unable to connect, Access denied
19:42:29.666450242 INFO  | Starting PulseAudio docker container
19:42:29.667247237 WARN  | [DOCKER] Container WolfPulseAudio already present, removing first
19:42:29.667577929 INFO  | HTTPS server listening on port: 47984
19:42:30.096407288 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:30.142504717 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/serverinfo
19:42:30.161967145 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/applist
19:42:32.103077496 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:32.139644367 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:32.148639809 DEBUG | [PULSE] Connecting...
19:42:32.150871114 DEBUG | [PULSE] Pulse connection ready
19:42:33.186693032 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:33.208697526 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/serverinfo
19:42:34.184512629 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.186539174 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.188513626 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.188777702 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.216629978 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.216831914 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.218446852 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.218697273 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.243667953 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.245754793 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.246778569 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.264934097 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.265159740 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:34.267062835 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/appasset
19:42:36.239916730 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:36.269662852 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/serverinfo
19:42:37.162127592 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:37.192023942 DEBUG | 192.168.1.4 [GET] HTTP://192.168.1.34/serverinfo
19:42:37.648650820 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/launch
19:42:37.649054620 DEBUG | Host app state folder: /etc/wolf/15238196489126631726/Firefox, creating paths
19:42:37.649540852 DEBUG | [STREAM_SESSION] Create virtual audio sink
19:42:37.649972548 DEBUG | [STREAM_SESSION] Create wayland compositor
19:42:37.650063338 INFO  | RTP server started on port: 48100
19:42:37.650106770 DEBUG | [WAYLAND] Creating wayland display
19:42:37.650108746 INFO  | RTP server started on port: 48200
19:42:37.651355550 DEBUG | [PULSE] Created virtual sink: 3
2024-09-23T19:42:37.656729Z DEBUG smithay::backend::egl::display: Supported EGL client extensions: ["EGL_EXT_device_base", "EGL_EXT_device_enumeration", "EGL_EXT_device_query", "EGL_EXT_platform_base", "EGL_KHR_client_get_all_proc_addresses", "EGL_EXT_client_extensions", "EGL_KHR_debug", "EGL_EXT_platform_device", "EGL_EXT_explicit_device", "EGL_EXT_platform_wayland", "EGL_KHR_platform_wayland", "EGL_EXT_platform_x11", "EGL_KHR_platform_x11", "EGL_EXT_platform_xcb", "EGL_MESA_platform_gbm", "EGL_KHR_platform_gbm", "EGL_MESA_platform_surfaceless"]
2024-09-23T19:42:37.656742Z DEBUG smithay::backend::egl::display: Trying EGL platform: PLATFORM_DEVICE_EXT
2024-09-23T19:42:37.656752Z  INFO smithay::backend::egl::display: Successfully selected EGL platform: PLATFORM_DEVICE_EXT
libEGL warning: egl: failed to create dri2 screen
2024-09-23T19:42:37.676207Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
19:42:37.677188488 DEBUG | [RTSP] received command OPTIONS
libEGL warning: egl: failed to create dri2 screen
2024-09-23T19:42:37.680237Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
libEGL warning: egl: failed to create dri2 screen
19:42:37.692020734 DEBUG | [RTSP] received command DESCRIBE
2024-09-23T19:42:37.706877Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
2024-09-23T19:42:37.706892Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: eglInitialize
thread '<unnamed>' panicked at wayland-display-core/src/comp/mod.rs:189:60:
Failed to create EGLDisplay: InitFailed(NotInitialized)
stack backtrace:
19:42:37.707033001 DEBUG | [RTSP] received command SETUP
   0:     0x7f18e10fe575 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd736fd5964392270
   1:     0x7f18e11220fb - core::fmt::write::hc6043626647b98ea
   2:     0x7f18e10fb7df - std::io::Write::write_fmt::h0d24b3e0473045db
   3:     0x7f18e10fe34e - std::sys_common::backtrace::print::h45eb8174d25a1e76
   4:     0x7f18e10ff899 - std::panicking::default_hook::{{closure}}::haf3f0170eb4f3b53
   5:     0x7f18e10ff63a - std::panicking::default_hook::hb5d3b27aa9f6dcda
   6:     0x7f18e10ffd33 - std::panicking::rust_panic_with_hook::h6b49d59f86ee588c
   7:     0x7f18e10ffc14 - std::panicking::begin_panic_handler::{{closure}}::hd4c2f7ed79b82b70
   8:     0x7f18e10fea39 - std::sys_common::backtrace::__rust_end_short_backtrace::h2946d6d32d7ea1ad
   9:     0x7f18e10ff947 - rust_begin_unwind
  10:     0x7f18e0ee8f93 - core::panicking::panic_fmt::ha02418e5cd774672
  11:     0x7f18e0ee9426 - core::result::unwrap_failed::h55f86ada3ace5ed2
  12:     0x7f18e0f4c2f5 - waylanddisplaycore::comp::init::h0646e1a9d0f59f64
  13:     0x7f18e0f06cd0 - std::sys_common::backtrace::__rust_begin_short_backtrace::h3e12ecad4fbffac1
  14:     0x7f18e0f0aa7e - core::ops::function::FnOnce::call_once{{vtable.shim}}::h66c8be57d5da151d
  15:     0x7f18e110274b - std::sys::pal::unix::thread::Thread::new::thread_start::hb85dbfa54ba503d6
  16:     0x7f18de0daa94 - <unknown>
  17:     0x7f18de167a34 - __clone
thread '<unnamed>' panicked at /tmp/gst-wayland-display/wayland-display-core/src/lib.rs:87:41:
called `Result::unwrap()` on an `Err` value: RecvError
stack backtrace:
   0:     0x7f18e10fe575 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd736fd5964392270
2024-09-23T19:42:37.708431Z ERROR waylanddisplaycore: Compositor thread panic'ed! err=Any { .. }
   1:     0x7f18e11220fb - core::fmt::write::hc6043626647b98ea
   2:     0x7f18e10fb7df - std::io::Write::write_fmt::h0d24b3e0473045db
   3:     0x7f18e10fe34e - std::sys_common::backtrace::print::h45eb8174d25a1e76
   4:     0x7f18e10ff899 - std::panicking::default_hook::{{closure}}::haf3f0170eb4f3b53
   5:     0x7f18e10ff63a - std::panicking::default_hook::hb5d3b27aa9f6dcda
   6:     0x7f18e10ffd33 - std::panicking::rust_panic_with_hook::h6b49d59f86ee588c
   7:     0x7f18e10ffc14 - std::panicking::begin_panic_handler::{{closure}}::hd4c2f7ed79b82b70
   8:     0x7f18e10fea39 - std::sys_common::backtrace::__rust_end_short_backtrace::h2946d6d32d7ea1ad
   9:     0x7f18e10ff947 - rust_begin_unwind
  10:     0x7f18e0ee8f93 - core::panicking::panic_fmt::ha02418e5cd774672
  11:     0x7f18e0ee9426 - core::result::unwrap_failed::h55f86ada3ace5ed2
  12:     0x7f18e0ef00a3 - waylanddisplaycore::MaybeRecv<T>::get::hfdc57fb8b9f1c5c8
  13:     0x7f18e0ef174a - display_get_devices_len
  14:     0x558640c9be74 - _ZN4wolf4core15virtual_display22create_wayland_displayERKN5immer5arrayINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS2_13memory_policyINS2_21free_list_heap_policyINS2_8cpp_heapELm1024EEENS2_15refcount_policyENS2_15spinlock_policyENS2_20no_transience_policyELb0ELb1EEEEERKS9_
                               at /wolf/src/core/src/platforms/linux/virtual-display/wayland-display.cpp:30:36
19:42:37.722334685 DEBUG | [RTSP] received command SETUP
  15:     0x558640a95971 - _ZZZ23setup_sessions_handlersRKN5immer3boxIN5state8AppStateENS_13memory_policyINS_21free_list_heap_policyINS_8cpp_heapELm1024EEENS_15refcount_policyENS_15spinlock_policyENS_20no_transience_policyELb0ELb1EEEEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt8optionalI11AudioServerEENK3$_2clERKNS0_INS1_13StreamSessionESA_EEENKUlvE_clEv
                               at /wolf/src/moonlight-server/wolf.cpp:225:29
  16:     0x7f18de452bb4 - <unknown>
  17:     0x7f18de0daa94 - <unknown>
  18:     0x7f18de167a34 - __clone
fatal runtime error: failed to initiate panic, error 5
19:42:37.738400570 DEBUG | [RTSP] received command SETUP
19:42:37.754545072 DEBUG | [RTSP] received command ANNOUNCE
19:42:37.754632471 DEBUG | [RTSP] Moonlight requested video format HEVC
19:42:37.754645385 DEBUG | [RTSP] Adjusted video bitrate to 31308 Kbps
19:42:37.754739853 DEBUG | Video session 15238196489126631726, waiting for PING...
19:42:37.754781765 DEBUG | Audio session 15238196489126631726, waiting for PING...
19:42:37.769385951 DEBUG | [RTSP] received command PLAY
19:42:37.770628588 DEBUG | [ENET] connected client: 192.168.1.4:34503

Regarding your EDIT about kernel module, I have this added to grub default:
GRUB_CMDLINE_LINUX_DEFAULT="quiet nvidia-drm.modeset=1"

Moving forward, I will test out the Manual section in the quickstart.

@ABeltramo
Copy link
Member

Thanks for the very quick reply! docker-steam-headless uses Xorg whilst we use Wayland and Nvidia+Wayland is all a bit unstable at the moment but we are getting there..

The odd part is that Gstreamer is able to create the encoders

19:42:29.659572537 INFO  | Using H264 encoder: nvcodec
19:42:29.659781718 INFO  | Using HEVC encoder: nvcodec
19:42:29.659888963 INFO  | Using AV1 encoder: nvcodec

which means that the drivers and the nvidia libraries are in place, from the other logs on Discord that wasn't the case and it was failing to create the HW encoders. Could you try running the following:

docker run --rm -it --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all  -e NVIDIA_DRIVER_CAPABILITIES=all ubuntu ls /usr/lib/x86_64-linux-gnu/libnvidia*

I'm specifically looking for libnvidia-egl-gbm and libnvidia-egl-wayland I wonder if those are still missing like it was the case back when we first looked into the Nvidia Container Toolkit #52 (comment)

@Drakulix any idea on what else could it be?

@kosztyua
Copy link
Author

Hmm the first one does not seem to be there, second yes. Let me see if I can install it somehow

 docker run --rm -it --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all  -e NVIDIA_DRIVER_CAPABILITIES=all ubuntu ls /usr/lib/x86_64-linux-gnu/libnvidia*
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container-go.so.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container-go.so.1.16.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1.16.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-encode.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-ml.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so': No such file or directory
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1              /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.550.90.07     /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.550.90.07      /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1              /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1      /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.90.07      /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.550.90.07
/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1.1.9  /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4            /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.550.90.07
/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1           /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.550.90.07    /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11.so.550.90.07
/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.550.90.07   /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1          /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.1              /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.550.90.07  /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.550.90.07

@ABeltramo
Copy link
Member

Re-reading the old Container Toolkit discussion linked above, I wonder if it's dependent on how you've installed the drivers. Have you used apt or the .run file?

@ABeltramo
Copy link
Member

See: #52 (comment)

@kosztyua
Copy link
Author

I have installed the missing libnvidia-egl-gbm. Everything I installed was using apt, I came to hate the binary installer.

 docker run --rm -it --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all  -e NVIDIA_DRIVER_CAPABILITIES=all ubuntu ls /usr/lib/x86_64-linux-gnu/libnvidia*
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container-go.so.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container-go.so.1.16.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-container.so.1.16.1': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-encode.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-ml.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so': No such file or directory
ls: cannot access '/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so': No such file or directory
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1          /usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1.1.9  /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1            /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.550.90.07           /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11.so.550.90.07
/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.550.90.07  /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1           /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.550.90.07    /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so                /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1      /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.550.90.07   /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4          /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1              /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.550.90.07
/usr/lib/x86_64-linux-gnu/libnvidia-egl-gbm.so.1.1.0  /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.1              /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.550.90.07  /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.550.90.07
/usr/lib/x86_64-linux-gnu/libnvidia-egl-wayland.so.1  /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.550.90.07      /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1        /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.550.90.07

However with this dependency installed, still getting the same error

21:07:41.333911448 DEBUG | [ENET] connected client: 192.168.1.4:50400
libEGL warning: egl: failed to create dri2 screen
21:07:41.351557667 WARN  | Received MOUSE_MOVE_REL_PACKET but no mouse device is present
2024-09-23T21:07:41.352561Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
libEGL warning: egl: failed to create dri2 screen
2024-09-23T21:07:41.357364Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
libEGL warning: egl: failed to create dri2 screen
21:07:41.362186320 WARN  | Received MOUSE_MOVE_REL_PACKET but no mouse device is present
2024-09-23T21:07:41.390564Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: DRI2: failed to create screen
2024-09-23T21:07:41.390578Z ERROR smithay::backend::egl::ffi: [EGL] 0x3001 (NOT_INITIALIZED) eglInitialize: eglInitialize
thread '<unnamed>' panicked at wayland-display-core/src/comp/mod.rs:189:60:
Failed to create EGLDisplay: InitFailed(NotInitialized)
stack backtrace:
   0:     0x7f0d25763575 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hd736fd5964392270
   1:     0x7f0d257870fb - core::fmt::write::hc6043626647b98ea
   2:     0x7f0d257607df - std::io::Write::write_fmt::h0d24b3e0473045db
   3:     0x7f0d2576334e - std::sys_common::backtrace::print::h45eb8174d25a1e76
   4:     0x7f0d25764899 - std::panicking::default_hook::{{closure}}::haf3f0170eb4f3b53
   5:     0x7f0d2576463a - std::panicking::default_hook::hb5d3b27aa9f6dcda
   6:     0x7f0d25764d33 - std::panicking::rust_panic_with_hook::h6b49d59f86ee588c
   7:     0x7f0d25764c14 - std::panicking::begin_panic_handler::{{closure}}::hd4c2f7ed79b82b70
   8:     0x7f0d25763a39 - std::sys_common::backtrace::__rust_end_short_backtrace::h2946d6d32d7ea1ad
   9:     0x7f0d25764947 - rust_begin_unwind
  10:     0x7f0d2554df93 - core::panicking::panic_fmt::ha02418e5cd774672
  11:     0x7f0d2554e426 - core::result::unwrap_failed::h55f86ada3ace5ed2
  12:     0x7f0d255b12f5 - waylanddisplaycore::comp::init::h0646e1a9d0f59f64
  13:     0x7f0d2556bcd0 - std::sys_common::backtrace::__rust_begin_short_backtrace::h3e12ecad4fbffac1
  14:     0x7f0d2556fa7e - core::ops::function::FnOnce::call_once{{vtable.shim}}::h66c8be57d5da151d
  15:     0x7f0d2576774b - std::sys::pal::unix::thread::Thread::new::thread_start::hb85dbfa54ba503d6
  16:     0x7f0d2273fa94 - <unknown>
  17:     0x7f0d227cca34 - __clone
thread '<unnamed>' panicked at /tmp/gst-wayland-display/wayland-display-core/src/lib.rs:87:41:
called `Result::unwrap()` on an `Err` value: RecvError
stack backtrace:
   0:     0x7f0d25763575 - <std::sys_common::backtrace::_print::DisplayBacktrace2024-09-23T21:07:41.394619Z ERROR waylanddisplaycore: Compositor thread panic'ed! err=Any { .. }
 as core::fmt::Display>::fmt::hd736fd5964392270
   1:     0x7f0d257870fb - core::fmt::write::hc6043626647b98ea
   2:     0x7f0d257607df - std::io::Write::write_fmt::h0d24b3e0473045db
   3:     0x7f0d2576334e - std::sys_common::backtrace::print::h45eb8174d25a1e76
   4:     0x7f0d25764899 - std::panicking::default_hook::{{closure}}::haf3f0170eb4f3b53
   5:     0x7f0d2576463a - std::panicking::default_hook::hb5d3b27aa9f6dcda
   6:     0x7f0d25764d33 - std::panicking::rust_panic_with_hook::h6b49d59f86ee588c
   7:     0x7f0d25764c14 - std::panicking::begin_panic_handler::{{closure}}::hd4c2f7ed79b82b70
   8:     0x7f0d25763a39 - std::sys_common::backtrace::__rust_end_short_backtrace::h2946d6d32d7ea1ad
   9:     0x7f0d25764947 - rust_begin_unwind
  10:     0x7f0d2554df93 - core::panicking::panic_fmt::ha02418e5cd774672
  11:     0x7f0d2554e426 - core::result::unwrap_failed::h55f86ada3ace5ed2
  12:     0x7f0d255550a3 - waylanddisplaycore::MaybeRecv<T>::get::hfdc57fb8b9f1c5c8
  13:     0x7f0d2555674a - display_get_devices_len
  14:     0x55f32b94ee74 - _ZN4wolf4core15virtual_display22create_wayland_displayERKN5immer5arrayINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS2_13memory_policyINS2_21free_list_heap_policyINS2_8cpp_heapELm1024EEENS2_15refcount_policyENS2_15spinlock_policyENS2_20no_transience_policyELb0ELb1EEEEERKS9_
                               at /wolf/src/core/src/platforms/linux/virtual-display/wayland-display.cpp:30:36
  15:     0x55f32b748971 - _ZZZ23setup_sessions_handlersRKN5immer3boxIN5state8AppStateENS_13memory_policyINS_21free_list_heap_policyINS_8cpp_heapELm1024EEENS_15refcount_policyENS_15spinlock_policyENS_20no_transience_policyELb0ELb1EEEEERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt8optionalI11AudioServerEENK3$_2clERKNS0_INS1_13StreamSessionESA_EEENKUlvE_clEv
                               at /wolf/src/moonlight-server/wolf.cpp:225:29
  16:     0x7f0d22ab7bb4 - <unknown>
  17:     0x7f0d2273fa94 - <unknown>
  18:     0x7f0d227cca34 - __clone
fatal runtime error: failed to initiate panic, error 5

Later this week I might still try the .run installer, but rather not mess up the server, I use it for all sorts of CUDA runner 😅

@ABeltramo
Copy link
Member

ABeltramo commented Sep 23, 2024

Later this week, I might still try the .run installer, but rather not mess up the server, I use it for all sorts of CUDA runner

That's what the manual installation in the quickstart guide does for you without having to mess with the host. It'll basically download the .run file, install the library files into a docker volume, and then mount that in the right places.

I feel that the issue here is with old packages, especially given that you are running Ubuntu 22.04, which is fairly ancient in the Nvidia+Wayland world. If you could give the manual driver volume a try, I'd be very interested to know what the results are!

@kosztyua
Copy link
Author

Well, I just recently did a driver upgrade from 535 to 550, but I get your point. I have also installed libnvidia-gl-550-server package and now things stated happening, but not there yet. Log output is definitely better, but still black screen.
I will stop adding more info here until I have tested with the .run installer (and maybe a dist upgrade)

21:41:22.811455463 DEBUG | 192.168.1.4 [GET] HTTPS://192.168.1.34/launch
21:41:22.811884908 DEBUG | Host app state folder: /etc/wolf/15238196489126631726/Steam, creating paths
21:41:22.812229863 DEBUG | [STREAM_SESSION] Create virtual audio sink
21:41:22.812470573 INFO  | RTP server started on port: 48100
21:41:22.812573776 INFO  | RTP server started on port: 48200
21:41:22.812473702 DEBUG | [STREAM_SESSION] Create wayland compositor
21:41:22.812722221 DEBUG | [WAYLAND] Creating wayland display
2024-09-23T21:41:22.815700Z DEBUG smithay::backend::egl::display: Supported EGL client extensions: ["EGL_EXT_platform_base", "EGL_EXT_device_base", "EGL_EXT_device_enumeration", "EGL_EXT_device_query", "EGL_KHR_client_get_all_proc_addresses", "EGL_EXT_client_extensions", "EGL_KHR_debug", "EGL_KHR_platform_x11", "EGL_EXT_platform_x11", "EGL_EXT_platform_device", "EGL_MESA_platform_surfaceless", "EGL_EXT_explicit_device", "EGL_KHR_platform_wayland", "EGL_EXT_platform_wayland", "EGL_KHR_platform_gbm", "EGL_MESA_platform_gbm", "EGL_EXT_platform_xcb"]
2024-09-23T21:41:22.815733Z DEBUG smithay::backend::egl::display: Trying EGL platform: PLATFORM_DEVICE_EXT
2024-09-23T21:41:22.815841Z  INFO smithay::backend::egl::display: Successfully selected EGL platform: PLATFORM_DEVICE_EXT
2024-09-23T21:41:22.816335Z  INFO smithay::backend::egl::display: EGL Initialized
2024-09-23T21:41:22.816357Z  INFO smithay::backend::egl::display: EGL Version: (1, 5)
2024-09-23T21:41:22.816427Z  INFO smithay::backend::egl::display: Supported EGL display extensions: ["EGL_ANDROID_native_fence_sync", "EGL_EXT_buffer_age", "EGL_EXT_client_sync", "EGL_EXT_create_context_robustness", "EGL_EXT_image_dma_buf_import", "EGL_EXT_image_dma_buf_import_modifiers", "EGL_MESA_image_dma_buf_export", "EGL_EXT_output_base", "EGL_EXT_output_drm", "EGL_EXT_protected_content", "EGL_EXT_stream_consumer_egloutput", "EGL_EXT_stream_acquire_mode", "EGL_EXT_sync_reuse", "EGL_IMG_context_priority", "EGL_KHR_config_attribs", "EGL_KHR_create_context_no_error", "EGL_KHR_context_flush_control", "EGL_KHR_create_context", "EGL_KHR_fence_sync", "EGL_KHR_get_all_proc_addresses", "EGL_KHR_partial_update", "EGL_KHR_swap_buffers_with_damage", "EGL_KHR_no_config_context", "EGL_KHR_gl_colorspace", "EGL_KHR_gl_renderbuffer_image", "EGL_KHR_gl_texture_2D_image", "EGL_KHR_gl_texture_3D_image", "EGL_KHR_gl_texture_cubemap_image", "EGL_KHR_image", "EGL_KHR_image_base", "EGL_KHR_reusable_sync", "EGL_KHR_stream", "EGL_KHR_stream_attrib", "EGL_KHR_stream_consumer_gltexture", "EGL_KHR_stream_cross_process_fd", "EGL_KHR_stream_fifo", "EGL_KHR_stream_producer_eglsurface", "EGL_KHR_surfaceless_context", "EGL_KHR_wait_sync", "EGL_NV_nvrm_fence_sync", "EGL_NV_quadruple_buffer", "EGL_NV_stream_consumer_eglimage", "EGL_NV_stream_cross_display", "EGL_NV_stream_cross_object", "EGL_NV_stream_cross_process", "EGL_NV_stream_cross_system", "EGL_NV_stream_dma", "EGL_NV_stream_flush", "EGL_NV_stream_metadata", "EGL_NV_stream_remote", "EGL_NV_stream_reset", "EGL_NV_stream_socket", "EGL_NV_stream_socket_inet", "EGL_NV_stream_socket_unix", "EGL_NV_stream_sync", "EGL_NV_stream_fifo_next", "EGL_NV_stream_fifo_synchronous", "EGL_NV_stream_consumer_gltexture_yuv", "EGL_NV_stream_attrib", "EGL_NV_stream_origin", "EGL_NV_system_time", "EGL_NV_output_drm_flip_event", "EGL_NV_triple_buffer", "EGL_NV_robustness_video_memory_purge", "EGL_EXT_present_opaque", "EGL_WL_bind_wayland_display", "EGL_WL_wayland_eglstream"]
2024-09-23T21:41:22.818673Z  INFO egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}: smithay::backend::egl::context: EGL context created priority=Some(Medium)
21:41:22.822904059 DEBUG | [PULSE] Created virtual sink: 5
21:41:22.847466179 DEBUG | [RTSP] received command OPTIONS
2024-09-23T21:41:22.849562Z DEBUG smithay::backend::renderer::gles: Instancing is supported
2024-09-23T21:41:22.849573Z DEBUG smithay::backend::renderer::gles: Rgba8 Renderbuffers are supported
2024-09-23T21:41:22.849575Z DEBUG smithay::backend::renderer::gles: Blitting is supported
2024-09-23T21:41:22.849577Z DEBUG smithay::backend::renderer::gles: Color Transformations are supported
2024-09-23T21:41:22.849578Z DEBUG smithay::backend::renderer::gles: Fencing is supported
2024-09-23T21:41:22.849580Z DEBUG smithay::backend::renderer::gles: GL Debug is supported
2024-09-23T21:41:22.849773Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Instancing is supported
2024-09-23T21:41:22.849782Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Rgba8 Renderbuffers are supported
2024-09-23T21:41:22.849786Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Blitting is supported
2024-09-23T21:41:22.849789Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Color Transformations are supported
2024-09-23T21:41:22.849793Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Fencing is supported
2024-09-23T21:41:22.849796Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: GL Debug is supported
2024-09-23T21:41:22.850049Z  INFO egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Initializing OpenGL ES Renderer
2024-09-23T21:41:22.850061Z  INFO egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: GL Version: "OpenGL ES 3.2 NVIDIA 550.90.07"
2024-09-23T21:41:22.850070Z  INFO egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: GL Vendor: "NVIDIA Corporation"
2024-09-23T21:41:22.850073Z  INFO egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: GL Renderer: "NVIDIA GeForce RTX 4070 SUPER/PCIe/SSE2"
2024-09-23T21:41:22.850077Z  INFO egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: Supported GL Extensions: ["GL_EXT_base_instance", "GL_EXT_blend_func_extended", "GL_EXT_blend_minmax", "GL_EXT_buffer_storage", "GL_EXT_clear_texture", "GL_EXT_clip_control", "GL_EXT_clip_cull_distance", "GL_EXT_color_buffer_float", "GL_EXT_color_buffer_half_float", "GL_EXT_conservative_depth", "GL_EXT_copy_image", "GL_EXT_depth_clamp", "GL_EXT_debug_label", "GL_EXT_discard_framebuffer", "GL_EXT_disjoint_timer_query", "GL_EXT_draw_buffers_indexed", "GL_EXT_draw_elements_base_vertex", "GL_EXT_EGL_image_array", "GL_EXT_EGL_image_storage", "GL_EXT_EGL_image_external_wrap_modes", "GL_EXT_float_blend", "GL_EXT_frag_depth", "GL_EXT_geometry_point_size", "GL_EXT_geometry_shader", "GL_EXT_gpu_shader5", "GL_EXT_map_buffer_range", "GL_EXT_multi_draw_indirect", "GL_EXT_multisample_compatibility", "GL_EXT_multisampled_render_to_texture", "GL_EXT_multisampled_render_to_texture2", "GL_EXT_multiview_texture_multisample", "GL_EXT_multiview_timer_query", "GL_EXT_occlusion_query_boolean", "GL_EXT_polygon_offset_clamp", "GL_EXT_post_depth_coverage", "GL_EXT_primitive_bounding_box", "GL_EXT_raster_multisample", "GL_EXT_render_snorm", "GL_EXT_robustness", "GL_EXT_separate_shader_objects", "GL_EXT_shader_group_vote", "GL_EXT_shader_implicit_conversions", "GL_EXT_shader_integer_mix", "GL_EXT_shader_io_blocks", "GL_EXT_shader_non_constant_global_initializers", "GL_EXT_shader_texture_lod", "GL_EXT_shadow_samplers", "GL_EXT_sparse_texture", "GL_EXT_sparse_texture2", "GL_EXT_sRGB", "GL_EXT_sRGB_write_control", "GL_EXT_tessellation_point_size", "GL_EXT_tessellation_shader", "GL_EXT_texture_border_clamp", "GL_EXT_texture_buffer", "GL_EXT_texture_compression_bptc", "GL_EXT_texture_compression_dxt1", "GL_EXT_texture_compression_rgtc", "GL_EXT_texture_compression_s3tc", "GL_EXT_texture_cube_map_array", "GL_EXT_texture_filter_anisotropic", "GL_EXT_texture_filter_minmax", "GL_EXT_texture_format_BGRA8888", "GL_EXT_texture_mirror_clamp_to_edge", "GL_EXT_texture_norm16", "GL_EXT_texture_query_lod", "GL_EXT_texture_rg", "GL_EXT_texture_shadow_lod", "GL_EXT_texture_sRGB_R8", "GL_EXT_texture_sRGB_decode", "GL_EXT_texture_storage", "GL_EXT_texture_view", "GL_EXT_draw_transform_feedback", "GL_EXT_unpack_subimage", "GL_EXT_window_rectangles", "GL_KHR_context_flush_control", "GL_KHR_debug", "GL_EXT_memory_object", "GL_EXT_memory_object_fd", "GL_NV_memory_object_sparse", "GL_KHR_parallel_shader_compile", "GL_KHR_no_error", "GL_KHR_robust_buffer_access_behavior", "GL_KHR_robustness", "GL_EXT_semaphore", "GL_EXT_semaphore_fd", "GL_NV_timeline_semaphore", "GL_KHR_shader_subgroup", "GL_KHR_texture_compression_astc_ldr", "GL_KHR_texture_compression_astc_sliced_3d", "GL_KHR_texture_compression_astc_hdr", "GL_NV_bgr", "GL_NV_bindless_texture", "GL_NV_blend_equation_advanced", "GL_NV_blend_equation_advanced_coherent", "GL_NVX_blend_equation_advanced_multi_draw_buffers", "GL_NV_blend_minmax_factor", "GL_NV_clip_space_w_scaling", "GL_NV_compute_shader_derivatives", "GL_NV_conditional_render", "GL_NV_conservative_raster", "GL_NV_conservative_raster_pre_snap_triangles", "GL_NV_copy_buffer", "GL_NV_copy_image", "GL_NV_draw_buffers", "GL_NV_draw_instanced", "GL_NV_draw_texture", "GL_NV_draw_vulkan_image", "GL_NV_EGL_stream_consumer_external", "GL_NV_explicit_attrib_location", "GL_NV_fbo_color_attachments", "GL_NV_fill_rectangle", "GL_NV_fragment_coverage_to_color", "GL_NV_fragment_shader_barycentric", "GL_NV_fragment_shader_interlock", "GL_NV_framebuffer_blit", "GL_NV_framebuffer_mixed_samples", "GL_NV_framebuffer_multisample", "GL_NV_generate_mipmap_sRGB", "GL_NV_geometry_shader_passthrough", "GL_NV_instanced_arrays", "GL_NV_internalformat_sample_query", "GL_NV_gpu_shader5", "GL_NV_image_formats", "GL_NV_memory_attachment", "GL_NV_mesh_shader", "GL_NV_occlusion_query_samples", "GL_NV_non_square_matrices", "GL_NV_pack_subimage", "GL_NV_packed_float", "GL_NV_packed_float_linear", "GL_NV_path_rendering", "GL_NV_path_rendering_shared_edge", "GL_NV_pixel_buffer_object", "GL_NV_polygon_mode", "GL_NV_primitive_shading_rate", "GL_NV_read_buffer", "GL_NV_read_depth", "GL_NV_read_depth_stencil", "GL_NV_read_stencil", "GL_NV_representative_fragment_test", "GL_NV_sample_locations", "GL_NV_sample_mask_override_coverage", "GL_NV_scissor_exclusive", "GL_NV_shader_atomic_fp16_vector", "GL_NV_shader_noperspective_interpolation", "GL_NV_shader_subgroup_partitioned", "GL_NV_shader_texture_footprint", "GL_NV_shading_rate_image", "GL_NV_shadow_samplers_array", "GL_NV_shadow_samplers_cube", "GL_NV_sRGB_formats", "GL_NV_stereo_view_rendering", "GL_NV_texture_array", "GL_NV_texture_barrier", "GL_NV_texture_border_clamp", "GL_NV_texture_compression_latc", "GL_NV_texture_compression_s3tc", "GL_NV_texture_compression_s3tc_update", "GL_NV_texture_dirty_tile_map", "GL_NV_timer_query", "GL_NV_viewport_array", "GL_NV_viewport_array2", "GL_NV_viewport_swizzle", "GL_KHR_blend_equation_advanced", "GL_KHR_blend_equation_advanced_coherent", "GL_OES_compressed_ETC1_RGB8_texture", "GL_EXT_compressed_ETC1_RGB8_sub_texture", "GL_OES_depth24", "GL_OES_depth32", "GL_OES_depth_texture", "GL_OES_depth_texture_cube_map", "GL_OES_copy_image", "GL_OES_draw_buffers_indexed", "GL_OES_draw_elements_base_vertex", "GL_OES_texture_border_clamp", "GL_OES_tessellation_point_size", "GL_OES_tessellation_shader", "GL_OES_texture_buffer", "GL_OES_geometry_point_size", "GL_OES_geometry_shader", "GL_OES_gpu_shader5", "GL_OES_shader_io_blocks", "GL_OES_texture_view", "GL_OES_primitive_bounding_box", "GL_OES_EGL_image", "GL_OES_EGL_image_external", "GL_OES_EGL_image_external_essl3", "GL_OES_EGL_sync", "GL_OES_element_index_uint", "GL_OES_fbo_render_mipmap", "GL_OES_get_program_binary", "GL_OES_mapbuffer", "GL_OES_packed_depth_stencil", "GL_OES_rgb8_rgba8", "GL_EXT_read_format_bgra", "GL_OES_sample_shading", "GL_OES_sample_variables", "GL_OES_shader_image_atomic", "GL_OES_shader_multisample_interpolation", "GL_OES_standard_derivatives", "GL_OES_surfaceless_context", "GL_OES_texture_cube_map_array", "GL_OES_texture_npot", "GL_OES_texture_float", "GL_OES_texture_float_linear", "GL_OES_texture_half_float", "GL_OES_texture_half_float_linear", "GL_OES_texture_stencil8", "GL_OES_texture_storage_multisample_2d_array", "GL_OES_vertex_array_object", "GL_OES_vertex_half_float", "GL_OES_viewport_array", "GL_OVR_multiview", "GL_OVR_multiview2", "GL_OVR_multiview_multisampled_render_to_texture", "GL_ANDROID_extension_pack_es31a", ""]
2024-09-23T21:41:22.851758Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 1 (bound to GL_ARRAY_BUFFER_ARB, usage hint is GL_STATIC_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:22.851776Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 3 (bound to GL_ARRAY_BUFFER_ARB, usage hint is GL_STATIC_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:22.853141Z  INFO backend_libinput: smithay::backend::libinput: Initializing a libinput backend
2024-09-23T21:41:22.853157Z  INFO input_seat{name="seat-0"}:add_keyboard{xkb_config=XkbConfig { rules: "", model: "", layout: "", variant: "", options: None } repeat_delay=200 repeat_rate=25}:input_keyboard: smithay::input::keyboard: Initializing a xkbcommon handler with keymap query
2024-09-23T21:41:22.856126Z  INFO input_seat{name="seat-0"}:add_keyboard{xkb_config=XkbConfig { rules: "", model: "", layout: "", variant: "", options: None } repeat_delay=200 repeat_rate=25}:input_keyboard: smithay::input::keyboard: Loaded Keymap name="English (US)"
2024-09-23T21:41:22.857076Z  INFO smithay::wayland::socket: Created new socket name=Some("wayland-1")
2024-09-23T21:41:22.857084Z  INFO waylanddisplaycore::comp: Listening on wayland socket. socket_name="wayland-1"
2024-09-23T21:41:22.857173Z DEBUG waylanddisplaycore::comp: Requested video format: RGBx .to_fourcc() = 0
2024-09-23T21:41:22.857182Z  INFO new{name="HEADLESS-1" physical=PhysicalProperties { size: Size<smithay::utils::geometry::Raw> { w: 0, h: 0 }, subpixel: Unknown, make: "Virtual", model: "Wolf" }}: smithay::output: Creating new Output name="HEADLESS-1"
2024-09-23T21:41:22.857190Z  INFO smithay::wayland::output: Creating new wl_output output="HEADLESS-1"
2024-09-23T21:41:22.857230Z DEBUG desktop_space{id=2}: smithay::desktop::space: Mapping output at Point<smithay::utils::geometry::Logical> { x: 0, y: 0 } output="HEADLESS-1"
21:41:22.857200684 DEBUG | WAYLAND_DISPLAY=/tmp/sockets/wayland-1
21:41:22.863577880 DEBUG | [RTSP] received command DESCRIBE
21:41:22.864172369 DEBUG | /dev/dri/renderD128 vendor: NVIDIA Corporation
21:41:22.864215258 DEBUG | [DOCKER] Using fake-udev, creating /etc/wolf/15238196489126631726/Steam/udev/data
21:41:22.869322751 DEBUG | /dev/dri/renderD128 vendor: NVIDIA Corporation
21:41:22.869337868 INFO  | NVIDIA_DRIVER_VOLUME_NAME not set, assuming nvidia driver toolkit is installed..
21:41:22.878201985 DEBUG | [RTSP] received command SETUP
21:41:22.893189142 DEBUG | [RTSP] received command SETUP
21:41:22.908154081 DEBUG | [RTSP] received command SETUP
21:41:22.926085723 DEBUG | [RTSP] received command ANNOUNCE
21:41:22.926205068 DEBUG | [RTSP] Moonlight requested video format HEVC
21:41:22.926230625 DEBUG | [RTSP] Adjusted video bitrate to 31308 Kbps
21:41:22.926382219 DEBUG | Video session 15238196489126631726, waiting for PING...
21:41:22.926390188 DEBUG | Audio session 15238196489126631726, waiting for PING...
21:41:22.942257537 DEBUG | [RTSP] received command PLAY
21:41:22.943566132 DEBUG | [ENET] connected client: 192.168.1.4:30197
21:41:22.944167056 DEBUG | Starting video pipeline:
appsrc name=wolf_wayland_source is-live=true block=false format=3 stream-type=0 !
queue !
cudaupload !
cudaconvertscale !
video/x-raw(memory:CUDAMemory), width=2560, height=1440, chroma-site=mpeg2, format=NV12, colorimetry=bt601, pixel-aspect-ratio=1/1 !
nvh265enc gop-size=-1 bitrate=31308 aud=false rc-mode=cbr zerolatency=true preset=p1 tune=ultra-low-latency multi-pass=two-pass-quarter !
h265parse !
video/x-h265, profile=main, stream-format=byte-stream !
rtpmoonlightpay_video name=moonlight_pay payload_size=1392 fec_percentage=20 min_required_fec_packets=2 !
udpsink bind-port=48100 host=192.168.1.4 port=62838 sync=true
21:41:23.013612051 DEBUG | Setting up wolf_wayland_source
21:41:23.013706980 DEBUG | [WAYLAND] Start feeding app-src
2024-09-23T21:41:23.013675Z DEBUG waylanddisplaycore::comp: Requested video format: RGBx .to_fourcc() = 0
0:02:28.441951128     1 0x7fa27c001380 WARN        cudaconvertscale gstcudaconvertscale.c:1396:gst_cuda_base_convert_set_info:<cudaconvertscale5> Can't calculate borders
0:02:28.442027922     1 0x7fa27c001380 WARN           cudaconverter gstcudaconverter.c:2104:gst_cuda_converter_setup:<cudaconverter2> Couldn't compile to cubin, trying ptx
2024-09-23T21:41:23.014823Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_ARRAY_BUFFER_ARB, usage hint is GL_STATIC_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:23.014931Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:23.015657Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 4 (bound to GL_PIXEL_PACK_BUFFER_ARB, usage hint is GL_STREAM_READ) will use DMA CACHED memory as the source for buffer object operations.
2024-09-23T21:41:23.031123Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Program/shader state performance warning: Vertex shader in program 10 is being recompiled based on GL state.
2024-09-23T21:41:23.031257Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:23.031381Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 4 (bound to GL_PIXEL_PACK_BUFFER_ARB, usage hint is GL_STREAM_READ) will use DMA CACHED memory as the source for buffer object operations.
2024-09-23T21:41:23.047747Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:23.047847Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 4 (bound to GL_PIXEL_PACK_BUFFER_ARB, usage hint is GL_STREAM_READ) will use DMA CACHED memory as the source for buffer object operations.
2024-09-23T21:41:23.064470Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:23.064592Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 4 (bound to GL_PIXEL_PACK_BUFFER_ARB, usage hint is GL_STREAM_READ) will use DMA CACHED memory as the source for buffer object operations.
0:02:28.498603687     1 0x7fa28c82cc60 WARN                  appsrc gstappsrc.c:2469:gst_app_src_push_internal:<wolf_wayland_source> Dropping old item buffer: 0x7fa2987f78b0, pts 0:00:00.033333332, dts 0:00:00.033333332, dur 0:00:00.016666666, size 14745600, offset none, offset_end none, flags 0x0
21:41:23.140404133 DEBUG | [WAYLAND] Start feeding app-src
2024-09-23T21:41:23.141144Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) will use VIDEO memory as the source for buffer object operations.
2024-09-23T21:41:23.141252Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 4 (bound to GL_PIXEL_PACK_BUFFER_ARB, usage hint is GL_STREAM_READ) will use DMA CACHED memory as the source for buffer object operations.
2024-09-23T21:41:23.157731Z DEBUG egl{native="/dev/dri/renderD128" platform="PLATFORM_DEVICE_EXT" version=(1, 5)}:egl_context{ptr=140335909659681}:renderer_gles2: smithay::backend::renderer::gles: [GL] Buffer detailed info: Buffer object 2 (bound to GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING_ARB (1), and GL_ARRAY_BUFFER_ARB, usage hint is GL_STREAM_DRAW) will use VIDEO memory as the source for buffer object operations.

@ABeltramo
Copy link
Member

This looks good!
If it's not crashing it might just be slow because of all the logging, try removing the RUST_LOG=DEBUG and just paste again the full Wolf logs if it's still failing to display anything.

@kosztyua
Copy link
Author

I have no idea which combination of environmental variables and extra installed packages I used 😅 Will sleep now and see tomorrow if I can figure it out 💤

@ABeltramo
Copy link
Member

No worries, and thanks for sticking around! At least we've got a good trail to follow now.

@TransparentDuck
Copy link

TransparentDuck commented Sep 25, 2024

Hi, coming over from #127 as requested.

Requested info

OS Version:
Debian GNU/Linux 12 (bookworm)
Kernel: Linux 6.1.0-25-amd64

nvidia-smi output:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3080 Ti     On  | 00000000:04:00.0  On |                  N/A |
| 35%   45C    P0             100W / 350W |   2862MiB / 12288MiB |      4%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1423      G   /usr/lib/xorg/Xorg                           62MiB |
|    0   N/A  N/A      2125      G   /usr/bin/kwalletd5                            3MiB |
|    0   N/A  N/A      2581      G   /usr/bin/kwin_wayland                       583MiB |
|    0   N/A  N/A      2603      G   /usr/bin/Xwayland                             4MiB |
|    0   N/A  N/A      2646      G   /usr/bin/ksmserver                            3MiB |
|    0   N/A  N/A      2661      G   /usr/bin/kded5                                3MiB |
|    0   N/A  N/A      2693      G   /usr/bin/plasmashell                        237MiB |
|    0   N/A  N/A      2715      G   ...c/polkit-kde-authentication-agent-1        3MiB |
|    0   N/A  N/A      2966      G   ...86_64-linux-gnu/libexec/kdeconnectd        3MiB |
|    0   N/A  N/A      2982      G   /usr/bin/kaccess                              3MiB |
|    0   N/A  N/A      2990      G   ...-linux-gnu/libexec/DiscoverNotifier        3MiB |
|    0   N/A  N/A      2992      G   /usr/bin/kalendarac                           3MiB |
|    0   N/A  N/A      3221      G   /app/lib/thunderbird/thunderbird              7MiB |
|    0   N/A  N/A      3225      G   /app/lib/firefox/firefox                   1542MiB |
|    0   N/A  N/A      3226      G   SyncThingy                                    3MiB |
|    0   N/A  N/A      3242      G   ...-gnu/libexec/xdg-desktop-portal-kde        3MiB |
|    0   N/A  N/A     15125      G   /usr/bin/dolphin                              3MiB |
|    0   N/A  N/A     31700      G   qbittorrent                                   3MiB |
|    0   N/A  N/A     33829      G   ...lib/libreoffice/program/soffice.bin        3MiB |
|    0   N/A  N/A     43624      G   /usr/bin/kate                                 3MiB |
|    0   N/A  N/A    156260      G   ...erProcess --variations-seed-version       60MiB |
|    0   N/A  N/A    375738      G   /usr/bin/dolphin                              3MiB |
|    0   N/A  N/A    375917      G   .../x86_64-linux-gnu/libexec/kf5/kiod5        3MiB |
|    0   N/A  N/A    381674      G   /usr/bin/konsole                              3MiB |
+---------------------------------------------------------------------------------------+

Nvidia driver installation method:
apt
(apt install nvidia-driver firmware-misc-nonfree)

Logs from linked issue (to keep everything in one place):
_wolf-wolf-1_logs-1.txt

I'll try installing the Nvidia drivers using the manual method now and see what happens.

Thanks again for all the help!

@ABeltramo
Copy link
Member

Thanks for all the info, could you try installing the following apt packages

libnvidia-egl-gbm1
libnvidia-egl-wayland1

and see if that fixes it?

@TransparentDuck
Copy link

I'd already tried the Nvidia (Manual) installation method before I saw your reply, and got that working with both Steam and Lutris.

To help with troubleshooting, I've removed the Wolf containers and stack and started fresh.

Installing using the Nvidia (Container Toolkit) Docker Compose Method

This gave me the same Moonlight error as before:
Connection terminated
Error code: -1

The Wolf logs show that it couldn't get hardware encoding working:
08:17:32.364419538 WARN | Software H264 encoder detected ... 08:17:32.366224983 WARN | Software HEVC encoder detected ... 08:17:32.366886443 WARN | Software AV1 encoder detected
(tangent - how do I put linebreaks in inline code sections?)

Full debug logs:
Wolf-Logs-New-Installation-Nvidia-Container-Toolkit-Debug.txt`

Installing apt Packages

When I tried installing these packages using apt:
libnvidia-egl-gbm1
libnvidia-egl-wayland1

Both said they were already installed:
libnvidia-egl-gbm1 is already the newest version (1.1.0-2).
libnvidia-egl-wayland1 is already the newest version (1:1.1.10-1).

Trying to launch Steam in Wolf then gave the same error.

Installing using Nvidia (Manual) Method

After running these commands:
curl https://raw.githubusercontent.com/games-on-whales/gow/master/images/nvidia-driver/Dockerfile | docker build -t gow/nvidia-driver:latest -f - --build-arg NV_VERSION=$(cat /sys/module/nvidia/version) .
docker create --rm --mount source=nvidia-driver-vol,destination=/usr/nvidia gow/nvidia-driver:latest sh
docker volume ls | grep nvidia-driver
sudo cat /sys/module/nvidia_drm/parameters/modeset

and changing the definition of my Wolf stack in Portainer to the one in the Nvidia (Manual) section of the Quickstart page, my Wolf container logs now show it using hardware encoding:
08:39:56.373138263 INFO | Using H264 encoder: nvcodec
etc.

Wolf-Logs-New-Installation-Nvidia-Manual.txt

Launching Steam in Wolf then worked. I could also launch and use Lutris.

So, thanks very much for the guidance. Really appreciate your time and knowledge!

(I do have a problem with Steam - when I select my ethernet connection during setup it flashes up a message too quickly to read then goes back to the network selection screen. I'm not sure how to make the Steam container logs more verbose as the container is created and deleted as required. If anyone has a quick answer that'd be great, otherwise I'll keep poking around).

@TransparentDuck
Copy link

Re my problem setting up Steam, looks like it's a known issue with a workaround:
#120

So everything's working now, awesome! Thanks @ABeltramo.

@ABeltramo
Copy link
Member

@TransparentDuck thanks for all the info, that's very helpful.
I'm glad that the manual volume driver works; I just need to figure out which package is missing when installing the drivers using a package manager.. 🤔

@kerail
Copy link

kerail commented Oct 28, 2024

For me fix was in docker compose to change

    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]

to specific nvidia driver

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

@csrednicki
Copy link

csrednicki commented Oct 30, 2024

I also had this problem briefly when using Nvidia (Manual) configuration method. Turns out if the video drivers in host system had any changes you should rebuild nvidia-driver-vol volume. After removing old nvidia-driver-vol volume and building new volume everything started to work without any errors.

@hitchfred
Copy link

For me fix was in docker compose to change

    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]

to specific nvidia driver

    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

This helped me as well. Running Ubuntu 22.04.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01             Driver Version: 535.216.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050 Ti     Off | 00000000:08:00.0 Off |                  N/A |
| 30%   50C    P0              N/A /  75W |   1236MiB /  4096MiB |     51%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2768    C+G   /wolf/wolf                                  105MiB |
|    0   N/A  N/A      5872      G   sway                                          2MiB |
|    0   N/A  N/A      8561      G   Xwayland                                     67MiB |
|    0   N/A  N/A     11208      G   ./steamwebhelper                             22MiB |
|    0   N/A  N/A     11234    C+G   ...ebian-installation/logs/cef_log.txt        4MiB |
|    0   N/A  N/A     13588    C+G   ... the First Sin\Game\DarkSoulsII.exe      985MiB |
+---------------------------------------------------------------------------------------+```

@ABeltramo ABeltramo changed the title libEGL warning: egl: failed to create dri2 screen Nvidia: using container toolkit results in EGL exception Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cannot reproduce I can't reproduce it reliably, needs more info nvidia
Projects
None yet
Development

No branches or pull requests

6 participants