-
Notifications
You must be signed in to change notification settings - Fork 2k
Manual setup of nvidia-docker2 for Debian 11? #1537
Comments
@DenizUgur is there a reason that the existing debian packages cannot be used? In many cases (e.g. ubuntu18.04 or greater) the packages for later releases are the same as the earlier packages. |
Hi! I have the same problem. Any news about debian 11 package? |
I was able to create debian11 packages myself by adding |
@wentasah which changes were required to the make file apart from adding it to the targets list? If this was all that was required, the existing debian 10 packages should not have any significant differences from the debian11 packages generated and could be used. The setting you specified is out of scope of |
I'm not much familiar with NVIDIA container stuff, I just needed GPU access from within the containers. I didn't succeed with debian10 packages, but it was perhaps due to cgroup v2 (I'm not sure now). I just added diff --git a/mk/Dockerfile.debian b/mk/Dockerfile.debian
index 8e8a560..8de0b19 100644
--- a/mk/Dockerfile.debian
+++ b/mk/Dockerfile.debian
@@ -36,7 +36,8 @@ ENV WITH_LIBELF=${WITH_LIBELF}
ENV WITH_TIRPC=${WITH_TIRPC}
ENV WITH_SECCOMP=${WITH_SECCOMP}
-RUN make distclean && make -j"$(nproc)"
+RUN make distclean && make -j"$(nproc)" || mv -v 'deps/src/elftoolchain-0.7.1/libelf/name libelf.so.1' deps/src/elftoolchain-0.7.1/libelf/libelf.so.1
+RUN make install
ENV DIST_DIR /dist
VOLUME $DIST_DIR
diff --git a/mk/docker.mk b/mk/docker.mk
index efcfaed..24ae10c 100644
--- a/mk/docker.mk
+++ b/mk/docker.mk
@@ -27,14 +27,14 @@ DIST_DIR ?= $(CURDIR)/dist
MAKE_DIR ?= $(CURDIR)/mk
# Supported OSs by architecture
-AMD64_TARGETS := ubuntu20.04 ubuntu18.04 ubuntu16.04 debian10 debian9
+AMD64_TARGETS := ubuntu20.04 ubuntu18.04 ubuntu16.04 debian11 debian10 debian9
X86_64_TARGETS := centos7 centos8 rhel7 rhel8 amazonlinux1 amazonlinux2 opensuse-leap15.1
PPC64LE_TARGETS := ubuntu18.04 ubuntu16.04 centos7 centos8 rhel7 rhel8
ARM64_TARGETS := ubuntu18.04 It would be better it works with cgroup v2, but it's not critical for me (at least for now). |
I've tested what @wentasah suggested but it didn't work as expected. |
Not sure why it doesn't work, but this is what I see on my system: user@server:~$ cat /etc/debian_version
11.0
user@server:~$ docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Mon Aug 30 11:24:21 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.91.03 Driver Version: 460.91.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:18:00.0 Off | N/A |
| 26% 35C P0 19W / 250W | 0MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
user@server:~$ dlocate libnvidia-ml.so
libnvidia-ml1:amd64: /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.460.91.03
libnvidia-ml1:amd64: /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so
libnvidia-ml1:amd64: /usr/lib/x86_64-linux-gnu/nvidia/current/libnvidia-ml.so.1
nvidia-cuda-dev:amd64: /usr/lib/x86_64-linux-gnu/stubs/libnvidia-ml.so
|
Nice! But do you know something about an official debian package? |
Please see this comment for an update regarding |
FWIW: We switched over to LXC (Ubuntu 20.04) and it works flawless out of the box with systemd.unified_cgroup_hierarchy=1 (i.e. w/o all the docker junk/bloatware and we got much more flexibility + the freedom to do, what we want ;-) ). |
We now have an RC of libnvidia-container out that adds support for If you would like to try it out, make sure and add the For DEBs
For RPMs
|
Note: This does not directly add |
Release notes here: |
Debian 11 support has now been added such that running the following should now work as expected:
|
I'm using pop OS 21.10. How could I force it to install the last version? With experimental only find 2.8.0 which is not compatible with cgroupv2. |
@johnnync13 specifying the distribution explicitly as either
|
I fixed it with this:
PD: https://gist.github.com/kuang-da/2796a792ced96deaf466fdfb7651aa2e |
Is it possible to setup nvidia-docker2 before any stable release? Or are there any plans in the near future to bring support to Debian 11?
The text was updated successfully, but these errors were encountered: