Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apptainer: unbreak --nv #279235

Closed

Conversation

SomeoneSerge
Copy link
Contributor

@SomeoneSerge SomeoneSerge commented Jan 6, 2024

Description of changes

This is work in progress.

Shortcut to the first error:

❯ APPTAINER_MESSAGELEVEL=100000 NVIDIA_VISIBLE_DEVICES=all nix run --arg config '{ allowUnfree = true; }' -I nixpkgs=flake:github:SomeoneSerge/nixpkgs/feat/nixos-apptainer-nosuid -f '<nixpkgs>' apptainer.gpuChecks.saxpy.unwrapped
...
VERBOSE [U=1001,P=1319708] addCwdMount()                 /home/ss/Sources/nixpkgs/.worktree/apptainer found within container
DEBUG   [U=1001,P=1319708] create()                      nvidia-container-cli
DEBUG   [U=1001,P=1319732] findOnPath()                  Found "nvidia-container-cli" at "/nix/store/j9raldl3nsv797ydmc7v8kxgf8vbkks3-libnvidia-container-1.14.3/bin/nvidia-container-cli"
DEBUG   [U=1001,P=1319732] findOnPath()                  Found "ldconfig" at "/run/current-system/sw/bin/ldconfig"
DEBUG   [U=1001,P=1319732] NVCLIConfigure()              nvidia-container-cli binary: "/nix/store/j9raldl3nsv797ydmc7v8kxgf8vbkks3-libnvidia-container-1.14.3/bin/nvidia-container-cli" args: ["--debug" "--user" "configure" "--no-cgroups" "--device=all" "--compute" "--utility" "--ldconfig=@/run/current-system/sw/bin/ldconfig" "/nix/store/v1mwsgk404dryqfbffyafybyab1snzfp-apptainer-1.2.5/var/lib/apptainer/mnt/session/final"]
DEBUG   [U=1001,P=1319732] NVCLIConfigure()              Running nvidia-container-cli in user namespace
DEBUG   [U=1001,P=1319708] create()                      Chroot into /nix/store/v1mwsgk404dryqfbffyafybyab1snzfp-apptainer-1.2.5/var/lib/apptainer/mnt/session/final
DEBUG   [U=1001,P=1319732] Chroot()                      Hold reference to host / directory
DEBUG   [U=1001,P=1319732] Chroot()                      Called pivot_root on /nix/store/v1mwsgk404dryqfbffyafybyab1snzfp-apptainer-1.2.5/var/lib/apptainer/mnt/session/final
DEBUG   [U=1001,P=1319732] Chroot()                      Change current directory to host / directory
...
CUDA error at cudaRuntimeGetVersion(&rtVersion): CUDA driver version is insufficient for CUDA runtime version

Progress so far

  • Updated and fixed the build-time errors in libnvidia-container, nvidia-docker, nvidia-container-cli.
  • Patched out several undocumented abortions from apptainer and libnvidia-docker. I had to extend libnvidia-docker's error messages
  • Verified nvidia-container --list and nvidia-container info

Still suffering from:

  • apptainer --nv --nvccli still doesn't mount any of the libraries from /run/opengl-driver/lib, even though nvidia-container-cli is being called, and its ldcache_resolve does hit them

  • the nvliblist method is broken because we still haven't patched out the ldconfig abuse

  • Running nvidia-container-cli --debug --user configure ... from the shell (outside singularity) still fails, e.g.:

    ❯ "/nix/store/scsh1pr4n520c4mklmwai3bclb3hipj0-libnvidia-container-1.14.3/bin/nvidia-container-cli" "--debug" "--user" "configure" "--no-cgroups" "--device=all" "--compute" "--utility" "--ldconfig=@/nix/store/bh4lz3c2n3qfbm2hhwjhnqcaxcjs2sm8-glibc-2.38-27-bin/bin/ldconfig" "/var/lib/apptainer/mnt/session/final"
    nvidia-container-cli: permission error: /build/source/src/utils.c:1055: perm_set_capabilities(CAP_PERIMTTED, ...): capability change failed: operation not permitted
    

    It's still unclear to me if nvidia-container-cli was ever meant to be used by unprivileged users

  • libnvc's logs (warnx, etc) aren't visible; I don't even think apptainer eats them, I think they're never actually printed. The NVC_DEBUG_FILE is also ignored. Essentially, error_setx is the only way I found so far to get any logs out of nvidia-container-cli

Motivation

Apptainer and singularity's GPU support is (mostly) broken again. Note that singularity+setuid used to work (with the fake ldconfig). Recap:

  • Both default to abusing glibc internals (ld.so.cache and/or ldconfig -P) hoping to scan for the sonames listed in etc/nvliblist.conf. This isn't new, but we should add a patch making them support at least nixos (ideally we need a generic posix solution accepted upstream); tracking upstream as well: --nv: make ldconfig optional in the nvliblist.conf apptainer/apptainer#1894
  • Singularity only works with setuid
  • Apptainer refuses to use --nvccli unless the setuid bit is unset
  • Apptainer's --nv aborts unless the image is writable, but doesn't explain why it needs the write access
  • The --nvccli uses nvidia-container-toolkit which abuses ld.so.cache, obscurely fails trying to manage capabilities, but also silently fails to copy or mount the host libraries into the container (even if ldcache_resolve succeeds locating them)
  • Nixpkgs' libnvidia-container and nvidia-container-toolkit are outdated
  • Several packages in Nixpkgs still use the upstream-deprecated nvidia-docker
  • This is also related to the podman/kubernetes/containerd issues brought up on discourse

Naturally, we need to contact both apptainer, singularityce, and libnvidia-container about these issues. Ideally, they would reach out with the glibc maintainers and ask what would have been the reasonable approach to locating the host libraries.

⚠️ I apologize for the language, but I'm pretty tired and I wasn't planning to be spending time on this

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

CC @ShamrockLee


Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jan 6, 2024
@SomeoneSerge SomeoneSerge added the 6.topic: cuda Parallel computing platform and API label Jan 6, 2024
Copy link
Contributor

@ShamrockLee ShamrockLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a challenging task. Would love to see it work!

nixos/modules/programs/singularity.nix Outdated Show resolved Hide resolved
@ShamrockLee
Copy link
Contributor

Cc: @misumisumi #230851

@SomeoneSerge SomeoneSerge force-pushed the feat/nixos-apptainer-nosuid branch from f24b311 to 680bbed Compare January 8, 2024 17:25
@SomeoneSerge
Copy link
Contributor Author

❯ NVIDIA_VISIBLE_DEVICES=all nix run --arg config '{ allowUnfree = true; }' -f . apptainer.gpuChecks.saxpy.unwrapped
INFO:    gocryptfs not found, will not be able to use gocryptfs
WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating
Start
Runtime version: 11080
Driver version: 12030
Host memory initialized, copying to the device
Scheduled a cudaMemcpy, calling the kernel
Scheduled a kernel call
Max error: 0.000000
  • --nvccli still broken; I don't if I should keep the related patches in this PR, or split out
  • nvliblist-based --nv works on NixOS, cf. the above

@ShamrockLee @misumisumi I'd appreciate if you could run some tests at this point. It'd also be helpful if somebody could test podman/containerd/etc, though again, it's probably more efficient to split

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/using-nvidia-container-runtime-with-containerd-on-nixos/27865/30

Copy link
Contributor Author

@SomeoneSerge SomeoneSerge Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently breaks virtualisation.docker.enableNvidia:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: ldcache error: open failed: /sbin/ldconfig: No such file or directory.  ...

...the cause is likely in the update, rather than in the patches

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new error from docker run --runtime nvidia --rm ... is

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: /var/lib/docker/runtimes/nvidia did not terminate successfully: exit status 125: unknown.

The only thing I get in the journal (with debug: true in the docker.daemon.settings, and --debug in runtimes.nvidia.runtimeArgs) is a random cli tool's --help I think: https://gist.github.com/SomeoneSerge/e6fbe4747d7ea4f6f5de2b9477c5f6d2. I tried grepping for parts of this message (including failed to read init pid file) and for /var/lib/docker/runtimes/nvidia but found no matches in libnvidia-container, nvidia-container-toolkit, runc, or moby.

If anybody knows how to make docker, or runc, or nvidia-container-runtime write logs please say so. Otherwise I'm stuck and will probably proceed by merging just the apptainer bits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github codesearch shows the message could've come from containerd. I thought containerd was somehow separate from docker? Anyway, hoping someone more experienced with the OCI can handle this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this PR didn't introduce the breakage. It turns out I didn't notice docker run --gpus all failing on me on nixos-unstable already.

Nonetheless, I removed the updates and opened #280087

@GTrunSec
Copy link
Contributor

GTrunSec commented Jan 8, 2024

CC:@jmbaur just let you know this PR before you update the libcontainer-15 draft.

@SomeoneSerge
Copy link
Contributor Author

This is bad scoping. I split out the apptainer chnages into #280076. I'll see if I have the bandwidth for libnvidia-docker (as I mentioned, I fail to retrieve any logs from the runtime and I'm not familiar with the ecosystem)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: cuda Parallel computing platform and API 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 10.rebuild-linux: 11-100
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants