Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nvidia GPU Usage - KeyError: 'gpu_card_path' #694

Open
CapitaineThug opened this issue Feb 5, 2025 · 0 comments
Open

Nvidia GPU Usage - KeyError: 'gpu_card_path' #694

CapitaineThug opened this issue Feb 5, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@CapitaineThug
Copy link

CapitaineThug commented Feb 5, 2025

Existing Resources

  • [ ✅] Please search the existing issues for related problems
  • [ ✅] Consult the product documentation : Docs
  • [ ✅] Consult the FAQ : FAQ
  • [ ✅] Consult the Troubleshooting Guide : Guide
  • [ ✅] Reviewed existing training videos: Youtube

Describe the bug
I try to create a Kasm container and put my Nvidia graphics card (GTX 1050) in it. An error occurs and the graphics card cannot be used. When I disable the graphics card by setting the “GPU Count” variable to 0, the container starts.

I've tried exotic configurations such as directly modifying the docker run configuration to force the GPU usage in the container, no error but the graphics card was not used. There is my special config:

{
  "device_requests": [
    {
      "Driver": "nvidia",
      "Count": null,
      "DeviceIDs": [
        "GPU-4555762a-faaa-d468-93f5-d91ed192a883"
      ],
      "Capabilities": [
        [
          "gpu",
          "compute",
          "video"
        ]
      ]
    }
  ]
}

To Reproduce
Steps to reproduce the behavior:

  1. Go to "Workspaces"
  2. Click on any workspace supposed to support GPU (in my clase: Blender)
  3. Scroll down to "GPU Count" and set the value to 1
  4. Start a new instance of the workspace (Blender)
  5. The error appears in logs

Expected behavior
A container starts and can use my Nvidia GPU without any error. Then when I run nvtop on the host, the container process appears.

Screenshots

Image

Image

Image

Workspaces Version
1.16.1.be60db

Workspaces Installation Method
Single Server,

Client Browser

  • OS: Windows 11 23H2
  • Browser Firefox
  • 135.0 (64 bits)

Workspace Server Information (please provide the output of the following commands):

  • uname -a
Linux nas01 6.6.44-production+truenas #1 SMP PREEMPT_DYNAMIC Mon Dec 16 20:59:32 UTC 2024 x86_
64 GNU/Linux
  • cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
  • `sudo docker info
Client: Docker Engine - Community
 Version:    27.1.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.16.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.1
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 65
  Running: 59
  Paused: 0
  Stopped: 6
 Images: 99
 Server Version: 27.1.1
 Storage Driver: overlay2
  Backing Filesystem: zfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: nvidia runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 2bf793ef6dc9a18e00cb12efb64355c2c9d5eb41
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.6.44-production+truenas
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 47.05GiB
 Name: nas01
 ID: a7d1f8bf-2cdc-44ed-9a50-43b75ea326a2
 Docker Root Dir: /mnt/SSD/Docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
  • sudo docker ps | grep kasm
f24f29f00ef5   a3b8ebd7e2d7                                              "/usr/bin/nvidia-smi…"    12 seconds ago   Up 2 seconds (health: starting)   4444/tcp                                
                                                                                                                                                                                              
                   kasm_gpu_helper
4a40a8b08bd2   kasmweb/proxy:1.16.1                                      "/docker-entrypoint.…"    2 days ago       Up 33 hours                       80/tcp                                  
                                                                                                                                                                                              
                   kasm_proxy
c0906690aadf   kasmweb/share:1.16.1                                      "/bin/sh -c '/usr/bi…"    2 days ago       Up 33 hours (healthy)             8182/tcp                                
                                                                                                                                                                                              
                   kasm_share
53a6c5369f72   kasmweb/rdp-https-gateway:1.16.1                          "/opt/rdpgw/rdpgw"        2 days ago       Up 33 hours (healthy)                                                     
                                                                                                                                                                                              
                   kasm_rdp_https_gateway
4c2e5566298f   kasmweb/rdp-gateway:1.16.1                                "/start.sh"               2 days ago       Up 33 hours (healthy)             0.0.0.0:3389->3389/tcp, :::3389->3389/tc
p                                                                                                                                                                                             
                   kasm_rdp_gateway
6ff3c7c01437   kasmweb/agent:1.16.1                                      "/bin/sh -c '/usr/bi…"    2 days ago       Up 33 hours (healthy)             4444/tcp                                
                                                                                                                                                                                              
                   kasm_agent
36168f0ad709   kasmweb/kasm-guac:1.16.1                                  "/dockerentrypoint.sh"    2 days ago       Up 33 hours (healthy)                                                     
                                                                                                                                                                                              
                   kasm_guac
6c6ba7c246c2   kasmweb/manager:1.16.1                                    "/usr/bin/startup.sh…"    2 days ago       Up 33 hours (healthy)             8181/tcp                                
                                                                                                                                                                                              
                   kasm_manager
2fe359d55d5f   kasmweb/api:1.16.1                                        "/bin/sh -c '/usr/bi…"    2 days ago       Up 33 hours (healthy)             8080/tcp                                
                                                                                                                                                                                              
                   kasm_api
42927f45e3c2   redis:5-alpine                                            "docker-entrypoint.s…"    2 days ago       Up 33 hours                       6379/tcp                                
                                                                                                                                                                                              
                   kasm_redis

Additional context
I'm actually able to use my GPU in containers like Immich, JellyFin, Frigate without any problems using this docker-compose section:

    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            device_ids:
              - 'GPU-4555762a-faaa-d468-93f5-d91ed192a883'
            capabilities:
              - gpu
              - compute
              - video

The GPU was installed after the kasm workspace setup, so when I ran the script, he didn't know that a GPU would be used later.

I've already tried to do this setup with the same Hardware on Ubuntu Server 22.04 LTS and it worked.

ps: I speak French 😉

@CapitaineThug CapitaineThug added the bug Something isn't working label Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant