Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

no NVIDIA GPU device is present: /dev/nvidia0 does not exist #1330

Closed
prismspecs opened this issue Jun 24, 2020 · 3 comments
Closed

no NVIDIA GPU device is present: /dev/nvidia0 does not exist #1330

prismspecs opened this issue Jun 24, 2020 · 3 comments

Comments

@prismspecs
Copy link

prismspecs commented Jun 24, 2020

1. Issue or feature description

When running:

docker run -it -p 8888:8888 tensorflow/tensorflow:latest-gpu-jupyter jupyter notebook --notebook-dir=/tf --ip 0.0.0.0 --no-browser --allow-root --NotebookApp.allow_origin='https://colab.research.google.com'

and then

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

I get:

Num GPUs Available:  0

And in my terminal:

Adapting from protocol version 5.1 (kernel 9d0342d5-d12c-4366-8b4e-d89bae8f2e3b) to 5.3 (client).
2020-06-24 21:27:37.458905: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-06-24 21:27:37.458946: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (-1)
2020-06-24 21:27:37.458987: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist

3. Information to attach (optional if deemed irrelevant)

nvidia-container-cli -k -d /dev/tty info

I0624 21:01:13.729499 15587 nvc.c:281] initializing library context (version=1.0.6, build=0000000000000000000000000000000000000000)
I0624 21:01:13.729603 15587 nvc.c:255] using root /
I0624 21:01:13.729615 15587 nvc.c:256] using ldcache /etc/ld.so.cache
I0624 21:01:13.729625 15587 nvc.c:257] using unprivileged user 1000:1000
W0624 21:01:16.511890 15598 nvc.c:186] failed to set inheritable capabilities
W0624 21:01:16.512007 15598 nvc.c:187] skipping kernel modules load due to failure
I0624 21:01:16.512783 15599 driver.c:133] starting driver service
I0624 21:01:16.550477 15587 nvc_info.c:437] requesting driver information with ''
I0624 21:01:16.550745 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvoptix.so.440.82
I0624 21:01:16.550831 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.440.82
I0624 21:01:16.550871 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.440.82
I0624 21:01:16.550892 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.440.82
I0624 21:01:16.550930 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.440.82
I0624 21:01:16.550982 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.440.82
I0624 21:01:16.551002 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.440.82
I0624 21:01:16.551026 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.440.82
I0624 21:01:16.551064 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.440.82
I0624 21:01:16.551111 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.440.82
I0624 21:01:16.551130 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.440.82
I0624 21:01:16.551164 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.440.82
I0624 21:01:16.551210 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.440.82
I0624 21:01:16.551228 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.440.82
I0624 21:01:16.551254 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.440.82
I0624 21:01:16.551284 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.440.82
I0624 21:01:16.551300 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.440.82
I0624 21:01:16.551346 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.440.82
I0624 21:01:16.551583 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.440.82
I0624 21:01:16.551671 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.440.82
I0624 21:01:16.551717 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.440.82
I0624 21:01:16.551754 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.440.82
I0624 21:01:16.551797 15587 nvc_info.c:151] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.440.82
I0624 21:01:16.551861 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-tls.so.440.82
I0624 21:01:16.551879 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-ptxjitcompiler.so.440.82
I0624 21:01:16.551928 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-opticalflow.so.440.82
I0624 21:01:16.551967 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-opencl.so.440.82
I0624 21:01:16.551989 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-ml.so.440.82
I0624 21:01:16.552038 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-ifr.so.440.82
I0624 21:01:16.552104 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-glvkspirv.so.440.82
I0624 21:01:16.552126 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-glsi.so.440.82
I0624 21:01:16.552174 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-glcore.so.440.82
I0624 21:01:16.552214 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-fbc.so.440.82
I0624 21:01:16.552242 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-fatbinaryloader.so.440.82
I0624 21:01:16.552263 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-encode.so.440.82
I0624 21:01:16.552292 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-eglcore.so.440.82
I0624 21:01:16.552315 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvidia-compiler.so.440.82
I0624 21:01:16.552350 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libnvcuvid.so.440.82
I0624 21:01:16.552446 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libcuda.so.440.82
I0624 21:01:16.552509 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libGLX_nvidia.so.440.82
I0624 21:01:16.552564 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libGLESv2_nvidia.so.440.82
I0624 21:01:16.552585 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libGLESv1_CM_nvidia.so.440.82
I0624 21:01:16.552619 15587 nvc_info.c:151] selecting /usr/lib/i386-linux-gnu/libEGL_nvidia.so.440.82
W0624 21:01:16.552630 15587 nvc_info.c:302] missing library libvdpau_nvidia.so
W0624 21:01:16.552633 15587 nvc_info.c:306] missing compat32 library libnvidia-cfg.so
W0624 21:01:16.552635 15587 nvc_info.c:306] missing compat32 library libvdpau_nvidia.so
W0624 21:01:16.552637 15587 nvc_info.c:306] missing compat32 library libnvidia-rtcore.so
W0624 21:01:16.552639 15587 nvc_info.c:306] missing compat32 library libnvoptix.so
I0624 21:01:16.552828 15587 nvc_info.c:232] selecting /usr/bin/nvidia-smi
I0624 21:01:16.552856 15587 nvc_info.c:232] selecting /usr/bin/nvidia-debugdump
I0624 21:01:16.552864 15587 nvc_info.c:232] selecting /usr/bin/nvidia-persistenced
I0624 21:01:16.552897 15587 nvc_info.c:232] selecting /usr/bin/nvidia-cuda-mps-control
I0624 21:01:16.552906 15587 nvc_info.c:232] selecting /usr/bin/nvidia-cuda-mps-server
I0624 21:01:16.552933 15587 nvc_info.c:369] listing device /dev/nvidiactl
I0624 21:01:16.552936 15587 nvc_info.c:369] listing device /dev/nvidia-uvm
I0624 21:01:16.552938 15587 nvc_info.c:369] listing device /dev/nvidia-uvm-tools
I0624 21:01:16.552940 15587 nvc_info.c:369] listing device /dev/nvidia-modeset
I0624 21:01:16.552953 15587 nvc_info.c:273] listing ipc /run/nvidia-persistenced/socket
W0624 21:01:16.552960 15587 nvc_info.c:277] missing ipc /tmp/nvidia-mps
I0624 21:01:16.552962 15587 nvc_info.c:493] requesting device information with ''
I0624 21:01:16.559910 15587 nvc_info.c:523] listing device /dev/nvidia0 (GPU-402283fa-193f-6581-3f85-96a5ac219d8c at 00000000:01:00.0)
NVRM version:   440.82
CUDA version:   10.2

Device Index:   0
Device Minor:   0
Model:          GeForce GTX 1650
Brand:          GeForce
GPU UUID:       GPU-402283fa-193f-6581-3f85-96a5ac219d8c
Bus Location:   00000000:01:00.0
Architecture:   7.5
I0624 21:01:16.560012 15587 nvc.c:318] shutting down library context
I0624 21:01:16.560793 15599 driver.c:192] terminating driver service
I0624 21:01:16.665093 15587 driver.c:231] driver service terminated successfully

uname -a

Linux pop-os 5.4.0-7634-generic #38~1592497129~20.04~9a1ea2e-Ubuntu SMP Fri Jun 19 22:43:37 UTC  x86_64 x86_64 x86_64 GNU/Linux

nvidia-smi -a

==============NVSMI LOG==============

Timestamp                           : Wed Jun 24 17:05:47 2020
Driver Version                      : 440.82
CUDA Version                        : 10.2

Attached GPUs                       : 1
GPU 00000000:01:00.0
    Product Name                    : GeForce GTX 1650
    Product Brand                   : GeForce
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : Disabled
    Accounting Mode Buffer Size     : 4000
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : N/A
    GPU UUID                        : GPU-402283fa-193f-6581-3f85-96a5ac219d8c
    Minor Number                    : 0
    VBIOS Version                   : 90.17.1C.40.4B
    MultiGPU Board                  : No
    Board ID                        : 0x100
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : G001.0000.02.04
        OEM Object                  : 1.1
        ECC Object                  : N/A
        Power Management Object     : N/A
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization Mode         : None
        Host VGPU Mode              : N/A
    IBMNPU
        Relaxed Ordering Mode       : N/A
    PCI
        Bus                         : 0x01
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x1F9110DE
        Bus Id                      : 00000000:01:00.0
        Sub System Id               : 0x09051028
        GPU Link Info
            PCIe Generation
                Max                 : 3
                Current             : 3
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays Since Reset         : 0
        Replay Number Rollovers     : 0
        Tx Throughput               : 1824000 KB/s
        Rx Throughput               : 7000 KB/s
    Fan Speed                       : N/A
    Performance State               : P3
    Clocks Throttle Reasons
        Idle                        : Not Active
        Applications Clocks Setting : Not Active
        SW Power Cap                : Active
        HW Slowdown                 : Not Active
            HW Thermal Slowdown     : Not Active
            HW Power Brake Slowdown : Not Active
        Sync Boost                  : Not Active
        SW Thermal Slowdown         : Active
        Display Clock Setting       : Not Active
    FB Memory Usage
        Total                       : 3914 MiB
        Used                        : 1497 MiB
        Free                        : 2417 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 6 MiB
        Free                        : 250 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 31 %
        Memory                      : 23 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    FBC Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : N/A
        Pending                     : N/A
    ECC Errors
        Volatile
            SRAM Correctable        : N/A
            SRAM Uncorrectable      : N/A
            DRAM Correctable        : N/A
            DRAM Uncorrectable      : N/A
        Aggregate
            SRAM Correctable        : N/A
            SRAM Uncorrectable      : N/A
            DRAM Correctable        : N/A
            DRAM Uncorrectable      : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending Page Blacklist      : N/A
    Temperature
        GPU Current Temp            : 55 C
        GPU Shutdown Temp           : 102 C
        GPU Slowdown Temp           : 97 C
        GPU Max Operating Temp      : 87 C
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : N/A
        Power Draw                  : 12.51 W
        Power Limit                 : N/A
        Default Power Limit         : N/A
        Enforced Power Limit        : N/A
        Min Power Limit             : N/A
        Max Power Limit             : N/A
    Clocks
        Graphics                    : 1395 MHz
        SM                          : 1395 MHz
        Memory                      : 3500 MHz
        Video                       : 1290 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 2100 MHz
        SM                          : 2100 MHz
        Memory                      : 4001 MHz
        Video                       : 1950 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes
        Process ID                  : 1231
            Type                    : G
            Name                    : /usr/lib/xorg/Xorg
            Used GPU Memory         : 133 MiB
        Process ID                  : 2004
            Type                    : G
            Name                    : /usr/lib/xorg/Xorg
            Used GPU Memory         : 654 MiB
        Process ID                  : 2251
            Type                    : G
            Name                    : /usr/bin/gnome-shell
            Used GPU Memory         : 330 MiB
        Process ID                  : 9123
            Type                    : G
            Name                    : /opt/Signal/signal-desktop --type=gpu-process --field-trial-handle=14144895808167185940,17353966407298987402,131072 --enable-features=WebComponentsV0Enabled --disable-features=SpareRendererForSitePerProcess --no-sandbox --gpu-preferences=KAAAAAAAAAAgAAAgAAAAAAAAYAAAAAAAEAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAA --shared-files
            Used GPU Memory         : 75 MiB
        Process ID                  : 10541
            Type                    : G
            Name                    : /usr/lib/chromium/chromium --type=gpu-process --field-trial-handle=15787011695909988649,307106227513681798,131072 --enable-gpu-rasterization --gpu-preferences=KAAAAAAAAAAgAAAgAAAAAAAAYAAAAAAAEAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAA --shared-files
            Used GPU Memory         : 227 MiB
        Process ID                  : 14527
            Type                    : G
            Name                    : /tmp/.mount_ObsidiUwpDRu/obsidian --type=gpu-process --field-trial-handle=11973769030279088385,14278091430670226667,131072 --enable-features=WebComponentsV0Enabled --disable-features=SpareRendererForSitePerProcess --gpu-preferences=KAAAAAAAAAAgAAAgAAAAAAAAYAAAAAAAEAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAA --shared-files
            Used GPU Memory         : 60 MiB

docker version

Client: Docker Engine - Community
 Version:           19.03.12
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        48a66213fe
 Built:             Mon Jun 22 15:45:44 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.12
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       48a66213fe
  Built:            Mon Jun 22 15:44:15 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

nvidia-container-cli -V

version: 1.0.6
build date: 2019-12-24T18:28+00:00
build revision: 0000000000000000000000000000000000000000
build compiler: x86_64-linux-gnu-gcc-9 9.2.1 20191130
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -Wdate-time -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -I/usr/include/tirpc -g -O2 -fdebug-prefix-map=/build/libnvidia-container-7GW8Zd/libnvidia-container-1.0.6=. -fstack-protector-strong -Wformat -Werror=format-security -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections -Wl,-Bsymbolic-functions -Wl,-z,relro
@LeoPotiers
Copy link

I have exactly the same problem. Did you manage to find a solution ?

@dridgway
Copy link

I had the same problem. I needed to add --gpus=all in the docker run command line.

@sosata
Copy link

sosata commented Aug 9, 2021

I have the same problem, and adding --gpus=all did NOT solve the issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants