Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [GPU] Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory #15

Closed
1 task done
andreaalloway opened this issue Apr 14, 2024 · 14 comments
Assignees
Labels
bug Something isn't working no-issue-activity

Comments

@andreaalloway
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

I'm using the lscr.io/linuxserver/faster-whisper:gpu and I'm encountering issues where any Wyoming prompt results in the following error:

Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

It appears to be related to this behavior in faster-whisper
SYSTRAN/faster-whisper#516

Expected Behavior

faster-whisper is able to use the GPU to parse speech to text

Steps To Reproduce

Setup the faster-whisper docker container per below
Added faster-whisper to Home Assistant using the Wyoming protocol
Setup a Raspberry PI 3+ with wyoming-satellite per https://github.com/rhasspy/wyoming-satellite/blob/master/docs/tutorial_installer.md
Prompts are responded (local wyoming-wakeword.service) to but in the logs on the docker container indicate an error

Logs for docker container lscr.io/linuxserver/faster-whisper:gpu

INFO:faster_whisper:Processing audio with duration 00:15.000
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

Logs for wyoming-satellite.service

run[807]: WARNING:root:Event(type='error', data={'text': 'speech-to-text failed', 'code': 'stt-stream-failed'}, payload=None)

Environment

- OS: Centos Stream 8 using the kernel-ml module
Linux 6.5.6-1.el8.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct  6 17:10:59 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
- How docker service was installed:
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum erase podman buildah
yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-compose
wget 'http://international.download.nvidia.com/XFree86/Linux-x86_64/550.67/NVIDIA-Linux-x86_64-550.67.run'
chmod +x NVIDIA-Linux-x86_64-550.67.run
./NVIDIA-Linux-x86_64-550.67.run
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo |   sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
yum install -y nvidia-container-toolkit
nvidia-ctk runtime configure --runtime=docker


nvidia-smi
Sat Apr 13 21:29:22 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080        Off |   00000000:05:00.0 Off |                  N/A |
|  0%   31C    P8              8W /  180W |    6487MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1731864      C   /app/.venv/bin/python                        3244MiB |
|    0   N/A  N/A   1737066      C   python3                                      3240MiB |
+-----------------------------------------------------------------------------------------+


### CPU architecture

x86-64

### Docker creation

```bash
version: '3.8'
services:
  faster-whisper:
    image: lscr.io/linuxserver/faster-whisper:gpu
    container_name: faster-whisper
    restart: always
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    environment:
      - PUID=<REDACTED>
      - PGID=<REDACTED>
      - TZ=<REDACTED>
      - WHISPER_MODEL=medium
      - WHISPER_BEAM=1 #optional
      - WHISPER_LANG=en #optional
    volumes:
      - /path/to/docker/whisper/config/:/config
    ports:
      - 10300:10300
    runtime: nvidia
    networks:
      swag_default:


### Container logs

```bash
[custom-init] No custom files found, skipping...
[2024-04-13 21:18:16.336] [ctranslate2] [thread 153] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
INFO:__main__:Ready
[ls.io-init] done.
INFO:faster_whisper:Processing audio with duration 00:15.000
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
[2024-04-13 21:21:04.052] [ctranslate2] [thread 223] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
INFO:__main__:Ready
Copy link

Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.

@andreaalloway
Copy link
Author

andreaalloway commented Apr 14, 2024

As a work around it appears that I can do the following.
Log into the container using

docker exec -it faster-whisper /bin/bash

Install torch

pip install torch --index-url https://download.pytorch.org/whl/cu121

exit the container bash

Create a .bashrc file under the /config directory (vim is not installed on the container so I used the host for this)

vim config/.bashrc

with the contents:

export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; import torch; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__) + ":" + os.path.dirname(torch.__file__) +"/lib")'`:$LD_LIBRARY_PATH

Then restarted my container

docker restart faster-whisper

@homerr homerr self-assigned this Apr 14, 2024
@homerr homerr added the bug Something isn't working label Apr 14, 2024
@PontyJohnty
Copy link

As a work around it appears that I can do the following. Log into the container using

docker exec -it faster-whisper /bin/bash

Install torch

pip install torch --index-url https://download.pytorch.org/whl/cu121

exit the container bash

Create a .bashrc file under the /config directory (vim is not installed on the container so I used the host for this)

vim config/.bashrc

with the contents:

export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; import torch; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__) + ":" + os.path.dirname(torch.__file__) +"/lib")'`:$LD_LIBRARY_PATH

Then restarted my container

docker restart faster-whisper

This worked for me too. Thank you for the suggestion.

@LinuxServer-CI
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

@LoghamLogan
Copy link

I had this same issue, I'm no expert but is this because the docker file is installing libs for cu11 rather than cu12? (So it breaks for anyone using CUDA v12) Would be nice if this could get fixed to avoid the need for the workaround suggested above.

@aptalca
Copy link
Member

aptalca commented May 19, 2024

upstream project wants cu11 iirc

@thespad
Copy link
Member

thespad commented May 19, 2024

It looks like upstream has switched the default recommendation to CUDA 12 SYSTRAN/faster-whisper@3d1de60, with the caveat that this may break some CUDA 11 setups, but I don't think we can win on that because the same version of ctranslate2 won't support both 11 and 12 and I don't really want a) A 5Gb+ image or b) two different branches for different versions.

@thespad
Copy link
Member

thespad commented May 19, 2024

Also looks like nvidia-cudnn-cu12 version 9+ has issues, so it's going to need pinning

@thespad thespad mentioned this issue May 19, 2024
1 task
@thespad
Copy link
Member

thespad commented May 19, 2024

Please try ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 and see if it resolves your issues.

@andreaalloway
Copy link
Author

Please try ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 and see if it resolves your issues.

This version appears to be working without the .bashrc work around

@thespad
Copy link
Member

thespad commented May 19, 2024

PR has been merged, new image should be built in the next ~30 mins.

@thespad thespad closed this as completed May 19, 2024
@LinuxServer-CI LinuxServer-CI moved this from Issues to Done in Issue & PR Tracker May 19, 2024
@richardoswald
Copy link

    {
        "Id": "5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b",
        "Created": "2024-05-19T23:19:00.278825074Z",
        "Path": "/init",
        "Args": [],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 31380,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2024-05-19T23:19:00.622553426Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:d21f6ea99e039c4c747462439217435ae5dda8a05de3c9d36d9c3fdd9a77eadb",
        "ResolvConfPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/hostname",
        "HostsPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/hosts",
        "LogPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b-json.log",
        "Name": "/faster-whisper",
        "RestartCount": 0,
        "Driver": "btrfs",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/mnt/cache/appdata/faster-whisper:/config:rw"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {
                    "max-file": "1",
                    "max-size": "50m"
                }
            },
            "NetworkMode": "br0.20",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "ConsoleSize": [
                0,
                0
            ],
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": [
                "10.0.20.1"
            ],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            **"Runtime": "nvidia",**
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": [],
            "BlkioDeviceWriteBps": [],
            "BlkioDeviceReadIOps": [],
            "BlkioDeviceWriteIOps": [],
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DeviceCgroupRules": null,
            **"DeviceRequests": [
                {
                    "Driver": "",
                    "Count": -1,
                    "DeviceIDs": null,
                    "Capabilities": [
                        [
                            "gpu"
                        ]
                    ],
                    "Options": {}
                }
            ],**
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware",
                "/sys/devices/virtual/powercap"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": null,
            "Name": "btrfs"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/mnt/cache/appdata/faster-whisper",
                "Destination": "/config",
                "Mode": "rw",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "5a7daaf35afd",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "10300/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PUID=99",
                "UMASK=022",
                "HOST_OS=Unraid",
                "HOST_HOSTNAME=zuse",
                "HOST_CONTAINERNAME=faster-whisper",
                "TCP_PORT_10300=10300",
                "WHISPER_MODEL=tiny-int8",
                "PGID=100",
                "TZ=America/Chicago",
                "WHISPER_BEAM=1",
                "WHISPER_LANG=en",
                **"NVIDIA_DRIVER_CAPABILITIES'=gpu",**
                **"NVIDIA_VISIBLE_DEVICES=GPU-4fcc04e7-23a5-2aa8-96e5-76facc3844bc",**
                "PATH=/lsiopy/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "HOME=/config",
                "LANGUAGE=en_US.UTF-8",
                "LANG=en_US.UTF-8",
                "TERM=xterm",
                "S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0",
                "S6_VERBOSITY=1",
                "S6_STAGE2_HOOK=/docker-mods",
                "VIRTUAL_ENV=/lsiopy",
                "LSIO_FIRST_PARTY=true"
            ],
            "Cmd": null,
            "Image": "ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16",
            "Volumes": {
                "/config": {}
            },
            "WorkingDir": "/",
            "Entrypoint": [
                "/init"
            ],
            "OnBuild": null,
            "Labels": {
                "build_version": "Linuxserver.io version:- 2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 Build-date:- 2024-05-19T15:21:39+00:00",
                "maintainer": "thespad",
                "net.unraid.docker.icon": "https://raw.githubusercontent.com/linuxserver/docker-templates/master/linuxserver.io/img/linuxserver-ls-logo.png",
                "net.unraid.docker.managed": "dockerman",
                "org.opencontainers.image.authors": "linuxserver.io",
                "org.opencontainers.image.created": "2024-05-19T15:21:39+00:00",
                "org.opencontainers.image.description": "[Faster-whisper](https://github.com/SYSTRAN/faster-whisper) is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. This container provides a Wyoming protocol server for faster-whisper.",
                "org.opencontainers.image.documentation": "https://docs.linuxserver.io/images/docker-faster-whisper",
                "org.opencontainers.image.licenses": "GPL-3.0-only",
                "org.opencontainers.image.ref.name": "4db4a97b3e161472da9c546387db12b39d05a816",
                "org.opencontainers.image.revision": "4db4a97b3e161472da9c546387db12b39d05a816",
                "org.opencontainers.image.source": "https://github.com/linuxserver/docker-faster-whisper",
                "org.opencontainers.image.title": "Faster-whisper",
                "org.opencontainers.image.url": "https://github.com/linuxserver/docker-faster-whisper/packages",
                "org.opencontainers.image.vendor": "linuxserver.io",
                "org.opencontainers.image.version": "2.0.0-ls18",
                "swag": "enable",
                "swag_port": "10300",
                "swag_url": "fw.theoswalds.com"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "4a8532f3a95ce7242da0cb9c396165aeb2078acd1a812163577270cb16bb172c",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/4a8532f3a95c",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "br0.20": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "5a7daaf35afd"
                    ],
                    "NetworkID": "083fbe85ffc8034005057ad6b3b67ba621dc6c0601e0afbad3dd37bebc35bc6e",
                    "EndpointID": "db5e2a20d843e010245619196d74af94db24ef91b28c529734df2cefcfbb8635",
                    "Gateway": "10.0.20.1",
                    "IPAddress": "10.0.20.27",
                    "IPPrefixLen": 24,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:0a:00:14:1b",
                    "DriverOpts": null
                }
            }
        }
    }
]

Error

INFO:faster_whisper:Processing audio with duration 00:01.260
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.10/site-packages/wyoming/server.py:28> exception=RuntimeError('cuBLAS failed with status CUBLAS_STATUS_ALLOC_FAILED')>
Traceback (most recent call last):
  File "/lsiopy/lib/python3.10/site-packages/wyoming/server.py", line 35, in run
    if not (await self.handle_event(event)):
  File "/lsiopy/lib/python3.10/site-packages/wyoming_faster_whisper/handler.py", line 70, in handle_event
    text = " ".join(segment.text for segment in segments)
  File "/lsiopy/lib/python3.10/site-packages/wyoming_faster_whisper/handler.py", line 70, in <genexpr>
    text = " ".join(segment.text for segment in segments)
  File "/lsiopy/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 511, in generate_segments
    encoder_output = self.encode(segment)
  File "/lsiopy/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 762, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_ALLOC_FAILED

I'm running into an issue with the latest version and ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 as well. I put all of the nvidia flags in bold that I'm using. Any thoughts?

@kanjieater
Copy link

kanjieater commented May 21, 2024

I'm also getting this error with the tool i'm developing using this image.

docker run -it --rm --name subplz --gpus all -v /mnt/d/sync:/sync -v /mnt/d/SyncCache:/SyncCache subplz:latest sync -d "/sync/変な家/" --rerun   
🖥️  We're using cuda. Results will be faster using Cuda with GPU than just CPU. Lot's of RAM needed no matter what.
📝 Transcribing...
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

I had this error locally on the host as well, and had to add the LD_LIBRARY_PATH to my env var to get it working. I see the workaround above, but it also say's it's fixed. Is there any reason I still can't run faster-whisper commands?

Update: Ran this inside my docker container

>>> import os
>>> import nvidia.cublas.lib
>>> import nvidia.cudnn.lib
>>> 
>>> print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))
/lsiopy/lib/python3.10/site-packages/nvidia/cublas/lib:/lsiopy/lib/python3.10/site-packages/nvidia/cudnn/lib

Then copied the output to my dockerfile to get things working. It's basically the same workaround as before.
ENV LD_LIBRARY_PATH="/lsiopy/lib/python3.10/site-packages/nvidia/cublas/lib:/lsiopy/lib/python3.10/site-packages/nvidia/cudnn/lib"

Copy link

This issue is locked due to inactivity

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working no-issue-activity
Projects
Archived in project
Development

No branches or pull requests

9 participants