Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand documentation wrt how to use NVIDIA GPU(install TF+CUDA+cuDNN) #50

Open
ghost opened this issue Dec 9, 2020 · 7 comments
Open

Comments

@ghost
Copy link

ghost commented Dec 9, 2020

It would be good to expand the documentation to show users how to use NVIDIA GPU for inference and maybe even provide a reference setup(using Ubuntu 18.04 LTS, install the whatever CUDA+cuDNN TF2 supports(10.1 as of right now I think) and offload inference to that.
Currently using a i7 8665U 4c8t 25W CPU causes all cores to go nearly 80-100% running MobileNetV1 on 1280x720p video stream. Would be great to offload it to GPU and actually make the system usable for tasks while video chatting.
PS: I am volunteering to write the documentation.

@allo-
Copy link
Owner

allo- commented Dec 9, 2020

There is nothing special to do. You set up cudnn like for any other project and it should work. I guess we should rather link official docs which stay updated when something changes.

@ghost
Copy link
Author

ghost commented Dec 9, 2020

It's a real PITA to install TF compatible CUDA+cuDNN unless you're maybe using conda that can resolve it for you. if you pip install tensorflow you have to use CUDA 10 and then look up which cuDNN version is compatible with CUDA 10 and the TF you have installed(7.x version usually) and if you want the newer TF/CUDA/cuDNN you have to build and compile TF C yourself using bazel which is even more of a PITA. This process assumes a lot of prior knowledge, so it would be good to provide actual documentation for it or just tell them to use a container

@allo-
Copy link
Owner

allo- commented Dec 9, 2020

It's a real PITA to install TF compatible CUDA+cuDNN unless you're maybe using conda that can resolve it for you.

Nvidia likes to make it hard. I do not understand why you can download this only after registering. Their debian packages are strange as well, one extracts other debian packages in a cache folder to install them from there.

Compiling tensorflow using pip was never a problem here, but not every tensorflow is compatible with every cuda version. I guess here we need to blame Google.

What worked for me:

  • Install CUDA using debian packages
  • Install cudnn using the downloaded files
  • Find out which tensorflow works with the cuda version and install it with pip

I guess this is no enduser friendly way, but the main problem is nvidia and its wrappers around kernel modules and their license disaster, which prevents distributions from packaging everything needed.

@ghost
Copy link
Author

ghost commented Dec 9, 2020

Yeah, the registering is weird lol, but it's been this way with Intel too, so I've kinda given up lol.
wrt deb, it's less of an NVIDIA problem, since they develop one driver that gets packaged in a myriad of ways for each Linux package manager etc. For the longest time, the best way to install stuff was to download the .runfile/.tar.gz and set up all the paths yourself. Since then there's been a large influx of people using Linux that aren't up for the task of setting up paths and dynamic libraries and symlinks. The solution is either people packaging the .deb either from the graphics ppa or standard repo to test and invest more time instead of just packaging the latest unix driver OR just telling the users to install the NV driver(ppa or standard repo) and just use containers. For most non-gaming tasks, containers end up being easier so you don't end up messing up your system when you upgrade or something goes bad during the installation.
wrt TF: it's my understanding pip doesn't do any compilation, just pulls down the latest wheel from pypi. And yeah, TF installation really turns me off, and is at least 10% of the reason why I prefer Pytorch which is so much easier(all you need is the NV Unix Driver) https://pytorch.org/get-started/locally/

@allo-
Copy link
Owner

allo- commented Dec 9, 2020

it's less of an NVIDIA problem, since they develop one driver that gets packaged in a myriad of ways for each Linux package manager etc.

When it would have a harmless license (even non-free) the distributions would package it. See the nvidia graphics driver for example. The problem are the parts, which are not allowed to be redistributed, for example, when Nvidia wants to know who's downloading cuDNN.

it's my understanding pip doesn't do any compilation, just pulls down the latest wheel from pypi.

pypi allows types different packages, like source tar, binary packages, wheels and I think some others. When you install a source package pip does the compilation (but you probably need to install the "libfoo-dev" packages in your system first). Tensorflow is often easy to install from the "manylinux" wheel, but on the other hand this means for example that the build is not optimized for your cpu, i.e., it does not use modern instructions that are not available on older CPUs, which are still supported by tensorflow. I think tensorflow (or pytorch?) even prints a warning "Your cpu supports avx2, but this binary is not built with avx2 enabled".

These details are, why it would be best to link to a nice howto, which is maintained by someone else, who updates it as needed. Who knows what's needed for the next tensorflow version to run smoothly.

@ghost
Copy link
Author

ghost commented Dec 12, 2020

re driver: yeah, not a big fan of NVIDIA asking questions while downloading stuff, although I'm okay with it since I'm using their bandwidth(drivers and those propeitary blobs aren't small :P)
Interesting, I didn't know that pip could do compilation. Wrt providing a link maintained by someone else for TF, that's why I think we should either provide Docker instructions for GPU, since it would work even if there isn't a GPU, and would also use for eg the avx2 instructions on the CPU if they are supported. The con to that is that the binary size would be larger, which I think is something we can deal with when we come to it.

@ghost ghost closed this as completed Feb 6, 2021
@allo-
Copy link
Owner

allo- commented Feb 6, 2021

Let's keep that one open. I will in the near future not describe a docker setup like in #51, but a tracking bug for more documentation on CUDA is useful.

@allo- allo- reopened this Feb 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant