-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: CUDA error: no kernel image is available for execution on the device #6
Comments
what nvidia driver are you using? I have 3090 and recommend latest cudatoolkit 11.2 / driver 460 - you can check with - pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html
https://developer.nvidia.com/Cuda-downloads related |
I followed your suggestions and reinstalled everything, but the error persists. |
I used run_docker.sh / maybe try updating docker? I made a PR that added in extra params for 3090 card. Though it worked without. Do you have 2 gpus? Or just 1? Are you sure you’re on cuda11.2 not 11.1? |
Here is my configuration. I only have 1 gpu. I'll try docker later, but I do hope the issue can be solved in Windows since I've never used docker before. |
Hi, can you post some extra details about your environment? Here's the procedure that pytorch bug process requires. Preferably run the script and post results (as text please, not screenshot for this one). EnvironmentPlease copy and paste the output from our You can get the script and run it with:
|
Id just throw in another hard drive and boot to Ubuntu / pop-os. https://pop.system76.com/ You can get Linux subsystem to work with windows / but you could waste days getting it working. I’m using iMac and connect to hp workstation running pop-os via RemoteDesktop.google.com |
Many people have been able to get started with StyleGAN2-ADA on native PyTorch on Windows 10 without problems. PyTorch is pretty well packaged, so it's a lot easier than what it used to be with SG2. While I prefer Linux myself, I wouldn't give up on Windows 10 just yet.. Awaiting for @xielongze to post the results of the env collection script. |
Thanks a lot!
Below is the script result. OS: Microsoft Windows 10 专业版(Professional) Python version: 3.7 (64-bit runtime) Versions of relevant libraries: |
Thanks! Is PyTorch working for you otherwise on the GPU? Try running the below commands in your Python interpreter:
|
As far as I'm concerned it works pretty well, but the output is slightly different from yours. Is there any version incompatible here? (style-gan) F:/>python
|
I'm using the Anaconda3 2020.11 version from the Anaconda website and I'm running Linux. I guess that explains the difference. Worth trying out also: delete this directory and its contents and rerun:
|
Did it but still no luck. |
Can you run Did you configure packages for cuda using conda or pip ? I’m aware of cuda toolkit having problems with 3090 on conda latest commit. |
Here is the result of pip freeze I did install cuda using conda, but I'm not sure if the problem is with conda. |
1 similar comment
Here is the result of pip freeze I did install cuda using conda, but I'm not sure if the problem is with conda. |
I don't know if this helps, but when we've been installing PyTorch, we installed pytorch using conda (as per pytorch.org instructions for CUDA 11.0 and conda). This automatically installs whatever cuda toolkit package pytorch needs. But in addition, you need a separate CUDA toolkit installation from NVIDIA's website so that custom extension builds work (they spawn nvcc among other things -- these tools are not installed when you install pytorch.). I think you already have this, just mentioning this for completeness. I'm running out of things to try. Probably one more thing to try is to do a full reinstall of Python and ensure that you're running what you intended. For us the full Anaconda distribution version 2020.11 has worked well, you need only a couple of extra packages on top of that. But like John says above, some versions apparently have problems with 3090. FWIW, I successfully ran StyleGAN2-ADA pytorch on RTX 3090 yesterday on Linux and it was working just fine. Also a colleague at work was running it on Windows with Anaconda using what I believe was pytorch 1.7-cu110 with either a CUDA 11.1 or CUDA 11.2 toolkit installed with NVIDIA's installers. One more thing: please post the build.ninja file from here |
Can you get torch-1.8.0.dev20210129 installed? you're on 1.7 - don't use conda to install pytorch. I had similiar issue - this worked and resolved all issues pip uninstall torch
pip install --pre torch torchvision torchaudio -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html Successfully installed torch-1.8.0.dev20210129+cu110 |
By any chance, could you list all of the dependencies of this project, including cuda&cudnn versions, python packages as well as their versions? I'll try to install the same packages and see if it will work. Thanks in advance. |
I don't have that at hand as I work on Linux, but in a nutshell here's what I know has been working (a colleague used this yesterday - but you should double and triple check what John's been saying in this thread too):
Ensure that you don't have other versions of CUDA toolkit in anywhere your system. You should NOT need to separately install cudnn as the step #2 above will install it with pytorch. I can't give much more precise information as I don't have a Windows machine to debug this on and I cannot reproduce this on my system. BTW: I think you missed my earlier question: "One more thing: please post the build.ninja file from here %USERPROFILE%\AppData\Local\torch_extensions\torch_extensions\Cache. That might have some clues about what tools get picked up in extension builds." |
Don't know if it helps, but Jeff Heaton has a Windows guide - https://www.youtube.com/watch?v=BCde68k6KXg |
Sorry I missed that out. Below is the file content. I noticed that I did download and install cudnn from Nvidia separately Is that a problem? Will try to reinstall everything as you suggest next week and see if that helps. ninja_required_version = 1.3 cflags = -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -IC:\ProgramData\Anaconda3\envs\style-gan\lib\site-packages\torch\include -IC:\ProgramData\Anaconda3\envs\style-gan\lib\site-packages\torch\include\torch\csrc\api\include -IC:\ProgramData\Anaconda3\envs\style-gan\lib\site-packages\torch\include\TH -IC:\ProgramData\Anaconda3\envs\style-gan\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include" -IC:\ProgramData\Anaconda3\envs\style-gan\Include -D_GLIBCXX_USE_CXX11_ABI=0 /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc rule compile rule cuda_compile rule link build bias_act.o: compile F$:\python_projects\stylegan2-ada-pytorch-main\torch_utils\ops\bias_act.cpp build bias_act_plugin.pyd: link bias_act.o bias_act.cuda.o default bias_act_plugin.pyd |
Not sure if it's a problem, just mentioning that this should not be necessary with PyTorch. |
Finally, I am able to get the project to work! I still have no idea what went wrong last time, but below is what I did to make it right. Hope it's helpful. First, as @nurpax mentioned, completely remove python/Anaconda and install Anaconda3-2020.11. Make sure none of it remains in PATH when removing. Thank you soooooooo much for your help! |
Hi, how to solve the problem in the end, I encountered the same problem. |
Hi, @lff12940 You can try to change your pytorch version from https://pytorch.org/get-started/previous-versions/. It's work for me when I installed pytorch with |
Hi, I am also facing this problem. I use the following :
After I installed all the packages related to C++ on VS2019, I installed Cuda Toolkit.
I even ran Cuda 11.7, 11.8 and 12.0 on VS2019 and VS2022, But the error still exists. please help me out , |
I'm trying to run the sample code but it raises an error. I'm running on RTX 3090 with cuda 11.1(as the description recommends) and cudnn8.0.5. The message is attached below.
I'm able to run pytorch with cuda.
Do you have any idea how to solve this problem? Thanks in advance!
The text was updated successfully, but these errors were encountered: