Thank god for this, you saved me #95

arnicas · 2022-10-07T08:26:30Z

I can't figure out how to say thank you except to file this and say you just saved me after 2 days of wasted broken installs and deps on an A100.

pmeier · 2022-10-07T08:41:27Z

Thank god for this

Well, you could thank me instead 😛 Jokes aside, thanks for posting this. It means a lot. I'm glad that this project helped you out.

If you find the time, could you tell me how your envs were broken before? While working on #73, I figured that detecting broken envs would also be good addition for this tools. In the future we might even have ltt fix to fix the env automatically. But all of this depends on correctly detecting broken envs and the reason why they are broken in the first place.

Plus, did you encounter any issues with light-the-torch? Was the usage clear just from the README?

arnicas · 2022-10-07T09:34:59Z

Hah - it was great. My only request would be to add torchvision (and maybe torchaudio) as extra possible installs along with? I held my breath when I added it but it worked ok.

The envs were broken because I didn't know what I was doing on the installation. My nvidia-smi said one thing about cuda (11.6), my nvcc -V said another, the pytorch site had no supporting versions in their little "helper", and I couldn't figure out how to get the right +cu on the install. So various libs that need torch tried to install it, and different versions, and actually broke my installs of older versions that i had gotten working with cuda. So then I also got repeated errors from cuda/torch saying that my driver with sm_80 wasn't supported etc. Shrug, hard to describe the chaos.

Thanks again!

pmeier · 2022-10-07T10:12:46Z

My only request would be to add torchvision (and maybe torchaudio) as extra possible installs along with? I held my breath when I added it but it worked ok.

Most PyTorch distributions, including torchvision and torchaudio are already supported. There is no public list, but I'm periodically monitoring their indices to check if we are missing something.

light-the-torch/light_the_torch/_patch.py

Lines 38 to 50 in eda21f3

    
           PYTORCH_DISTRIBUTIONS = { 
        
               "torch", 
        
               "torch_model_archiver", 
        
               "torch_tb_profiler", 
        
               "torcharrow", 
        
               "torchaudio", 
        
               "torchcsprng", 
        
               "torchdata", 
        
               "torchdistx", 
        
               "torchserve", 
        
               "torchtext", 
        
               "torchvision", 
        
           }

If something breaks, feel free to reach out.

My nvidia-smi said one thing about cuda (11.6), my nvcc -V said another

That is indeed very confusing and I fell for it myself in the beginning. nvcc reports the version of the CUDA toolkit you have installed, while nvidia-smi reports the version up to which compiled CUDA code is supported on your machine. So in your case, you can use everything with CUDA<=11.6, which I believe is most of the binaries PyTorch provides at the moment.

Plus, unless you actually want to compile CUDA code, e.g. building PyTorch from source, you don't need the CUDA toolkit installed on your system at all. PyTorch ships everything you need at runtime inside the wheels. This is why they are so large. You only need to have the driver installed.

TL;DR if you ever have to install PyTorch wheels manually again, trust nvidia-smi.

pmeier mentioned this issue Oct 23, 2022

make it more clear that ltt helps install *all* PyTorch distributions #103

Merged

pmeier closed this as completed in #103 Oct 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thank god for this, you saved me #95

Thank god for this, you saved me #95

arnicas commented Oct 7, 2022

pmeier commented Oct 7, 2022

arnicas commented Oct 7, 2022

pmeier commented Oct 7, 2022 •

edited

Loading

Thank god for this, you saved me #95

Thank god for this, you saved me #95

Comments

arnicas commented Oct 7, 2022

pmeier commented Oct 7, 2022

arnicas commented Oct 7, 2022

pmeier commented Oct 7, 2022 • edited Loading

pmeier commented Oct 7, 2022 •

edited

Loading