Slow speed making Casanovo impractical for most shotgun data. #251

mhoopmann · 2023-10-13T00:02:10Z

Casanovo is working great for amino acid sequence identification, but it's going really slowly in my hands. I do have an NVIDIA card, nothing special (Quadro P400), but it takes days to complete the analysis for the number of spectra I can acquire in a single hour. At this rate, I can't keep pace with how fast I am collecting spectra. Is there something I'm doing wrong? Or perhaps a way to improve the algorithm speed? Casanovo performance seems way to slow for practical use, and I really, really want to use this software.

bittremieux · 2023-10-13T08:13:28Z

Running Casanovo on CPU-only is very slow unfortunately, and if there's a mismatch in the CUDA version Casanovo might inadvertently fall back to that. You have an older GPU, so we first need to ensure that it's actually being used.

Can you share the Casanovo log file? That contains some information on whether a GPU was found. Additionally, can you check what the output of watch nvidia-smi is while running Casanovo and what its GPU resource consumption is?

mhoopmann · 2023-10-13T15:57:40Z

Thanks for the quick response, Wout! Here's the log file: https://regis-web.systemsbiology.net/PublicDatasets/mikeh/Casanovo/20210901-HeLa-01.log

Yes, the nvidia-smi output looks very lackluster while running Casanovo (see image below). Any suggestions? Or perhaps some guidelines for minimum hardware requirements? Thanks much!

bittremieux · 2023-10-15T07:58:23Z

Indeed, it doesn't seem like the Casanovo process is running or even registered on the GPU. I suspect that this is because your GPU is a bit older, but the log file unfortunately is not conclusive. We also don't have a similar GPU to test compatibility.

mhoopmann · 2023-10-17T19:46:13Z

Thanks for putting me in the right direction. I'm making progress. For anyone else who might have the same issues, here are the steps:

remove pytorch
update nvidia drivers to the latest
reinstall pytorch using the appropriate command at https://pytorch.org/get-started/locally/

They key for me was actually the removal of pytorch rather than trying to update any of the packages in place. Updating the OS drivers definitely required the removal and reinstallation of pytorch/cuda on my system.

A simple test to see if GPU compute is working is the following python code:

import torch
a=torch.cuda.is_available()
print(f'CUDA: {a}')

The results: Casanovo is going along faster. nvidia-smi indicates 100% usage, and the temperature is rising. I'll have to find a new GPU to really try to open the throttle. I am still concerned with the following warning:

".conda\envs\casanovo_env\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:224: PossibleUserWarning: The dataloader, predict_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argument(try 12 which is the number of cpus on this machine) in theDataLoader` init to improve performance."

If I'm guessing correctly, n_workers is used to set num_workers. It is set to 0, as far as I can tell from the debug log:
"2023-10-17 12:24:41,189 DEBUG [casanovo/MainProcess] casanovo.main : n_workers = 0"

But the code for dataloaders.py(line number 74) indicates that the value should be 12:

self.n_workers = n_workers if n_workers is not None else os.cpu_count()

Am I interpreting this correctly? Or should I not even be concerned? Is there a way to set this parameter without changing the code?

bittremieux · 2023-10-18T15:08:10Z

The number of workers is platform-dependent. Specifically, on Windows, only a single worker thread can be used for data loading. This is maybe slightly sub-optimal, but shouldn't make that much difference in the end.

Great that you managed to get the GPU working. What's the spectrum throughput on that system? When a GPU is available, this might no longer be the bottleneck, even with an older GPU. Other parts of the Casanovo code can be pretty slow, and this is an active action point for us.

mhoopmann · 2023-10-18T22:30:40Z

Thanks Wout, good to know.
I went big for my latest test: 114,000 spectra from a single run. It's still running (been about 27 hours). I suspect it will finish in 36 hours. This is a huge improvement; without the GPU, it would be 10 days. 10 days was not practical, but 36 hours I can work with. A new GPU or two, or renting some cloud GPUs, should allow me to scale up if I can get the budget. Any other speed improvements to the algorithm would be a huge bonus.

bittremieux added the question Further information is requested label Oct 13, 2023

bittremieux mentioned this issue Oct 23, 2023

Document how to check that the GPU is used #256

Merged

melihyilmaz closed this as completed in #256 Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow speed making Casanovo impractical for most shotgun data. #251

Slow speed making Casanovo impractical for most shotgun data. #251

mhoopmann commented Oct 13, 2023

bittremieux commented Oct 13, 2023

mhoopmann commented Oct 13, 2023

bittremieux commented Oct 15, 2023

mhoopmann commented Oct 17, 2023 •

edited

Loading

bittremieux commented Oct 18, 2023

mhoopmann commented Oct 18, 2023 •

edited

Loading

Slow speed making Casanovo impractical for most shotgun data. #251

Slow speed making Casanovo impractical for most shotgun data. #251

Comments

mhoopmann commented Oct 13, 2023

bittremieux commented Oct 13, 2023

mhoopmann commented Oct 13, 2023

bittremieux commented Oct 15, 2023

mhoopmann commented Oct 17, 2023 • edited Loading

bittremieux commented Oct 18, 2023

mhoopmann commented Oct 18, 2023 • edited Loading

mhoopmann commented Oct 17, 2023 •

edited

Loading

mhoopmann commented Oct 18, 2023 •

edited

Loading