-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow speed making Casanovo impractical for most shotgun data. #251
Comments
Running Casanovo on CPU-only is very slow unfortunately, and if there's a mismatch in the CUDA version Casanovo might inadvertently fall back to that. You have an older GPU, so we first need to ensure that it's actually being used. Can you share the Casanovo log file? That contains some information on whether a GPU was found. Additionally, can you check what the output of |
Thanks for the quick response, Wout! Here's the log file: https://regis-web.systemsbiology.net/PublicDatasets/mikeh/Casanovo/20210901-HeLa-01.log Yes, the nvidia-smi output looks very lackluster while running Casanovo (see image below). Any suggestions? Or perhaps some guidelines for minimum hardware requirements? Thanks much! |
Indeed, it doesn't seem like the Casanovo process is running or even registered on the GPU. I suspect that this is because your GPU is a bit older, but the log file unfortunately is not conclusive. We also don't have a similar GPU to test compatibility. |
Thanks for putting me in the right direction. I'm making progress. For anyone else who might have the same issues, here are the steps:
They key for me was actually the removal of pytorch rather than trying to update any of the packages in place. Updating the OS drivers definitely required the removal and reinstallation of pytorch/cuda on my system. A simple test to see if GPU compute is working is the following python code:
The results: Casanovo is going along faster. nvidia-smi indicates 100% usage, and the temperature is rising. I'll have to find a new GPU to really try to open the throttle. I am still concerned with the following warning: ".conda\envs\casanovo_env\lib\site-packages\pytorch_lightning\trainer\connectors\data_connector.py:224: PossibleUserWarning: The dataloader, predict_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the If I'm guessing correctly, n_workers is used to set num_workers. It is set to 0, as far as I can tell from the debug log: But the code for dataloaders.py(line number 74) indicates that the value should be 12:
Am I interpreting this correctly? Or should I not even be concerned? Is there a way to set this parameter without changing the code? |
The number of workers is platform-dependent. Specifically, on Windows, only a single worker thread can be used for data loading. This is maybe slightly sub-optimal, but shouldn't make that much difference in the end. Great that you managed to get the GPU working. What's the spectrum throughput on that system? When a GPU is available, this might no longer be the bottleneck, even with an older GPU. Other parts of the Casanovo code can be pretty slow, and this is an active action point for us. |
Thanks Wout, good to know. |
Casanovo is working great for amino acid sequence identification, but it's going really slowly in my hands. I do have an NVIDIA card, nothing special (Quadro P400), but it takes days to complete the analysis for the number of spectra I can acquire in a single hour. At this rate, I can't keep pace with how fast I am collecting spectra. Is there something I'm doing wrong? Or perhaps a way to improve the algorithm speed? Casanovo performance seems way to slow for practical use, and I really, really want to use this software.
The text was updated successfully, but these errors were encountered: