-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiprocessing bug when using CPU only #177
Comments
Hi Sami, I was able to reproduce the issue if there are fewer spectra than threads when running on CPU-only. Can you try doubling the number of spectra in your file (which will then exceed your 64 cores) to see whether that works? |
Hi Wout, I did increase the amount of spectra, but unfortunately I got another crash. The larger archive is available at: https://node.dy.fi/files/medium_archive.mgf BR,
|
@bittremieux, #176 may fix this issue, but we'll need new weights to know for sure. |
I agree. There have been a few CPU-only related issues in the past, which should normally be addressed in the next major release of Casanovo. This bug seems a new one though, so we'll have to double-check it when we have the new version. |
In fact, when I run with this medium archive I get "Out of memory" errors from Linux kernel after which processes are being killed. The machine has 228G of RAM memory and still it is not enough. Is there a way to reduce casanovo's memory consumption. I can see there are lots of threads. Is it possible to reduce the amount of threads and if it is, would it help to reduce memory consumption? |
Casanovo by default uses all available cores. Unfortunately I don't have great insights as to memory consumption when running on CPU only with lots of cores, because we haven't really used it like that. I have locally verified that Casanovo uses a few GB / core in such a situation though, so this could indeed problematic when running on a pc with lots of cores. There currently isn't a direct config option to restrict the number of cores that Casanovo will use, but on Linux you can use For example, to run Casanovo on the first 8 cores only:
Note that this can be very slow though. The medium_archive file took 1 hour with 8 CPUs and 14 seconds with a GPU. We are currently already working on changes to make the number of CPUs configurable within Casanovo itself (#176), and this will be provided in the next Casanovo release. |
Hi, With taskset the run did complete without errors. I did run a mgf with 237828 entries. It seems that casanovo did not find anything: Could it be that 1) CPU run does not work or mgf is too large, which prevents identifications for some reason or casanovo's default model does not contain for example EColi that is in this mgf? |
I assume it took quite a while for that many spectra on only 8 CPUs? The log doesn't indicate that anything is awry. Did you see the timer progress on the console output? Casanovo works independently of the species, so the final option is not relevant. I also don't think that file size should be a problem. There are known issues with running on the CPU that we are in the progress of fixing, but those result in Casanovo crashing, not the output being empty. To diagnose the problem, can you check whether the sample MGF file works correctly or not? And please share both the log file and the console output. |
I think the timer constantly shows "Predicting: 0it [00:00, ?it/s]". I did now run with the test data. Output file still looks quite empty. The output files can be found from: https://node.dy.fi/files/casanovo/ I managed to get a machine with a GPU, I will try with that next. |
I did run casanovo (with sample_preprocessed_spectra.mgf) in a new machine that has NVIDIA GA100 graphics card. The casanovo_out.mztab still did not have any identifications. Can I somewhere check that casanovo did use GPU? |
If the GPU was utilized correctly, that does at least indicate that the problem is not due to CPU-only, but that there is some other issue. But it is very weird that you're getting empty output on two different systems without doing anything out of the ordinary, and it's unfortunately very hard to diagnose what the problem could be. Normally there should be an indication that Casanovo used the GPU in the log and on the console output. Additionally, while Casanovo is running, you can use |
The GPU drivers were not properly installed, so casanovo did run on the CPU. Once GPU drivers were properly installed, Casanovo did run successfully and returned the results as expected. |
Thanks for verifying. We've now merged updates that should hopefully fix previous CPU issues, which will be included in a new release soon. |
Hi,
I was trying casanovo with a mgf file. The machine has no GPU.
I am running casanovo with following command:
casanovo --mode=denovo --peak_path=small_archive.mgf --output=casanovo_out.txt
The small_archive file is available at: https://node.dy.fi/files/small_archive.mgf
I wonder why does this crash happen, are there some configuration parameters I could try?
BR,
Sami
The text was updated successfully, but these errors were encountered: