-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WARNING: At least one BLAST run failed. ANIb may fail and The condensed distance matrix must contain only finite values #267
Comments
Hi @genomesandMGEs - thank you for your interest in I expect the issue is as the warning says: a BLAST run failed. Failure here includes writing no output, which can happen when there is no identifiable homology between two genomes ( In your position, I would identify the pair(s) of genomes with no identifiable homology and modify the input dataset accordingly. There are several methods for achieving this, including running a tool like I hope this is helpful to you. L. |
Hi @widdowquinn , Thanks for the quick reply. I find it difficult that some pair of genomes have no identifiable homology, since all these genomes belong to the sames species, and I used a filter to only include genomes with a maximum distance to the reference of 0.05 (~95% ANI). Is it possible that ANIb can't handle such large dataset? I'm trying to run with ANIm now to see if it works. |
I have run pyani on similarly-sized datasets, so I would not expect this to be the issue. When you determine genome distance, do you take coverage/alignments fraction into account? (It’s possible to get falsely high identities because a very small region of genome is being aligned, for instance) It’s worth keeping in mind also that existing species assignments can be inaccurate. If you can identify the specific BLAST output that gives the error, that may help the diagnosis. |
That's a good point - I used panacota to donwload the genomes of interest and filter by the 0.05 max distance, but there's no mention of the coverage/alignment fraction cut-off. I just ran bindash on my 2009 genomes, and here's the first 10 hits. Table was sorted by col 5, that represents shared k-mers/total k-mers: PSAE_0321_00764.fasta PSAE_0321_00952.fasta 2.8550e-02 0.0000e+00 0.378418 So, the genome pairs with the smallest fraction of shared k-mers is ~38%. Do you think this will be problematic for pyani's ANIb? |
You'll have a better idea of that if you, for instance, put those genomes into a folder with your reference and run ANIb, then inspect the BLAST output that gets written. If you use If BLAST falls over for whatever reason and doesn't produce an output, this will be problematic for |
FWIW I try not to encourage ANIb - the ANIm algorithm is more robust and stable in part because it doesn't require arbitrary fragmentation of genomes, or for properties of the alignments of those specific (yet arbitrary) fragments to be met. |
Just a heads up - just ran ANIm using Process ForkPoolWorker-20: During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/connection.py", line 205, in send_bytes During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes During handling of the above exception, another exception occurred: During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/queues.py", line 378, in put During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/queues.py", line 378, in put During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/process.py", line 108, in run During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/queues.py", line 378, in put During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes During handling of the above exception, another exception occurred: BrokenPipeError: [Errno 32] Broken pipe During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/connection.py", line 373, in _send During handling of the above exception, another exception occurred: File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/multiprocessing/pool.py", line 136, in worker During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): |
Can you get |
Yes, tried using ANIm just now on a way smaller dataset (n=20), and it worked perfectly. I'm running this command as a batch process in a server, so I guess disk space won't be a problem. Or do you think I should contact the IT support and check if there's some kind of limitation for this? |
Looking at your output it appears like you're using SLURM. There may be some issues with how |
Thank you for the tip. I contacted the IT support and they told me that from the Slurm accounting, it seems not enough RAM was available to run this job. So, I extended it to a max of 1400GB and ran ANIm for 5 days as a batch process, but the job didn't finish. You said you ran similar size datasets before using this tool; can you please let me know how long it took to complete the task? |
For a set of 1680 genomes:
which comes out at around 7d, 3.5h. |
Thanks for sharing this, will try to run again using a longer duration |
So, the job failed after 7 days running in the cluster server (on a bigmem node ~1400GB with 32 cpus), with an out of memory error. Do you suggest any of the following:
Thanks in advance for your time! |
Which stage was If it's falling over during BLAST runs, there's nothing much we can do to make BLAST more efficient. If it's falling over after the comparison runs are finished, then it may be that the internal data structures holding the compiled outputs are too large to be processed. This compilation step takes place on the node you run The error message you receive would be helpful for diagnosis, if you still have it. (BTW we're working to avoid that issue for larger datasets in v0.3, by working towards asynchronous update of a local database when each comparison ends - we know the current design has this kind of problem, which is a bottleneck for time as well) |
Thanks (again) for the quick and detailed reply. Here's the error output: Please let me know how I should proceed, and I can then discuss this with the IT support. |
That looks like it's falling over during comparison runs. It looks like a SLURM weirdness to me. I must confess I don't understand it properly, yet - and my SysAdmin friends tell me that local configurations do, apparently, make a difference. I've had similar troubles on SLURM with If it helps your IT guys diagnose, then what happens is:
I think your runs are failing somewhere in part 1, but I don't know why, I'm afraid. We're looking at a PR for v0.3 which specifically integrates with SLURM, but it's not on the main branch yet. Fingers crossed that might solve your issue, when it's ready. I'm sorry you're having these troubles right now, though. |
I agree it seems like the issue is on the SLURM side of things, probably still to do with the resources being requested versus what is actually needed. Have you tried this?: $ sacct --jobs=your_job-id I am more familiar with SGE schedulers, but this should output a bunch of information about the job, and if it's similar to SGE's If you post it here, maybe we'll spot something useful; but if you talk to IT, you should definitely share the output from that. |
Good idea re: sacct -j <jobid> --units=G --format JobID,MaxVMSize,MaxRSS,NodeList,AllocCPUS,TotalCPU,State,Start,End where One of my colleagues had a weird issue recently where they requested 16GB per task, and were somehow allocated 500GB. There was a discussion that followed about how to assign memory with SLURM jobs, and this came from one of the sysadmins:
I don't know enough about SLURM to advise on good settings here, though. |
Thank you both for sharing your thoughts. I had a talk with the IT support, and they said this confirms that the calculation is using extreme amount of memory and was killed by slurm due to out of memory. According to them, the real question is why it is using so much memory and if this is reasonable with 1300GB+. They argue the problem is independent of Slurm. They asked me to share the output from the top command and the figure they produced in the pdf file: output_top.txt Also, they warned me about one process in the list below 3475335 szrzs212 20 0 510.6g 505.3g 23012 S 0.0 33.5 49:42.26 average_nucleot that is continuously accumulating main memory. Maybe that is some kind of bug or this process is not correctly handled by pyani? Thanks again! |
There are a couple of things to comment on here, I think. The first is that currently we don't support parallelisation with SLURM directly in What really needs to happen is that each individual pairwise comparison should be carried out as a single job in an array of jobs (this is how the SGE scheduler currently works). I would expect that to keep memory requirements local to each comparison. Because of the lack of SLURM support in It looks like that task of managing all the necessary comparisons using However, we do have a PR (#236) for SLURM support that is not yet tested or integrated into the main codebase. This should enable array jobs using SLURM in the same way we currently do for SGE. I would expect that to solve your problem. Using this would mean moving up to v0.3, but this is not available through I appreciate this isn't the positive answer you're looking for to move your analysis on quickly, but I think it's where we are, just now. I want to get SLURM support going in the main code quite soon, so hopefully you won't need to wait too long. Many apologies, L. |
Hey,
I got an error while running ANIb on a large dataset (>2k bacterial genomes).
I installed pyani via conda, version 0.2.10 and ran this command:
average_nucleotide_identity.py -i ./ -o ANIb_output -g --gformat svg,png -m ANIb --gmethod seaborn
This is the error I got:
WARNING: At least one BLAST run failed. ANIb may fail.
/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py:649: UserWarning: Clustering large matrix with scipy. Installing
fastcluster
may give better performance.warnings.warn(msg)
Traceback (most recent call last):
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/bin/average_nucleotide_identity.py", line 977, in
draw(methods[args.method][1], gfmt)
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/bin/average_nucleotide_identity.py", line 809, in draw
pyani_graphics.heatmap_seaborn(
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/pyani/pyani_graphics.py", line 200, in heatmap_seaborn
fig = get_seaborn_clustermap(dfr, params, title=title)
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/pyani/pyani_graphics.py", line 144, in get_seaborn_clustermap
fig = sns.clustermap(
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/_decorators.py", line 46, in inner_f
return f(**kwargs)
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 1408, in clustermap
return plotter.plot(metric=metric, method=method,
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 1221, in plot
self.plot_dendrograms(row_cluster, col_cluster, metric, method,
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 1066, in plot_dendrograms
self.dendrogram_row = dendrogram(
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/_decorators.py", line 46, in inner_f
return f(**kwargs)
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 774, in dendrogram
plotter = _DendrogramPlotter(data, linkage=linkage, axis=axis,
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 584, in init
self.linkage = self.calculated_linkage
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 651, in calculated_linkage
return self._calculate_linkage_scipy()
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/seaborn/matrix.py", line 619, in _calculate_linkage_scipy
linkage = hierarchy.linkage(self.array, method=self.method,
File "/gxfs_home/cau/sunzm592/anaconda3/envs/pyani/lib/python3.9/site-packages/scipy/cluster/hierarchy.py", line 1065, in linkage
raise ValueError("The condensed distance matrix must contain only "
ValueError: The condensed distance matrix must contain only finite values.
Thanks!
The text was updated successfully, but these errors were encountered: