-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] DAS_Tool starts but fails after calculating contig lengths (binning-prokaryotic.py) #45
Comments
I can take a look at this tomorrow. Can you try with sample https://zenodo.org/records/10094990 You'll need: https://zenodo.org/records/10094990/files/S1_scaffolds.fasta.gz?download=1 Is it possible you can upload the scaffolds and mapped.sorted.bam I can try out tomorrow (maybe e-mail instead of positing link here)? Can you try the following?
|
I'll upload the data to a drop box and send you a link, it's too big for email. Maybe it's not needed, as I can reproduce the error with the S1 dataset, as below?.
dastool --bins ${S2B_ARRAY[0]} --contigs ../veba_output/binning/viral/58348/output/unbinned.fasta --outputbasename ../veba_output/binning/prokaryotic_test_HPC/58348/intermediate/7__dastool/_ --labels ${S2B_ARRAY[1]} --search_engine diamond --score_threshold 0.1 --write_bins 1 --create_plots 0 --threads 32 --proteins ../veba_output/binning/prokaryotic_test/58348/intermediate/2__pyrodigal/gene_models.faa --debug This completed as expected. successfully finished I tried running only the dastools command as above from the veba conda_env (I was struggling to do it interactively in the container and ran out of time, I can do this if you want me to, but it seems the error is pretty "stable"), but it failed as previously. It appears to something to do with the dastools set up in the container / env rather than dastools per se, since it completed with my system installed dastools. I then noticed that you're using dastools1.1.2, my system used 1.1.3. I then re-ran the above using 1.1.2 and was able to replicate the error on my system (using dastools independent of veba). Maybe it's something to do with the version?????? Maybe updating the container to use the later versio would fix the issue (although not explain why it's occuring?)
Let me know what else to do on my end, thanks for the help! |
This is very helpful thank you. Trying to walkthrough this to diagnose: Here you're using 1.2.0 and here you're using 1.4.1 (https://hub.docker.com/r/jolespin/veba_binning-prokaryotic/tags):
I'll try with the singularity image. Can you share the command you used to build the singularity image from the docker image? |
Hmm...so I just did a fresh install for this module using the following yml (https://github.com/jolespin/veba/blob/main/install/environments/VEBA-binning-prokaryotic_env.yml):
I installed the module like this:
Here's the test files:
I just ran it to completion using 1 thread and 20GB of memory:
I'm running it with Docker on my MacOS system right now:
It's working right now but I'll update once it's done running locally (takes a little bit longer) but I confirmed it got past the DAS Tool step successfully. Note there aren't any changes in the binning-prokaryotic module between v1.4.1 -> v1.4.2 (which is why I didn't update the docker container for it): 8502b7a One bit I can try is to slim down the prokaryotic binning module to allow for an updated |
"Here you're using 1.2.0 I only used 1.2.0 to run I'm wanting to run the container versions if possible, it's much easier for me on the HPC here than setting local conda envs. I'm continually running into conflicts with all the conda envs / installs etc on hpc and the containers are much cleaner. I need to run them as singularity here, our HPC doesn't support docker...... To build the singularity containers: The containers for the previous modules worked as expected (preprocess, assemble, virbin), all built and deployed the same way. |
sorry, that version of dastool above be 1.1.3, not 1.3.0! |
Yea, neither does ours. I typically test locally with Docker. I haven't been able to get singularity installed and running correctly our our HPC unfortunately. Do you install singularity with conda/mamba? Also, how can you specify the volume mount points with singularity? I can try getting a working walkthrough for you. |
In the above examples I'm running singularity as an HPC module (as in "module load singularity", so it's installed by the sys admin, not by me in this case. I've previously installed from source as per https://docs.sylabs.io/guides/3.0/user-guide/installation.html, but it's preferred for me to use the module version. I haven't tried running your container interactively, only via "Singularity run etc" as above. I think the bind/mounts points are set/enabled/allowed by your admin in this case. You can use "singularity shell" to use the container interactively, and you can specify the mount points as per here, https://docs.sylabs.io/guides/3.0/user-guide/bind_paths_and_mounts.html. I'll take look at running it interactively later today. Is there a specific thing your looking for if I do? |
Ok I believe our HPC has the same usage for singularity https://www.sdsc.edu/support/user_guides/expanse.html#modules I'm pushing a new update Feb 1st so I'll give this a try when I do. Apologies for the inconvenience. These containers are supposed to solve these types of issues. |
Any progress here? I downloaded the container for 1.5.0 (veba_binning-prokaryotic_1.5.0) and ran it, but encountered the same error with the dastools step. Here are the last few lines of the dastool log, indicating the same issue....... processing query block 1, reference block 1/1, shape 2/2, index chunk 4/4. |
Looking into this right now. I did some setup changes on the repository so I'll double check that the docker container is working as expected. Here's a test to pull and check for DAS_Tool:
I just ran the new container on my local machine and it worked:
....
Can you try running the Docker container on your local machine? The binning is pretty low resource so it shouldn't be too hard on your system. Tomorrow I'll try and contain our server company to use singularity so I can test. |
I'm trying this right now but having issues with singularity loading the correct PATH within the container. Hopefully I can get some help on this: Apologies for the delay. I'll try to get this resolved ASAP. In the meantime, I have a work around that just worked for me:
...
I built the singularity container like this:
|
Hi Josh, In case it's useful I simply ran the workflow on HPC via SLURm scheduler as below. Thanks for updating, it seems to have fixed the issue! As an aside (maybe I should add it to a new "issue"?), I did run into an error with checkM2 not liking paths greater than some number of characters (OSError: AF_UNIX path too long). Simply reducing the number of characters in the paths removed the error, I don't think it's an issue if running your default path names from the examples, but could be if testing things and making output paths longer (as in my case).
|
This is EXTREMELY useful information. Thank you! I'm still learning about singularity and have never heard of apptainer but this seems like a much more straightforward implementation.
I've encountered this issue too. One workaround I've used is by specifying the temporary directory:
Give this a try here:
Can you create a new issue for this? I feel like other people have encountered this before too (i certainly have). |
Describe the bug:
binning-prokaryote fails at 7__dastools step. dastools appears to run/start but fails after calculating contig lengths.......
Versions
veba_binning-prokaryotic_1.4.1.sif and the equivalent conda install/env
Command used to produce error:
When running veba_binning-prokaryotic using the container veba_binning-prokaryotic_1.4.1.sif, dastools (step 7) does not complete, causing the workflow to fail.
I'm setting all of the inputs as per the docs:
export VEBA_DATABASE=/scratch3/bis068/veba/db
N_JOBS=32
N_ITER=1 #this is set to 1 to make the error show faster, set as 10 usually as per docs
ID=548348
OUT_DIR=veba_output/binning/prokaryotic/
FASTA=veba_output/binning/viral/${ID}/output/unbinned.fasta
BAM=veba_output/assembly/${ID}/output/mapped.sorted.bam
and then the command used to run the workflow module was:
singularity run veba_binning-prokaryotic_1.4.1.sif binning-prokaryotic.py -f ${FASTA} -b ${BAM} -n ${ID} -p ${N_JOBS} -o ${OUT_DIR} -m 1500 -I ${N_ITER} --skip_maxbin2
It's kind of weird, it looks like dastool starts running and then for some reason stops? I had the same issue when running via the conda environments. I've switched to the containers, conda envs become kind of complicated on our HPC, and thought this might solve the problem (or avoid it really I guess), but it didn't.
The previous steps all seem to run without problem using the containers (prepocess, assembly, bin-viral).
log files
Returncodes for all steps prior are "0" (1__ to 6__)
7__dastool.e.txt
7__dastool.o.txt
7__dastool.returncode.txt
The text was updated successfully, but these errors were encountered: