-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPAdes assembler crashed due to odd read correction #188
Comments
Hi, Am I missing something or might be there some internal issue with unpaired short data provided via So, this works: unicycler -1 1.fastq.gz -2 2.fastq.gz -o . --verbosity 3 --keep 3 -t 32 but this not: unicycler -1 1.fastq.gz -2 2.fastq.gz -s se.fastq.gz -o . --verbosity 3 --keep 3 -t 32 |
I don't think Unicycler supports multiple short-read libraries: #64 |
Thanks @dswan for the hint. Nevertheless, I think discussion in #64 are a little bit different as they're talking about different short read libraries. My question / bug report is dealing with a single short-read lib. I used https://github.com/OpenGene/fastp to QC and stitch the PE reads. Thus, I get PE, merged/stitched and remaining SE reads from a single short-read lib. As Unicycler offers parameters for both PE and SE short-reads, it might be very beneficial to accept and forward them to the SPAdes, as newer versions exactly accept this kind of short-read file sets. |
I'm curious as to why you're not just supplying the paired end library? What's the utility in merging a paired-end library and treating the merged pairs as a single end library, but retaining the pairs that don't stitch? I'm more intrigued if you have a specific use-case why this is beneficial than anything else! |
It's because you get slightly longer reads and thus larger kmers so SPAdes is potentially able to compute better assemblies. The new SPAdes version (>3.12.0) provides extra parameters in order to provide these files (-1/-2, -s, -merged) |
Thanks, I've seen this strategy used in eukaryotic genome assembly but only when supplemented with LMP libraries or other positional information, hadn't seen that this was part of the latest SPAdes release though! |
Does anyone else (@rrwick ) have an idea? |
Oliver shared his reads with me (thanks!) so I could try to reproduce this issue, but I failed to do so. I.e. when I assemble the same reads on my computer, I get the proper result:
So I think this might be a weird platform-specific SPAdes-related bug, and I don't think I can solve it. A decent workaround is to just turn off read correction by using Unicycler's Sorry for the lack of a solid resolution, but since I think there's nothing to be done on Unicycler's end, I'm going to close this issue now! |
Thanks @rrwick for the deep dive into this. As the error seems to be unreproducible but remains at our site I'll just use the |
Hi and thanks a lot for this great tool!
I use Unicycler a lot and so far it almost always did a great job.
I recently QCed and assembled SRR1609861 (Illumina PE only) with fastp and Unicycler and for some reason Unicycler crashes in the first kmer (K27) assembly iteration right after the read error correction step:
The SPAdes log says:
The Unicycler cmd:
$ unicycler -1 1.fastq.gz -2 2.fastq.gz -s se.fastq.gz -o . --verbosity 3 --keep 3 -t 32
se.fastq.gz
contains unpaired reads surviving the QC but lacking a valid mate.Indeed, the Unicycler-internal SPAdes-corrected read files are erroneous as the forward file only contains a fraction of the actual reads:
The fastp PE output seems OK and has exact the same number of reads:
Interestingly, a normal SPAdes assembly with the same SPAdes version (3.13.0) enabling internal read error correction finished without any problems.
I'm running the latest Unicycler (v0.4.7) on a native Ubuntu with 64 cores (HT), 256 Gb memory and local storage; so no VM issues should be involved here. I could also reproduce this issue on a different machine.
I'm equally puzzled and curious to know what exactly is the cause for this crash. Any help very appreciated! Please, let me know if you need anything else.
Best regards!
The text was updated successfully, but these errors were encountered: