Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing multiple PE libraries into Unicycler #64

Closed
ncimb opened this issue Oct 24, 2017 · 7 comments
Closed

Passing multiple PE libraries into Unicycler #64

ncimb opened this issue Oct 24, 2017 · 7 comments

Comments

@ncimb
Copy link

ncimb commented Oct 24, 2017

SPAdes supports the passing in of multiple PE and MP files:

spades.py --pe1-1 lib1_forward_1.fastq --pe1-2 lib1_reverse_1.fastq
--pe1-1 lib1_forward_2.fastq --pe1-2 lib1_reverse_2.fastq
-o spades_output

I'd like to do the same with Unicycler, is there something I'm missing to be able to do this?

@rrwick
Copy link
Owner

rrwick commented Oct 25, 2017

No, you're not missing anything - Unicycler just can't do that. It's a common feature request, but I haven't gotten around to it yet 😄

If your two libraries are quite similar (i.e. about same read length and insert size), then you should be able to just cat your read files together and give them to Unicycler as a 'single' library:

cat lib1_forward_1.fastq lib1_forward_2.fastq > reads_1.fastq
cat lib1_reverse_1.fastq lib1_reverse_2.fastq > reads_2.fastq

However, you said "PE and MP", so is one of your inputs a long insert mate pair library? Unicycler doesn't support those, so it probably isn't a good assembler choice for mate pair datasets. I'm not sure which assembler to recommend for that... I haven't encountered mate pair much in bacterial genomics and so don't have any experience.

Ryan

@ncimb
Copy link
Author

ncimb commented Oct 27, 2017

Thanks Ryan, I was specifically talking about PE libraries despite my vague question! I did wonder if I could pass them as a cat'd pair, but they're likely to differ on read length and insert size.

Apologies for posting a common feature request, should have looked at the previous issues more closely!

@ncimb ncimb closed this as completed Oct 27, 2017
@rrwick
Copy link
Owner

rrwick commented Oct 30, 2017

Even with different read length or insert size, you might get away with catting the files. It's worth a try, anyway.

That being said, when I've encountered this in the past using SPAdes (which does allow multiple libraries), I've usually found that the better of my two read sets assembles just as well as the combination of the two. So if one of your read sets is deeper or has more even read coverage than the other, you might do fine with that one alone.

@phbrito
Copy link

phbrito commented Dec 13, 2017

Hello,
If we concatenate reads from two (fairly similar) libraries as you suggested before does unicycler take care of duplicated reads or should we find a way to remove those before running unicycler?
Thanks!
Patrícia

@judzen
Copy link

judzen commented Dec 13, 2017 via email

@rrwick
Copy link
Owner

rrwick commented Jan 4, 2018

I've never actually tried using Filtlong on a short read set - it may work, but I make no guarantees! Generally speaking, there's no harm in having duplicated short reads. If I have way too many short reads and it's slowing things down, I sometimes cut the set down by trimming with a stringent qscore. I like Trim Galore for this sort of thing - set --quality and --length to large values to get rid of more reads.

@phbrito
Copy link

phbrito commented Jan 4, 2018

thanks for the suggestions! I wasn´t sure about Filtlong as it is described for long reads and will check Trim Galore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants