Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate RagTag for gap filling scaffolded de novo assemblies #32

Open
tomkinsc opened this issue Apr 6, 2023 · 2 comments
Open

Evaluate RagTag for gap filling scaffolded de novo assemblies #32

tomkinsc opened this issue Apr 6, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@tomkinsc
Copy link
Member

tomkinsc commented Apr 6, 2023

@golu099 brought up GAPPadder as a tool to potentially replace Gap2Seq for filling gaps in seq coverage between scaffolded contigs of de novo assemblies. We should evaluate it, potentially using synthetic read sets generated by something like wgsim—likely combining reads from multiple similar genomes to approximate sequencing a sample from a mixed viral population.

We may also want to take a look at some of the other scaffolding/gap filling tools, like RagTag (NB: RagTag fills gaps in de novo assemblies using sequence data from assemblies, not from reads).

@dpark01 dpark01 added the enhancement New feature or request label Mar 19, 2024
@dpark01
Copy link
Member

dpark01 commented Aug 3, 2024

Some notes from @ammaraziz on slack about ragtag:

Use -r to infer the gaps or it will add 100bp gaps, set min gap length with -g to 2 when inferred, there is an issue relating to gap lenght of 1 (for some reason absolute minimum is 2). Everything else on default, I used minimap2 as the aligner. One very annoying issue is that if your contigs overlap, ragtag will add 100bp gap (irrespective of the above settings).

@ammaraziz
Copy link

I did some more testing yesterday and ran into this issue again:

One very annoying issue is that if your contigs overlap, ragtag will add 100bp gap (irrespective of the above settings)

I now think this behavior changes depending on the -r + -g options. If they're not set, ragtag adds 100bp gap. If they are set, it will not scaffold the second overlapping fragment.

An aside:
The overlaps occur with spades (skesa doesn't have this problem). It's related to the max kmer in the k range, where the overlap is kmer-2 in length. I keep increasing the kmer size (up to 103 now) but this issue crops it's ugly head. My samples could contain mixtures (either quasi viruses or coinfection) but its hard to tell.

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants