You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@golu099 brought up GAPPadder as a tool to potentially replace Gap2Seq for filling gaps in seq coverage between scaffolded contigs of de novo assemblies. We should evaluate it, potentially using synthetic read sets generated by something like wgsim—likely combining reads from multiple similar genomes to approximate sequencing a sample from a mixed viral population.
We may also want to take a look at some of the other scaffolding/gap filling tools, like RagTag (NB: RagTag fills gaps in de novo assemblies using sequence data from assemblies, not from reads).
The text was updated successfully, but these errors were encountered:
Use -r to infer the gaps or it will add 100bp gaps, set min gap length with -g to 2 when inferred, there is an issue relating to gap lenght of 1 (for some reason absolute minimum is 2). Everything else on default, I used minimap2 as the aligner. One very annoying issue is that if your contigs overlap, ragtag will add 100bp gap (irrespective of the above settings).
I did some more testing yesterday and ran into this issue again:
One very annoying issue is that if your contigs overlap, ragtag will add 100bp gap (irrespective of the above settings)
I now think this behavior changes depending on the -r + -g options. If they're not set, ragtag adds 100bp gap. If they are set, it will not scaffold the second overlapping fragment.
An aside:
The overlaps occur with spades (skesa doesn't have this problem). It's related to the max kmer in the k range, where the overlap is kmer-2 in length. I keep increasing the kmer size (up to 103 now) but this issue crops it's ugly head. My samples could contain mixtures (either quasi viruses or coinfection) but its hard to tell.
@golu099 brought up GAPPadder as a tool to potentially replace Gap2Seq for filling gaps in seq coverage between scaffolded contigs of de novo assemblies. We should evaluate it, potentially using synthetic read sets generated by something like
wgsim
—likely combining reads from multiple similar genomes to approximate sequencing a sample from a mixed viral population.We may also want to take a look at some of the other scaffolding/gap filling tools, like RagTag (NB: RagTag fills gaps in de novo assemblies using sequence data from assemblies, not from reads).
The text was updated successfully, but these errors were encountered: