Skip to content

Commit

Permalink
ignore too short consensus sequences
Browse files Browse the repository at this point in the history
  • Loading branch information
hsnguyen committed Nov 4, 2019
2 parents 21883c2 + 399a14d commit be18702
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
1 change: 1 addition & 0 deletions docs/npgraph.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ More features would be added later to the GUI but it's not the focus of this pro
All settings from the GUI can be set beforehand via commandline interface.
Without using GUI, the mandatory inputs are assembly graph file (*-si*) and long-read data (*-li*).
The assembly graph must be output from SPAdes in either FASTG or GFA format (normally *assembly_graph.fastg* or *assembly_graph.gfa*).
From new version of SPAdes, the output GFA file is *assembly_graph_with_scaffolds.gfa* which includes SPAdes path finding and scaffolding results. Sometimes, this might give additional mis-assemblies so the original graph of the building-block contigs (fastg file) is preferred.

The long-read data will be used for bridging and can be given as DNA sequences (FASTA/FASTQ format, possible .gz) or alignment records (SAM/BAM) as mentioned above. *npGraph* will try to guess the format of the inputs based on the extensions, but sometimes you'll have to specify it yourself (e.g. when "-" is provided to read from *stdin*).
If the sequences are given, then it's mandatory to have either minimap2 (recommended) or BWA-MEM installed in your system to do the alignment between long reads and the pre-assemblies.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -527,7 +527,7 @@ class BridgeSegment{
BDNode n=new BDNode(graph, "000"+AlignedRead.PSEUDO_ID++);
Sequence seq=graph.consensus.getConsensus(id, greedy);
//FIXME: review this case!
if(seq==null||seq.length()<=BDGraph.getKmerSize())
if(seq==null||seq.length()<Math.min(BDGraph.getKmerSize(),100))//ignore empty or too short sequences
return;

n.setAttribute("seq", seq);
Expand Down

0 comments on commit be18702

Please sign in to comment.