Skip to content

Commit

Permalink
DOC: update paper.bib
Browse files Browse the repository at this point in the history
  • Loading branch information
Vini2 committed Sep 26, 2024
1 parent cde5070 commit 89a246b
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 1 deletion.
8 changes: 8 additions & 0 deletions paper/paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -400,3 +400,11 @@ @article{Chandrasiri:2022
keywords = {Convex hull, Convex hull distance, Metagenomic binning, Multiple k values, High dimensional data clustering, Clustering algorithm},
abstract = {Metagenomics has enabled culture-independent analysis of micro-organisms present in environmental samples. Metagenomics binning, which involves the grouping of contigs into bins that represent different taxonomic groups, is an important step of a typical metagenomic workflow followed after assembly. The majority of the metagenomic binning tools represent the composition and coverage information of contigs as feature vectors consisting of a large number of dimensions. However, these tools use traditional Euclidean distance or Manhattan distance metrics which become unreliable in the high dimensional space. We propose CH-Bin, a binning approach that leverages the benefits of using convex hull distance for binning contigs represented by high dimensional feature vectors. We demonstrate using experimental evidence on simulated and real datasets that the use of high dimensional feature vectors to represent contigs can preserve additional information, and result in improved binning results. We further demonstrate that the convex hull distance based binning approach can be effectively utilized in binning such high dimensional data. To the best of our knowledge, this is the first time that composition information from oligonucleotides of multiple sizes has been used in representing the composition information of contigs and a convex hull distance based binning algorithm has been used to bin metagenomic contigs. The source code of CH-Bin is available at https://github.com/kdsuneraavinash/CH-Bin.}
}

@misc{Woodcroft:2017,
author={Woodcroft, BJ and Newell, R},
title="{WWOOD/coverm: Read coverage calculator for metagenomics}",
year={2017},
note= {Accessed: August 11, 2023},
howpublished = {\url{https://github.com/wwood/CoverM}},
}
2 changes: 1 addition & 1 deletion paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ The following inputs are required to run the `metacoag` subcommand.

* Contigs file
* Assembly graph file(s)
* A delimited file containing the contig identifier and its average read coverage for each contig - can be obtained by running a read coverage calculation tool such as CoverM ([https://github.com/wwood/CoverM](https://github.com/wwood/CoverM)) or Koverage [@Roach:2024]
* A delimited file containing the contig identifier and its average read coverage for each contig - can be obtained by running a read coverage calculation tool such as CoverM [Woodcroft:2017] or Koverage [@Roach:2024]

The assembly graph files can vary depending on the assembler used to generate the contigs. The metaSPAdes version requires the assembly graph file in `.gfa` format and the paths file corresponding to the contigs file in `.paths` format. The MEGAHIT version requires the assembly graph file in `.gfa` format. The metaFlye version requires the assembly graph file `assembly_graph.gfa` and the paths file `assembly_info.txt` from the final assembly output.

Expand Down

0 comments on commit 89a246b

Please sign in to comment.