diff --git a/docs/output.md b/docs/output.md index a9a3b642..61068dcd 100644 --- a/docs/output.md +++ b/docs/output.md @@ -166,13 +166,27 @@ Sequencing depth per contig and sample is generated by `jgi_summarize_bam_contig - `--depth.txt.gz`: Sequencing depth for each contig and sample, only for short reads. -### Metabat +### MetaBAT2 -[metabat](https://bitbucket.org/berkeleylab/metabat) recovers genome bins (that is, contigs/scaffolds that all belongs to a same organism) from metagenome assemblies. Additionally, Quast is run again on all the genome bins. +[MetaBAT2](https://bitbucket.org/berkeleylab/metabat) recovers genome bins (that is, contigs/scaffolds that all belongs to a same organism) from metagenome assemblies. -**output directory: `results/GenomeBinning/MetaBat2`** +**output directory: `results/GenomeBinning/MetaBAT2`** -- `*.fa`: Genome bins retrieved from the different input assemblies +- `$assembler-$sample.*.fa`: Genome bins retrieved from input assembly +- `$assembler-$sample.unbinned.*.fa`: Contigs that were not binned with other contigs but considered interesting. By default, these are at least 1 Mbp (`--min_length_unbinned_contigs`) in length and at most the 100 longest contigs (`--max_unbinned_contigs`) are reported. + +All the files in this folder will be assessed by Quast and Busco. + +**output directory: `results/GenomeBinning/MetaBAT2/discarded`** + +- `*.lowDepth.fa`: Low depth contigs that are filtered by MetaBat2 +- `*.tooShort.fa`: Too short contigs that are filtered by MetaBat2 +- `*.unbinned.pooled.fa`: Pooled unbinned contigs equal or above `--min_contig_size`, by default 1500 bp. +- `*.unbinned.remaining.fa`: Remaining unbinned contigs below `--min_contig_size`, by default 1500 bp, but not in any other file. + +All the files in this folder contain small and/or unbinned contigs that are not further processed. + +Files in these two folders contain all contigs of an assembly. ### Busco