- GUNC will now report the number of failed genomes if some fail when running in batch mode.
- Fixed cases where number of mapped genes might be different when running smaller genomes individually vs grouped.
- Fixed cases where gunc would fail if _-_ was in a sequence identifier.
- Fixed cases where contig colour in visualisation was incorrect.
- Fixed incorrect ordering of taxonomic levels in plot if user supplies them in an incorrect order.
- Allow GUNC to continue if some genomes fail diamond mapping.
- Fixed crash if diamond fails to map anything to GUNC reference.
- Fixed bug where GUNC goes into infinite loop if an input filename is merged.fa
- class and order are now included in contig assignment output file (@aaronmussig)
- Fix pandas deprecation waring (@tmaklin)
- Added fixes for pandas FutureWarning (changing behaviour of Series.idxmax)
- Added option to plot specific contigs
- Added option to plot ALL contigs
- Handle cases in which no genes are called with a more useful error message.
- Added check if temp_dir exists before starting analysis.
- Removed dependency on zgrep (for compatability with nf-core tests)
- Added contig_taxonomy_output option to output detailed taxonomy assignment count per contig.
- Fix version of dependancy in conda recipe: requests>=2.22.0
- Running with genecalls as iput failed
- GUNC plot contig_display_num displayed a defined number of genes instead of contigs
- GUNC can now be run with GTDB database
- Added option to download GTDB_GUNC database
- Input file options can be gene_calls (faa) instead of fna if --gene_calls flag is set
- Input genecalls can be gzipped
- Output maxCSS file is now sorted
- Fix version of dependancy: requests>=2.22.0 (older versions not compatable)
- Better error message if gunc_db does not exist
- checkm_merge didnt work with unless checkm qa was run with -o 2
- Documentation updates
- Links to synthetic datasets added
- Citations for diamond and prodigal added
- Clarified how checkM should be run for checkm_merge
- Corrected command for download_db
- Check if fasta is given with -f option instead of list of filepaths
- Running from genecounts failed
- Fixed case where pass.GUNC output was converted to ints
- Fixed silently ignoring input samples that did not map to reference
- Better error message if ouput_dir doesnt exist
- Documentation updates
- Added option to download the GUNC_DB file
- Added option to merge GUNC output with checkM output
- Added option to create interactive HTML based visualisation
- Added option to run all fastas in a directory
- Added option to provide input filepaths in a file
- Added min_mapped_genes option so scores are not calculated when there are not enough genes
- Added use_species_level option for determining tax_level with maxCSS score
- Can now accept gzipped fna files (with .gz ending)
- Allow GUNC_DB to be supplied using an env var
- Updated arguments to a subcommand structure
- Complete rewrite of how scores are calculated
- Gene calling is now done in parallel
- genome2taxonomy was not included in pip package
- GUNC failed if nothing left after minor clade filtering
- If duplicate filenames were in input, output files were overwritten
- Inputs that dont map any genes to GUNC_DB were silently missing in output
- Documentation updates
- sklearn dependency removed
- Added the bioconda recipe to repo
- Added check for zgrep, prodigal and diamond
- Changed output names to match those in paper
- Fixed diamond version to 2.0.4 (needs to be compatable with GUNC_DB)
- Better quality LOGOs
- Diamond logs are silenced
- Timestamps added to log output
- First release