Canu v2.1.1
These are release notes for Canu version 2.1.1, which was released on October 16th, 2020. Canu is specialized for assembly of single-molecule high-noise sequences. Full documentation can be found at http://canu.readthedocs.org/.
This release provides a stable, tested, and documented version of the software. The binary distributions should work on any relatively recent version of the respective OS and are the recommended way to install Canu. The source code distribution contains everything you need to create a binary distribution for your own specific OS.
Citation
- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Research. (2017).
- Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM. De novo assembly of haplotype-resolved genomes with trio binning. Nature Biotechnology. (2018).
- Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Research. (2020).
Minimum Requirements
- 8GB minimum memory; 16GB strongly suggested
- GCC 4.5 (for compilation only); GCC 7 or newer strongly recommended
- Perl 5.12.0, or File::Path 2.08
- Java SE 8
- macOS 10.10 Yosemite (for macOS/Darwin binaries only)
- gnuplot 5.2 (optional, for generating diagnostic graphs)
Installation
Users can download Canu as source code or as pre-compiled binaries. The binary distribution is the recommended install method, assuming it is available for your platform. The source code package needs to be compiled and installed before it can be used.
Note that the installation directory has changed compared to previous releases.
To install from a binary distribution (recommended):
tar -xJf canu-2.1.1.*.tar.xz
Canu will be installed at canu-2.1.1/bin/canu.
To install from source code (DO NOT download the Source code files provided by GitHub as these will not compile, use the canu-2.1.1.tar.gz instead):
tar -xJf canu-2.1.1.tar.xz
cd canu-2.1.1/src
make -j 8
cd ..
Canu will be installed at canu-2.1.1/build/bin/canu.
Changes
Canu v2.1.1 IS compatible with assemblies started with Canu v2.1.
This minor release adds a small performance enhancement to consensus and fixes two crashes, one in consensus and one in bogart.
- Add multithreading for the final step of consensus, where it aligns the original reads back to the consensus sequence to find the read layout.
- Fix a systematic crash (on some systems) in utgcns:
Assertion 'idmap.empty() == true' failed.
#1780. - Fix a crash in bogart (on PacBio HiFi metagenomic datasets):
Assertion 'isRepeat == true' failed
. #1806 and #1813.
Known Issues
See the issues page for up-to date open issues, or to report a problem.
- Large memory usage and runtime for long reads (e.g., Nanopore) when using the
overlapper=ovl
algorithm, and during Overlap Error Adjustment. The-fast
option enables a significantly faster algorithm, especially for nanopore data, but may produce slightly less contiguous assemblies. - No support for trio binning of HiFi data. As a workaround, specify the HiFi data as -pacbio-raw and run only the haplotyping step (-haplotype) followed by assembly of the partitioned reads.
See the FAQ for many suggestions, including suggestions for specific data types, e.g., Nanopore r9 reads.
Legal
Canu is derived from Celera Assembler and includes code from many other projects. Most, but not all, of the code is GPL licensed. See the README.licenses file and individual source code files for details.