Skip to content

Canu v2.1.1

Compare
Choose a tag to compare
@brianwalenz brianwalenz released this 19 Oct 07:12

These are release notes for Canu version 2.1.1, which was released on October 16th, 2020. Canu is specialized for assembly of single-molecule high-noise sequences. Full documentation can be found at http://canu.readthedocs.org/.

This release provides a stable, tested, and documented version of the software. The binary distributions should work on any relatively recent version of the respective OS and are the recommended way to install Canu. The source code distribution contains everything you need to create a binary distribution for your own specific OS.

Citation

Minimum Requirements

  • 8GB minimum memory; 16GB strongly suggested
  • GCC 4.5 (for compilation only); GCC 7 or newer strongly recommended
  • Perl 5.12.0, or File::Path 2.08
  • Java SE 8
  • macOS 10.10 Yosemite (for macOS/Darwin binaries only)
  • gnuplot 5.2 (optional, for generating diagnostic graphs)

Installation

Users can download Canu as source code or as pre-compiled binaries. The binary distribution is the recommended install method, assuming it is available for your platform. The source code package needs to be compiled and installed before it can be used.

Note that the installation directory has changed compared to previous releases.

To install from a binary distribution (recommended):

tar -xJf canu-2.1.1.*.tar.xz

Canu will be installed at canu-2.1.1/bin/canu.

To install from source code (DO NOT download the Source code files provided by GitHub as these will not compile, use the canu-2.1.1.tar.gz instead):

tar -xJf canu-2.1.1.tar.xz
cd canu-2.1.1/src
make -j 8
cd ..

Canu will be installed at canu-2.1.1/build/bin/canu.

Changes

Canu v2.1.1 IS compatible with assemblies started with Canu v2.1.

This minor release adds a small performance enhancement to consensus and fixes two crashes, one in consensus and one in bogart.

  • Add multithreading for the final step of consensus, where it aligns the original reads back to the consensus sequence to find the read layout.
  • Fix a systematic crash (on some systems) in utgcns: Assertion 'idmap.empty() == true' failed. #1780.
  • Fix a crash in bogart (on PacBio HiFi metagenomic datasets): Assertion 'isRepeat == true' failed. #1806 and #1813.

Known Issues

See the issues page for up-to date open issues, or to report a problem.

  • Large memory usage and runtime for long reads (e.g., Nanopore) when using the overlapper=ovl algorithm, and during Overlap Error Adjustment. The -fast option enables a significantly faster algorithm, especially for nanopore data, but may produce slightly less contiguous assemblies.
  • No support for trio binning of HiFi data. As a workaround, specify the HiFi data as -pacbio-raw and run only the haplotyping step (-haplotype) followed by assembly of the partitioned reads.

See the FAQ for many suggestions, including suggestions for specific data types, e.g., Nanopore r9 reads.

Legal

Canu is derived from Celera Assembler and includes code from many other projects. Most, but not all, of the code is GPL licensed. See the README.licenses file and individual source code files for details.