Skip to content

mSWEEP-v1.4.0 (10 March 2020)

Compare
Choose a tag to compare
@tmaklin tmaklin released this 10 Mar 18:17
· 402 commits to master since this release
ecca0a2

Beware the clichés of software naming edition.

DOI

New features

  • Support parallel processing through the '-t' flags with excellent scaling in larger problems.
  • Add possibility to match the input grouping indicators to the fasta file through the '--fasta' and '--groups-list' options.
  • Add the '--bootstrap-count' option which allows resampling fewer input alignments than the original sample contains.
  • Add possibility to specify the initial random seed for bootstrapping through the '--seed' option.
  • Support reading in files compressed with bz2 or lzma if compiled on a machine that supports them.

Better error checking

  • Validate that all input and output files exist and are accessible.
  • Add possibility to validate the input grouping indicators when using Themisto pseudoalignments (resolves #4 ).
  • Catch errors in several places that escaped in earlier versions.
  • More informative error messages in the above-mentioned cases.

More efficient resource usage

  • Parallel proceessing in the RCG optimization using OpenMP.
  • Memory usage reduced by ~40% and in large problems.
  • Single core performance increased by ~10% in large problems.

Better build pipeline

  • Download dependencies when running cmake.
  • Build without OpenMP if it is not supported.
  • More aggressive compiler optimization flags.
  • Support build and optimization with the Intel C compiler.

Internal changes

  • Improve code structure and legibility.
  • Use an external library (telescope) to read in pseeudoalignments from both kallisto or Themisto.
  • Better internal storage for the pseudoalignments.
  • Change the (rareish) reset step in the RCG optimization to be computationally more expensive but consume significantly less memory.
  • Separate bootstrap and regular sample processing classes.