Sailfish v0.9.0
This is a fairly major new release of Sailfish (thus the major version bump). It includes some new features and makes minor but backward-incompatible changes to the output format.
Major Changes
- Sequence-specific bias correction --- The old bias correction methodology has been removed from Sailfish and replaced with a new sequence-specific bias correction model. Bias correction is enabled with the
--biasCorrect
flag. The new model has numerous benefits over the old. First, it should more accurately correct for sequence specific biases, leading to better estimates in biased samples. Second, it should not suffer from the same pathological "over-correction" failure cases of the old model --- if there is no substantial bias in the sample, it should have only a minimal effect on quantification results. - New output format --- The new output format (which will also be adopted by Salmon v0.6.0 onward) adds another column,
EffectiveLength
, to the output which records the effective length of each transcript. This is the third column, and theTPM
andNumReads
columns have both been shifted by 1. Also, thequant.sf
output file has been simplified and now contains no comment lines. The first row in the file is an (un-commented) header that lists the column names, and the subsequent rows are the quantification estimates. - Information about the command used --- Since the comment lines have been removed from the
quant.sf
file, this information (and more), which can sometimes be useful, has been output to other locations. There is a JSON formatted file in the top-level output directory calledcmd_info.json
. This contains a JSON structure with the relevant command line parameters (which used to appear in thequant.sf
comments). - Meta-information about the run --- Quite a bit of useful information appears in the file
aux/meta_info.json
under the main quantification directory. This records information such as the number of reads processed, the number mapped, the percentage mapped, which type of posterior sampling (e.g. Gibbs / bootstrap), if any, was performed. - Auxiliary parameters from the run --- In addition to the
meta_info.json
file, theaux/
directory of the main quantification directory contains other useful files. Specifically, it contains gzipped, binary, data for any bootstrap or Gibbs samples that were generated, and gzipped binary data about the fragment length distribution and bias parameters (the latter is only meaningful if bias-correction was performed).
Bug Fixes
- This release fixes a bug where the mapping location of a fragment may have been miscalculated by a small number of bases in certain cases. This in turn could lead to a small shift in the fragment length distribution and in the resulting quantification estimates.
Acknowledgements
- Special thanks go to Ayush Sengupta for helping out with the implementation of sequence-specific bias correction.
- Special thanks go to Mike Love for testing the effectiveness of the sequence-specific bias correction implementation on some experimental (GEUVADIS) data!