CHANGELOG.txt


4.3.0 June 29, 2024
- bugfix w/ cigar-N-splitter, incorporating the required read group ID in each alignment entry - impacts long read calling sensitivity

4.2.0 June 3, 2024 (removed due to cigar-N-splitter used in long reads not incorporating read group id - needed for haplotypecaller)
- for faster workflow, set default boosting to 'none' and turned off blat_ED and annot pass annotations unless boosting is indicated.
- ctat mutation wdl and Terra wrapper wdls updated
- added bamsifter and strand-specific read normalization
- replace gatk cigar-N-splitter with our own pysam based version
- add ctat-mutations to dockstore for easy integration into terra or other supported cloud platforms


4.1.0 March 13, 2024
- incorporated support for single cell transcriptomics
- more robust accounting of read support per variant type including indels and multithreading
- include HC-realigned bam file as an output.


4.0.0 Sept. 17, 2023
- included preliminary support for pacbio rna-seq (operational, but not benchmarked):  use --left <pbio.fastq> and --is_long_reads
- turned off boosting by default
- when boosting on, adds boosting annotations instead of filtering (vcf annotation of BOOSTselect)
- boosting of indels turned off, only boosting on snvs
- added ctat-genome-lib-builder submodule, incorporated minimap2 index prep
- cravat data resources provided as tarball and documentation for installation updated.


3.3.1 Jan 22, 2023
- greatly improved memory usage around the annotate BLAT ED step

3.3.0 Dec 8, 2022
- support for providing target interval list (eg. just search exome regions)
- local wdl imports so not requiring network connection on secure systems.
- further improvements for parallelization and memory usage (annotate_PASS_reads.py)


3.2.0
- revised WDLs for simpler configuration with Terra
- using all common variants for training ML in boosting
- drop SPLICEADJ, now just using DJ
- now explicitly writing feature matrices for use with the downstream ML steps for improved transparency and flexibility
- our RVBLR and py-snpir are now available as submodules, although not fully integrated into the ctat_mutations, and also leverage feature matrices as above.
- uses CTAT genome lib: https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/__genome_libs_StarFv1.10/


3.0.1
* note: 3.0.1 - bugfix, new syntax for CreateSequenceDictionary with gatk 4.1.9.0 in mut lib prep step
    docker and singularity images provided.


3.0.0
- improved boosting performance
- rewrote workflow logic using WDL
- run 'make' in the base directory to install Cromwell workflow runner
- moved RVBLR (RVBoost) to https://github.com/broadinstitute/RVBLR


2.5.0
- leverages Open-CRAVAT for annotations in place of REST-call to web app.
- incorporates additional boosting methods: SGBoost, GBoost, AdaBoost, and RF
- cleaner organization of output files
- added gnomad pop AF annotations
- added clinvar annotations
- igv-reports incorporates clinvar and FATHMM
- cancer-related variants selected based on chasm & vest pvals, or FATHMM or clinvar pathogenic attributes.

2.4.0
- added support for using single-end rna-seq reads
- support for RVBLR (our RVBoost-integration as RVB-like R)
- pctextpos is computed as an annotation and leveraged by RVBLR
- more robust variant annotation / multithreading
- annot_PASS_reads on by default now (again), but there's an option to disable if necessary.
- various bugfixes