Skip to content

Commit

Permalink
Merge branch 'release/1.13.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
David Jones committed Mar 6, 2018
2 parents 23aaf63 + cbaed1e commit 6a2634e
Show file tree
Hide file tree
Showing 16 changed files with 1,030 additions and 88 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@
/perltidy.LOG
/MYMETA.json
/MYMETA.yml
/reports
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ perl:
- "5.22"

script:
- set -e
- export PATH=$HOME/install/bin:$HOME/install/biobambam2/bin:$PATH
- git clone --depth 1 --single-branch --branch develop https://github.com/cancerit/cgpBigWig.git
- cd cgpBigWig
- ./setup.sh $HOME/install
Expand Down
13 changes: 12 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
### 1.12.0
# CHANGES

## 1.13.0

* Overlapping read support
* Support for csi indexing of bam/cram files
* Scatter gather implememnted for flagging
* Updates to README, license dates and contact information
* Uses CaVEMan core overlapping reads enabled [1.13.0](https://github.com/cancerit/CaVEMan/releases/tag/1.13.0)

## 1.12.0

* Update to use CaVEMan 1.12.0
* Deal with CaVEMan 1.12.0 intermediate output now being gzipped
661 changes: 661 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion Makefile.PL
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/perl

##########LICENCE##########
# Copyright (c) 2014 Genome Research Ltd.
# Copyright (c) 2014-2018 Genome Research Ltd.
#
# Author: Cancer Genome Project [email protected]
#
Expand Down
69 changes: 49 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,66 @@
cgpCaVEManWrapper
=================
# cgpCaVEManWrapper

cgpCaVEManWrapper provides a simplified usage implementation for the complete Cancer Genome Project processing flow of the algorithm CaVEMan.

For details of the underlying algorithm please see the [CaVEMan](http://cancerit.github.io/CaVEMan/) site.
For details of the underlying algorithm please see the [CaVEMan][caveman] site.

For details of the filtering process please see the [cgpCaVEManPostProcessing](http://cancerit.github.io/cgpCaVEManPostProcessing/) site.
For details of the filtering process please see the [cgpCaVEManPostProcessing][caveman-pp] site.

| Master | Dev |
|---|---|
| [![Build Status](https://travis-ci.org/cancerit/cgpCaVEManWrapper.svg?branch=master)](https://travis-ci.org/cancerit/cgpCaVEManWrapper) | [![Build Status](https://travis-ci.org/cancerit/cgpCaVEManWrapper.svg?branch=dev)](https://travis-ci.org/cancerit/cgpCaVEManWrapper) |
| Master | Develop |
| --------------------------------------------- | ----------------------------------------------- |
| [![Master Badge][travis-master]][travis-base] | [![Develop Badge][travis-develop]][travis-base] |

---
## Docker, Singularity and Dockstore

### Dependencies/Install
There are pre-built images containing this codebase on quay.io.

* [dockstore-cgpwxs][ds-cgpwxs-git]
* Contains tools specific to WXS analysis.
* [dockstore-cgpwgs][ds-cgpwgs-git]
* Contains additional tools for WGS analysis.

These were primarily designed for use with dockstore.org but can be used as normal containers.

The docker images are know to work correctly after import into a singularity image.

## Dependencies/Install
Please install the following first:

* [PCAP-core](http://github.com/ICGC-TCGA-PanCancer/PCAP-core/releases)
* [cgpCaVEManPostProcessing](http://github.com/cancerit/cgpCaVEManPostProcessing/releases)
* [PCAP-core][pcap-core-rel]
* [cgpCaVEManPostProcessing][caveman-pp-rel]

Please see these for any child dependencies.

Once complete please run:

```
./setup.sh /some/install/location
```

This will automatically get the appropriate version of the core [CaVEMan](http://cancerit.github.io/CaVEMan/) algorithm.

---
This will automatically get the appropriate version of the core [CaVEMan][caveman] algorithm.

## Creating a release
#### Preparation

### Preparation

* Commit/push all relevant changes.
* Pull a clean version of the repo and use this for the following steps.

#### Cutting the release
### Cutting the release

1. Update `perl/lib/Sanger/CGP/Caveman.pm` to the correct version.
2. Run `./prerelease.sh`
3. Check all tests and coverage reports are acceptable.
4. Commit the updated docs tree and updated module/version.
5. Push commits.
6. Use the GitHub tools to draft a release.

LICENCE
=======
Copyright (c) 2014-2017 Genome Research Ltd.
## LICENCE

```
Copyright (c) 2014-2018 Genome Research Ltd.
Author: Cancer Genome Project <cgpit@sanger.ac.uk>
Author: CASM/Cancer IT <cgphelp@sanger.ac.uk>
This file is part of cgpCaVEManWrapper.
Expand All @@ -72,3 +86,18 @@ reads ‘Copyright (c) 2005, 2007, 2008, 2009, 2011, 2012’ and a copyright
statement that reads ‘Copyright (c) 2005-2012’ should be interpreted as being
identical to a statement that reads ‘Copyright (c) 2005, 2006, 2007, 2008,
2009, 2010, 2011, 2012’."
```

<!-- References -->
[caveman]: http://cancerit.github.io/CaVEMan
[caveman-pp]: http://cancerit.github.io/cgpCaVEManPostProcessing
[caveman-pp-rel]: https://github.com/cancerit/cgpCaVEManPostProcessing/releases
[pcap-core-rel]: https://github.com/cancerit/PCAP-core/releases
[bio-db-hts]: http://search.cpan.org/dist/Bio-DB-HTS
[ds-cgpwxs-git]: https://github.com/cancerit/dockstore-cgpwxs
[ds-cgpwgs-git]: https://github.com/cancerit/dockstore-cgpwgs

<!-- Travis -->
[travis-base]: https://travis-ci.org/cancerit/cgpCaVEManWrapper
[travis-master]: https://travis-ci.org/cancerit/cgpCaVEManWrapper.svg?branch=master
[travis-develop]: https://travis-ci.org/cancerit/cgpCaVEManWrapper.svg?branch=dev
44 changes: 33 additions & 11 deletions bin/caveman.pl
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#!/usr/bin/perl

##########LICENCE##########
# Copyright (c) 2014-2017 Genome Research Ltd.
# Copyright (c) 2014-2018 Genome Research Ltd.
#
# Author: David Jones <cgpit@sanger.ac.uk>
# Author: CASM/Cancer IT <cgphelp@sanger.ac.uk>
#
# This file is part of cgpCaVEManWrapper.
#
Expand Down Expand Up @@ -64,6 +64,7 @@ BEGIN
const my $IDS_MUTS_TBI => q{%s.muts.ids.vcf.gz.tbi};
const my $NO_ANALYSIS => q{%s.no_analysis.bed};
const my $SP_ASS_MESSAGE => qq{%s defined at commandline (%s) does not match that in the BAM file (%s). Defaulting to BAM file value.\n};
const my $SPLIT_LINE_COUNT => 1000;

const my @VALID_PROTOCOLS => qw(WGS WXS RNA);
const my @PERMITTED_SEQ_TYPES => qw(pulldown|exome|genome|genomic|followup|targeted|rna_seq);
Expand All @@ -90,6 +91,7 @@ BEGIN
$threads->add_function('caveman_split', \&Sanger::CGP::Caveman::Implement::caveman_split);
$threads->add_function('caveman_mstep', \&Sanger::CGP::Caveman::Implement::caveman_mstep);
$threads->add_function('caveman_estep', \&Sanger::CGP::Caveman::Implement::caveman_estep);
$threads->add_function('caveman_flag', \&Sanger::CGP::Caveman::Implement::caveman_flag);

# this is here just to make the reference usable if not the same samtools version
my $ref = $options->{'reference'};
Expand All @@ -114,8 +116,12 @@ BEGIN

if(!exists $options->{'process'} || $options->{'process'} eq 'split'){
$options->{'out_file'} = $options->{'splitList'};
my $contig_count = Sanger::CGP::Caveman::Implement::file_line_count($options->{'reference'});
my $valid_fai_idx = Sanger::CGP::Caveman::Implement::valid_seq_indexes($options);
my $contig_count = scalar @{$valid_fai_idx};
$options->{'valid_fai_idx'} = $valid_fai_idx;
#= Sanger::CGP::Caveman::Implement::file_line_count($options->{'reference'});
$threads->run($contig_count, 'caveman_split', $options);
delete $options->{'valid_fai_idx'};
}

if(!exists $options->{'process'} || $options->{'process'} eq 'split_concat'){
Expand Down Expand Up @@ -168,15 +174,28 @@ BEGIN
if((!exists $options->{'process'} || $options->{'process'} eq 'flag')
&& (!defined $options->{'noflag'} || $options->{'noflag'} != 1)){
$options->{'for_flagging'} = $options->{'ids_muts_file'};
$options->{'flagged'} = sprintf($FLAGGED_MUTS,$options->{'out_file'});
Sanger::CGP::Caveman::Implement::caveman_flag($options);
$options->{'for_split'} = $options->{'ids_muts_file'};
$options->{'split_out'} = $options->{'ids_muts_file'}.'split';
$options->{'split_lines'} = $SPLIT_LINE_COUNT;
#Split flagging file
Sanger::CGP::Caveman::Implement::caveman_split_vcf($options);
#Count flag target
$options->{'vcf_split_count'} = Sanger::CGP::Caveman::Implement::count_files($options,$options->{'split_out'}.'*');
#flag each as an array
$options->{'flagged'} = sprintf($FLAGGED_MUTS,$options->{'out_file'});
#Run the flagging code with number of split jobs.
$threads->run($options->{'vcf_split_count'}, 'caveman_flag', $options);
#concatenate flagged files into a single flagged output file
Sanger::CGP::Caveman::Implement::concat_flagged($options);
#Gzip and index output flagged file
Sanger::CGP::Caveman::Implement::zip_flagged($options);
}

if((!exists $options->{'process'}) #We aren't specifying steps
|| ($options->{'process'} eq 'flag') #We've flagged so we are done anyway
|| ($options->{'noflag'} == 1 && $options->{'process'} eq 'add_ids')){ #No flagging wanted and preflagging step done
#finally cleanup after ourselves by removing the temporary output folder, split files etc.
#TODO Zip the snps files with IDs
#Zip the snps files with IDs
Sanger::CGP::Caveman::Implement::pre_cleanup_zip($options);
cleanup($options);
}
Expand Down Expand Up @@ -290,6 +309,7 @@ sub setup {
'mpc|mut_probability_cutoff=f' => \$opts{'mpc'},
'spc|snp_probability_cutoff=f' => \$opts{'spc'},
'e|read-count=i' => \$opts{'read-count'},
'x|exclude=s' => \$opts{'exclude'},
'dbg|debug' => \$opts{'debug_cave'},
) or pod2usage(2);

Expand All @@ -311,13 +331,14 @@ sub setup {
PCAP::Cli::file_for_reading('reference',$opts{'reference'});
PCAP::Cli::file_for_reading('tumour-bam',$opts{'tumbam'});
PCAP::Cli::file_for_reading('normal-bam',$opts{'normbam'});
#We should also check the bam indexes exist.
my $tumidx = $opts{'tumbam'}.".bai";
my $normidx = $opts{'normbam'}.".bai";
PCAP::Cli::file_for_reading('tumour-bai',$tumidx);
PCAP::Cli::file_for_reading('normal-bai',$normidx);
PCAP::Cli::file_for_reading('ignore-file',$opts{'ignore'});

#We should also check an index exist.
for my $op(qw(normbam tumbam)) {
pod2usage(-message => "\nERROR: $op |".$opts{$op}."| cannot locate index file.\n", -verbose => 1, -output => \*STDERR)
unless(-f $opts{$op}.'.bai' || -f $opts{$op}.'.csi' || -f $opts{$op}.'.crai');
}

if(exists($opts{'tumcn'}) && defined($opts{'tumcn'})){
if(-e $opts{'tumcn'}) {
if(-s $opts{'tumcn'} == 0 && (!exists $opts{'tumdefcn'} || !defined $opts{'tumdefcn'})) {
Expand All @@ -344,6 +365,7 @@ sub setup {
delete $opts{'process'} unless(defined $opts{'process'});
delete $opts{'index'} unless(defined $opts{'index'});
delete $opts{'limit'} unless(defined $opts{'limit'});
delete $opts{'exclude'} unless(defined $opts{'exclude'});

$opts{'read-count'} = 350_000 unless(defined $opts{'read-count'});

Expand Down
4 changes: 2 additions & 2 deletions bin/caveman_merge_results.pl
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
#!/usr/bin/perl

##########LICENCE##########
# Copyright (c) 2014 Genome Research Ltd.
# Copyright (c) 2014-2018 Genome Research Ltd.
#
# Author: David Jones <cgpit@sanger.ac.uk>
# Author: CASM/Cancer IT <cgphelp@sanger.ac.uk>
#
# This file is part of cgpCaVEManWrapper.
#
Expand Down
6 changes: 3 additions & 3 deletions lib/Sanger/CGP/Caveman.pm
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
package Sanger::CGP::Caveman;

##########LICENCE##########
# Copyright (c) 2014-2017 Genome Research Ltd.
# Copyright (c) 2014-2018 Genome Research Ltd.
#
# Author: David Jones <cgpit@sanger.ac.uk>
# Author: CASM/Cancer IT <cgphelp@sanger.ac.uk>
#
# This file is part of cgpCaVEManWrapper.
#
Expand All @@ -26,6 +26,6 @@ package Sanger::CGP::Caveman;
use strict;
use Const::Fast qw(const);

our $VERSION = '1.12.0';
our $VERSION = '1.13.0';

1;
Loading

0 comments on commit 6a2634e

Please sign in to comment.