Expecting ploidyVcfFile in CanvasPartition #89

Rashesh7 · 2018-06-27T15:37:00Z

Hi,

In the latest version 1.38 when I am running Somatic-WGS workflow, it is expecting a ploidyVcfFile at the 5th step 'CanvasPartition'.

So the command has:
${DOTNET}/dotnet {$CANVAS}/CanvasPartition/CanvasPartition.dll -p ""

So instead of NULL the script is taking it as an empty string and still trying to validate the VCF and erroring out.

Can you please check if that is actually an issue?

The way I am running CANVAS is:

{$DOTNET}/dotnet {$CANVAS}/Canvas.dll Somatic-WGS
--bam=$TUMOR_BAMFILE
--output=$outdir
--reference=$CANVAS_data/kmer.fa
--genome-folder=$CANVAS_data/WholeGenomeFasta
--sample-name=$SAMPLE
--filter-bed=$CANVAS_data/filter13.bed
--sample-b-allele-vcf=$NORMAL_VCF
--somatic-vcf=$TUMOR_VCF

eroller · 2018-06-27T16:36:30Z

Thanks for pointing this out. It seems like we should make --ploidy-vcf a required parameter then.

If you don't care about proper sex chromosome calling for a male sample, you can provide an empty vcf file (don't forget the header lines) for this parameter and both X and Y will be treated as diploid.

Rashesh7 · 2018-06-27T16:59:41Z

Hi Eric,

I am sorry if you have explained this somewhere, but do you have an example of how the --ploidy-vcf input file should look like?

Thanks

eroller · 2018-06-27T17:16:15Z

Here is a template ploidy vcf for a female sample using Grch38 :

##fileformat=VCFv4.1
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype for imprecise events">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	HCC1187-NovaSeq-Grch38
chrY	0	.	N	<CNV>	.	PASS	END=57227415	CN	0

Here is a template ploidy vcf for a male sample using GRCh38

##fileformat=VCFv4.1
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype for imprecise events">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	MALE
chrX	0	.	N	<CNV>	.	PASS	END=10000	CN	1
chrX	2781479	.	N	<CNV>	.	PASS	END=155701382	CN	1
chrX	156030895	.	N	<CNV>	.	PASS	END=156040895	CN	1
chrY	0	.	N	<CNV>	.	PASS	END=57227415	CN	1

and a male sample using hg19:

##fileformat=VCFv4.1
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number genotype for imprecise events">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	sim4LP7404n03
chrX	0	.	N	<CNV>	.	PASS	END=60000	CN	1
chrX	2699520	.	N	<CNV>	.	PASS	END=154931043	CN	1
chrX	155260560	.	N	<CNV>	.	PASS	END=155270560	CN	1
chrY	0	.	N	<CNV>	.	PASS	END=59373566	CN	1

Rashesh7 · 2018-06-27T19:20:56Z

So is this supposed to be the ploidy of chromosomes in the Normal sample?
CANVAS will then predict the tumor purity and ploidy from the sample-b-allele-vcf?

eroller · 2018-06-27T20:12:30Z

Yes, the file should contain the sex chromosome ploidy of the normal sample. Canvas estimates tumor sample purity and overall ploidy (e.g. 4 for a tetraploid tumor) from both coverage and b-allele frequencies.

Rashesh7 · 2018-06-27T20:17:01Z

Cool, just wanted to confirm that.
Thank you.

Rashesh7 · 2018-07-06T18:32:33Z

Hi Eric,
What was the source for the GRCh37/WholeGenomeFasta/genome.fa and GRCh38/WholeGenomeFasta/genome.fa ?

If I was to use the reference genome that we use for mapping other than the following are there any other constraints/processes I should be doing:

Generate a similar directory structure as GRCh37/WholeGenomeFasta/.
Generate the GenomeSize.xml for reference genome (including all contigs?)
Run 'FlagUniqueKmers' to generate the kmer.fa file for the reference genome.

Please let me know if I am missing any steps.

Also, would the results differ with the inclusion of decoy and HLA contigs in the reference?

Thank you.

eroller · 2018-07-10T16:22:26Z

What reference genome are you using? As long as the coordinates are the same you should be able to use one of the existing kmer.fa files we provide and simply change contig names if necessary. The presence of decoys and unmapped contigs may change the resulting calls slightly, but we don't know how significant an impact it is for CNVs.

The steps you have listed look correct. Note that FlagUniqueKmers can take a day to run depending on the machine and will require a high memory machine. You may also need to create a fasta index as well (i.e. genome.fa.fai) using samtools.

Rashesh7 · 2018-07-10T16:32:22Z

We are using GRCH37 with decoy and moving forward we would be using GRCH38 (including all the contigs and HLA contigs).
We might need to test that. Will update when we do test it.

The other question in the email was:
Is there any intermediate file that records the log2 values for each segment?

eroller · 2018-07-10T18:47:02Z

We provide normalized coverage values for each bin in bigwig format. Look for coverage.bigWig in the per-sample temp sub-directory under the analysis output directory.

the bigwig file contains floating point values, normalized to copy number. If you need to convert them to log2 for visualization you can parse the values from the bedgraph output in the VisualizationTemp subdirectory. Look for coverage.bedgraph

eroller mentioned this issue Jun 28, 2018

NullReferenceException when running Germline-WGS #69

Open

alvaralmstedt mentioned this issue Nov 21, 2018

No output from sex chromosomes #106

Closed

eroller mentioned this issue Apr 23, 2019

Demo evaluation needs correction #118

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expecting ploidyVcfFile in CanvasPartition #89

Expecting ploidyVcfFile in CanvasPartition #89

Rashesh7 commented Jun 27, 2018

eroller commented Jun 27, 2018

Rashesh7 commented Jun 27, 2018

eroller commented Jun 27, 2018

Rashesh7 commented Jun 27, 2018

eroller commented Jun 27, 2018

Rashesh7 commented Jun 27, 2018

Rashesh7 commented Jul 6, 2018 •

edited

Loading

eroller commented Jul 10, 2018

Rashesh7 commented Jul 10, 2018

eroller commented Jul 10, 2018 •

edited

Loading

Expecting ploidyVcfFile in CanvasPartition #89

Expecting ploidyVcfFile in CanvasPartition #89

Comments

Rashesh7 commented Jun 27, 2018

eroller commented Jun 27, 2018

Rashesh7 commented Jun 27, 2018

eroller commented Jun 27, 2018

Rashesh7 commented Jun 27, 2018

eroller commented Jun 27, 2018

Rashesh7 commented Jun 27, 2018

Rashesh7 commented Jul 6, 2018 • edited Loading

eroller commented Jul 10, 2018

Rashesh7 commented Jul 10, 2018

eroller commented Jul 10, 2018 • edited Loading

Rashesh7 commented Jul 6, 2018 •

edited

Loading

eroller commented Jul 10, 2018 •

edited

Loading