Derived dataset for each sample: fraction of genome altered by 1: copy number change 2. number of mutations #1

jingchunzhu · 2017-05-05T17:53:50Z

Build derived datasets:
for each sample: fraction of genome altered by copy number change
for each sample: number of mutations

Hi,

Would you guys be able to create track for fraction of genome altered by 1: copy number change and 2: number of mutations for each TCGA cohort or for the pan cancer? It used to be available in cBioPortal. The number of mutations per sample is still available but fraction of the genome altered by copy number is no longer available. Someone from MSKCC is working on getting that live again. Or is there a way to generate this data from downloading it form Xena and calculating it myself?

Thanks,

jingchunzhu · 2017-05-05T22:39:33Z

Total mutation count (mutation burden): It is only important to know how many mutations are present. The specific mutations are not important.
Fraction of genome altered by copy number (0-1): cBioPortal has calculated it as follows: The fraction of copy number altered genome = length of segments with log2 CNA value larger than 0.2 divided by the length of all segments measured. This is basically a measurement of genomic instability.

question: is there any background on the cutoff of 0.2 ?

jingchunzhu · 2017-05-05T23:37:52Z

in gbm, classify PTEN using 0.2, there is 84% samples with PTEN deletion. Is this about right?

http://dev.xenabrowser.net/heatmap/?bookmark=a05f9847421717d27d5e6fa60a67e79b

http://dev.xenabrowser.net/heatmap/?bookmark=723e4cd313b2380869a255f5dde62171

jingchunzhu · 2017-05-07T18:32:10Z

“In a diploid genome, a single-copy gain in a perfectly pure, homogeneous sample has a copy ratio of 3/2. In log2 scale, this is log2(3/2) = 0.585, and a single-copy loss is log2(1/2) = -1.0.” However, most tumors are heterogeneous (clonal tumor populations) and have some normal stroma. Therefore, the sample’s purity and heterogeneity need to be considered so alterations are not missed, meaning a lower threshold. I have also seen a lot of cancer focused publications using 0.2 as a threshold. I am guessing 0.2 is used of these reasons.

The frequency of a PTEN deletion (one or both alleles) in GBM is 89% (514/577).

duxiuju · 2018-09-15T14:09:31Z

Dear jingchunzhu,
I would like to ask that 'log2 CNA value larger than 0.2' just represents the value larger than +0.2 or the absolute value larger than 0.2? Because if it only represents the value larger than +0.2, the copy numcer loss is neglected,right?

jingchunzhu assigned jingchunzhu and unassigned jingchunzhu Jun 14, 2017

ucscXena deleted a comment from souravsingh Mar 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Derived dataset for each sample: fraction of genome altered by 1: copy number change 2. number of mutations #1

Derived dataset for each sample: fraction of genome altered by 1: copy number change 2. number of mutations #1

jingchunzhu commented May 5, 2017 •

edited

Loading

jingchunzhu commented May 5, 2017

jingchunzhu commented May 5, 2017

jingchunzhu commented May 7, 2017

duxiuju commented Sep 15, 2018

Derived dataset for each sample: fraction of genome altered by 1: copy number change 2. number of mutations #1

Derived dataset for each sample: fraction of genome altered by 1: copy number change 2. number of mutations #1

Comments

jingchunzhu commented May 5, 2017 • edited Loading

jingchunzhu commented May 5, 2017

jingchunzhu commented May 5, 2017

jingchunzhu commented May 7, 2017

duxiuju commented Sep 15, 2018

jingchunzhu commented May 5, 2017 •

edited

Loading