Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

newstudy/mixed_nfosi_2022 #1736

Closed
wants to merge 21 commits into from
Closed

Conversation

anngvu
Copy link
Contributor

@anngvu anngvu commented Nov 22, 2022

Cancer studies updated in this pull request:
Submitting new public study on behalf of NF-OSI as described here: #1734

For issues with the submission, please comment on nf-osi#1 and we'll try to resolve (e.g. if you'd like to remove the HTML validation report). This might fail the genome reference check, but we were told by @ritikakundra and Thomas Yu that this should be OK?

checks

For all pull requests:

  • Passes validation

For a new study (in addition to above):

  • Does study name and study ID follow our convention? e.g. Tumor_Type (Institue, Journal Year); brca_mskcc_2015
  • is study meta data complete? e.g. pmid, group of PUBLIC
  • were all samples profiled with WES/WGS? If not, is gene panel file curated?
  • are oncotree codes of all samples curated; Cancer Type and Cancer Type Detailed needs to be added in addition to Oncotree Code
  • clinical sample and patient data with meta files
  • mutations data with meta files
  • MAF is based on hg19
  • MAF with 2 isoforms: uniprot and mskcc
  • CNA data with meta files
  • CNA segment data with meta files
  • Expression data including z-scores with meta files
  • Case-lists for all profiles.
  • Manual checking (Niki or JJ): Triage or private Portal link here

@jaybee84
Copy link
Contributor

@rmadupuri and @anngvu During additional testing of the variant calls it has come to our notice that the variants included in this PR may not have been filtered completely to remove all germline variants and false positives. We are currently generating a new merged MAF file to accompany this PR with filtered variants to exclude any false positives or germline variants. Our goal is to replace the merged MAF file by the end of this week. @rmadupuri would you be okay to hold reviewing this PR until next week?

@rmadupuri
Copy link
Collaborator

@jaybee84 of course! Let us know when the data is fixed and ready for review.

@anngvu
Copy link
Contributor Author

anngvu commented Jan 19, 2023

@rmadupuri @jaybee84 Replaced with updated mutations data file. I took a look at https://output.circle-artifacts.com/output/job/9a956c59-873e-49d1-9943-acd33ac8d41a/artifacts/0/~/test-reports/ERRORS/mixed_nfosi_2022-validation.html and it's still just the genome reference check.

@anngvu
Copy link
Contributor Author

anngvu commented Jan 20, 2023

@rmadupuri You do a squash merge, right?

@rmadupuri
Copy link
Collaborator

@anngvu thanks for the fixes. Yes we do a squash merge.

@rmadupuri
Copy link
Collaborator

rmadupuri commented Jan 26, 2023

@anngvu @jaybee84 quick questions,

  1. The paper mentions 55 tumors from 23 patients were sequenced. The data files have 18 pats / 19 samples. Will you be adding in the data for the missing samples?
  2. Do you have the RNA-seq, copy number data for the samples?

@jaybee84
Copy link
Contributor

@rmadupuri Thanks for the questions:

The paper mentions 55 tumors from 23 patients were sequenced. The data files have 18 pats / 19 samples. Will you be adding in the data for the missing samples?

In the scope of this paper, only 18 patients had matched tumor-normal samples for somatic variant calling so we could only submit samples from 18 patients. We are only submitting somatic calls according to GATK best practices.

While it is not in the scope of the attached publication, but these samples are from a Biobank. We are currently preparing to release another batch of samples and data, that we would like to add to cBioPortal. This submission is a pilot batch to test and finalize our internal processes for later submissions.

Do you have the RNA-seq, copy number data for the samples?

We have RNAseq data from these patients and are hoping to submit them in this year.

@rmadupuri
Copy link
Collaborator

Got it. Thanks @jaybee84! Here are a few meta data corrections needed.

Can we update the study identifier to nst_nfosi_2022?

meta_study.txt:

  • Can you update study name to Nerve Sheath Tumors (John Hopkins, Sci Data 2020)
  • And citation to Pollard et al. Sci Data 2020
  • Add an additional row for pmid pmid: 32561749. This links out to the publication.
  • And a row for reference_genome: hg38. This differentiates between hg19, hg38 studies.
  • And update type_of_cancer to nst

mutations data:

  • Can you change the file names to data_mutations.txt and meta_mutations.txt?

case lists:

  • Update the all sample case list name to cases_all.txt

Thanks!

@jaybee84
Copy link
Contributor

tagging @anngvu for the suggested updates to the PR. Thanks for the review @rmadupuri !

@anngvu
Copy link
Contributor Author

anngvu commented Jan 31, 2023

@rmadupuri Also thanks for the review! 🙏 I applied suggested changes.

@rmadupuri
Copy link
Collaborator

rmadupuri commented Jan 31, 2023

Thanks for the fixes @anngvu! We have added a few edits and standardized the files to cBioPortal formats in #1760 (I was not able to push my commits to your branch cause of access issues). Closing this PR here and we will merge the data along with your commits in PR 1760.

Thanks so much again for all the work. We will update you once the data is released to public portal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants