Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional review needed for the identification of WGS/WES samples in clickhouse table development #10872

Open
sheridancbio opened this issue Jun 27, 2024 · 1 comment

Comments

@sheridancbio
Copy link
Contributor

This relates to cases where a study contains a sample which appears to be part of a genetic profile, but the sample is not present in data_gene_matrix.txt, or the gene panel id value is 'NA' or missing for a sample which is present in data_gene_matrix.txt.

translation of raw cbioportal database tables into derived clickhouse tables (e.g. sample_to_gene_panel_derived)

Scripts have been developed to produce flattened tables and views for clickhouse development efforts underway. See:

INSERT INTO sample_to_gene_panel_derived

These scripts attempt to connect the PANEL_ID field from the sample_profile table to the panels present in the gene_panel table, and if there is no connecting gene panel then the value 'WES' is used in place of the (missing) gene panel stable id. This logic should be considered in combination with discussions around #10871, where 'NA' values in data_gene_matrix.txt might or might not be present and the resulting imported data might or might not introduce record into sample_profile based on the presence of detected non-silent mutations importer into the mutations table for the sample.

Once the expected data representation in sample_profile is determined and specified for WGS/WES and for non-profiled samples, the logic in these scripts should be examined and updated if necessary.

@sheridancbio
Copy link
Contributor Author

This issue was created after review of #10867 (@haynescd)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants