Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize bulkATACseq? #1334

Closed
mccalluc opened this issue Oct 29, 2020 · 6 comments
Closed

Visualize bulkATACseq? #1334

mccalluc opened this issue Oct 29, 2020 · 6 comments
Labels
enhancement New feature or request feature: vitessce View configs / pipelines question Further information is requested

Comments

@mccalluc
Copy link
Contributor

currently QA on PROD

HBM395.NRTL.659
HBM279.JRTJ.535
HBM583.CJZM.893
HBM742.TXRC.975
HBM555.CGHX.875
HBM256.TSRN.268
HBM643.KKCR.667
HBM488.SDDC.876
@mccalluc mccalluc added enhancement New feature or request question Further information is requested UI labels Oct 29, 2020
@pecan88
Copy link
Contributor

pecan88 commented May 24, 2021

@ngehlenborg @mccalluc @ilan-gold - Stanford TMC has approved a list of eight bulk atac-seq processed datasets for release that I found in the system from a while back. After investigating, @khanshawPSC & @jswelling identified this open item as relating to those datasets.

May we publish these datasets or is there a reason to refrain from a visualization perspective?

@ngehlenborg
Copy link
Member

(For reference, we are talking about these datasets: https://portal.hubmapconsortium.org/search?mapped_data_types[0]=Bulk%20ATAC-seq%20%5BBWA%20%2B%20MACS2%5D&group_name[0]=Stanford%20TMC&entity_type[0]=Dataset)

We could visualize these datasets in Vitessce as is (i.e., no additional processing needed), so that can be added later (see hubmapconsortium/portal-visualization#14).

I noticed, however, that output directories are not properly annotated, e.g., QC report files (here: FASTQC HTML reports and ZIP files) are not marked as such (i.e., the "Show QA Files Only" button does not work) and the output file formats are not annotated either (hovering on "?" icon results in mostly empty tooltip):

image

Most importantly, it is not possible to figure out which genome build was used for the mapping, i.e., the data can't be interpreted.

@pecan88
Copy link
Contributor

pecan88 commented May 24, 2021

Thank you @ngehlenborg - I will redirect to Stanford TMC, @khanshawPSC , and @mruffalo re: the directory and output file format annotation problems.

@pecan88
Copy link
Contributor

pecan88 commented May 27, 2021

@khanshawPSC & @mruffalo - what are the results of looking at the problem and defining next steps toward moving these datasets to publication?

@mruffalo
Copy link
Contributor

Visualization support isn't a blocker for publication -- but it may be worth delaying publication so the pipeline can be modified and re-run to write the additional metadata described by @ngehlenborg and @ilan-gold in hubmapconsortium/portal-visualization#14. There isn't yet any consensus about the file format and content for this additional metadata, but we could add an additional pipeline output file quite easily once the contents are finalized.

Alternatively, these datasets can be published as-is (now), then re-run in the future once we add the additional metadata for visualization support, assuming API and UI support for dataset versioning.

@mccalluc
Copy link
Contributor Author

mccalluc commented Feb 7, 2022

Closing... but please reopen, and clarify the scope, if I have misunderstood.

@mccalluc mccalluc closed this as completed Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature: vitessce View configs / pipelines question Further information is requested
Projects
None yet
Development

No branches or pull requests

6 participants