Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNV plotting improvements. #2853

Closed
droazen opened this issue Jun 5, 2017 · 2 comments
Closed

CNV plotting improvements. #2853

droazen opened this issue Jun 5, 2017 · 2 comments

Comments

@droazen
Copy link
Contributor

droazen commented Jun 5, 2017

@LeeTL1220 commented on Wed May 04 2016

Please see broadinstitute/gatk-protected#224 for additional information and proposed solutions.

We would like to have interactive plots generated for ACNV outputs.

In the past, we would use IGV, but this is too inflexible


@samuelklee commented on Wed Oct 19 2016

MAF-CR plots would also be nice.

Copied from broadinstitute/gatk-protected#224:

We also discussed possibly having some interactive plots in the future. I think that checking out packages like plotly (https://plot.ly/r/) would be a good start. The ultimate goal would be to build some sort of dashboard (maybe using shiny, http://shiny.rstudio.com/) that takes in seg files from CNV/ACNV/etc. and generates several plots at once. Even simple things like being able to interactively select which chromosomes/segments to plot, having the ability to zoom, or hover-highlighting segments would make the results much easier to parse and interpret.


@sooheelee commented on Wed Jan 25 2017

My recommendation is that some part of the output data be compatible with IGV at the very least. Take a look at the mutation overlay feature on this page that allows users to overlay MAF mutations onto expression data. This could easily be mutation data over CNV heatmap data.

Any additional plotting feature would be cherries on top.

Also, we have three related github issues. Perhaps consolidate your efforts so as not to duplicate them?

Finally, the b37-only compatibility is not acceptable in my opinion. Our genomics tools should not have this kind of constraint. The use of model organisms and the light they shed on the biological pathways we study is what has brought and continues to bring the great advances we see in biomedical research. Honestly, we cannot say that this plotting tool is even hg19-compatible. It is not. A proportion of users don't know how to sed for removing the 'chr' from files, and furthermore, this kind of file-editing is directly counter to the concept of data provenance that rigorous scientific research should uphold.

RosieQuezada is another user who is struggling with this constraint:

Will this compatibility issue be fixed? If so, when will it be fixed? Is this something that is difficult to fix? It's embarrassing from my point of view and we need to assuage RosieQuezada and other users' experiences, perhaps with a concrete solution.


@samuelklee commented on Wed Jan 25 2017

Thanks for your comments, @sooheelee!

Allowing for different references is probably our highest priority in terms of plotting issues---but unfortunately, plotting issues are not very high priority overall. However, I agree that it is embarrassing. In principle, the fix should be relatively quick, so I can try to squeeze one in at some point in the next week or so.

As for MAF output, I agree it would be nice to have the option for IGV compatibility, but I don't think the MAF format is supported by the engine nor do I know if there are plans to implement support. Perhaps @LeeTL1220 or @davidbenjamin can comment.


@sooheelee commented on Wed Jan 25 2017

To clarify, the CNV callset should have IGV compatibility where it displays as a heatmap (like GISTIC outputs in IGV). To this folks can overlay whatever mutation data they have, whether that be in MAF or VCF format.


@samuelklee commented on Wed Jan 25 2017

Ah, gotcha. In that case it should already be relatively easy for users to create IGV-compatible output according to http://software.broadinstitute.org/software/igv/SegmentedData

Depending on which tool output they are trying to plot (CNV or ACNV), they may have to manually create files with the column order expected by IGV by removing or reordering columns, but I don't think this is unreasonable. (I think this is preferable to outputting additional 4-column segment files specifically for use with IGV, right?)


@samuelklee commented on Wed Jan 25 2017

Started a branch. Will have to cook up some new test data but should hopefully be relatively quick.

Note that we will lose the dotted centromere indicators unless we require their locations as an additional input.


@sooheelee commented on Wed Jan 25 2017

I think IGV's default heatmap coloring is centered around 0 or 1, whichever CNV data isn't.

As for the centromere locations, I'm not sure but perhaps this format can help define those for people who want to define them. I'd have to do some digging through UCSC Golden paths to see what is commonly available.


@samuelklee commented on Wed Jan 25 2017

According to the page linked above, users should be able to set data range and log/linear scale in IGV?


@samuelklee commented on Fri Jan 27 2017

@LeeTL1220 @achevali I would like to get rid of the per-segment ACNV plotting, unless there are any strong objections. @dlivitz Do you guys find this functionality useful?


@LeeTL1220 commented on Fri Jan 27 2017

Whatever you like.

On Fri, Jan 27, 2017 at 3:04 PM, samuelklee [email protected]
wrote:

@LeeTL1220 https://github.com/LeeTL1220 @achevali
https://github.com/achevali I would like to get rid of the per-segment
ACNV plotting, unless there are any strong objections. @dlivitz
https://github.com/dlivitz Do you guys find this functionality useful?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
broadinstitute/gatk-protected#495 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACDXk9EuPaZXELCvgEuNQyHS3ZJQikCTks5rWk23gaJpZM4IXtaa
.

--
Lee Lichtenstein
Broad Institute
75 Ames Street, Room 7003EB
Cambridge, MA 02142
617 714 8632


@sooheelee commented on Fri Feb 03 2017

I've been preoccupied with M2 tutorial work. Please let me know how I can help, if there is something I can do to help.


@samuelklee commented on Fri Feb 03 2017

Not super high priority. Just let me know if the way plotting uses the sequence dictionary (which is summarized by the bullet points in the PR) looks reasonable to you, when you get a chance.

@samuelklee samuelklee changed the title ACNV plotting improvements. CNV plotting improvements. Jan 10, 2018
@samuelklee samuelklee removed the ACNV label Jan 10, 2018
@sooheelee
Copy link
Contributor

We've had a preliminary discussion with IGV team, towards finding out what are some possible solutions and avenues to solutions.

@samuelklee
Copy link
Contributor

IGV compatible output was added in #5048 and #5115 and there are other issues open for CNV plotting improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants