Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mg$produce_metagene design and syntax for #33

Open
bumproo opened this issue Jun 30, 2021 · 2 comments
Open

mg$produce_metagene design and syntax for #33

bumproo opened this issue Jun 30, 2021 · 2 comments

Comments

@bumproo
Copy link

bumproo commented Jun 30, 2021

Hi,
I have the following experimental design:

Samples D A
D.3 ../bwa/d2253.filt.bam 1 0
D.8 ../bwa/d2257.filt.bam 1 0
A.3 ../bwa/d2255.filt.bam 0 1
A.8 ../bwa/d2259.filt.bam 0 1
A.29 ../bwa/d2262.filt.bam 0 1
A.input ../bwa/A_input.bam 0 2
D.input ../bwa/D_input.bam 2 0

so 2 D samples and an input and 3 A samples and a different input

when I run:
mg$produce_metagene(design=design)

or

mg$produce_metagene(design=design, facet_by= ~ group)

I get a metagene plots for 10 results. each experimental condition paired with each control. I'm hoping for just 5 lines. 2 for D with the appropriate input and 3 for A

Do you have an idea where I am going wrong?

I also wonder if you could provide some syntax advice for the "design_filter" argument to mg$produce_metagene and mg$group_coverages?

@ericfournier2
Copy link

Hello bumproo,

sorry for taking so long to answer.

Could you show me which regions you are specifying when creating your metagene object? You should get a number of lines equal to (Number of sample groups) * (Number of region groups). Since your design specifies two region groups (D and A), I'm guessing you are specifying five region groups, and thus are getting 10 lines.

If you want to proces each sample individually, your design should look like this:

Samples D3 D8 A3 A8 A29
D.3 ../bwa/d2253.filt.bam 1 0 0 0 0
D.8 ../bwa/d2257.filt.bam 1 0 0 0 0
A.3 ../bwa/d2255.filt.bam 0 0 1 0 0
A.8 ../bwa/d2259.filt.bam 0 0 0 1 0
A.29 ../bwa/d2262.filt.bam 0 0 0 0 1
A.input ../bwa/A_input.bam 0 0 2 2 2
D.input ../bwa/D_input.bam 2 2 0 0 0

This will give you 5 design groups.

Also, remember that your input conditions will have no effect unless you specify a parameter for the normalization parameter, for example: test$produce_metagene(normalization="log2_ratio")

However, since you have a single input for all of your samples, I would recommend simply plotting it besides your samples, which makes more sense.

You can use deisgn_filter by passing an array of logical values indicating which design groups should be kept in the plot (IE, which columns of the design matrix you want). For example, given the design I provided above, passing c(TRUE, TRUE, FALSE, FALSE, FALSE) will keep samples D3 and D8 and hide samples A3, A8 and A29.

Does this help?

Have a nice day,
-Eric

@bumproo
Copy link
Author

bumproo commented Jul 14, 2021

Hi Eric,
No problem, not long at all and thank you very much for the detailed response.

Yes, each one of my experimental samples had its own peak file and it makes a lot of sense to me now.
I will try and visualize the metagenes with a shared set of regions. I'm sure that will fix the trouble I was having. Also for the design clarification and design_filter syntax, both make good sense

Charlie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants