Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wighted vs unweighted UniFrac strong difference #149

Open
ArnaudGaudry opened this issue Jul 21, 2021 · 3 comments
Open

Wighted vs unweighted UniFrac strong difference #149

ArnaudGaudry opened this issue Jul 21, 2021 · 3 comments

Comments

@ArnaudGaudry
Copy link

Hello qemistree developers!

I tried to reproduce analyses from the publication on the evaluation dataset: https://github.com/knightlab-analyses/qemistree-analyses/blob/master/Evaluation-Dataset-Analyses.ipynb

When generating the plot using the metricunweighted_unifrac instead of weighted_normalized_unifrac , it generates a really different plot that is actually quite similar to the one generated using bray-curtis (strong batch effect visible). Is this inherent in the metric and expected? I thought you might have tested it in development!

Thanks and best regards,
Arnaud

@ElDeveloper
Copy link
Member

@ArnaudGaudry that's interesting. Mathematically speaking Bray-Curtis is more similar to weighted UniFrac than it is to unweighted UniFrac. I don't remember seeing this plot, mainly because we knew the abundance-based weighting inherent to the weighted variant of UniFrac would play an important role based on other experiments and tests we ran before. @anupriyatripathi any thoughts on this?

@anupriyatripathi
Copy link
Collaborator

anupriyatripathi commented Jul 23, 2021 via email

@ArnaudGaudry
Copy link
Author

@anupriyatripathi @ElDeveloper Thank you for your detailed answers!

It is indeed maybe due to the weight given to low abundance metabolites. PERMANOVA is a really good idea to measure the groups separations and I'll give it a try. Since the idea is to use chemical relationships to mitigate the batch effect, I also compared Qemistree to CSCS (also weighted and unweighted).
As you can see, unweighted CSCS still mitigates the batch effect, unlike unweighted Unifrac. Since both are methodologically completely different, it is hard to compare but I expected a result quite similar for both unweighted versions (as it is the case for weighted versions). This is obviously not the case ^^
unifrac_vs_cscs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants