-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about using propr with multi-omics data #9
Comments
G'day, and thanks for your interest in propr. Are you suggesting that that you have multiple data sets which come from the same samples? For example, 16s data, pathway data, etc., each from the same N samples? I assume the pathway data then are derived from the 16s data (and so forth). In that case, it may make more sense to treat each data set separately, and apply propr K times for K data sets. I'm not exactly sure what you mean by "combine". What propr can do is measure gene-gene (or microbe-microbe or pathway-pathway) associates for a single data set. If you want to integrate data sets in a multivariate analysis, I'd recommend the mixOmics package. They have an option in their software to apply a CLR and do it the CoDa way too. Anyways, we've recently put together a (hopefully) easy-to-follow workflow for compositional data analysis that covers differential abundance and association testing. You might find it helpful! https://academic.oup.com/gigascience/article/8/9/giz107/5572529 Feel free to describe your data in more detail and I can advise. Otherwise, let me know if you have some more questions. |
Love the link! Super useful, thanks. Your take on the X sets * N samples is close, but in this case it's shotgun:taxonomy and shotgun:metabolic assignments, generated independent through different pipelines but from the same By "combine", simply meant concatenate datasets ( Ultimate goal is to be able to determine proportionality between taxonomic and metabolic features generated from the same data/sample/fastq set. Had though that CoDa would operate well between multiomics sets, assuming each set had an appropriately chosen denominator. This is perhaps edging into a broader question, but in what ways are intercomparisons between CoDa sets (e.g. propr's edit: checked out the publication provided above, which for reference of others seems to cover this issue fairly exactly under the heading "Vertical data integration". From Quinn et al., 2020:
|
Ah yes! Now we're on the same page!
This is correct. Unfortunately, I haven't yet written a nice API for this (though it's probably overdue). But there is a rough work-around that can get you started. Something like this...
Unfortunately, you won't be able to use any of the other propr functions, including FDR estimation. Though, I might be able to hack an update together this week that allows the user to do their own transformation before running the rest of the guts of the program. For example, something like
I'll look into it Tuesday to see if this is doable and update you either way. FYI we've also written a small commentary on compositional multi-omics analysis. It elaborates on the vertical integration approach in more detail. It sounds like you already understand why each data set needs its own reference, but not everyone gets this point... https://www.biorxiv.org/content/10.1101/847475v1 Enjoy your weekend! |
If you perform your own (multi-omic) transformation, you can now pass it through propr, and access all of the helper/wrapper functions. Here is a reproducible example for you.
I've done a few tests and everything looks OK. Please let me know if something strange happens and I'll give it a fix. |
That's class! Was initially hoping for simple guidance on the propriety of doing this, so this is well beyond. I'll start cramming bugs into pipes and let you know how it goes. |
Thanks and regards as ever to the devs.
I'm considering several sets of 'omic data generated from the same cohort of
FASTQ
files (i.e. taxonomic, pathways, etc). The way of the CoDa seems a good choice for integrating these values, but I've not seen it mentioned / suggested / gainsaid anywhere.Was planning on subsetting the sets independently (but at similar levels) using
propr(... select = )
and then combine, butpropr
has no native function I can see for this. It could just be an unaddressed usecase, but I would be interested to hear a sane thought on the issue.Hope all are well.
The text was updated successfully, but these errors were encountered: