Question about lr2glm model.matrix argument #18

MeggyC · 2020-09-01T01:40:43Z

Hi there,

I am working on sequence data from coral associated viral communities, which is in a dataframe of CLR transformed abundance data in which the columns are the samples and the rows are the viral species. I have a metadata dataframe too for each sample with ~15 numerical variables.

I want to build a glm for this data in order to see which metadata variables correlate best with the changes in community composition.

From looking at the documentation from propr, I think that the most appropriate course of action would be to use the lr2glm function. In this function I would pass the clr abundance dataframe as the first argument, and it requires a model matrix as the second argument. How do I generate this model matrix? Is that the result of the model.matrix() function in modelr? I would really appreciate an example please?

Also, does the clr transformed abundance matrix need to have the samples as the columns, or the variables (here, viruses).

Many thanks!

tpq · 2020-09-01T03:09:54Z

Hi Meggy,

There are a few options here. Given that your data are already CLR transformed, you can fit a glm model directly, as you would for any other data. The lr2glm() function in propr does something like this, and you find it helpful to look at the source code. Note however that this function is not really part of propr per se, which is why it is not properly documented. But to answer your question, a model matrix is indeed produced by the model.matrix() function. This function is actually contained within base R! If you check ?model.matrix in base R, you can find an example.

dd <- data.frame(a = gl(3,4), b = gl(4,1,12)) # balanced 2-way
m <- model.matrix(~ a + b, dd)

You would then supply m to the second argument of lr2glm().

tpq · 2020-09-01T03:13:52Z

Note that the GLM will test each OTU one at a time, via the formula OTU_i ~ Covariates. If you want to perform a fully multivariable analysis, you could instead supply the CLR-transformed data, along with the covariates, to a multivariable method, such as redundancy analysis vegan::rda() or canonical correlation analysis vegan::CCorA().

tpq changed the title ~~lr2glm model.matrix argument~~ Question about lr2glm model.matrix argument Jun 15, 2021

tpq added question helpful This question has been marked as potentially helpful to others. labels Jun 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about lr2glm model.matrix argument #18

Question about lr2glm model.matrix argument #18

MeggyC commented Sep 1, 2020

tpq commented Sep 1, 2020

tpq commented Sep 1, 2020 •

edited

Loading

Question about lr2glm model.matrix argument #18

Question about lr2glm model.matrix argument #18

Comments

MeggyC commented Sep 1, 2020

tpq commented Sep 1, 2020

tpq commented Sep 1, 2020 • edited Loading

tpq commented Sep 1, 2020 •

edited

Loading