Replies: 2 comments 2 replies
-
That looks promising!
…On Wed, May 18, 2022 at 5:58 PM Gibs ***@***.***> wrote:
@mortonjt <https://github.com/mortonjt> and I discussed using BIRDMAn as
a classifier.
The idea is that you can fit parameters on a training dataset and use
these parameters on a testing dataset. The testing model will evaluate the
log-likelihood of the parameters on each class and you can use the highest
log-likelihood to determine the most likely class.
Training model:
data {
int<lower=1> N;
int<lower=1> p;
vector[N] depth;
int y[N];
matrix[N, p] x;
real<lower=0> B_p;
real<lower=0> phi_s;
}parameters {
vector[p] beta_var;
real<lower=0> reciprocal_phi;
}transformed parameters {
real phi = 1 / reciprocal_phi;
vector[N] lam = x*beta_var + depth;
}model {
beta_var[1] ~ normal(-6, B_p);
for (j in 2:p) {
beta_var[j] ~ normal(0, B_p);
}
reciprocal_phi ~ cauchy(0, phi_s);
y ~ neg_binomial_2_log(lam, phi);
}generated quantities {
vector[N] y_predict;
vector[N] log_lhood;
for (n in 1:N) {
y_predict[n] = neg_binomial_2_log_rng(lam[n], phi);
log_lhood[n] = neg_binomial_2_log_lpmf(y[n] | lam[n], phi);
}
}
Testing model:
data {
int<lower=1> D; // Number of microbes
int<lower=1> N; // Number of samples
int<lower=1> draws; // Number of draws
real log_depths[N]; // Log sequencing depths
int<lower=0> y[N, D]; // Count data
matrix[N, 2] x; // Design matrix (all ones)
array[draws] matrix[2, D] post_beta_var; // Posterior draws
array[draws] vector[D] post_phi; // Overdispersion per microbe
}parameters {
}model {
}generated quantities {
array[draws] matrix[N, 2] all_log_lhood; // Log-likelihood of each class
for (i in 1:draws) {
matrix[N, 2] log_lhood = rep_matrix(rep_row_vector(0, 2), N);
matrix[2, D] beta_var = post_beta_var[i];
matrix[N, D] lam1 = col(x, 1) * row(beta_var, 1); // Intercept only
matrix[N, D] lam2 = x * beta_var; // Intercept + beta
vector[D] phi = post_phi[i];
for (n in 1:N) {
for (d in 1:D) {
log_lhood[n, 1] += neg_binomial_2_log_lpmf(y[n, d] | lam1[n, d] + log_depths[n], phi[d]);
log_lhood[n, 2] += neg_binomial_2_log_lpmf(y[n, d] | lam2[n, d] + log_depths[n], phi[d]);
}
}
all_log_lhood[i] = log_lhood;
}
}
I've tried this out on a small dataset (Qiita study ID: 11402) and it
seems pretty promising. Using only Intercept + one predictor, we get a 60%
accuracy. With a stronger microbial effect and a more robust model, we
should hopefully see better performance. Log-likelihoods were summed across
all chains and draws.
[image: image]
<https://user-images.githubusercontent.com/4030868/169162448-6e82adda-7870-42cf-b274-563233bc15cf.png>
—
Reply to this email directly, view it on GitHub
<https://github.com/gibsramen/BIRDMAn/discussions/70>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA75VXKQUP33NOJ7BIPLOI3VKVRYNANCNFSM5WJ3P2GQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Try Lognormal for dispersion parameters : https://github.com/flatironinstitute/q2-matchmaker/blob/main/q2_matchmaker/assets/nb_case_control_single.stan#L48 |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
@mortonjt and I discussed using BIRDMAn as a classifier.
The idea is that you can fit parameters on a training dataset and use these parameters on a testing dataset. The testing model will evaluate the log-likelihood of the parameters on each class and you can use the highest log-likelihood to determine the most likely class.
Training model:
Testing model:
I've tried this out on a small dataset (Qiita study ID: 11402) and it seems pretty promising. Using only Intercept + one predictor, we get a 60% accuracy. With a stronger microbial effect and a more robust model, we should hopefully see better performance. Log-likelihoods were summed across all chains and draws.
Beta Was this translation helpful? Give feedback.
All reactions