Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lprior argument to prior, enabling separate lprior variables #1724

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

n-kall
Copy link

@n-kall n-kall commented Jan 14, 2025

This would allow for tagging priors for inclusion in different lprior variables, enabling priorsense to selectively power-scale priors. It would fix #1585

@n-kall n-kall changed the title initial work on lprior tags lprior argument to prior, enabling separate lprior variables Jan 14, 2025
@paul-buerkner
Copy link
Owner

paul-buerkner commented Jan 15, 2025

I think this looks pretty good already.

You mentioned that so far it only works with the b coefficents but I couldn't readily see this limitation in your code. Can you elaborate?

Also, could you show an example model here to illustrate the features use?

Thank you for working on this issue!

@paul-buerkner paul-buerkner added this to the brms 2.23.0 milestone Jan 15, 2025
@n-kall
Copy link
Author

n-kall commented Jan 15, 2025

Certainly, here is an example:

Basically, the current code works for priors that are set on the coefficient level, i.e. has_coef_prior == TRUE see here

I wasn't sure how to properly handle the lprior tagging for a class when there is no individual coef prior. Now the lprior variables are created, but not added to if there is no individual coefficient prior.

prior <- c(
  prior(normal(0, 5), coef = "K1", lprior = "covariates"), # works
  prior(normal(0, 1), coef = "N1", lprior = "treatment"), # works
  prior(normal(0, 100), class = "Intercept", lprior = "Intercept"), # does not work
  prior(normal(0, 10), class = "sigma", lprior = "sigma") # does not work
)


x <- brm(yield ~ N + K + P, data = npk,
         prior = prior, backend = "cmdstanr",
         empty = TRUE
         )

prior_summary(x)

s <- stancode(x)

s
##  > prior_summary(x)
##           prior     class coef group resp dpar nlpar lb ub     lprior       source
##          (flat)         b                                                  default
##    normal(0, 5)         b   K1                             covariates         user
##    normal(0, 1)         b   N1                              treatment         user
##          (flat)         b   P1                                        (vectorized)
##  normal(0, 100) Intercept                                   Intercept         user
##   normal(0, 10)     sigma                             0         sigma         user
// generated with brms 2.22.8
functions {
}
data {
   int<lower=1> N;  // total number of observations
   vector[N] Y;  // response variable
   int<lower=1> K;  // number of population-level effects
   matrix[N, K] X;  // population-level design matrix
   int<lower=1> Kc;  // number of population-level effects after centering
   int prior_only;  // should the likelihood be ignored?
 }
 transformed data {
   matrix[N, Kc] Xc;  // centered version of X without an intercept
   vector[Kc] means_X;  // column means of X before centering
   for (i in 2:K) {
     means_X[i - 1] = mean(X[, i]);
     Xc[, i - 1] = X[, i] - means_X[i - 1];
   }
 }
 parameters {
   vector[Kc] b;  // regression coefficients
   real Intercept;  // temporary intercept for centered predictors
   real<lower=0> sigma;  // dispersion parameter
 }
 transformed parameters {
   real lprior = 0;  // prior contributions to the log posterior
   real lprior_covariates = 0;
   real lprior_treatment = 0;
   real lprior_Intercept = 0;
   real lprior_sigma = 0;
   lprior += normal_lpdf(b[1] | 0, 1);
   lprior_treatment += normal_lpdf(b[1] | 0, 1);
   lprior += normal_lpdf(b[2] | 0, 5);
   lprior_covariates += normal_lpdf(b[2] | 0, 5);
   lprior += normal_lpdf(Intercept | 0, 100);
   lprior += normal_lpdf(sigma | 0, 10)
     - 1 * normal_lccdf(0 | 0, 10);
 }
 model {
   // likelihood including constants
   if (!prior_only) {
     target += normal_id_glm_lpdf(Y | Xc, Intercept, b, sigma);
   }
   // priors including constants
   target += lprior;
 }
generated quantities {
   // actual population-level intercept
   real b_Intercept = Intercept - dot_product(means_X, b);
 }

@paul-buerkner
Copy link
Owner

I see. okay, I will take a look at make it work for all priors even without the coef specified

@paul-buerkner
Copy link
Owner

paul-buerkner commented Jan 24, 2025

I have updated your PR so it should now work for all priors. I have also improved the logic of adding priors to the lprior variables to avoid double computation of the prior lpdf statements. Can you double check that the logic is correct?

Some further comments:

  • Upon now seeing it in action, I am not sure we should call the argument to set_prior and friends lprior. It reads as if people should provide a log prior there, which is not what we want to indicate. Do you think just having the argument named tag or so could be better? I am oben for other suggestions
  • However we call the argument, it would still need to be documented in set_prior at least including an example.
  • In the end, could you add some tests to check that the generated stan code is correct?

@n-kall
Copy link
Author

n-kall commented Jan 24, 2025

Great, thanks! I will check it out.

Regarding the naming, originally I had used tag also, but then I wasn't sure if it is distinct enough from group. But yes, lprior is unclear and tag could work with correct explanation.

I'll test it out and add examples / tests

@n-kall
Copy link
Author

n-kall commented Feb 3, 2025

Your changes look great! One thing I noticed though is that the lprior tag is not vectorized as I would expect.

For example, if I specify the tag on class = "b" and no individual coefficients, it works as intended. But if I then specify one specific coefficient to have a different tag, the tag for the class is not used anymore. See below:
Do you think it is reasonable for the tag to (vectorize) like the prior itself?

prior1 <- c(
  prior(normal(0, 5), class = "b", lprior = "covariates") # vectorizes
)
prior2 <- c(
  prior(normal(0, 5), class = "b", lprior = "covariates"), # is removed
  prior(normal(0, 1), coef = "N1", lprior = "treatment") # works  
)
x <- brm(yield ~ N + K + P, data = npk,
         prior = prior1, backend = "cmdstanr",
         empty = TRUE
         )
#> prior_summary(x)
                 prior     class coef group resp dpar nlpar lb ub     lprior       source
            normal(0, 5)         b                                  covariates         user
            normal(0, 5)         b   K1                                        (vectorized)
            normal(0, 5)         b   N1                                        (vectorized)
            normal(0, 5)         b   P1                                        (vectorized)
 student_t(3, 55.6, 6.2) Intercept                                                  default
    student_t(3, 0, 6.2)     sigma                             0                    default
#> stancode(x)
transformed parameters {
  // prior contributions to the log posterior
  real lprior_covariates = 0;
  real lprior = 0;
  lprior_covariates += normal_lpdf(b | 0, 5); // covariates tag works as intended
  lprior += student_t_lpdf(Intercept | 3, 55.6, 6.2);
  lprior += student_t_lpdf(sigma | 3, 0, 6.2)
    - 1 * student_t_lccdf(0 | 3, 0, 6.2);
  lprior += lprior_covariates;
}
x2 <- brm(yield ~ N + K + P, data = npk,
         prior = prior2, backend = "cmdstanr",
         empty = TRUE
         )
#> prior_summary(x2)
                   prior     class coef group resp dpar nlpar lb ub     lprior       source
            normal(0, 5)         b                                  covariates         user
            normal(0, 5)         b   K1                                        (vectorized)
            normal(0, 1)         b   N1                              treatment         user
            normal(0, 5)         b   P1                                        (vectorized)
 student_t(3, 55.6, 6.2) Intercept                                                  default
    student_t(3, 0, 6.2)     sigma                             0                    default
stancode(x2)
// generated with brms 2.22.8
transformed parameters {
  // prior contributions to the log posterior
  real lprior_covariates = 0;
  real lprior = 0;
  real lprior_treatment = 0;
  lprior_treatment += normal_lpdf(b[1] | 0, 1); // treatment tag works
  lprior += normal_lpdf(b[2] | 0, 5); // covariate tag no longer holds the other coefficients
  lprior += normal_lpdf(b[3] | 0, 5);
  lprior += student_t_lpdf(Intercept | 3, 55.6, 6.2);
  lprior += student_t_lpdf(sigma | 3, 0, 6.2)
    - 1 * student_t_lccdf(0 | 3, 0, 6.2);
  lprior += lprior_covariates;
  lprior += lprior_treatment;
}

@paul-buerkner
Copy link
Owner

Good catch. Fixed. Will you make the other adjustments then?

@n-kall
Copy link
Author

n-kall commented Feb 3, 2025

Yes, can do! Thank you again!

@n-kall
Copy link
Author

n-kall commented Feb 5, 2025

I now changed the name to "tag" and made it such that prior_tag can be specified in priorsense functions. There is still a bit of cleanup required, and documentation, but it seems to be working as intended now

prior_cov <- prior(normal(0, 10), class = "b", tag = "covariates")
prior_trt <- prior(normal(0, 1), class = "b", coef = "Trt1", tag = "treatment")
prior_sigma <- prior(exponential(10), class = "sigma", tag = "sigma")

f <- brm(count ~ Trt + zAge + zBase + (1 | patient), data = epilepsy, prior = c(prior_cov, prior_trt, prior_sigma), backend = "cmdstanr")

priorsense::powerscale_sensitivity(f) # all priors

priorsense::powerscale_sensitivity(f, prior_tag = "treatment") # only treatment tag

priorsense::powerscale_sensitivity(f, prior_tag = "covariates") # only covariates tag

priorsense::powerscale_sensitivity(f, prior_tag = "sigma") # only sigma tag

@paul-buerkner
Copy link
Owner

Very cool! with "cleanup required" do you mean in the new brms code or in priorsense?

@n-kall
Copy link
Author

n-kall commented Feb 5, 2025

I didn't actually change any code in priorsense, only in the brmsfit method for create_priorsense_data. Cleanup would include deciding if this way makes sense or if I should adjust priorsense to handle tags (currently there is prior_selection which is numeric)

@paul-buerkner
Copy link
Owner

I think doing it in the brms method sounds sensible given that this is brms-specific stuff. Or what would be the benefit of doing it in priorsense?

Am I understanding it correctly, that you will also add some small tests and doc to the PR?

@n-kall
Copy link
Author

n-kall commented Feb 5, 2025

I was thinking that a user could define e.g. lprior_treatment and lprior_covariates as variables directly in the Stan model code, and then use the same prior_tag (or maybe the current prior_selection) argument to do this selection. It might make sense to have the same way to do things in priorsense regardless of whether the fit is brmsfit or stanfit.

And yes, I can add the tests and docs to finish the PR

@paul-buerkner
Copy link
Owner

I see, makes sense. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Interface for defining which prior log densities are stored for priorsense
2 participants