Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds support for posterior::as_draws() #51

Merged
merged 16 commits into from
Nov 26, 2024

Conversation

djnavarro
Copy link
Contributor

main changes in this PR:

  • supplies posterior::as_draws_*() methods for stanemax objects
  • the approach is broadly analogous to the approach taken in brms: the internal stanfit object is first coerced to a draws_list and then to the required draws_* class
  • supports variable, regex, and inc_warmup arguments for filtering and subsetting, as per brms methods
  • provides minimal set of unit tests

ancillary changes:

  • posterior package is now included in Suggests
  • roxygen updated to 7.3.2, with "_PACKAGE" replacing the @docType tag to prevent deprecation warning

closes #49

@yoshidk6
Copy link
Owner

yoshidk6 commented Nov 13, 2024

Thank you so much, this is wonderful!!

I'd love to customize further so that the extracted samples are even easier to work with. The main one would be the names of the extracted variables, these are currently in a format like ec50[1] or e0_par[1,1], but removing the indexing would make the interpretation more intuitive.
Ideally we can see something with what we see when printing out the fit, from fit2 in test-extract_param.R

---- Emax model fit with rstanemax ----

           mean se_mean    sd  2.5%    25%    50%    75%  97.5%  n_eff Rhat
emax[B0]  86.34    0.34  3.94 79.53  83.91  86.14  88.55  94.11 132.87 1.01
emax[B2]  81.20    0.28  3.27 75.27  79.22  80.87  83.23  87.54 137.69 1.00
emax[B3]  77.32    0.32  3.84 70.39  74.68  77.31  79.57  85.44 146.36 1.00
e0[A0]     7.15    0.19  2.53  2.54   5.45   7.15   8.74  12.09 185.43 1.01
e0[A1]    19.52    0.15  1.90 15.76  18.31  19.47  20.74  22.86 154.82 1.01
ec50[C1] 131.77    1.51 25.78 84.40 114.27 129.42 147.79 189.34 290.74 1.00
ec50[C0]  84.24    1.27 16.18 55.31  73.77  82.39  92.77 119.21 161.88 1.00
gamma      1.00     NaN  0.00  1.00   1.00   1.00   1.00   1.00    NaN  NaN
sigma      8.38    0.05  0.83  7.03   7.73   8.29   8.92  10.12 323.06 1.00

You can take a look at extract_param.R for how I handled the names of the variables. extract_param_fit() that is used can be found in posterior_predict.R. In there I'm creating a longer data frame when covariates are present, but I think we shouldn't do it in as_draws_*. Just changing the name should be all what we need.

Or maybe we can define something like as_draws_df_longer(), and deprecate extract_param() for more streamlined API? I'd love to hear your thoughts.

Since we are changing the variable names and there aren't great flexibility in model structure anyways, I think we can drop the variable and regex. Similarly it should be fine to just keep the main model variables (i.e. c("emax", "e0", "ec50") with or w/o covariates, and c("gamma", "sigma")). Keeping inc_warmup option would be great if that's not too much of a work.

While doing this it would be good to add unit tests to make sure name mappings are done as expected.

Thanks!

@djnavarro
Copy link
Contributor Author

Hi @yoshidk6 - thanks for the suggestions! Following up from this and our chat, I've made the following changes:

  • The as_draws_*() methods only return the meaningful parameters (i.e., e0, emax, etc) and not the ancillary columns in the stanfit object
  • When the model contains covariates, the parameter names are returned in the same style as extract_param() (e.g., "emax[A0]" rather than "emax[1]")
  • Unit tests for as_draws_*() expanded to check stanfit models with and without covariates
  • Unit tests for as_draws_*() checks that the names are as expected, for all draws types

Informally, I've also checked that relabelling the variables in this fashion doesn't seem to break any internal object representation expected by the posterior package. I didn't try to be too thorough about this but from what I can tell:

  • For draws_df, draws_list, draws_matrix and draws_array objects, a variable name like emax[A0] is passed as-is into the internal representation of the object
  • For draws_rvars objects, a name like "emax[A0]" is carved up such that the object contains a list element $emax and A0 gets pushed into the names of the vector stored in $emax

In both cases, when we call posterior::variables() it returns the expected output for with_indices = TRUE and with_indices = FALSE. Additionally, I tried a few examples to check that posterior::subset_draws() still behaves as expected with the renamed variables.

Let me know what you think! 🙂

Copy link
Owner

@yoshidk6 yoshidk6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wonderful, thank you so much for incorporating my feedback and going extra miles to ensure compatibility with as_draws_rvars!

@yoshidk6 yoshidk6 merged commit 963332c into yoshidk6:master Nov 26, 2024
6 checks passed
@djnavarro djnavarro deleted the add-as-draws branch December 5, 2024 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it work with posterior package
2 participants