Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider distribution interface. #382

Closed
Tracked by #423
sbfnk opened this issue Apr 28, 2023 · 13 comments · Fixed by #504
Closed
Tracked by #423

Reconsider distribution interface. #382

sbfnk opened this issue Apr 28, 2023 · 13 comments · Fixed by #504
Assignees

Comments

@sbfnk
Copy link
Contributor

sbfnk commented Apr 28, 2023

          very much agree on renaming and agree we should/can support the natural scale input with an internal transform. I think we should doc that that isn't an ideal approach though.

Originally posted by @seabbs in #363 (comment)

At the moment the mean and sd variables in dist_spec mean different things depending on whether a gamma or lognormal is used. This is not really ideal and it would be good to think about a way to make this more intuitive, where parameters aquire their "natural" meaning.

@seabbs seabbs changed the title Reconsider distribution interface Depreciate distribution interface due to reimplementation elsewhere May 15, 2023
@seabbs seabbs changed the title Depreciate distribution interface due to reimplementation elsewhere Reconsider distribution interface. May 15, 2023
@seabbs seabbs removed the discussion label May 15, 2023
@sbfnk sbfnk mentioned this issue Jul 18, 2023
18 tasks
@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 12, 2023

I can see two options here, both involving breaking changes but one slightly more than the other

  1. Keep things largely as they are but add meanlog, meanlog_sd, meansd and meansd_sd arguments to dist_spec. These will be used with a lognormal, whereas the natural ones will be used for the gamma. If the user specified e.g. a mean and sd then these will be converted but this will only work for the fixed case, i.e. an error is thrown if any of the _sd arguments is >0 (as converting the uncertainty for transformed parameters is not trivial).
  2. Same but as a single params argument to dist_spec. If this contains not the stan model parameters (mean/sd for gamma, mu/sigma for lognormal) it gets passed on to dist_skel to generate samples that then get fitted with estimate_delay.

Any other suggestions?

@sbfnk sbfnk added this to the CRAN v2.0 release milestone Oct 12, 2023
@sbfnk sbfnk self-assigned this Oct 12, 2023
@seabbs
Copy link
Contributor

seabbs commented Oct 12, 2023

The other alternative in 1. is to make everything including the gamma be parameterised with the names of the parameters used in stan and then link out to those. That would help with documentation etc.

It does mean that the easy of parameterization would go away for the gamma but as most people are estimating these using shape,scale etc. the mean and sd are quite poor priors to use (because of the correlations between them).

If we went with two we could support different parameters for different distributions and potentially multiple for the same distribution (this would mean doing a transformation internally (in both R and stan)).

@seabbs
Copy link
Contributor

seabbs commented Oct 12, 2023

we could make this a bit fancy and specify the dist using some sugar so that you write gamma(shape = 1212, scale = 1212) rather than the list based approach. Then have internal functions for handling each sugar type (and transforming args etc).

@seabbs
Copy link
Contributor

seabbs commented Oct 12, 2023

(as converting the uncertainty for transformed parameters is not trivial).

Unless we also do this in stan but it adds quite a lot of overhead (as above)

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 12, 2023

If we went with two we could support different parameters for different distributions and potentially multiple for the same distribution (this would mean doing a transformation internally (in both R and stan)).

Yes I suspect that would be the more user friendly option as could also e.g. parameterise using a median etc. but it's potentially less transparent for the user what's happening under the hood with bootstraped_dist_fit or the like.

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 12, 2023

we could make this a bit fancy and specify the dist using some sugar so that you write gamma(shape = 1212, scale = 1212) rather than the list based approach. Then have internal functions for handling each sugar type (and transforming args etc).

Yes potentially though I don't think I'd know how to do it.

@seabbs
Copy link
Contributor

seabbs commented Oct 12, 2023

I think it would be just some regex stuff? I did a similar thing for the rw() sugar in epinowcast.

@seabbs
Copy link
Contributor

seabbs commented Oct 12, 2023

or literally have gamma be a function would be the other option (which is what say rstanarm does).

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 12, 2023

I think it would be just some regex stuff? I did a similar thing for the rw() sugar in epinowcast.

Wouldn't this add a whole layer of required input checking, and also preclude us from passing e.g. variables or function calls etc.? Perhaps fine but seems a heavy price for a small (even nonzero?) gain.

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 12, 2023

or literally have gamma be a function would be the other option (which is what say rstanarm does).

I could see how that could work. But again seems to add some overhead and I'm not even sure that

dist_spec(lognormal(meanlog = 3, sdlog = 1))

is that much better than

dist_spec(distribution = "lognormal", params = c(meanlog = 3, sdlog = 1))

but perhaps it is. Definitely less typing.

@seabbs
Copy link
Contributor

seabbs commented Oct 20, 2023

I think it looks more flashy but I'm not sure it is clearly better. It might help a bit with documentation though (as the parameters that the lognormal need can be in the lognormal docs rather than all in the dist_spec docs.

@Bisaloo
Copy link
Member

Bisaloo commented Oct 25, 2023

I think it would be just some regex stuff? I did a similar thing for the rw() sugar in epinowcast.

I think it's complicated and brittle to use regexes to manipulate code, or more generally treat code as text (eval(parse())). If going this way, I would recommend directly manipulating the syntax tree, the same way lintr is doing it.

@seabbs
Copy link
Contributor

seabbs commented Nov 1, 2023

So could we not try get rid of dist_spec and instead inherit from it for our distributions? So you would just do.

lognormal(meanlog = 3, sdlog = 1)

directly and under the hood that would use dist_spec stuff?

@github-project-automation github-project-automation bot moved this from In Progress to Done in EpiNow2 v1.5.0 Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants