Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model API cleanup #6309

Merged
merged 6 commits into from
Nov 18, 2022
Merged

Model API cleanup #6309

merged 6 commits into from
Nov 18, 2022

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Nov 17, 2022

Trying to standardize mapping names a bit more to facilitate model transformations such as in pymc-devs/pymc-extras#91

Closes #6305
Closes #5076

Major / Breaking Changes

  • Sampling of transformed variables from prior_predictive is no longer allowed

Bugfixes / New features

  • ...

Docs / Maintenance

  • Rename several internal Model variables

@ricardoV94 ricardoV94 force-pushed the model_api_cleanup branch 3 times, most recently from 8a81537 to 43e3f05 Compare November 17, 2022 10:15
@ricardoV94 ricardoV94 force-pushed the model_api_cleanup branch 2 times, most recently from de31798 to 0121fd9 Compare November 17, 2022 10:30
@codecov
Copy link

codecov bot commented Nov 17, 2022

Codecov Report

Merging #6309 (60f15e1) into main (d4ff7ae) will increase coverage by 2.31%.
The diff coverage is 94.44%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6309      +/-   ##
==========================================
+ Coverage   91.87%   94.18%   +2.31%     
==========================================
  Files         111      111              
  Lines       23917    23908       -9     
==========================================
+ Hits        21973    22518     +545     
+ Misses       1944     1390     -554     
Impacted Files Coverage Δ
pymc/initial_point.py 100.00% <ø> (ø)
pymc/tests/distributions/test_continuous.py 99.76% <ø> (-0.01%) ⬇️
pymc/model.py 89.76% <85.71%> (-0.29%) ⬇️
pymc/backends/arviz.py 90.71% <100.00%> (+2.90%) ⬆️
pymc/data.py 80.08% <100.00%> (ø)
pymc/model_graph.py 78.82% <100.00%> (ø)
pymc/sampling/forward.py 95.45% <100.00%> (-0.11%) ⬇️
pymc/sampling/jax.py 98.19% <100.00%> (+0.85%) ⬆️
pymc/sampling/mcmc.py 92.27% <100.00%> (ø)
pymc/smc/kernels.py 97.41% <100.00%> (+0.03%) ⬆️
... and 8 more

@ricardoV94 ricardoV94 added the major Include in major changes release notes section label Nov 17, 2022
@ricardoV94 ricardoV94 added this to the v4.4.0 milestone Nov 17, 2022
@ricardoV94 ricardoV94 changed the title Model api cleanup Model API cleanup Nov 17, 2022
@ricardoV94 ricardoV94 force-pushed the model_api_cleanup branch 2 times, most recently from 9246301 to 29e8e05 Compare November 17, 2022 14:18
Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trusting that you'll fix that one test 👍

This also disables prior_predictive sampling of transformed variables
This property was initially added just to handle deterministics created by automatic imputation, in order to ensure the combined tensor of missing and observed components showed up in prior and posterior predictive sampling. At the same time, it allowed hiding the deterministic during mcmc sampling, saving memory use for large datasets. This last benefit is lost for the sake of simplicity. If a user is concerned, they can manually split the observed and missing components of a dataset when defining their model.
@@ -343,7 +362,7 @@ def sample_prior_predictive(
var_names : Iterable[str]
A list of names of variables for which to compute the prior predictive
samples. Defaults to both observed and unobserved RVs. Transformed values
are not included unless explicitly defined in var_names.
are not allowed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why transformed (unconstrained) values are not allowed here? I need to get prior samples in the transformed space, what should I do? Thanks!

(MCMC sampling by removing observations from the likelihood is not a good option. I guess I could also use sample_prior_predictive to get prior parameter samples and then transform back to the unconstrained space but it seems not easy to do it with Transforms in an efficient vectorized way, i.e., it seems I have to do a for-loop.)

Copy link
Member Author

@ricardoV94 ricardoV94 Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A helper to go from unconstrained back to constrained draws is needed elsewhere (specially since we don't save them in InferenceData after sampling), so we can add that and will also cover your use case.

I think there is an open issue for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue: #6721

Copy link
Contributor

@pipme pipme Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick reply! Transforming efficiently between unconstrained and constrained draws would be super useful for our use case. What would be needed for its implementation? In the meantime, what could be a good workaround? (I need to transform a lot of prior samples from constrained to unconstrained. Or alternatively, unconstrained prior samples are even more useful to me than constrained samples. So it would be better if I could get them directly.) Can I modify back sample_prior_predictive to include the transformed variables like sd_log__?

For the InferenceData, pm.sample(..., idata_kwargs={"include_transformed": True}) seems to include the transformed samples.

@pipme
Copy link
Contributor

pipme commented Jul 11, 2024

Hi, please see my comment above. Any reason why transformed (unconstrained) values are not allowed in sample_prior_predictive? I need to get prior samples in the transformed space, what should I do? Thanks!

@ricardoV94
Copy link
Member Author

No we shouldn't change sample_prior_predictive, but you can of course change in your local installation if that's an option for you. Or you can do something like this for yourself: https://discourse.pymc.io/t/logp-questions-synthetic-dataset-to-evaluate-modeling/12129/6?u=ricardov94

I'll open a PR soon to add the functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance major Include in major changes release notes section
Projects
None yet
3 participants