-
-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve documentation on InferenceData.map #1252
Comments
In particular, I have two concrete examples of what you might want to do with
|
Here's my current working code for what I'm trying to do. Even after reading #1255, I'm still not sure if that falls under
|
I have extended docs a little, map basically loops over groups to apply the function to all groups (the same way that I think map would be a perfect fit for 1. if the transformation had to be done to both prior and posterior, it it is a posterior only transformation it can still be used and may be more convenient but would be equivalent to the following:
Metagroups are still quite experimental so any comment will be very welcome. I think 2. requires access to both observed_data and posterior predictive? I think it diverges from the groupwise case of |
I have thinking about this too lately. We would basically need to create a function, then having some simple way to combine InferenceData groups + external data where function arguments could be inferred / manually defined. Something similar as in And this would pull out wanted output. |
OK, I put together a really simple example of (1), which after your suggestions seems to be working well:
I'll try to think about example (2) now. |
Here's my code for example (2). I can't get the sklearn <> map stuff working because of the need for an implicit broadcast / aggregation across one dimension. I think making that commented out line work in an easily comprehensible way might be a win for usability.
|
PS: the reason I'm pushing for this functionality is that I think it would be great to be able to have a clean separation between "model training" and "model evaluation", where we use the InferenceData interface to do model evaluation (using available metrics in, e.g., sklearn) after the model was sampled. |
I'd recommend reading the updated radon notebook from pymc-devs/pymc#3963 to see examples of postprocessing calculations, you can access the different variables as In the second example in fact, using xarray should take care of all broadcasting, the issue is that y0 is a numpy array, so xarray cannot broadcast automatically, using |
Thanks, the new features (coords in pm.Model() and pm.Data) seem extremely useful for ensuring good bookkeeping and reproducibility in these situations. I still don't actually see how to solve the issue in (2). Basically, the issue there is that I specifically want to avoid re-writing the metric and instead leverage a unit-tested, open-source version from sklearn. To me, the only way to use that existing code is to use some sort of "coordinate aware" function like map, apply, agg, or reduce where a user specifies which axes are iterated over and which ones are "presented to func". |
Oh, my bad, I assumed sklearn function would work, xarray preserves coords and broadcasts automatically with most numpy functions. It looks like you'll have to use |
Closing, as the doc improvement has been merged and released in 0.9.0, but feel free to reopen if needed 😉 |
FYI, I was able to figure out my use case combining arviz + xarray + sklearn metrics. I spent several hours staring at the docs for
|
For example, suppose you want to evaluate a new function of the model parameters (iterated for each step in each chain). Often, one might use
pymc3.Deterministic
to add those to the compute graph during model sampling, but I imagine thatInferenceData.map
might be a nice way to run such calculations without having to build it into the original compute graph at training time. I think a couple of examples and some guardrails on what fun ought to be might improve the usability here.The text was updated successfully, but these errors were encountered: