-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add opt_sample
that performs optimizations prior to calling pymc.sample
#396
base: main
Are you sure you want to change the base?
Conversation
83953eb
to
dbe62e8
Compare
403a6bd
to
2525ea8
Compare
Need to try out. This example in description is different than the docstrings. Both work?
|
Yes, I wanted an example that is faster than NUTS, but for testing it would be overkill |
|
||
conjugate_a = a + y | ||
conjugate_b = b + (n - y) | ||
extra_dims = range(binomial_rv.type.ndim - beta_rv.type.ndim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work in all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the cases where there is some type of ExpandDims on one of the parameters would won't detect conjuncy since the owner is not Beta:
N = 100
X = np.repeat(
[1, 50, 99],
250,
).reshape(3, 250)
coords = {"idx": range(each), "kind": ["low", "mid", "high"]}
with pm.Model(coords=coords) as model:
p_raw = pm.Beta("p", 1, 1, dims="kind")
p = p_raw[:, None]
y = pm.Binomial("y", n=N, p=p, observed=X, dims=("kind", "idx"))
idata = pmx.opt_sample(verbose=True)
Or case with shapes like this which actually fails it:
N = 100
X = np.repeat(
[1, 50, 99],
250,
).reshape(3, 250)
with pm.Model() as model:
p = pm.Beta("p", 1, 1, shape=(3, 1))
y = pm.Binomial("y", n=N, p=p, observed=X, shape=(3, 250))
idata = pmx.opt_sample(verbose=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm being lazy and only allowing expand_dims on the left. I wanted to keep things simple for the expression of the conjugate beta alpha and beta params, specially with batch dims of the Binomial. We can make it more flexible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your first question, it does not. If alpha already had one dimension, I should sum contributions from y/n-y and keep that dimension. Basically I have to unbroadcast uses of the same alpha/beta parameters.
I guess that's what you're seeing as a failure in your second example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw this sort of things are what's usually missing from the literature on conjugacy and bayesian model rewrites. Things are much more tricky once you start working with arrays of parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the second example. The first one is explicitly excluded as it was before. Consider it not implemented functionality.
The challenge is we would need to map the alpha/beta to the transformations that happen afterwards. and right expand_dims is kind of like a transpose. The left expand dims is supported because that's just default numpy-like broadcasting
conjugate_a = a + y | ||
conjugate_b = b + (n - y) | ||
extra_dims = range(binomial_rv.type.ndim - beta_rv.type.ndim) | ||
if extra_dims: | ||
conjugate_a = conjugate_a.sum(extra_dims) | ||
conjugate_b = conjugate_b.sum(extra_dims) | ||
conjugate_beta_rv = Beta.dist(conjugate_a, conjugate_b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would summing through data to make summary stats used be more general?
For instance, y.sum(...) and (n - y).sum(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mix independent dimensions of the beta. The model would no longer be equivalent to the original one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically a could itself not be a scalar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am saying that the parameter, p, and its shape defines the model and to keep that. Make n and y broadcastable with p only as that will make broadcastable with a and b. For instance,
a: ()
b: ()
p: ()
n: (100, )
y: (100, )
Then to sum n and y to match dim of p, (). n.sum(axis=0), y.sum(axis=0)
Another example :
a: (3, )
b: ()
p: (4, 3) # Defined from shape
n: ()
y: (100, 4, 1)
n summed to something broadcastable with (4, 3)
n, y.sum(axis=0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Determined shape from p
y_stat = ...
n_stat = ...
# Cookie-cutter conjugate logic
a_post = a + y_stat
b_post = b + (n_stat - y_stat)
# Translate the conjugate parameters to PyMC distribution
conjugate_beta_rv = Beta.dist(alpha=a_post, beta=b_post) # Same shape as p before. Maybe shape needs to be passed as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that was the goal originally but as you found out it was not implemented correctly. I was only handing new dims. The last push has the sum_bcast_dims
that reduces a_post/b_post to the shape of the original beta RV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a: (3, )
b: ()
p: (4, 3) # Defined from shape
n: ()
y: (100, 4, 1)
This example is not possible/not supported. the (100, 4, 1) y cannot map to the (4, 3) of the p, unless you slice or something, which we don't allow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but in the operation to get to posterior parameters, they are fine and broadcast to the desired shape.
Also, why not pass shape to new Beta.dist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but in the operation to get to posterior parameters, they are fine and broadcast to the desired shape.
It matters how that shape went from 3 to 1. Where they summed/sliced/multiplied. Was there another operation in between like exponentiation/ reciprocal?
By only allowing direct use or left expand dims I rule out all these cases that I wouldn't know how to conjugate.
Also, why not pass shape to new Beta.dist?
Yup, it's doing that now.
575beee
to
a5c036a
Compare
@wd60622 I had forgot to push the updates. I wasn't trying to gaslight you :D |
a5c036a
to
7bdb03e
Compare
7bdb03e
to
c1809e8
Compare
Includes building blocks for #358
CC @wd60622 @theorashid
Hopefully it will make it easy for follow-up PRs to add more cases