-
-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new RandomVariable Op and optimizations #137
Add new RandomVariable Op and optimizations #137
Conversation
Codecov Report
@@ Coverage Diff @@
## master #137 +/- ##
==========================================
+ Coverage 71.26% 71.51% +0.24%
==========================================
Files 158 162 +4
Lines 54376 54653 +277
==========================================
+ Hits 38752 39084 +332
+ Misses 15624 15569 -55
|
41de3f9
to
4c1edf1
Compare
While attempting to create a lift optimization for The problem is neatly summarized by the following NumPy-only example: >>> np.random.RandomState(123).normal(mean, std).T
array([[0.99989144, 3.99984937],
[2.00009973, 4.99994214],
[3.0000283 , 6.00016514]])
>>> np.random.RandomState(123).normal(mean.T, std.T)
array([[0.99989144, 4.00009973],
[2.0000283 , 4.99984937],
[2.99994214, 6.00016514]]) The first case is the numeric result one would obtain from a Here's an example of how it could be made to work: >>> (mean + std * np.random.RandomState(123).standard_normal((2, 3))).T
array([[0.99989144, 3.99984937],
[2.00009973, 4.99994214],
[3.0000283 , 6.00016514]])
>>> mean.T + std.T * np.random.RandomState(123).standard_normal((2, 3)).T
array([[0.99989144, 3.99984937],
[2.00009973, 4.99994214],
[3.0000283 , 6.00016514]]) Simply put, by implementing the affine transform in Since I don't think we want to effectively reimplement all the samplers in NumPy's |
9d168b9
to
e877ff8
Compare
d9411fb
to
7c36c55
Compare
661824b
to
4199a7d
Compare
4199a7d
to
f28efcb
Compare
A preliminary |
OK, it has occurred to me—in another context—that we should address the RNG consistency issue mentioned above if we want to apply these optimizations more often. ProblemImagine that we're creating a Theano graph for the NumPy operations that produce import numpy as np
seed = 34893
rng = np.random.RandomState(seed)
x = rng.normal(np.arange(2))
z = x - x[1] >>> z
array([-0.7960794, 0. ]) Just as with NumPy, we would expect a Theano-PyMC graph for The naive rng = np.random.RandomState(seed)
x = rng.normal(np.arange(2))
rng_2 = np.random.RandomState(seed)
y = rng_2.normal(np.arange(2)[1])
z = x - y >>> z
array([-1. , -0.2039206]) Unfortunately, that's not what the graph actually represents, so this rewrite is inconsistent. As a simple way to avoid introducing this issue, we should not perform the rewrite if there's another reference to Potential SolutionsRNG-basedWe might be able to solve a larger number of cases using an RNG-based approach. Such an approach might also preserve numeric equality between graphs (i.e. equality of graphs pre-and-post rewrite, as described above), but it will require some additional Theano-PyMC functionality. The idea is that we track the number of elements to skip, which might not be too difficult in most cases, especially since we're already computing all the requisite shape and index information for the rewrites themselves. In other words, the Theano-PyMC RNG objects would carry a set of state "jumps" that determine the evolution of the internal RNG state based on the indexing applied to it. The most basic way of implementing this could use a seed-based approach (offsets from a seed, really). This would work with all RNGs and samplers, but I'm not sure if it could be efficiently extended to blocks/slices of indices. It seems like we would have to ensure that all values were drawn individually from a flattened version of the array. This isn't difficult to do, and it could be implemented in C/Cython to cut costs. Alternatively, we could—and eventually should—add support for at least one of the two more flexible NumPy Our simple example above can be fixed in this way: x = drng.normal(np.arange(2))
drng = np.random.default_rng(seed)
# Move the state forward so that the next sample matches the second entry in
# `x`
drng.bit_generator.advance(1)
y = drng.normal(np.arange(2)[1])
z = x - y >>> z
array([-2.68521984, 0. ]) Naturally, this Unfortunately, this approach would end up sampling the same value multiple times throughout a graph if it's implemented without some form of caching. Otherwise, these RNG-based approaches have a direct correspondence with the actual source of change between rewrites (i.e. the RNG state), which adds to their appeal. In other words, indexing is equivalent to shifting an abstract Graph-basedWe could also attempt to synchronize slices of This is a nice solution because it would work for any RNG and sampling method. It would also avoid the RNG-based issue of producing duplicate samples, since it's effectively an extreme type of the caching needed to reduce duplicates in that approach. Unfortunately, it would incur most of the same overhead that sparse arrays do, but some of that could be ameliorated by a simple, low-level C implementation—at least for certain key steps. It also doesn't address the simpler pre-and-post graph rewrite numerical consistency. |
74f8b62
to
a518bce
Compare
To keep things moving, we should probably disable the automatic use of these rewrites until a good RNG/rewrite-consistency solution is worked out. I'll create a separate issue for that. |
a518bce
to
aca1c85
Compare
52495db
to
b807eea
Compare
This allows one to obtain the length of a fixed-length vector that has--for example--been cast to a different datatype, squared, etc.
These tests now assume that the C VM versions of scan Ops should be faster than their standard Python counterparts.
b807eea
to
9afaae5
Compare
This optimization does *not* preserve equality between the numeric results of the untransformed and transformed graphs when the RNGs and seeds are equal. The reason is that the underlying sampler methods themselves are not implemented in Theano, so we cannot apply the requisite DimShuffle-like operations to the intermediate samples used to generate multiple replications and/or independent variates. For example, sampling a normal of size (3, 2) requires a draw of size (3, 2) from a standard normal and we can't transpose that (3, 2) array. If we could, then we would be able to maintain numerical equality between graphs.
9afaae5
to
0505138
Compare
Is there a branch of pymc3 that is compatible with this change? I'm running into a |
This PR adds a more optimizable and robust
Op
for random variables aptly namedRandomVariable
.RandomVariable
Op
and testsOp
s (i.e. they support multiple, stacked independent parameter arguments—withsize
parameter support, as well)RandomFunction
, its modules, and the functions that depend on it.DimShuffle
lift optimization forRandomVariable
snormal(mean, std).T
is replaced withnormal(mean.T, std.T)
*Subtensor
lift optimization forRandomVariable
snormal(mean, std)[idx]
is replaced withnormal(mean[idx], std[idx])
(This is also a replacement for #131 that comes from my new fork)