Add new RandomVariable Op and optimizations #137

brandonwillard · 2020-10-30T02:31:31Z

This PR adds a more optimizable and robust Op for random variables aptly named RandomVariable.

Add RandomVariable Op and tests
- Broadcastable multivariate normal, Dirichlet, categorical, and multinomial Ops (i.e. they support multiple, stacked independent parameter arguments—with size parameter support, as well)
Remove RandomFunction, its modules, and the functions that depend on it.
Create DimShuffle lift optimization for RandomVariables
- E.g. normal(mean, std).T is replaced with normal(mean.T, std.T)
Create *Subtensor lift optimization for RandomVariables
- E.g. normal(mean, std)[idx] is replaced with normal(mean[idx], std[idx])

(This is also a replacement for #131 that comes from my new fork)

codecov · 2020-10-30T02:49:57Z

Codecov Report

Merging #137 (0505138) into master (ca215a2) will increase coverage by 0.24%.
The diff coverage is 98.37%.

@@            Coverage Diff             @@
##           master     #137      +/-   ##
==========================================
+ Coverage   71.26%   71.51%   +0.24%     
==========================================
  Files         158      162       +4     
  Lines       54376    54653     +277     
==========================================
+ Hits        38752    39084     +332     
+ Misses      15624    15569      -55

Impacted Files	Coverage Δ
theano/gpuarray/rng_mrg.py	`36.50% <ø> (ø)`
theano/sandbox/rng_mrg.py	`90.66% <50.00%> (ø)`
theano/tensor/basic.py	`89.59% <78.57%> (-0.12%)`	⬇️
theano/gof/graph.py	`91.14% <93.33%> (+0.39%)`	⬆️
theano/tensor/random/opt.py	`97.27% <97.27%> (ø)`
theano/tensor/random/op.py	`99.38% <99.38%> (ø)`
theano/compile/profiling.py	`78.93% <100.00%> (ø)`
theano/tensor/random/basic.py	`100.00% <100.00%> (ø)`
theano/tensor/random/type.py	`100.00% <100.00%> (ø)`
theano/tensor/random/utils.py	`100.00% <100.00%> (ø)`
... and 10 more

brandonwillard · 2020-11-08T00:04:56Z

While attempting to create a lift optimization for DimShuffles on RandomVariables I came across an issue involving numeric reproducibility.

The problem is neatly summarized by the following NumPy-only example:

>>> np.random.RandomState(123).normal(mean, std).T
array([[0.99989144, 3.99984937],
       [2.00009973, 4.99994214],
       [3.0000283 , 6.00016514]])

>>> np.random.RandomState(123).normal(mean.T, std.T)
array([[0.99989144, 4.00009973],
       [2.0000283 , 4.99984937],
       [2.99994214, 6.00016514]])

The first case is the numeric result one would obtain from a DimShuffled RandomVariable graph. The second is the lifted version of the same graph. Both result are theoretically equivalent and—ideally—should produce the same numeric result for the same RNG and seed. As we can see, they do not.

Here's an example of how it could be made to work:

>>> (mean + std * np.random.RandomState(123).standard_normal((2, 3))).T
array([[0.99989144, 3.99984937],
       [2.00009973, 4.99994214],
       [3.0000283 , 6.00016514]])

>>> mean.T + std.T * np.random.RandomState(123).standard_normal((2, 3)).T
array([[0.99989144, 3.99984937],
       [2.00009973, 4.99994214],
       [3.0000283 , 6.00016514]])

Simply put, by implementing the affine transform in RandomState.normal, we can add a transpose to the block of standard normals. This is apparently what we're missing when we use RandomState.normal.

Since I don't think we want to effectively reimplement all the samplers in NumPy's RandomState, we can either think of a good work around to preserve equality, or we can accept the fact that the two graphs will produce different results although they're theoretically equivalent.

brandonwillard · 2020-11-24T04:59:48Z

A preliminary Subtensor lift optimization was just added; however, it needs one addition in order to work with multivariate distributions—and a lot more tests.

brandonwillard · 2020-11-27T03:25:13Z

OK, it has occurred to me—in another context—that we should address the RNG consistency issue mentioned above if we want to apply these optimizations more often.

Problem

Imagine that we're creating a Theano graph for the NumPy operations that produce z in the following:

import numpy as np

seed = 34893
rng = np.random.RandomState(seed)

x = rng.normal(np.arange(2))

z = x - x[1]

>>>  z
array([-0.7960794,  0.       ])

Just as with NumPy, we would expect a Theano-PyMC graph for z to necessarily have a 0 for the element at index 1. This should also hold for any RNG state.

The naive local_subtensor_rv_lift rewrite rule would effectively substitute x[1] with np.random.RandomState(seed).normal(np.arange(2)[1]), which would only imply that the expectation of z[1] is 0. I.e.

rng = np.random.RandomState(seed)

x = rng.normal(np.arange(2))

rng_2 = np.random.RandomState(seed)
y = rng_2.normal(np.arange(2)[1])

z = x - y

>>>  z
array([-1.       , -0.2039206])

Unfortunately, that's not what the graph actually represents, so this rewrite is inconsistent.

As a simple way to avoid introducing this issue, we should not perform the rewrite if there's another reference to x in the graph; however, that would limit the applicability of the optimization. This restriction can be loosened a bit by allowing references to invariant properties (e.g. the shape of x) and not the values in x themselves.

Potential Solutions

RNG-based

We might be able to solve a larger number of cases using an RNG-based approach. Such an approach might also preserve numeric equality between graphs (i.e. equality of graphs pre-and-post rewrite, as described above), but it will require some additional Theano-PyMC functionality.

The idea is that we track the number of elements to skip, which might not be too difficult in most cases, especially since we're already computing all the requisite shape and index information for the rewrites themselves. In other words, the Theano-PyMC RNG objects would carry a set of state "jumps" that determine the evolution of the internal RNG state based on the indexing applied to it.

The most basic way of implementing this could use a seed-based approach (offsets from a seed, really). This would work with all RNGs and samplers, but I'm not sure if it could be efficiently extended to blocks/slices of indices. It seems like we would have to ensure that all values were drawn individually from a flattened version of the array. This isn't difficult to do, and it could be implemented in C/Cython to cut costs.

Alternatively, we could—and eventually should—add support for at least one of the two more flexible NumPy BitGenerators: PCG64 and/or Philox. These RNGs implement an .advance method that would allow us to manipulate the state in a manner that preserves consistency between shuffles and subsets of RandomVariable arrays.

Our simple example above can be fixed in this way:

x = drng.normal(np.arange(2))

drng = np.random.default_rng(seed)
# Move the state forward so that the next sample matches the second entry in
# `x`
drng.bit_generator.advance(1)
y = drng.normal(np.arange(2)[1])

z = x - y

>>>  z
array([-2.68521984,  0.        ])

Naturally, this .advance-based approach won't work for certain samplers (e.g. rejection-based ones), but it should work for more than a few of the samplers for basic random variables.

Unfortunately, this approach would end up sampling the same value multiple times throughout a graph if it's implemented without some form of caching.

Otherwise, these RNG-based approaches have a direct correspondence with the actual source of change between rewrites (i.e. the RNG state), which adds to their appeal. In other words, indexing is equivalent to shifting an abstract rng state: normal(mean, stddev, rng)[index] is converted to normal(mean[index], stddev[index], new_rng).

Graph-based

We could also attempt to synchronize slices of x throughout the graph by replacing the rewritten RandomVariables with stand-ins that are updated in-place. In effect, we would replace indexed random arrays with some type of sparse, lazy random arrays that operate like a sparse array would, except that when elements are indexed a value is generated and permanently saved for those index locations.

This is a nice solution because it would work for any RNG and sampling method. It would also avoid the RNG-based issue of producing duplicate samples, since it's effectively an extreme type of the caching needed to reduce duplicates in that approach.

Unfortunately, it would incur most of the same overhead that sparse arrays do, but some of that could be ameliorated by a simple, low-level C implementation—at least for certain key steps. It also doesn't address the simpler pre-and-post graph rewrite numerical consistency.

brandonwillard · 2020-12-04T20:35:01Z

To keep things moving, we should probably disable the automatic use of these rewrites until a good RNG/rewrite-consistency solution is worked out. I'll create a separate issue for that.

This allows one to obtain the length of a fixed-length vector that has--for example--been cast to a different datatype, squared, etc.

These tests now assume that the C VM versions of scan Ops should be faster than their standard Python counterparts.

This optimization does *not* preserve equality between the numeric results of the untransformed and transformed graphs when the RNGs and seeds are equal. The reason is that the underlying sampler methods themselves are not implemented in Theano, so we cannot apply the requisite DimShuffle-like operations to the intermediate samples used to generate multiple replications and/or independent variates. For example, sampling a normal of size (3, 2) requires a draw of size (3, 2) from a standard normal and we can't transpose that (3, 2) array. If we could, then we would be able to maintain numerical equality between graphs.

kyleabeauchamp · 2020-12-25T06:05:05Z

Is there a branch of pymc3 that is compatible with this change? I'm running into a ImportError: cannot import name 'MRG_RandomStreams' exception when I try to use master pymc3 against master theano-pymc.

brandonwillard added important new operator labels Oct 30, 2020

brandonwillard self-assigned this Oct 30, 2020

brandonwillard mentioned this pull request Oct 30, 2020

Add new RandomVariable Op and implementations #131

Closed

4 tasks

brandonwillard force-pushed the add-randomvariable-op branch 7 times, most recently from 41de3f9 to 4c1edf1 Compare November 6, 2020 01:40

brandonwillard mentioned this pull request Nov 7, 2020

Implement JAX conversions for RandomFunction/RandomVariable Ops #146

Closed

brandonwillard force-pushed the add-randomvariable-op branch 4 times, most recently from 9d168b9 to e877ff8 Compare November 9, 2020 00:56

brandonwillard changed the title ~~Add new RandomVariable Op and implementations~~ Add new RandomVariable Op and optimizations Nov 9, 2020

brandonwillard force-pushed the add-randomvariable-op branch 2 times, most recently from d9411fb to 7c36c55 Compare November 10, 2020 05:09

brandonwillard force-pushed the add-randomvariable-op branch 4 times, most recently from 661824b to 4199a7d Compare November 24, 2020 04:05

brandonwillard marked this pull request as draft November 24, 2020 04:31

brandonwillard force-pushed the add-randomvariable-op branch from 4199a7d to f28efcb Compare November 24, 2020 04:58

brandonwillard mentioned this pull request Nov 27, 2020

Use NumPy's newer random routines pymc-devs/pymc#4263

Closed

brandonwillard force-pushed the add-randomvariable-op branch 3 times, most recently from 74f8b62 to a518bce Compare November 29, 2020 00:02

brandonwillard mentioned this pull request Dec 4, 2020

Construct a means of maintaining RNG consistency between graph rewrites #209

Open

brandonwillard force-pushed the add-randomvariable-op branch from a518bce to aca1c85 Compare December 5, 2020 23:32

brandonwillard marked this pull request as ready for review December 5, 2020 23:34

brandonwillard force-pushed the add-randomvariable-op branch 5 times, most recently from 52495db to b807eea Compare December 7, 2020 19:08

brandonwillard added 7 commits December 12, 2020 17:20

Ignore univariate Elemwise Ops in get_vector_length

b51cfd0

This allows one to obtain the length of a fixed-length vector that has--for example--been cast to a different datatype, squared, etc.

Move MRG RandomStream test to tests.sandbox.test_rng_mrg

abc6915

Turn speed tests into actual unit tests

46adff8

These tests now assume that the C VM versions of scan Ops should be faster than their standard Python counterparts.

Allow get_scalar_constant_value to get shape values from constants

3635eac

Make equal_computations support mixed NumPy/primitive inputs

3a2556a

Make get_vector_length handle simple Join Ops

b83d0a6

Use MakeVector in as_tensor_variable

7e219e9

brandonwillard force-pushed the add-randomvariable-op branch from b807eea to 9afaae5 Compare December 12, 2020 23:31

brandonwillard added 3 commits December 12, 2020 19:07

Add new RandomVariable Op and implementations

a328dd5

Implement RandomVariable Subtensor lift optimization

0505138

brandonwillard force-pushed the add-randomvariable-op branch from 9afaae5 to 0505138 Compare December 13, 2020 01:08

brandonwillard merged commit 50c60b6 into aesara-devs:master Dec 14, 2020

brandonwillard deleted the add-randomvariable-op branch December 14, 2020 00:46

brandonwillard linked an issue Dec 15, 2020 that may be closed by this pull request

Merge RandomVariable Ops #66

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new RandomVariable Op and optimizations #137

Add new RandomVariable Op and optimizations #137

brandonwillard commented Oct 30, 2020 •

edited

Loading

codecov bot commented Oct 30, 2020 •

edited

Loading

brandonwillard commented Nov 8, 2020

brandonwillard commented Nov 24, 2020

brandonwillard commented Nov 27, 2020 •

edited

Loading

brandonwillard commented Dec 4, 2020

kyleabeauchamp commented Dec 25, 2020

Add new RandomVariable Op and optimizations #137

Add new RandomVariable Op and optimizations #137

Conversation

brandonwillard commented Oct 30, 2020 • edited Loading

codecov bot commented Oct 30, 2020 • edited Loading

Codecov Report

brandonwillard commented Nov 8, 2020

brandonwillard commented Nov 24, 2020

brandonwillard commented Nov 27, 2020 • edited Loading

Problem

Potential Solutions

RNG-based

Graph-based

brandonwillard commented Dec 4, 2020

kyleabeauchamp commented Dec 25, 2020

brandonwillard commented Oct 30, 2020 •

edited

Loading

codecov bot commented Oct 30, 2020 •

edited

Loading

brandonwillard commented Nov 27, 2020 •

edited

Loading