Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace slow dist_math::incomplete_beta with aesara op #4519

Closed
wants to merge 222 commits into from

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Mar 9, 2021

This PR replaces the incomplete_beta method in dist_math by an aesara op betainc that wraps the scipy.special.betainc method.

This provides a large increase in speed (and graph compilation) even without providing the c_code implementation directly (this can always be added later). It also allows the evaluation of the logcdf for multiple values in Beta, StudentT, Binomial, NegativeBinomial (and zero inflated variants). I removed the previous limitation warning when trying to evaluate these methods for multiple values. These changes are critical for any practical implementation of Truncated Versions of these distributions.

Gradients were adapted from the STAN implementation, and require the use of two extra "pseudo aesara ops" that simply wrap the equivalent python functions. This does not allow for further graph optimizations, but I doubt aesara could do much with the previous scan implementation either. Let me know if you think there is a better solution.

Depending on what your PR does, here are a few things you might want to address in the description:

Closes #4420

@ricardoV94 ricardoV94 changed the title Inc beta Replace slow dist_math::incomplete_beta with aesara op Mar 9, 2021
@ricardoV94 ricardoV94 force-pushed the inc_beta branch 2 times, most recently from df52381 to d15d752 Compare March 9, 2021 17:30
@OriolAbril
Copy link
Member

not sure if this should be merged into master or into v4 🤔 cc @brandonwillard @twiecki @michaelosthege

@brandonwillard
Copy link
Contributor

It shouldn't be difficult to rebase v4, so we could merge something like this. I think the problem would have more to do with our release plan (e.g. are we going to make version 3 releases with the theano -> aesara change?)

@ricardoV94
Copy link
Member Author

ricardoV94 commented Mar 9, 2021

Strange failure on 32-bit run, for the logcdf of the beta.

E               AssertionError: 
E               Arrays are not almost equal to 3 decimals
E               {'alpha': array(100., dtype=float32), 'beta': array(0.1, dtype=float32), 'value': array(0.001, dtype=float32)}
E               x and y -inf location mismatch:
E                x: array(-inf, dtype=float32)
E                y: array(-697.172)

I don't understand why the aesara op would underflow but not the scipy one. It seems scipy is basically calling betainc and then taking the logarithm, which is exactly the same thing we are doing:

https://github.com/pymc-devs/pymc3/blob/53b642e4124feb0092ef5492685f44420f80cf3b/pymc3/distributions/continuous.py#L1319

https://github.com/scipy/scipy/blob/master/scipy/stats/_continuous_distns.py#L610-L611

Afaik the scipy should also underflow to 0 and then -inf in a float-32 environment:

np.log(st.beta(100.0, 0.1).cdf(0.001).astype('float32'))
<ipython-input-32-b9435d6250d7>:1: RuntimeWarning: divide by zero encountered in log
  np.log(st.beta(100.0, 0.1).cdf(0.001).astype('float32'))
Out[32]: -inf

@michaelosthege
Copy link
Member

michaelosthege commented Mar 9, 2021

@ricardoV94 this error could indicate that y ran with float64:

E                x: array(-inf, dtype=float32)
E                y: array(-697.172)

RELEASE-NOTES.md Outdated Show resolved Hide resolved
@ricardoV94
Copy link
Member Author

ricardoV94 commented Mar 10, 2021

@ricardoV94 this error could indicate that y ran with float64:

E                x: array(-inf, dtype=float32)
E                y: array(-697.172)

I was probably unclear. The error appeared in the 32bit GH run: https://github.com/pymc-devs/pymc3/runs/2075254035?check_suite_focus=true

AFAIK both should have underflowed.

@ricardoV94
Copy link
Member Author

In terms of V3 / V4, I think it's fine if this only appears in the V4 branch. I can toggle this as a draft until we are ready for it.

@ricardoV94 ricardoV94 force-pushed the inc_beta branch 4 times, most recently from 6ea14a8 to 780bd6e Compare March 10, 2021 12:40
RELEASE-NOTES.md Outdated Show resolved Hide resolved
@ricardoV94 ricardoV94 marked this pull request as draft March 11, 2021 07:01
@ricardoV94
Copy link
Member Author

Waiting a bit on this one following some discussion on the STAN side to make sure these gradient approximations are good enough: stan-dev/math#2417

brandonwillard and others added 13 commits March 29, 2021 11:32
This value was not representative of its name.
…d basic dists

These changes can be summarized as follows:
- `Model` objects now track fully functional Theano graphs that represent all
relationships between random and "deterministic" variables.  These graphs are
called these "sample-space" graphs.  `Model.unobserved_RVs`, `Model.basic_RVs`,
`Model.free_RVs`, and `Model.observed_RVs` contain these
graphs (i.e. `TensorVariable`s), which are generated by `RandomVariable` `Op`s.
- For each random variable, there is now a corresponding "measure-space"
variable (i.e. a `TensorVariable` that corresponds to said variable in a
log-likelihood graph).  These variables are available as `rv_var.tag.value_var`,
for each random variable `rv_var`, or via `Model.vars`.
- Log-likelihood (i.e. measure-space) graphs are now created for individual
random variables by way of the generic functions `logpt`, `logcdf`,
`logp_nojac`, and `logpt_sum` in `pymc3.distributions`.
- Numerous uses of concrete shape information stemming from `Model`
objects (e.g. `Model.size`) have been removed/refactored.
- Use of `FreeRV`, `ObservedRV`, `MultiObservedRV`, and `TransformedRV` has been
deprecated.  The information previously stored in these classes is now tracked
using `TensorVariable.tag`, and log-likelihoods are generated using the
aforementioned `log*` generic functions.
This commit changes `DictToArrayBijection` so that it returns a `RaveledVars`
datatype that contains the original raveled and concatenated vector along with
the information needed to revert it back to dictionay/variables form.

Simply put, the variables-to-single-vector mapping steps have been pushed away
from the model object and its symbolic terms and closer to the (sampling)
processes that produce and work with `ndarray` values for said terms.  In doing
so, we can operate under fewer unnecessarily strong assumptions (e.g. that the
shapes of each term are static and equal to the initial test points), and let
the sampling processes that require vector-only steps deal with any changes in
the mappings.
The approach currently being used is rather inefficient.  Instead, we should
change the `size` parameters for `RandomVariable` terms in the sample-space
graph(s) so that they match arrays of the inputs in the trace and the desired
number of output samples.  This would allow the compiled graph to vectorize
operations (when it can) and sample variables more efficiently in large batches.
Classes and functions removed:
- PyMC3Variable
- ObservedRV
- FreeRV
- MultiObservedRV
- TransformedRV
- ArrayOrdering
- VarMap
- DataMap
- _DrawValuesContext
- _DrawValuesContextBlocker
- is_fast_drawable
- _compile_theano_function
- vectorize_theano_function
- get_vectorize_signature
- _draw_value
- draw_values
- generate_samples
- fast_sample_posterior_predictive

Modules removed:
- pymc3.distributions.posterior_predictive
- pymc3.tests.test_random
brandonwillard and others added 18 commits May 15, 2021 16:20
* refactor Moyal distribution

* fixed ndim_params on moyal

Co-authored-by: Farhan Reynaldo <[email protected]>
* refactor kumaraswamy
* add logcdf method

Co-authored-by: Farhan Reynaldo <[email protected]>
Co-authored-by: Ricardo <[email protected]>
* refactor Skewnormal and Rice distribution
* add test for rice b and skewnorm tau params
* change default parameter to b
* Add float32 xfail to `test_rice`

Co-authored-by: Farhan Reynaldo <[email protected]>
Co-authored-by: Ricardo <[email protected]>
* Refactor Flat and HalfFlat distributions

* Re-enable Gumbel logp test

* Remove redundant test
@ricardoV94 ricardoV94 mentioned this pull request Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dist_math.incomplete_beta is very slow