Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reusing function in pymc.compute_log_likelihood #7073

Closed
DanielRobertNicoud opened this issue Dec 22, 2023 · 4 comments
Closed

Allow reusing function in pymc.compute_log_likelihood #7073

DanielRobertNicoud opened this issue Dec 22, 2023 · 4 comments

Comments

@DanielRobertNicoud
Copy link

Describe the issue:

Calling pymc.compute_log_likelihood multiple times on the same model leads to multiple compilations via compile_fn. This is a time sink (in some applications) that could easily be solved by storing the compiled function.

Reproduceable code example:

import numpy as np
import pymc

n = 5_000
n_feat = 3
X_train = np.random.normal(size=(n, n_feat))
y_train = np.random.normal(size=(n))

with pymc.Model() as model:
    # data containers
    X = pymc.MutableData("X", X_train)
    y = pymc.MutableData("y", y_train)
    # priors
    intercept = pymc.Normal("intercept", mu=0, sigma=1)
    b = pymc.MvNormal("b", mu=np.zeros(n_feat), cov=np.eye(n_feat))
    sigma = pymc.HalfCauchy("sigma", beta=10)
    mu = intercept + pymc.math.dot(X, b).flatten()
    # likelihood
    likelihood = pymc.Normal("obs", mu=mu, sigma=sigma, observed=y)

    idata = pymc.sample()

def compute_ll_twice(model, idata):
    n_test = 5
    X_test = np.random.normal(size=(n_test, n_feat))
    y_test = np.random.normal(size=(n_test))
    with model:
        pymc.set_data({"X": X_test, "y": y_test})
        for _ in range(5):
            out = pymc.compute_log_likelihood(idata, extend_inferencedata=False)

%prun -l 20 -s cumtime compute_ll_twice(model, idata)

Error message:

No response

PyMC version information:

pymc==5.10.1

Context for the issue:

In a research application I am writing, I need to call pymc.compute_log_likelihood many times (sometimes refitting the model in between). The calls to compile_fn take up to 60% of my computation time. If this could be easily fixed, it would be extremely helpful for me. Thank you very much in advance!

Copy link

welcome bot commented Dec 22, 2023

Welcome Banner
🎉 Welcome to PyMC! 🎉 We're really excited to have your input into the project! 💖

If you haven't done so already, please make sure you check out our Contributing Guidelines and Code of Conduct.

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 22, 2023

All PyMC Model functions are recompiled everytime, including when you call pm.sample (or sample_posterior_predictive) multiple times with the same model. It's not trivial to know whether a model has changed of if the function we are requesting is equivalent to another that was requested in the past.

The compiled function itself is pretty simple to obtain:

elemwise_loglike_fn = model.compile_fn(
inputs=model.value_vars,
outs=model.logp(vars=observed_vars, sum=False),
on_unused_input="ignore",
)

What's a bit tricky is that transformed variables are not available in the trace. compute_log_likelihood has an ugly hack around this but there's a better solution now.

from pymc.model.conditioning import remove_value_transforms

model_wo_transforms = remove_value_transforms(model)
fn = model_wo_transforms.compile_fn(...)

# Reuse fn...

Then there's a bit of boilerplate to map a function over a trace that we could provide to users

@ricardoV94 ricardoV94 changed the title BUG: pymc.compute_log_likelihood calls compile_fn every time Allow reusing function in pymc.compute_log_likelihood Dec 22, 2023
@DanielRobertNicoud
Copy link
Author

@ricardoV94 That would be a very useful addition in the case where the model has not changed!

Maybe a way could be even to pass the compiled function to the compute_log_likelihood function as an optional argument?

@ricardoV94
Copy link
Member

Closing this is favor of #7177

Thanks for bringing this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants