Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider making channel scaling part of the model definition #1383

Open
ricardoV94 opened this issue Jan 16, 2025 · 9 comments
Open

Consider making channel scaling part of the model definition #1383

ricardoV94 opened this issue Jan 16, 2025 · 9 comments
Labels

Comments

@ricardoV94
Copy link
Contributor

ricardoV94 commented Jan 16, 2025

When working on #1357 to make the optimizar model-agnostic, I still had to worry about channel scales, because these are not part of the model. I imagine when defining the model they happen as a pre-proccessing step?

If instead, the model was defined with raw data and the scaling happened symbolically, it wouldn't be needed. Is there any part of the codebase that requires sometimes applying the scale and other times not?

with pm.Model() as m:
  natural_x = pm.Data("x", ...)
  rescaled_x = natural_x / natural_x.max().eval()  # So it doesn't change when you change `natural_x`.
  ... # Make use of rescaled_x.

If we needed a function that takes rescaled_x as input that would also be easy, by wrapping the operation in a Deterministic, which gives us a handle to it later.

@ricardoV94 ricardoV94 changed the title Make channel scaling part of the model definition Consider making channel scaling part of the model definition Jan 16, 2025
@wd60622
Copy link
Contributor

wd60622 commented Jan 16, 2025

Totally. Does this work with scaling the regression target though?

@ricardoV94
Copy link
Contributor Author

Totally. Does this work with scaling the regression target though?

I don't know exactly what you're asking :)

@wd60622
Copy link
Contributor

wd60622 commented Jan 16, 2025

This doesn't work ... (simple version of the model and how I am interpreting your suggestion)

import numpy as np
import pymc as pm

seed = sum(map(ord, "Scaling the likelihood depended variables doesn't work in PyMC"))
rng = np.random.default_rng(seed)

true_mu = 100
true_sigma = 30

n_obs = 10
coords = {
    "date": np.arange(n_obs),
}

dist = pm.Normal.dist(mu=true_mu, sigma=true_sigma, shape=n_obs)
data = pm.draw(dist, random_seed=rng)

scaling = data.max()

with pm.Model(coords=coords) as model:
    mu = pm.Normal("mu")
    sigma = pm.HalfNormal("sigma")

    target = pm.Data("target", data, dims="date")
    scaled_target = target / scaling

    pm.Normal("observed", mu=mu, sigma=sigma, observed=scaled_target)

Results in:

TypeError                              Traceback (most recent call last)
Cell In[6], line 30
     27 target = pm.Data("target", data, dims="date")
     28 scaled_target = target / scaling
---> 30 pm.Normal("observed", mu=mu, sigma=sigma, observed=scaled_target)

File ~/micromamba/envs/pymc-marketing-dev/lib/python3.10/site-packages/pymc/distributions/distribution.py:513, in Distribution.__new__(cls, name, rng, dims, initval, observed, total_size, transform, default_transform, *args, **kwargs)
    509         kwargs["shape"] = tuple(observed.shape)
    511 rv_out = cls.dist(*args, **kwargs)
--> 513 rv_out = model.register_rv(
    514     rv_out,
    515     name,
    516     observed=observed,
    517     total_size=total_size,
    518     dims=dims,
    519     transform=transform,
    520     default_transform=default_transform,
    521     initval=initval,
    522 )
    524 # add in pretty-printing support
    525 rv_out.str_repr = types.MethodType(str_for_dist, rv_out)

File ~/micromamba/envs/pymc-marketing-dev/lib/python3.10/site-packages/pymc/model/core.py:1245, in Model.register_rv(self, rv_var, name, observed, total_size, dims, default_transform, transform, initval)
   1243 else:
   1244     if not is_valid_observed(observed):
-> 1245         raise TypeError(
   1246             "Variables that depend on other nodes cannot be used for observed data."
   1247             f"The data variable was: {observed}"
   1248         )
   1250     # `rv_var` is potentially changed by `make_obs_var`,
   1251     # for example into a new graph for imputation of missing data.
   1252     rv_var = self.make_obs_var(
   1253         rv_var, observed, dims, default_transform, transform, total_size
   1254     )

TypeError: Variables that depend on other nodes cannot be used for observed data.The data variable was: True_div.0

@wd60622
Copy link
Contributor

wd60622 commented Jan 16, 2025

However, doing for the covariated in the model works fine so breaking this into two steps (covariates and target) would be fine.

@ricardoV94
Copy link
Contributor Author

ricardoV94 commented Jan 16, 2025

Ah I see what you mean. We can and should make observed less restrictive in PyMC. As long as it is a function that involves no RVs / value_vars, it should be fine.

We already have a bunch of exceptions for casting and minibatch (which actually involves RVs, but carefully defined ones)

@wd60622
Copy link
Contributor

wd60622 commented Jan 16, 2025

Cool. Seems good to work toward then. Are there open PyMC issues for this?

@ricardoV94
Copy link
Contributor Author

Cool. Seems good to work toward then. Are there open PyMC issues for this?

I know it has been talked about repeatedly but can't find any issue

@wd60622
Copy link
Contributor

wd60622 commented Jan 16, 2025

Cool. I know there are some related issues in pymc-marketing

Related to #154 #407 #299 and others that are linked there.

@cetagostini
Copy link
Contributor

cetagostini commented Jan 20, 2025

@ricardoV94 @wd60622
The following code works for me ->

with pm.Model(
            coords=self.model_coords,
        ) as self.model:
            _channel_scale = pm.Data(
                "channel_scale",
                self.scalers._channel.values,
                mutable=False,
                dims="channel",
            )
            _target_scale = pm.Data(
                "target_scale",
                self.scalers._target.item(),
                mutable=False,
            )

            # Scale `channel_data` and `target`
            channel_data_ = pm.Data(
                name="channel_data",
                value=(
                    self.xarray_dataset._channel.transpose(
                        "date", *self.dims, "channel"
                    ).values
                    / _channel_scale.eval()
                ),
                dims=("date", *self.dims, "channel"),
            )

            target_ = pm.Data(
                name="target",
                value=(
                    self.xarray_dataset._target.sum(dim="target")
                    .transpose("date", *self.dims)
                    .values
                ),
                dims=("date", *self.dims),
            )

. . . . 

            mu_var *= _target_scale.eval()

            mu = pm.Deterministic(name="mu", var=mu_var, dims=("date", *self.dims))

            self.model_config["likelihood"].dims = ("date", *self.dims)
            self.model_config["likelihood"].create_likelihood_variable(
                name=self.output_var,
                mu=mu,
                observed=target_,
            )

You can see the full implementation on the #1036 PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants