Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing a wider selection of variable transformations #447

Closed
ricardoV94 opened this issue Dec 1, 2023 Discussed in #442 · 1 comment · Fixed by #499
Closed

Implementing a wider selection of variable transformations #447

ricardoV94 opened this issue Dec 1, 2023 Discussed in #442 · 1 comment · Fixed by #499
Labels
enhancement New feature or request help wanted Extra attention is needed MMM

Comments

@ricardoV94
Copy link
Contributor

Discussed in #442

Originally posted by iraur November 23, 2023
Hello!
I was wondering if anyone has considered implementing alternatives for variable transformations, namely Weibull PDF for adstock where the (cumulative) decay rates are given by :
$$G_{k,\lambda}(t) = \frac{k}{\lambda}\Big(\frac{t}{\lambda} \Big)^{k-1}e^{-(\frac{t}{\lambda})^k}$$
and the Hill function for saturation, where transformed spend in week $t$ is given by :
$$x_t^{\textrm{transf}} = \frac{x_t^\alpha}{x_t^\alpha + \gamma^\alpha}$$
The main motivations for this are :

  • Weibull PDF offers a lot more flexibility, compared to geometric adstock which is sometimes criticised for being too simplistic. As a result it can lead to better performing models (see this white paper from Ekimetrics for example)
  • Weibull PDF is preferred by business stakeholders as it can better model real life marketing effects
  • We started out our MMM journey using Robyn, Meta’s open-source MMM package, which uses Weibull PDF and Hill - it would be great for pymc-marketing and Robyn to align on these transformations (more direct comparison, QA checks, etc)

We have attempted to write a scrappy first version of these functions below and would very much appreciate any comments or feedback (still getting up to speed on pymc so appreciate your patience on any silly mistakes here!) 🤗

import pymc as pm
from pymc_marketing.mmm.transformers import batched_convolution

def weibull_pdf_adstock(x, shape, scale, l_max: int = 30, axis: int = 0):
    shape = pm.floatX(pt.as_tensor(shape)[..., None])
    scale = pm.floatX(pt.as_tensor(scale)[..., None])

    w = shape * pt.power(scale, -shape) * pt.power(pt.arange(l_max, dtype=x.dtype) + 1, shape - 1) * pt.power(pt.as_tensor(math.exp(1)), -pt.power((pt.arange(l_max, dtype=x.dtype) + 1)/scale, shape))
    w = (w - pt.min(w.eval()))/(pt.max(w.eval()) - pt.min(w.eval()))
    return batched_convolution(x, w, axis=axis)

def saturation_hill(x, alpha, gamma, x_marginal=None):
    inflexion = np.dot(np.array([1 - gamma, gamma]), np.array([pt.min(x, axis=0), pt.max(x, axis=0)])) # linear interpolation by dot product
    
    if x_marginal is None:
        x_scurve = x**alpha / (x**alpha + inflexion**alpha)
    else:
        x_scurve = x_marginal**alpha / (x_marginal**alpha + inflexion**alpha)
    return x_scurve

Thanks! cc. @louismagowan

@ricardoV94 ricardoV94 added MMM enhancement New feature or request help wanted Extra attention is needed labels Dec 1, 2023
@wd60622
Copy link
Contributor

wd60622 commented Dec 1, 2023

I'd be happy to help out here

For the other adstocks, they use the sum for normalization (i.e x / x.sum()). Would that be something to implement for weibull too?
I think there's a gain of having sum for normalization in order to keep the sum of the input the same.
What was the rationale behind this difference? Would there be a benefit of leaving these as option to the user as well as an argument that is more than a boolean?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed MMM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants