Implement ParetoNBD with covariates #545

ricardoV94 · 2024-02-22T14:24:12Z

Description

Related Issue

Supersedes and closes Add Covariates to ParetoNBDModel #463
Related to #

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Modules affected

MMM
CLV

Type of change

📚 Documentation preview 📚: https://pymc-marketing--545.org.readthedocs.build/en/545/

ColtAllen · 2024-02-22T14:45:19Z

I think an updated UML diagram would be a helpful addition to CONTRIBUTING.MD to better understand all these new class inheritances:

Here are some instructions on how to create one:

#178

pymc_marketing/clv/models/pareto_nbd.py

codecov · 2024-02-22T16:31:32Z

Codecov Report

Attention: Patch coverage is 19.73684% with 122 lines in your changes are missing coverage. Please review.

Project coverage is 35.61%. Comparing base (6f7a3d4) to head (db06d52).

Files	Patch %	Lines
pymc_marketing/clv/models/pareto_nbd.py	20.68%	115 Missing ⚠️
pymc_marketing/clv/models/beta_geo.py	0.00%	5 Missing ⚠️
pymc_marketing/clv/distributions.py	0.00%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #545       +/-   ##
===========================================
- Coverage   91.56%   35.61%   -55.96%     
===========================================
  Files          21       21               
  Lines        2052     2137       +85     
===========================================
- Hits         1879      761     -1118     
- Misses        173     1376     +1203

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pymc_marketing/clv/models/pareto_nbd.py

ColtAllen · 2024-02-27T06:07:50Z

Code is certainly cleaner than the IF statement jungle of the PR I posted haha. Best way I can contribute is to create a child branch and work in parallel in the dev notebook for user-oriented feedback.

ricardoV94 · 2024-02-27T08:05:01Z

Code is certainly cleaner than the IF statement jungle of the PR I posted haha. Best way I can contribute is to create a child branch and work in parallel in the dev notebook for user-oriented feedback.

It's always easier to refactor than write the first pass. The notebook would help a lot!

I'll be testing the covariate model today, and fix stuff I've certainly broken

tests/clv/models/test_pareto_nbd.py

ricardoV94 · 2024-03-01T18:34:31Z

@ColtAllen finally pushed my last changes. I broke the pre-existing methods API very much, but we can:

Go back and make it like it looked before, plus optional argument for covariates in all methods
Use a new API but support old API with deprecation warnings
Something else. I don't like asking for a whole dataframe in cases where the user is only changing one variable, but for covariates it makes sense to request a dataframe anyway, so it makes sense that other variable also comes in the same dataframe

I'm in favor of 2.

Finally, the slow TestParetoNBDModelWithCovariates::test_inference is failing badly. I used synthetic data from fixed true parameters (so I could control the covariate coefficients), but it is not recovering those at all. Should I use different true parameter values?

review-notebook-app · 2024-03-01T18:49:52Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ColtAllen · 2024-03-01T23:59:39Z

@ColtAllen finally pushed my last changes. I broke the pre-existing methods API very much, but we can:

Go back and make it like it looked before, plus optional argument for covariates in all methods

Use a new API but support old API with deprecation warnings

Something else. I don't like asking for a whole dataframe in cases where the user is only changing one variable, but for covariates it makes sense to request a dataframe anyway, so it makes sense that other variable also comes in the same dataframe

I'm in favor of 2.

Yes, let's do option 2.

Finally, the slow TestParetoNBDModelWithCovariates::test_inference is failing badly. I used synthetic data from fixed true parameters (so I could control the covariate coefficients), but it is not recovering those at all. Should I use different true parameter values?

Yes - the true parameters you're using are for a model fit without covariates. When I fit a covariate model in the dev notebook of my previous PR, all parameter values changed.

I also noticed the covariates were synthesized and tacked onto an existing dataset. If we're to test via parameter recovery, we'll need to generate a full dataset from the parameters used for testing.

ricardoV94 · 2024-03-02T08:02:28Z

I also noticed the covariates were synthesized and tacked onto an existing dataset. If we're to test via parameter recovery, we'll need to generate a full dataset from the parameters used for testing.

No, I'm generating new observations (recency and frequency) in that test as well

ColtAllen · 2024-03-06T16:45:21Z

No, I'm generating new observations (recency and frequency) in that test as well

Buy why is only a single prior predictive sample being taken?

ricardoV94 · 2024-03-06T16:46:32Z

No, I'm generating new observations (recency and frequency) in that test as well

Buy why is only a single prior predictive sample being taken?

One draw is enough, it contains an observation for each synthetic customer

ColtAllen · 2024-03-06T17:40:16Z

No, I'm generating new observations (recency and frequency) in that test as well
One draw is enough, it contains an observation for each synthetic customer

I see. The new frequency column name is misspelled though, so it's testing on the original column rather than what was generated for the test.

ColtAllen · 2024-03-13T21:24:27Z

Is the user API still a work in progress? All predictive methods require arguments to be converted into dataframe columns beforehand, which seems rather cumbersome.

pymc_marketing/clv/models/pareto_nbd.py

tests/clv/models/test_pareto_nbd.py

pymc_marketing/clv/models/pareto_nbd.py

Co-authored-by: Colt Allen <[email protected]>

ColtAllen reviewed Feb 22, 2024

View reviewed changes

pymc_marketing/clv/models/pareto_nbd.py Outdated Show resolved Hide resolved

ricardoV94 force-pushed the pareto_covar_ricardo branch from cb1d436 to 216257f Compare February 26, 2024 13:39

pymc-labs deleted a comment from BBDS-Colt Feb 27, 2024

ColtAllen reviewed Feb 27, 2024

View reviewed changes

pymc_marketing/clv/models/pareto_nbd.py Outdated Show resolved Hide resolved

ColtAllen reviewed Feb 27, 2024

View reviewed changes

pymc_marketing/clv/models/pareto_nbd.py Show resolved Hide resolved

ColtAllen reviewed Feb 27, 2024

View reviewed changes

pymc_marketing/clv/models/pareto_nbd.py Outdated Show resolved Hide resolved

ColtAllen reviewed Feb 27, 2024

View reviewed changes

pymc_marketing/clv/models/pareto_nbd.py Outdated Show resolved Hide resolved

ricardoV94 force-pushed the pareto_covar_ricardo branch 3 times, most recently from bc5a7f3 to 6ba6ee1 Compare February 28, 2024 19:24

ricardoV94 commented Feb 28, 2024

View reviewed changes

tests/clv/models/test_pareto_nbd.py Outdated Show resolved Hide resolved

This was referenced Mar 1, 2024

CLV models don't have working repr without model fit #559

Closed

Refactor CLV build_model logic #564

Merged

ricardoV94 force-pushed the pareto_covar_ricardo branch 3 times, most recently from 5e49dac to 839aaf0 Compare March 1, 2024 18:28

ricardoV94 force-pushed the pareto_covar_ricardo branch 2 times, most recently from cfd6d75 to 133fd36 Compare March 1, 2024 18:49

ricardoV94 force-pushed the pareto_covar_ricardo branch 2 times, most recently from 48770dd to 024e76f Compare March 4, 2024 12:13

ricardoV94 force-pushed the pareto_covar_ricardo branch 4 times, most recently from 4de7aba to 2ae0b94 Compare March 14, 2024 19:47

ricardoV94 marked this pull request as ready for review March 14, 2024 19:49

ricardoV94 added enhancement New feature or request CLV labels Mar 14, 2024

ricardoV94 force-pushed the pareto_covar_ricardo branch 3 times, most recently from db45c51 to 652b0c9 Compare March 15, 2024 14:32

ColtAllen reviewed Mar 15, 2024

View reviewed changes

Add test util for setting fake data in CLV models

3e92610

ricardoV94 force-pushed the pareto_covar_ricardo branch from 652b0c9 to d64a4df Compare March 15, 2024 15:02

Implement ParetoNBD with covariates

db06d52

Co-authored-by: Colt Allen <[email protected]>

ricardoV94 force-pushed the pareto_covar_ricardo branch from d64a4df to db06d52 Compare March 15, 2024 15:08

ColtAllen approved these changes Mar 15, 2024

View reviewed changes

ricardoV94 merged commit 5aba73e into pymc-labs:main Mar 15, 2024
9 of 10 checks passed

ricardoV94 deleted the pareto_covar_ricardo branch March 15, 2024 15:46

ricardoV94 mentioned this pull request Mar 15, 2024

Fix bug in predictive methods of ParetoNBD model with covariates #589

Merged

13 tasks

ColtAllen mentioned this pull request Mar 20, 2024

Improve posterior predictive output of ParetoNBD #417

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ParetoNBD with covariates #545

Implement ParetoNBD with covariates #545

ricardoV94 commented Feb 22, 2024 •

edited

Loading

ColtAllen commented Feb 22, 2024

codecov bot commented Feb 22, 2024 •

edited

Loading

ColtAllen commented Feb 27, 2024

ricardoV94 commented Feb 27, 2024

ricardoV94 commented Mar 1, 2024

review-notebook-app bot commented Mar 1, 2024

ColtAllen commented Mar 1, 2024 •

edited

Loading

ricardoV94 commented Mar 2, 2024

ColtAllen commented Mar 6, 2024

ricardoV94 commented Mar 6, 2024

ColtAllen commented Mar 6, 2024

ColtAllen commented Mar 13, 2024

Implement ParetoNBD with covariates #545

Implement ParetoNBD with covariates #545

Conversation

ricardoV94 commented Feb 22, 2024 • edited Loading

Description

Related Issue

Checklist

Modules affected

Type of change

ColtAllen commented Feb 22, 2024

codecov bot commented Feb 22, 2024 • edited Loading

Codecov Report

ColtAllen commented Feb 27, 2024

ricardoV94 commented Feb 27, 2024

ricardoV94 commented Mar 1, 2024

review-notebook-app bot commented Mar 1, 2024

ColtAllen commented Mar 1, 2024 • edited Loading

ricardoV94 commented Mar 2, 2024

ColtAllen commented Mar 6, 2024

ricardoV94 commented Mar 6, 2024

ColtAllen commented Mar 6, 2024

ColtAllen commented Mar 13, 2024

ricardoV94 commented Feb 22, 2024 •

edited

Loading

codecov bot commented Feb 22, 2024 •

edited

Loading

ColtAllen commented Mar 1, 2024 •

edited

Loading