Add Cornac BPR deep dive notebook #937

tqtg · 2019-09-20T07:59:17Z

Description

Add Cornac Bayesian Personalized Ranking (BPR) deep dive notebook

Related Issues

#931

Checklist:

I have followed the contribution guidelines and code style for this project.
I have added tests covering my contributions.
I have updated the documentation accordingly.

Staging to master

review-notebook-app · 2019-09-20T07:59:23Z

Check out this pull request on

You'll be able to see Jupyter notebook diff and discuss changes. Powered by ReviewNB.

msftclas · 2019-09-20T07:59:32Z

All CLA requirements met.

tqtg · 2019-09-20T08:40:07Z

Hi @yueguoguo,

I see the failed checks. How do I get access to the log to know the reasons?

gramhagen · 2019-09-20T09:32:10Z

We currently don't have a way for external contributors to trigger the tests from the forked branch, so these aren't failures due to your code.

Thanks for putting this in. I'll take a look at it soon.

miguelgfierro · 2019-09-20T17:30:01Z

/azp run

azure-pipelines · 2019-09-20T17:30:08Z

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

miguelgfierro · 2019-09-20T17:33:47Z

@eisber, I'm trying to set the trick you told me, but I'm getting this error:

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

Have you seen this error? staging shouldn't be excluded

gramhagen · 2019-09-20T17:48:10Z

This is going to run on the fork not staging right? We would need to enable that somehow, but just disabled automatic triggers at pull request.

yueguoguo · 2019-09-23T03:29:55Z

Thanks @tqtg for the contribution!

We may have to come back to this after sorting out some of the testing pipeline issues.

tests/integration/test_notebooks_python.py

tests/smoke/test_notebooks_python.py

miguelgfierro · 2019-09-24T15:13:43Z

tests/unit/test_cornac_utils.py

+    item = rating_true.iloc[1]["itemID"]
+    assert preds[(preds["userID"] == user) & (preds["itemID"] == item)][
+               "prediction"
+           ].values == pytest.approx(mf.rate(train_set.uid_map[user],


here we are cheating :-) because we are comparing two routines that are the same. What we want is to compare the prediction that comes from predict_rating with a df explicitly created by us as a fixture, so you would need to generate a new df with the result.

Question, any reason to select MF instead of BPR for the test?

I think this test is still valid. We are testing ground-truth training data and the predictions from a trained MF model. In theory, MF should be able to overfit on this small input data, thus, it will be a small approximation error here.

MF is used instead of BPR because I want to reuse tests from Surprise (MF is equivalent to SVD in Surprise). For the purpose of testing utility functions, it should be fine.

Hi @miguelgfierro ,

Tests are updated using some evaluation metrics. It might not be that clean since it involves the implementation of metrics. However, I think it's reliable enough to be unit tests.

miguelgfierro · 2019-09-24T15:14:43Z

tests/unit/test_cornac_utils.py

+    train_set = cornac.data.Dataset.from_uir(rating_true.itertuples(index=False), seed=42)
+    mf.fit(train_set)
+
+    preds = predict_ranking(mf, rating_true, remove_seen=True)


same here, you would need to input the result df as a fixture and compare it with the output of predict_ranking

miguelgfierro · 2019-09-24T15:14:59Z

tests/unit/test_cornac_utils.py

+    assert preds["itemID"].dtypes == rating_true["itemID"].dtypes
+    user = preds.iloc[1]["userID"]
+    item = preds.iloc[1]["itemID"]
+    assert preds[(preds["userID"] == user) & (preds["itemID"] == item)][


notebooks/02_model/cornac_bpr_deep_dive.ipynb

miguelgfierro · 2019-09-24T15:20:12Z

notebooks/02_model/cornac_bpr_deep_dive.ipynb

+    "\n",
+    "One usual approach for item recommendation is directly predicting a preference score $\\hat{x}_{u,i}$ given to item $i$ by user $u$.  BPR uses a different approach by using item pairs $(i, j)$ and optimizing for the correct ranking given preference of user $u$, thus, there are notions of *positive* and *negative* items.  The training data $D_S : U \\times I \\times I$ is defined as:\n",
+    "\n",
+    "$$D_S := \\{(u, i, j) \\mid i \\in I^{+}_{u} \\wedge j \\in I \\setminus I^{+}_{u}\\}$$\n",


would you mind to check if the math is rendered correctly? I can't see it on github

This is rendered properly on my browser.

All the formulas are rendered correctly on GitHub.

tests/integration/test_notebooks_python.py

notebooks/02_model/cornac_bpr_deep_dive.ipynb

miguelgfierro · 2019-09-24T15:30:14Z

hey @tqtg I'm really impressed how quickly you were able to understand our ways of working. You did a deep dive, added the tests, even added black. To be honest, not many people outside our team are able to ramp up that fast, even some folks at MS don't do it.

I would like to ask you, how much time did you spend reading our documentation to be able to create the PR? also, if you think there is something that can be clearer, also when there is too much or less info about anything

tqtg · 2019-09-24T16:35:25Z

hey @tqtg I'm really impressed how quickly you were able to understand our ways of working. You did a deep dive, added the tests, even added black. To be honest, not many people outside our team are able to ramp up that fast, even some folks at MS don't do it.

I would like to ask you, how much time did you spend reading our documentation to be able to create the PR? also, if you think there is something that can be clearer, also when there is too much or less info about anything

Thank you @miguelgfierro for your kind words.

To be honest, I didn't read the documentation but read through the codebase (it's quite easy to read and not that big though :D). To create the notebook, I mainly based on other available notebooks and existing utilities.

For the whole project, to me, it's quite clear and well structured, though, some part of the documentation is not rendered properly I guess.

I hope to contribute more to the project causes it's very useful for the community!

yueguoguo · 2019-09-25T07:15:57Z

For the whole project, to me, it's quite clear and well structured, though, some part of the documentation is not rendered properly I guess.

Awesome @tqtg ! Thanks again. which part of the doc was rendered wrongly?

tqtg · 2019-09-25T07:42:22Z

For the whole project, to me, it's quite clear and well structured, though, some part of the documentation is not rendered properly I guess.

Awesome @tqtg ! Thanks again. which part of the doc was rendered wrongly?

For example: https://microsoft-recommenders.readthedocs.io/en/latest/dataset.html
Only the last section Knowledge graph utilities appears where other sections don't (e.g. Recommendation datasets). Is it something expected or not?

yueguoguo · 2019-09-25T09:03:00Z

@tqtg good catch - it looks like the readthedoc build needs a pre-configured environment to successfully parse the source files.

FYI @miguelgfierro

miguelgfierro · 2019-09-25T09:25:03Z

Only the last section Knowledge graph utilities appears where other sections don't (e.g. Recommendation datasets). Is it something expected or not?

good catch @tqtg, thanks

notebooks/02_model/cornac_bpr_deep_dive.ipynb

gramhagen · 2019-09-25T17:49:31Z

notebooks/02_model/cornac_bpr_deep_dive.ipynb

@@ -0,0 +1,595 @@
+{


Now that our model is fitted
->
Now that our model is trained

Reply via ReviewNB

gramhagen · 2019-09-26T10:15:12Z

Cool. Ok @miguelgfierro , ready to merge?

miguelgfierro

this is awesome!

miguelgfierro · 2019-09-26T10:38:13Z

miguelgfierro · 2019-09-26T10:40:24Z

hey @tqtg for large contributions like yours, we are adding the authors to this list: https://github.com/microsoft/recommenders/blob/staging/AUTHORS.md#contributors--sorted-alphabetically

Feel free to add your name if you want

tqtg · 2019-09-26T11:53:57Z

@miguelgfierro: can you tell me more how to add myself in? Would it be another PR from stagging?

miguelgfierro and others added 8 commits September 17, 2019 10:25

Merge pull request #927 from microsoft/staging

444e6c4

Staging to master

Merge branch 'staging' of https://github.com/microsoft/recommenders

64751d2

Add Cornac into conda CONDA_BASE requirements

1901116

Add cornac_utils

47b836b

Add tests for cornac_utils

c18bbfc

Add Cornac notebook into 02_model

0498ada

Update README files with cornac_bpr_deep_dive notebook

5aeafc4

Add tests for the notebook

7bbff71

tqtg requested review from gramhagen, miguelgfierro and yueguoguo as code owners September 20, 2019 07:59

tqtg changed the title ~~Tqtg/cornac bpr tutorial~~ Add Cornac BPR deep dive notebook Sep 20, 2019

miguelgfierro reviewed Sep 24, 2019

View reviewed changes

tqtg added 2 commits September 25, 2019 00:25

Update notebook with Timer

6ca70d9

Rename test functions

5c2e228

tqtg requested a review from miguelgfierro September 24, 2019 16:39

tqtg added 2 commits September 25, 2019 14:27

Fix l2 norm latex in the notebook

5fb57d7

Replace \Vert with ||

01ccf40

Remove \limits in \sum

9d8c58a

Update tests using eval metrics

5a318ac

gramhagen reviewed Sep 25, 2019

View reviewed changes

Fix typos

b7b00df

miguelgfierro approved these changes Sep 26, 2019

View reviewed changes

miguelgfierro merged commit cb8c7b4 into recommenders-team:staging Sep 26, 2019

tqtg deleted the tqtg/cornac_bpr_tutorial branch September 26, 2019 11:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Cornac BPR deep dive notebook #937

Add Cornac BPR deep dive notebook #937

tqtg commented Sep 20, 2019 •

edited

Loading

review-notebook-app bot commented Sep 20, 2019

msftclas commented Sep 20, 2019 •

edited

Loading

tqtg commented Sep 20, 2019

gramhagen commented Sep 20, 2019

miguelgfierro commented Sep 20, 2019

azure-pipelines bot commented Sep 20, 2019

miguelgfierro commented Sep 20, 2019

gramhagen commented Sep 20, 2019

yueguoguo commented Sep 23, 2019

miguelgfierro Sep 24, 2019

tqtg Sep 24, 2019

tqtg Sep 24, 2019

tqtg Sep 25, 2019

miguelgfierro Sep 24, 2019

tqtg Sep 25, 2019

miguelgfierro Sep 24, 2019

tqtg Sep 25, 2019

miguelgfierro Sep 24, 2019

tqtg Sep 24, 2019

tqtg Sep 25, 2019

miguelgfierro commented Sep 24, 2019

tqtg commented Sep 24, 2019

yueguoguo commented Sep 25, 2019

tqtg commented Sep 25, 2019

yueguoguo commented Sep 25, 2019

miguelgfierro commented Sep 25, 2019

gramhagen Sep 25, 2019

tqtg Sep 26, 2019

gramhagen commented Sep 26, 2019

miguelgfierro left a comment

miguelgfierro commented Sep 26, 2019

miguelgfierro commented Sep 26, 2019

tqtg commented Sep 26, 2019

Add Cornac BPR deep dive notebook #937

Add Cornac BPR deep dive notebook #937

Conversation

tqtg commented Sep 20, 2019 • edited Loading

Description

Related Issues

Checklist:

review-notebook-app bot commented Sep 20, 2019

msftclas commented Sep 20, 2019 • edited Loading

tqtg commented Sep 20, 2019

gramhagen commented Sep 20, 2019

miguelgfierro commented Sep 20, 2019

azure-pipelines bot commented Sep 20, 2019

miguelgfierro commented Sep 20, 2019

gramhagen commented Sep 20, 2019

yueguoguo commented Sep 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miguelgfierro commented Sep 24, 2019

tqtg commented Sep 24, 2019

yueguoguo commented Sep 25, 2019

tqtg commented Sep 25, 2019

yueguoguo commented Sep 25, 2019

miguelgfierro commented Sep 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gramhagen commented Sep 26, 2019

miguelgfierro left a comment

Choose a reason for hiding this comment

miguelgfierro commented Sep 26, 2019

miguelgfierro commented Sep 26, 2019

tqtg commented Sep 26, 2019

tqtg commented Sep 20, 2019 •

edited

Loading

msftclas commented Sep 20, 2019 •

edited

Loading