Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large refactor #1086

Merged
merged 67 commits into from
Jun 29, 2020
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
c47fe4d
rename and refactor :boom:
miguelgfierro Apr 16, 2020
5551d4a
rename and refactor :boom:
miguelgfierro Apr 16, 2020
f6b0453
refact
miguelgfierro Apr 27, 2020
6db4148
scenarios
miguelgfierro Apr 27, 2020
485c1f1
retail
miguelgfierro Apr 27, 2020
02bec05
retail
miguelgfierro Apr 27, 2020
9c35983
retail
miguelgfierro Apr 27, 2020
3e3756c
retail
miguelgfierro Apr 27, 2020
7864b8e
retail
miguelgfierro Apr 27, 2020
44772c7
comments @yueguoguo
miguelgfierro May 21, 2020
d79f878
Merge branch 'staging' into miguel/burn_and_destroy
miguelgfierro May 21, 2020
c6c20c5
Merge branch 'staging' into miguel/burn_and_destroy
miguelgfierro May 28, 2020
5db328f
advance
miguelgfierro Jun 4, 2020
8eb19fa
advance
miguelgfierro Jun 4, 2020
f3ddcae
advance
miguelgfierro Jun 4, 2020
f01dcb6
review
miguelgfierro Jun 4, 2020
c1baf1e
Merge branch 'staging' into miguel/burn_and_destroy
miguelgfierro Jun 11, 2020
1e78d52
scenarios
miguelgfierro Jun 11, 2020
60d9587
structure change
miguelgfierro Jun 11, 2020
7f44a9d
glossary
miguelgfierro Jun 11, 2020
61923c7
:boom:
miguelgfierro Jun 11, 2020
78986b7
readme
miguelgfierro Jun 12, 2020
36ed9e6
rewrite of retail readme for readability.
Jun 14, 2020
5b007f7
format
Jun 14, 2020
e1a5f51
glossary
miguelgfierro Jun 15, 2020
33c6e5e
:doc:
miguelgfierro Jun 15, 2020
40560c3
:doc:
miguelgfierro Jun 15, 2020
65dd13c
:doc:
miguelgfierro Jun 15, 2020
573e004
Update README.md
wutaomsft Jun 15, 2020
f42e8f5
wip
miguelgfierro Jun 15, 2020
63930e5
Merge branch 'miguel/burn_and_destroy' of github.com:microsoft/recomm…
miguelgfierro Jun 15, 2020
a442096
glossary
miguelgfierro Jun 15, 2020
47f9d25
glossary
miguelgfierro Jun 15, 2020
f572dcf
kg
miguelgfierro Jun 16, 2020
4930065
fix links
miguelgfierro Jun 16, 2020
97672a9
readme
miguelgfierro Jun 16, 2020
f022427
fix paths
miguelgfierro Jun 16, 2020
a46b18f
fix paths
miguelgfierro Jun 16, 2020
f156c0b
fix paths
miguelgfierro Jun 16, 2020
4f77506
rename
miguelgfierro Jun 16, 2020
5dd3f68
fix :bug: and paths
miguelgfierro Jun 16, 2020
123a737
tests
miguelgfierro Jun 17, 2020
14d7c50
fixing tests
miguelgfierro Jun 17, 2020
1ee91fa
:bug:
miguelgfierro Jun 17, 2020
d90c9a3
:bug:
miguelgfierro Jun 18, 2020
39705c1
typo
miguelgfierro Jun 18, 2020
e1bbd2a
fix :bug: test lightfm
miguelgfierro Jun 19, 2020
9d7c661
papers
miguelgfierro Jun 19, 2020
44b4843
papers
miguelgfierro Jun 19, 2020
da7cdbf
typo
miguelgfierro Jun 19, 2020
a6e441e
fixed :bug: with pymanopt
miguelgfierro Jun 22, 2020
57b0c8a
long tail
miguelgfierro Jun 22, 2020
c0185c1
spark
miguelgfierro Jun 22, 2020
b0f8a59
ignore
miguelgfierro Jun 22, 2020
841fc49
mmlspark lgb criteo
miguelgfierro Jun 22, 2020
871ef72
:bug:
miguelgfierro Jun 22, 2020
1881066
java8
miguelgfierro Jun 22, 2020
24b6ba9
benchmark
miguelgfierro Jun 22, 2020
4e9263a
retail
miguelgfierro Jun 23, 2020
16baaed
spark 2.4.3
miguelgfierro Jun 23, 2020
fd1eb0b
Update README.md
anargyri Jun 25, 2020
a281478
lightgcn
miguelgfierro Jun 25, 2020
d4a5244
fix :bug: in readme
miguelgfierro Jun 25, 2020
845964a
readms
miguelgfierro Jun 25, 2020
d5ae933
update authors
miguelgfierro Jun 25, 2020
f4c1f4d
merge staging
miguelgfierro Jun 29, 2020
930427f
:bug:
miguelgfierro Jun 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 7 additions & 65 deletions scenarios/README.md
Original file line number Diff line number Diff line change
@@ -1,74 +1,16 @@
# Recommendation System Scenarios

On this section there is listed a number of business scenarios that are common in Recommendation Systems (RS).
On this section there is listed a number of business scenarios that are common in Recommendation Systems.

The list of scenarios are:

* Ads
* Entertainment
* Food and restaurants
* News
* Retail
* Travel
* [Ads](ads)
* [Entertainment](entertainment)
* [Food and restaurants](food_and_restaurants)
* [News and document]()
* [Retail](retail)
* [Travel](travel)

## Types of Recommendation Systems

Typically recommendation systems in retail can be divided into three categories:

* Collaborative filtering: This type of recommendation system makes predictions of what might interest a person based on the taste of many other users. It assumes that if person X likes Snickers, and person Y likes Snickers and Milky Way, then person X might like Milky Way as well.

* Content-based filtering: This type of recommendation system focuses on the products themselves and recommends other products that have similar attributes. Content-based filtering relies on the characteristics of the products themselves, so it doesn’t rely on other users to interact with the products before making a recommendation.

* Hybrid filtering: This type of recommendation system can implement a combination fo any two of the above systems.

## Data in Recommendation Systems

### Data types

* Explicit interactions:

* Implicit interactions:

* Knowledge graph data:

* User features:

* Item features:

### Considerations about data size

The size of the data is important when designing the system...


## Metrics

In RS, there are two types of metrics: offline and online metrics.

### Machine learning metrics (offline metrics)

In Recommenders, offine metrics implementation for python are found on [python_evaluation.py](https://github.com/microsoft/recommenders/blob/master/reco_utils/evaluation/python_evaluation.py) and those for PySpark are found on [spark_evaluation.py](https://github.com/microsoft/recommenders/blob/master/reco_utils/evaluation/spark_evaluation.py).

Currently available metrics include:

- Root Mean Squared Error
- Mean Absolute Error
- R<sup>2</sup>
- Explained Variance
- Precision at K
- Recall at K
- Normalized Discounted Cumulative Gain at K
- Mean Average Precision at K
- Area Under Curve
- Logistic Loss


### Business success metrics (online metrics)

Online metrics are specific on the business scenario. More details can be found on each scenario folder.

## Managing Cold Start Scenarios in Recommendation Systems

....



49 changes: 42 additions & 7 deletions scenarios/retail/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Recommendation systems for Retail
miguelgfierro marked this conversation as resolved.
Show resolved Hide resolved

An increasing number of online companies are utilizing recommendation systems to increase user interaction and enrich shopping potential. Use cases of recommendation systems have been expanding rapidly across many aspects of eCommerce and online media over the last 4-5 years, and we expect this trend to continue.
An increasing number of online companies are utilizing recommendation systems (RS) to increase user interaction and enrich shopping potential. Use cases of recommendation systems have been expanding rapidly across many aspects of eCommerce and online media over the last 4-5 years, and we expect this trend to continue.

Companies across many different areas of enterprise are beginning to implement recommendation systems in an attempt to enhance their customer’s online purchasing experience, increase sales and retain customers. Business owners are recognizing potential in the fact that recommendation systems allow the collection of a huge amount of information relating to user’s behavior and their transactions within an enterprise. This information can then be systematically stored within user profiles to be used for future interactions.

Expand All @@ -15,11 +15,39 @@ The most common scenarios companies use are:
* Recommended for you: The "Recommended for you" recommendation predicts the next product that a user is most likely to engage with or purchase, based on the shopping or viewing history of that user. This recommendation is typically used on the home page.


## Data in Recommendation Systems

### Data types

In RS for retail there are typically the following types of data

* Explicit interactions: When a user explicitly rate an item, typically between 1-5, the user is giving a value on the likeliness of the item. In retail, this kind of data is not very common.

* Implicit interactions: Implicit interactions are views or clicks that show a certain interest of the user about a specific items. These kind of data is more common but it doesn't define the intention of the user as clearly as the explicit data.

* User features: These include all information that define the user, some examples can be name, address, email, demographics, etc.

* Item features: These include information about the item, some examples can be SKU, description, brand, price, etc.

* Knowledge graph data: ...

### Considerations about data size

The size of the data is important when designing the system...

### Cold start scenarios

Personalized recommender systems take advantage of users past history to make predictions. The cold start problem concerns the personalized recommendations for users with no or few past history (new users). Providing recommendations to users with small past history becomes a difficult problem for CF models because their learning and predictive ability is limited. Multiple research have been conducted in this direction using hybrid models. These models use auxiliary information (multimodal information, side information, etc.) to overcome the cold start problem.

### Long tail products

Typically, the shape of items interacted in retail follow a long tail distribution [1,2].

## Measuring Recommendation performance

### Machine learning metrics (offline metrics)

Please [see the main metrics description]() for understanding machine learning metrics.
Offline metrics in RS are based on rating, ranking, classification or diversity. For learning more about offline metrics, see the [definitions available in Recommenders repository](../../examples/03_evaluate)

### Business success metrics (online metrics)

Expand All @@ -40,18 +68,21 @@ There is some literature about the relationship between offline and online metri

### Advanced A/B testing: online learning with VW

## Challenges in Recommendation systems for Retail
...

* Cold start: Personalized recommender systems take advantage of users past history to make predictions. The cold start problem concerns the personalized recommendations for users with no or few past history (new users). Providing recommendations to users with small past history becomes a difficult problem for CF models because their learning and predictive ability is limited. Multiple research have been conducted in this direction using hybrid models. These models use auxiliary information (multimodal information, side information, etc.) to overcome the cold start problem.
## Examples of end 2 end recommendation scenarios with Microsoft Recommenders

* Long tail products:
From a technical perspective, RS can be grouped in these categories [1]:

* Collaborative filtering: This type of recommendation system makes predictions of what might interest a person based on the taste of many other users. It assumes that if person X likes Snickers, and person Y likes Snickers and Milky Way, then person X might like Milky Way as well. See the [list of examples in Recommenders repository](../../examples/02_model_collaborative_filtering).

* Content-based filtering: This type of recommendation system focuses on the products themselves and recommends other products that have similar attributes. Content-based filtering relies on the characteristics of the products themselves, so it doesn’t rely on other users to interact with the products before making a recommendation. See the [list of examples in Recommenders repository](../../examples/02_model_content_based_filtering).

## Building end 2 end recommendation scenarios with Microsoft Recommenders
* Hybrid filtering: This type of recommendation system can implement a combination fo any two of the above systems. See the [list of examples in Recommenders repository](../../examples/02_model_hybrid).

In the repository we have the following examples that can be used in retail
* Knowledge-base: ...

In the repository we have the following examples that can be used in retail

| Scenario | Description | Algorithm | Implementation |
|----------|-------------|-----------|----------------|
Expand All @@ -62,4 +93,8 @@ In the repository we have the following examples that can be used in retail

## References and resources

[1] Aggarwal, Charu C. Recommender systems. Vol. 1. Cham: Springer International Publishing, 2016.
[2]. Park, Yoon-Joo, and Alexander Tuzhilin. "The long tail of recommender systems and how to leverage it." In Proceedings of the 2008 ACM conference on Recommender systems, pp. 11-18. 2008. [Link to paper](http://people.stern.nyu.edu/atuzhili/pdf/Park-Tuzhilin-RecSys08-final.pdf).
[3]. Armstrong, Robert. "The long tail: Why the future of business is selling less of more." Canadian Journal of Communication 33, no. 1 (2008). [Link to paper](https://www.cjc-online.ca/index.php/journal/article/view/1946/3141).

sources: [1](https://emerj.com/ai-sector-overviews/use-cases-recommendation-systems/), [2](https://cloud.google.com/recommendations-ai/docs/placements), [3](https://www.researchgate.net/post/Can_anyone_explain_what_is_cold_start_problem_in_recommender_system)