推荐系统技术、评估及高效算法

Chapter 1 简介

信息过载

Chapter 2 协同过滤Collaborative Filtering

被大型电商网站使用

用大众的智慧推荐，根据其他用户的数据

只使用用户和物品之间的评分数据。 如果一些用户在过去有相似的偏好，那么在将来也会有相似的偏好（核心）

输入用户-物品评分矩阵

基于用户的CF

对一个活跃用户Alice和它没见过的物品，评估它对该物品的评分

找出一组和ALice喜好相似的用户，把他们的数据综合

用户相似度

Pearson Correlation Coefficient (PCC)评估相似度 a,b:users $r_{a,p}$:rating of user a for item p p:set of items,commonly rated by users a and b $$sim(a,b)=\frac{\sum_{p\in P}(r_{a,p}-\bar{r}a)(r{b,p}-\bar{r}b)}{\sqrt{\sum{p\in P}(r_{a,p}-\bar{r}a)^2}\sqrt{\sum{p\in P}(r_{b,p}-\bar{r}_b)^2}}$$ 值域为[-1,1]，1为完全相似，-1为完全不相似，0为既不完全相似也不完全不相似，不相关

Neighborhood Selection 如何选择相似度高的用户

Similarity threshold 设置相似度选取范围
Top-K most nearest neighbors 选择相似度最高的几个

Making predictions 预测用户对新物品的评分

$$pred(u,j)=\bar{r_u}+\frac{\sum_{v\in N^j_u}sim(u,v)*(r_{v,j}-\bar{r_v})}{\sum_{v\in N^j_u}|sim(u,v)|}$$

不同相似度的用户权重值不同

实现细节

如果找不到K个最近的，用尽量多的neighbor

相似度threshold通常设为0

如果预测值超出值域范围，则强制设置为值域中的最大值或最小值

评估

误差指标： Mean Absolute Error(MAE) $$MAE=\sum_{(u,i,r_{ui})\in R^{te}}|r_{ui}-\hat{r}{ui}|/|R^{te}|$$ Root Mean Square Error(RMSE) $$RMSE=\sqrt{\sum{(u,i,r_{ui})\in R^{te}}(r_{ui}-\hat{r}_{ui})^2/|R^{te}|}$$ RMSE通常更合适,能把误差放大越小越好

Chapter 2.2 相似度衡量

通常使用的衡量
- Pearson Correlation Coefficient(PCC)

Pearson

上面讲了

Constrained PCC

把平均值改为中间值

Adjusted PCC

加一个$\varepsilon$避免除以0的情况

Cosine silmilarity

$$sim(\vec a,\vec b )=\frac{\vec a \cdot \vec b}{|\vec a| *|\vec b|}$$

Cases of Correlations

同一个用户的不同变量
- X:GPA
- Y:SAT
不同用户的相同变量
- X:中国学生的GPA
- Y:新加坡学生的GPA

Pearson

考虑的线性关系，同时看出方向和大小

Spearman's Correlation Coefficient

通常用于排序变量

非线性的通过排序能得到线性关系，用Rank值取代原来的值

标准化

无单位，不同单位计算出的值一样

性质

r(x,y)=r(y,x)

Gaussian Kernel

基于distance的相似度计算

Manhattan Distance

Mean Squared Distance(MSD)

提升基于用户的CF

不是所有邻居的评分都是等价的

大家都喜欢的物品不如观点不一致的物品有效

预测

一般取20到50个邻居

User-based CF vs. Item-based CF

解释

user Users that read this book also read
Item You read these 10 books,so you might also like to read...

复杂性

user：用户数量<<物品数量
item：物品数量<<用户数量

Performance

通常，item>user

因为物品是简单的，人会有不同的喜好

多样性

item有局限，user能推荐更多的商品

Hybrid CF

Chapter 2.2

基于内存的方法

基于用户和基于物品的都是

基于模型的方法

可以离线处理

SVD:Single Value Decomposition

$R=U_0\sum_0 V^T_0$

Write a report regarding "Negative Sampling for Recommender Systems"

A number of references are necessary 参考文献必须得有
Would be better to add contents regarding the use of negative sampling in the deep learning(for recommendation)
Cannote be limited to ther papers introduced in the class

12月6日小组合作(2到3人）更好 guogb@swc.neu.edu.cn

Gradient Descent Variants

Batch Gradient Descent
Stochastic Gradient Descent(SGD)

Biased MF

$$\hat{r}_{u,i}=\mu+b_u+b_i+U_u\cdot V^T_i$$ $\mu$是常量，$b_u,b_i$是user bias和item bias

提升性能

NMF：Non-negative MF

负值不好解释

NMF:learning the parts of objects

Decompose a rating matrix A into two matrices W and H both non-negative 分解出的值都是非负值，W和H都是非负的 $$A=WH$$ Update $$H_{ab}=H_{ab}\frac{(W^tA){ab}}{(W^tWH){ab}}$$ $$W_{ab}=W_{ab}\frac{(AH^t){ab}}{(WHH^t{ab}}$$

Extension to PMF

把两类信息（UU，UI）联系在一起，中间有个桥接

Bayesian Personalized Ranking

One-class collaborative filtering(OCCF) 单类的反馈
- no negative feedback

BPR

Assumption

A user prefers an interacted item to an unknown item 和陌生物品相比，用户更喜欢一个被交互过的物品
Objective function: $p(\Theta)\sim N(0,\sum_\Theta)$ $p(i>uj|\Theta):=\sigma(\hat{x}{uij}(\Theta))$

where \sigma is the logistic sigmoid $\sigma(x):=\frac{1}{1+e^{-x}}$

WARP

Fast Negative Sampling

To choose the negative examples that are most likely to meet the requirements of violation

Training without Sampling

fBGD: fast batch gradient descent

Tensor Factorization

Tucker Decompostion

Factorization Machine

$\hat{y}(x)=\omega_0+\sum\limits^n_{i=1}\omega_ix_i$

Files

推荐系统.md

Latest commit

History

推荐系统.md

File metadata and controls

Chapter 1 简介

Chapter 2 协同过滤Collaborative Filtering

基于用户的CF

用户相似度

Neighborhood Selection 如何选择相似度高的用户

Making predictions 预测用户对新物品的评分

实现细节

评估

Chapter 2.2 相似度衡量

Pearson

Constrained PCC

Adjusted PCC

Cosine silmilarity

Cases of Correlations

Pearson

Spearman's Correlation Coefficient

标准化

性质

Gaussian Kernel

Manhattan Distance

Mean Squared Distance(MSD)

提升基于用户的CF

预测

User-based CF vs. Item-based CF

解释

复杂性

Performance

多样性

Hybrid CF

Chapter 2.2

基于内存的方法

基于模型的方法

SVD:Single Value Decomposition

Gradient Descent Variants

Biased MF

NMF：Non-negative MF

Extension to PMF

Bayesian Personalized Ranking

BPR

WARP

Fast Negative Sampling

Training without Sampling

Tensor Factorization

Factorization Machine

社交网络