diff --git a/.github/ISSUE_TEMPLATE/bug_report_CN.md b/.github/ISSUE_TEMPLATE/bug_report_CN.md new file mode 100644 index 000000000..9e1de6cc1 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report_CN.md @@ -0,0 +1,33 @@ +--- +name: Bug 报告 +about: 提交一份 bug 报告,帮助 RecBole 变得更好 +title: "[\U0001F41BBUG] 用一句话描述您的问题。" +labels: bug +assignees: '' + +--- + +**描述这个 bug** +对 bug 作一个清晰简明的描述。 + +**如何复现** +复现这个 bug 的步骤: +1. 您引入的额外 yaml 文件 +2. 您的代码 +3. 您的运行脚本 + +**预期** +对您的预期作清晰简明的描述。 + +**屏幕截图** +添加屏幕截图以帮助解释您的问题。(可选) + +**链接** +添加能够复现 bug 的代码链接,如 Colab 或者其他在线 Jupyter 平台。(可选) + +**实验环境(请补全下列信息):** + - 操作系统: [如 Linux, macOS 或 Windows] +- RecBole 版本 [如 0.1.0] + - Python 版本 [如 3.79] +- PyTorch 版本 [如 1.60] +- cudatoolkit 版本 [如 9.2, none] diff --git a/.github/ISSUE_TEMPLATE/feature_request_CN.md b/.github/ISSUE_TEMPLATE/feature_request_CN.md new file mode 100644 index 000000000..861dcc82d --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request_CN.md @@ -0,0 +1,20 @@ +--- +name: 请求添加新功能 +about: 提出一个关于本项目新功能/新特性的建议 +title: "[\U0001F4A1SUG] 一句话描述您希望新增的功能或特性" +labels: enhancement +assignees: '' + +--- + +**您希望添加的功能是否与某个问题相关?** +关于这个问题的简洁清晰的描述,例如,当 [...] 时,我总是很沮丧。 + +**描述您希望的解决方案** +关于解决方案的简洁清晰的描述。 + +**描述您考虑的替代方案** +关于您考虑的,能实现这个功能的其他替代方案的简洁清晰的描述。 + +**其他** +您可以添加其他任何的资料、链接或者屏幕截图,以帮助我们理解这个新功能。 diff --git a/.github/workflows/python-package.yml b/.github/workflows/python-package.yml index d5cd6ab38..abeb4fe9e 100644 --- a/.github/workflows/python-package.yml +++ b/.github/workflows/python-package.yml @@ -3,6 +3,7 @@ name: RecBole tests on: - pull_request + jobs: build: @@ -22,12 +23,16 @@ jobs: python -m pip install --upgrade pip pip install pytest pip install dgl + pip install xgboost if [ -f requirements.txt ]; then pip install -r requirements.txt; fi - + # Use "python -m pytest" instead of "pytest" to fix imports - name: Test metrics run: | python -m pytest -v tests/metrics + - name: Test data + run: | + python -m pytest -v tests/data - name: Test evaluation_setting run: | python -m pytest -v tests/evaluation_setting @@ -39,7 +44,4 @@ jobs: python -m pytest -v tests/config/test_config.py export PYTHONPATH=. python tests/config/test_command_line.py --use_gpu=False --valid_metric=Recall@10 --split_ratio=[0.7,0.2,0.1] --metrics=['Recall@10'] --epochs=200 --eval_setting='LO_RS' --learning_rate=0.3 - - name: Test evaluation_setting - run: | - python -m pytest -v tests/evaluation_setting diff --git a/.gitignore b/.gitignore index db5a25afc..660977853 100644 --- a/.gitignore +++ b/.gitignore @@ -1,3 +1,7 @@ +# Saved models +/saved* +*.pth + .vscode/ .idea/ *.pyc @@ -5,3 +9,4 @@ saved/ *.lprof *.egg-info/ +docs/build/ \ No newline at end of file diff --git a/README.md b/README.md index dd78ca451..148e2e48e 100644 --- a/README.md +++ b/README.md @@ -11,23 +11,25 @@ [](./LICENSE) -[HomePage] | [Docs] | [Datasets] | [Paper] +[HomePage] | [Docs] | [Datasets] | [Paper] | [Blogs] | [中文版] [HomePage]: https://recbole.io/ [Docs]: https://recbole.io/docs/ [Datasets]: https://github.com/RUCAIBox/RecDatasets [Paper]: https://arxiv.org/abs/2011.01731 +[Blogs]: https://blog.csdn.net/Turinger_2000/article/details/111182852 +[中文版]: README_CN.md RecBole is developed based on Python and PyTorch for reproducing and developing recommendation algorithms in a unified, comprehensive and efficient framework for research purpose. -Our library includes 53 recommendation algorithms, covering four major categories: +Our library includes 65 recommendation algorithms, covering four major categories: + General Recommendation + Sequential Recommendation + Context-aware Recommendation + Knowledge-based Recommendation -We design a unified and flexible data file format, and provide the support for 27 benchmark recommendation datasets. +We design a unified and flexible data file format, and provide the support for 28 benchmark recommendation datasets. A user can apply the provided script to process the original data copy, or simply download the processed datasets by our team. @@ -43,8 +45,8 @@ by our team. + **General and extensible data structure.** We design general and extensible data structures to unify the formatting and usage of various recommendation datasets. -+ **Comprehensive benchmark models and datasets.** We implement 53 commonly used recommendation algorithms, and provide -the formatted copies of 27 recommendation datasets. ++ **Comprehensive benchmark models and datasets.** We implement 65 commonly used recommendation algorithms, and provide +the formatted copies of 28 recommendation datasets. + **Efficient GPU-accelerated execution.** We optimize the efficiency of our library with a number of improved techniques oriented to the GPU environment. @@ -52,7 +54,14 @@ oriented to the GPU environment. + **Extensive and standard evaluation protocols.** We support a series of widely adopted evaluation protocols or settings for testing and comparing recommendation algorithms. + ## RecBole News +**01/15/2021**: We release RecBole [v0.2.0](https://github.com/RUCAIBox/RecBole/releases/tag/v0.2.0). + +**12/10/2020**: 我们发布了[RecBole小白入门系列中文博客(持续更新中)](https://blog.csdn.net/Turinger_2000/article/details/111182852) 。 + +**12/06/2020**: We release RecBole [v0.1.2](https://github.com/RUCAIBox/RecBole/releases/tag/v0.1.2). + **11/29/2020**: We constructed preliminary experiments to test the time and memory cost on three different-sized datasets and provided the [test result](https://github.com/RUCAIBox/RecBole#time-and-memory-costs) for reference. @@ -159,22 +168,25 @@ python run_recbole.py --model=[model_name] ## Time and Memory Costs -We constructed preliminary experiments to test the time and memory cost on three different-sized datasets (small, medium and large). For detailed information, you can click the following links.<br> +We constructed preliminary experiments to test the time and memory cost on three different-sized datasets +(small, medium and large). For detailed information, you can click the following links. -* [General recommendation models](asset/time_test_result/General_recommendation.md)<br> -* [Sequential recommendation models](asset/time_test_result/Sequential_recommendation.md)<br> -* [Context-aware recommendation models](asset/time_test_result/Context-aware_recommendation.md)<br> -* [Knowledge-based recommendation models](asset/time_test_result/Knowledge-based_recommendation.md)<br> +* [General recommendation models](asset/time_test_result/General_recommendation.md) +* [Sequential recommendation models](asset/time_test_result/Sequential_recommendation.md) +* [Context-aware recommendation models](asset/time_test_result/Context-aware_recommendation.md) +* [Knowledge-based recommendation models](asset/time_test_result/Knowledge-based_recommendation.md) -NOTE: Our test results only gave the approximate time and memory cost of our implementations in the RecBole library (based on our machine server). Any feedback or suggestions about the implementations and test are welcome. We will keep improving our implementations, and update these test results. +NOTE: Our test results only gave the approximate time and memory cost of our implementations in the RecBole library +(based on our machine server). Any feedback or suggestions about the implementations and test are welcome. +We will keep improving our implementations, and update these test results. ## RecBole Major Releases | Releases | Date | Features | |-----------|--------|-------------------------| +| v0.2.0 | 01/15/2021 | RecBole | | v0.1.1 | 11/03/2020 | Basic RecBole | - ## Contributing Please let us know if you encounter a bug or have any suggestions by [filing an issue](https://github.com/RUCAIBox/RecBole/issues). @@ -183,6 +195,9 @@ We welcome all contributions from bug fixes to new features and extensions. We expect all contributions discussed in the issue tracker and going through PRs. +We thank the insightful suggestions from [@tszumowski](https://github.com/tszumowski), [@rowedenny](https://github.com/rowedenny), [@deklanw](https://github.com/deklanw) et.al. + +We thank the nice contributions through PRs from [@rowedenny](https://github.com/rowedenny),[@deklanw](https://github.com/deklanw) et.al. ## Cite If you find RecBole useful for your research or development, please cite the following [paper](https://arxiv.org/abs/2011.01731): diff --git a/README_CN.md b/README_CN.md new file mode 100644 index 000000000..024656bdb --- /dev/null +++ b/README_CN.md @@ -0,0 +1,210 @@ + + +-------------------------------------------------------------------------------- + +# RecBole (伯乐) + +*“世有伯乐,然后有千里马。千里马常有,而伯乐不常有。”——韩愈《马说》* + +[](https://pypi.org/project/recbole/) +[](https://anaconda.org/aibox/recbole) +[](./LICENSE) + + +[中文主页] | [文档] | [数据集] | [论文] | [博客] | [English Version] + +[中文主页]: https://recbole.io/cn +[文档]: https://recbole.io/docs/ +[数据集]: https://github.com/RUCAIBox/RecDatasets +[论文]: https://arxiv.org/abs/2011.01731 +[博客]: https://blog.csdn.net/Turinger_2000/article/details/111182852 +[English Version]: README.md + + +RecBole 是一个基于 PyTorch 实现的,面向研究者的,易于开发与复现的,统一、全面、高效的推荐系统代码库。 +我们实现了53个推荐系统模型,包含常见的推荐系统类别,如: + ++ General Recommendation ++ Sequential Recommendation ++ Context-aware Recommendation ++ Knowledge-based Recommendation + + +我们约定了一个统一、易用的数据文件格式,并已支持 27 个 benchmark dataset。 +用户可以选择使用我们的数据集预处理脚本,或直接下载已被处理好的数据集文件。 + + +<p align="center"> + <img src="asset/framework.png" alt="RecBole v0.1 架构" width="600"> + <br> + <b>图片</b>: RecBole 总体架构 +</p> + + +## 特色 ++ **通用和可扩展的数据结构** 我们设计了通用和可扩展的数据结构来支持各种推荐数据集统一化格式和使用。 + ++ **全面的基准模型和数据集** 我们实现了53个常用的推荐算法,并提供了27个推荐数据集的格式化副本。 + ++ **高效的 GPU 加速实现** 我们针对GPU环境使用了一系列的优化技术来提升代码库的效率。 + ++ **大规模的标准评测** 我们支持一系列被广泛认可的评估方式来测试和比较不同的推荐算法。 + + +## RecBole 新闻 +**12/10/2020**: 我们发布了[RecBole小白入门系列中文博客(持续更新中)](https://blog.csdn.net/Turinger_2000/article/details/111182852) 。 + +**12/06/2020**: 我们发布了 RecBole [v0.1.2](https://github.com/RUCAIBox/RecBole/releases/tag/v0.1.2). + +**11/29/2020**: 我们在三个不同大小的数据集上进行了时间和内存开销的初步测试, +并提供了 [测试结果](https://github.com/RUCAIBox/RecBole#time-and-memory-costs) 以供参考。 + +**11/03/2020**: 我们发布了第一版 RecBole **v0.1.1**. + + +## 安装 +RecBole可以在以下几种系统上运行: + +* Linux +* Windows 10 +* macOS X + +RecBole需要在python 3.6或更高的环境下运行。 + +RecBole要求torch版本在1.6.0及以上,如果你想在GPU上运行RecBole,请确保你的CUDA版本或CUDAToolkit版本在9.2及以上。 +这需要你的NVIDIA驱动版本为396.26或以上(在linux系统上)或者为397.44或以上(在Windows10系统上)。 + + +### 从Conda安装 + +```bash +conda install -c aibox recbole +``` + +### 从pip安装 + +```bash +pip install recbole +``` + +### 从源文件安装 +```bash +git clone https://github.com/RUCAIBox/RecBole.git && cd RecBole +pip install -e . --verbose +``` + +## 快速上手 +如果你从GitHub下载了RecBole的源码,你可以使用提供的脚本进行简单的使用: + +```bash +python run_recbole.py +``` + +这个例子将会在ml-100k这个数据集上进行BPR模型的训练和测试。 + +一般来说,这个例子将花费不到一分钟的时间,我们会得到一些类似下面的输出: + +``` +INFO ml-100k +The number of users: 944 +Average actions of users: 106.04453870625663 +The number of items: 1683 +Average actions of items: 59.45303210463734 +The number of inters: 100000 +The sparsity of the dataset: 93.70575143257098% + +INFO Evaluation Settings: +Group by user_id +Ordering: {'strategy': 'shuffle'} +Splitting: {'strategy': 'by_ratio', 'ratios': [0.8, 0.1, 0.1]} +Negative Sampling: {'strategy': 'full', 'distribution': 'uniform'} + +INFO BPRMF( + (user_embedding): Embedding(944, 64) + (item_embedding): Embedding(1683, 64) + (loss): BPRLoss() +) +Trainable parameters: 168128 + +INFO epoch 0 training [time: 0.27s, train loss: 27.7231] +INFO epoch 0 evaluating [time: 0.12s, valid_score: 0.021900] +INFO valid result: +recall@10: 0.0073 mrr@10: 0.0219 ndcg@10: 0.0093 hit@10: 0.0795 precision@10: 0.0088 + +... + +INFO epoch 63 training [time: 0.19s, train loss: 4.7660] +INFO epoch 63 evaluating [time: 0.08s, valid_score: 0.394500] +INFO valid result: +recall@10: 0.2156 mrr@10: 0.3945 ndcg@10: 0.2332 hit@10: 0.7593 precision@10: 0.1591 + +INFO Finished training, best eval result in epoch 52 +INFO Loading model structure and parameters from saved/***.pth +INFO best valid result: +recall@10: 0.2169 mrr@10: 0.4005 ndcg@10: 0.235 hit@10: 0.7582 precision@10: 0.1598 +INFO test result: +recall@10: 0.2368 mrr@10: 0.4519 ndcg@10: 0.2768 hit@10: 0.7614 precision@10: 0.1901 +``` + +如果你要改参数,例如 ``learning_rate``, ``embedding_size``, 只需根据您的需求增加额外的参数,例如: + +```bash +python run_recbole.py --learning_rate=0.0001 --embedding_size=128 +``` + +如果你想改变运行模型,只需要在执行脚本时添加额外的设置参数即可: + +```bash +python run_recbole.py --model=[model_name] +``` + + +## 时间和内存开销 +我们构建了初步的实验来测试三个不同大小的数据集(小、中、大)的时间和内存开销。 +有关详细信息,请单击以下链接。 + +* [General recommendation models](asset/time_test_result/General_recommendation.md)<br> +* [Sequential recommendation models](asset/time_test_result/Sequential_recommendation.md)<br> +* [Context-aware recommendation models](asset/time_test_result/Context-aware_recommendation.md)<br> +* [Knowledge-based recommendation models](asset/time_test_result/Knowledge-based_recommendation.md)<br> + +NOTE: 我们的测试结果只给出了RecBole库中实现模型的大致时间和内存开销(基于我们的机器服务器)。 +我们欢迎任何关于测试、实现的建议。我们将继续改进我们的实现,并更新这些测试结果。 + + +## RecBole 重要发布 +| Releases | Date | Features | +|-----------|--------|-------------------------| +| v0.1.1 | 11/03/2020 | Basic RecBole | + + +## 贡献 + +如果您遇到错误或有任何建议,请通过 [Issue](https://github.com/RUCAIBox/RecBole/issues) 进行反馈 + +我们欢迎关于修复错误、添加新特性的任何贡献。 + +如果想贡献代码,请先在issue中提出问题,然后再提PR。 + +我们对[@tszumowski](https://github.com/tszumowski), [@rowedenny](https://github.com/rowedenny), [@deklanw](https://github.com/deklanw) 等用户提出的建议表示感谢。 + +我们也对[@rowedenny](https://github.com/rowedenny), [@deklanw](https://github.com/deklanw) 等用户做出的贡献表示感谢。 + + +## 引用 +如果你觉得RecBole对你的科研工作有帮助,请引用我们的[论文](https://arxiv.org/abs/2011.01731): + +``` +@article{recbole, + title={RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms}, + author={Wayne Xin Zhao and Shanlei Mu and Yupeng Hou and Zihan Lin and Kaiyuan Li and Yushuo Chen and Yujie Lu and Hui Wang and Changxin Tian and Xingyu Pan and Yingqian Min and Zhichao Feng and Xinyan Fan and Xu Chen and Pengfei Wang and Wendi Ji and Yaliang Li and Xiaoling Wang and Ji-Rong Wen}, + year={2020}, + journal={arXiv preprint arXiv:2011.01731} +} +``` + +## 项目团队 +RecBole由 [中国人民大学, 北京邮电大学, 华东师范大学](https://www.recbole.io/cn/about.html) 的同学和老师进行开发和维护。 + +## 免责声明 +RecBole 基于 [MIT License](./LICENSE) 进行开发,本项目的所有数据和代码只能被用于学术目的。 diff --git a/asset/framework.png b/asset/framework.png index a3a1cdc22..add0b8028 100644 Binary files a/asset/framework.png and b/asset/framework.png differ diff --git a/asset/logo.png b/asset/logo.png index bd828fae0..047e61bcf 100644 Binary files a/asset/logo.png and b/asset/logo.png differ diff --git a/asset/time_test_result/Context-aware_recommendation.md b/asset/time_test_result/Context-aware_recommendation.md index 39751b0b4..fa518892a 100644 --- a/asset/time_test_result/Context-aware_recommendation.md +++ b/asset/time_test_result/Context-aware_recommendation.md @@ -1,189 +1,191 @@ -## Time and memory cost of context-aware recommendation models - -### Datasets information: - -| Dataset | #Interaction | #Feature Field | #Feature | -| ------- | ------------: | --------------: | --------: | -| ml-1m | 1,000,209 | 5 | 134 | -| Criteo | 2,292,530 | 39 | 2,572,192 | -| Avazu | 4,218,938 | 21 | 1,326,631 | - -### Device information - -``` -OS: Linux -Python Version: 3.8.3 -PyTorch Version: 1.7.0 -cudatoolkit Version: 10.1 -GPU: TITAN RTX(24GB) -Machine Specs: 32 CPU machine, 64GB RAM -``` - -### 1) ml-1m dataset: - -#### Time and memory cost on ml-1m dataset: - -| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | -| --------- | -----------------: | -----------------: | -----------: | -| LR | 18.34 | 2.18 | 0.82 | -| DIN | 20.37 | 2.26 | 1.16 | -| DSSM | 21.93 | 2.24 | 0.95 | -| FM | 19.33 | 2.34 | 0.83 | -| DeepFM | 20.42 | 2.27 | 0.91 | -| Wide&Deep | 26.13 | 2.95 | 0.89 | -| NFM | 23.36 | 2.26 | 0.89 | -| AFM | 20.08 | 2.26 | 0.92 | -| AutoInt | 22.41 | 2.34 | 0.94 | -| DCN | 28.33 | 2.97 | 0.93 | -| FNN(DNN) | 19.51 | 2.21 | 0.91 | -| PNN | 22.29 | 2.23 | 0.91 | -| FFM | 22.98 | 2.47 | 0.87 | -| FwFM | 23.38 | 2.50 | 0.85 | -| xDeepFM | 24.40 | 2.30 | 1.06 | - -#### Config file of ml-1m dataset: - -``` -# dataset config -field_separator: "\t" -seq_separator: " " -USER_ID_FIELD: user_id -ITEM_ID_FIELD: item_id -LABEL_FIELD: label -threshold: - rating: 4.0 -drop_filter_field : True -load_col: - inter: [user_id, item_id, rating] - item: [item_id, release_year, genre] - user: [user_id, age, gender, occupation] - -# training and evaluation -epochs: 500 -train_batch_size: 2048 -eval_batch_size: 2048 -eval_setting: RO_RS -group_by_user: False -valid_metric: AUC -metrics: ['AUC', 'LogLoss'] -``` - -Other parameters (including model parameters) are default value. - -### 2)Criteo dataset: - -#### Time and memory cost on Criteo dataset: - -| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | -| --------- | -------------------------: | ---------------------------: | ---------------: | -| LR | 7.65 | 0.61 | 1.11 | -| DIN | - | - | - | -| DSSM | - | - | - | -| FM | 9.77 | 0.73 | 1.45 | -| DeepFM | 13.64 | 0.83 | 1.72 | -| Wide&Deep | 13.58 | 0.80 | 1.72 | -| NFM | 13.36 | 0.75 | 1.72 | -| AFM | 19.40 | 1.02 | 2.34 | -| AutoInt | 19.40 | 0.98 | 2.06 | -| DCN | 16.25 | 0.78 | 1.67 | -| FNN(DNN) | 10.03 | 0.64 | 1.63 | -| PNN | 12.92 | 0.72 | 1.85 | -| FFM | - | - | - | -| FwFM | 1175.24 | 8.90 | 2.12 | -| xDeepFM | 32.27 | 1.34 | 2.25 | - -#### Config file of Criteo dataset: - -``` -# dataset config -field_separator: "\t" -seq_separator: " " -USER_ID_FIELD: ~ -ITEM_ID_FIELD: ~ -LABEL_FIELD: label - -load_col: - inter: '*' - -highest_val: - index: 2292530 - -fill_nan: True -normalize_all: True -min_item_inter_num: 0 -min_user_inter_num: 0 - -drop_filter_field : True - - -# training and evaluation -epochs: 500 -train_batch_size: 2048 -eval_batch_size: 2048 -eval_setting: RO_RS -group_by_user: False -valid_metric: AUC -metrics: ['AUC', 'LogLoss'] -``` - -Other parameters (including model parameters) are default value. - -### 3)Avazu dataset: - -#### Time and memory cost on Avazu dataset: - -| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | -| --------- | -------------------------: | ---------------------------: | ---------------: | -| LR | 9.30 | 0.76 | 1.42 | -| DIN | - | - | - | -| DSSM | - | - | - | -| FM | 25.68 | 0.94 | 2.60 | -| DeepFM | 28.41 | 1.19 | 2.66 | -| Wide&Deep | 27.58 | 0.97 | 2.66 | -| NFM | 30.46 | 1.06 | 2.66 | -| AFM | 31.03 | 1.06 | 2.69 | -| AutoInt | 38.11 | 1.41 | 2.84 | -| DCN | 30.78 | 0.96 | 2.64 | -| FNN(DNN) | 23.53 | 0.84 | 2.60 | -| PNN | 25.86 | 0.90 | 2.68 | -| FFM | - | - | - | -| FwFM | 336.75 | 7.49 | 2.63 | -| xDeepFM | 54.88 | 1.45 | 2.89 | - -#### Config file of Avazu dataset: - -``` -# dataset config -field_separator: "\t" -seq_separator: " " -USER_ID_FIELD: ~ -ITEM_ID_FIELD: ~ -LABEL_FIELD: label -fill_nan: True -normalize_all: True - -load_col: - inter: '*' - -lowest_val: - timestamp: 14102931 -drop_filter_field : False - -# training and evaluation -epochs: 500 -train_batch_size: 2048 -eval_batch_size: 2048 -eval_setting: RO_RS -group_by_user: False -valid_metric: AUC -metrics: ['AUC', 'LogLoss'] -``` - -Other parameters (including model parameters) are default value. - - - - - - - +## Time and memory cost of context-aware recommendation models + +### Datasets information: + +| Dataset | #Interaction | #Feature Field | #Feature | +| ------- | ------------: | --------------: | --------: | +| ml-1m | 1,000,209 | 5 | 134 | +| Criteo | 2,292,530 | 39 | 2,572,192 | +| Avazu | 4,218,938 | 21 | 1,326,631 | + +### Device information + +``` +OS: Linux +Python Version: 3.8.3 +PyTorch Version: 1.7.0 +cudatoolkit Version: 10.1 +GPU: TITAN RTX(24GB) +Machine Specs: 32 CPU machine, 64GB RAM +``` + +### 1) ml-1m dataset: + +#### Time and memory cost on ml-1m dataset: + +| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | +| --------- | -----------------: | -----------------: | -----------: | +| LR | 18.34 | 2.18 | 0.82 | +| DIN | 20.37 | 2.26 | 1.16 | +| DSSM | 21.93 | 2.24 | 0.95 | +| FM | 19.33 | 2.34 | 0.83 | +| DeepFM | 20.42 | 2.27 | 0.91 | +| Wide&Deep | 26.13 | 2.95 | 0.89 | +| NFM | 23.36 | 2.26 | 0.89 | +| AFM | 20.08 | 2.26 | 0.92 | +| AutoInt | 22.41 | 2.34 | 0.94 | +| DCN | 28.33 | 2.97 | 0.93 | +| FNN(DNN) | 19.51 | 2.21 | 0.91 | +| PNN | 22.29 | 2.23 | 0.91 | +| FFM | 22.98 | 2.47 | 0.87 | +| FwFM | 23.38 | 2.50 | 0.85 | +| xDeepFM | 24.40 | 2.30 | 1.06 | + +#### Config file of ml-1m dataset: + +``` +# dataset config +field_separator: "\t" +seq_separator: " " +USER_ID_FIELD: user_id +ITEM_ID_FIELD: item_id +LABEL_FIELD: label +threshold: + rating: 4.0 +drop_filter_field : True +load_col: + inter: [user_id, item_id, rating] + item: [item_id, release_year, genre] + user: [user_id, age, gender, occupation] + +# training and evaluation +epochs: 500 +train_batch_size: 2048 +eval_batch_size: 2048 +eval_setting: RO_RS +group_by_user: False +valid_metric: AUC +metrics: ['AUC', 'LogLoss'] +``` + +Other parameters (including model parameters) are default value. + +### 2)Criteo dataset: + +#### Time and memory cost on Criteo dataset: + +| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | +| --------- | -------------------------: | ---------------------------: | ---------------: | +| LR | 7.65 | 0.61 | 1.11 | +| DIN | - | - | - | +| DSSM | - | - | - | +| FM | 9.77 | 0.73 | 1.45 | +| DeepFM | 13.64 | 0.83 | 1.72 | +| Wide&Deep | 13.58 | 0.80 | 1.72 | +| NFM | 13.36 | 0.75 | 1.72 | +| AFM | 19.40 | 1.02 | 2.34 | +| AutoInt | 19.40 | 0.98 | 2.06 | +| DCN | 16.25 | 0.78 | 1.67 | +| FNN(DNN) | 10.03 | 0.64 | 1.63 | +| PNN | 12.92 | 0.72 | 1.85 | +| FFM | - | - | Out of Memory | +| FwFM | 1175.24 | 8.90 | 2.12 | +| xDeepFM | 32.27 | 1.34 | 2.25 | + +Note: Criteo dataset is not suitable for DIN model and DSSM model. +#### Config file of Criteo dataset: + +``` +# dataset config +field_separator: "\t" +seq_separator: " " +USER_ID_FIELD: ~ +ITEM_ID_FIELD: ~ +LABEL_FIELD: label + +load_col: + inter: '*' + +highest_val: + index: 2292530 + +fill_nan: True +normalize_all: True +min_item_inter_num: 0 +min_user_inter_num: 0 + +drop_filter_field : True + + +# training and evaluation +epochs: 500 +train_batch_size: 2048 +eval_batch_size: 2048 +eval_setting: RO_RS +group_by_user: False +valid_metric: AUC +metrics: ['AUC', 'LogLoss'] +``` + +Other parameters (including model parameters) are default value. + +### 3)Avazu dataset: + +#### Time and memory cost on Avazu dataset: + +| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | +| --------- | -------------------------: | ---------------------------: | ---------------: | +| LR | 9.30 | 0.76 | 1.42 | +| DIN | - | - | - | +| DSSM | - | - | - | +| FM | 25.68 | 0.94 | 2.60 | +| DeepFM | 28.41 | 1.19 | 2.66 | +| Wide&Deep | 27.58 | 0.97 | 2.66 | +| NFM | 30.46 | 1.06 | 2.66 | +| AFM | 31.03 | 1.06 | 2.69 | +| AutoInt | 38.11 | 1.41 | 2.84 | +| DCN | 30.78 | 0.96 | 2.64 | +| FNN(DNN) | 23.53 | 0.84 | 2.60 | +| PNN | 25.86 | 0.90 | 2.68 | +| FFM | - | - | Out of Memory | +| FwFM | 336.75 | 7.49 | 2.63 | +| xDeepFM | 54.88 | 1.45 | 2.89 | + +Note: Avazu dataset is not suitable for DIN model and DSSM model. +#### Config file of Avazu dataset: + +``` +# dataset config +field_separator: "\t" +seq_separator: " " +USER_ID_FIELD: ~ +ITEM_ID_FIELD: ~ +LABEL_FIELD: label +fill_nan: True +normalize_all: True + +load_col: + inter: '*' + +lowest_val: + timestamp: 14102931 +drop_filter_field : False + +# training and evaluation +epochs: 500 +train_batch_size: 2048 +eval_batch_size: 2048 +eval_setting: RO_RS +group_by_user: False +valid_metric: AUC +metrics: ['AUC', 'LogLoss'] +``` + +Other parameters (including model parameters) are default value. + + + + + + + diff --git a/asset/time_test_result/General_recommendation.md b/asset/time_test_result/General_recommendation.md index e88472078..a9442c57a 100644 --- a/asset/time_test_result/General_recommendation.md +++ b/asset/time_test_result/General_recommendation.md @@ -77,7 +77,7 @@ Other parameters (including model parameters) are default value. | BPRMF | 4.42 | 52.81 | 1.08 | | NeuMF | 11.33 | 238.92 | 1.26 | | DMF | 20.62 | 68.89 | 7.12 | -| NAIS | - | - | - | +| NAIS | - | - | Out of Memory | | NGCF | 52.50 | 51.60 | 2.00 | | GCMC | 93.15 | 1810.43 | 3.17 | | LightGCN | 30.21 | 47.12 | 1.58 | @@ -127,13 +127,13 @@ Other parameters (including model parameters) are default value. | BPRMF | 6.31 | 120.03 | 1.29 | | NeuMF | 17.38 | 2069.53 | 1.67 | | DMF | 43.96 | 173.13 | 9.22 | -| NAIS | - | - | - | +| NAIS | - | - | Out of Memory | | NGCF | 122.90 | 129.59 | 3.28 | | GCMC | 299.36 | 9833.24 | 5.96 | | LightGCN | 67.91 | 116.16 | 2.02 | | DGCF | 1542.00 | 119.00 | 17.17 | | ConvNCF | 87.56 | 11155.31 | 1.62 | -| FISM | - | - | - | +| FISM | - | - | Out of Memory | | SpectralCF | 138.99 | 133.37 | 3.10 | #### Config file of Yelp dataset: diff --git a/asset/time_test_result/Sequential_recommendation.md b/asset/time_test_result/Sequential_recommendation.md index 3a0fb4f6c..38290d494 100644 --- a/asset/time_test_result/Sequential_recommendation.md +++ b/asset/time_test_result/Sequential_recommendation.md @@ -1,225 +1,230 @@ -## Time and memory cost of sequential recommendation models - -### Datasets information: - -| Dataset | #User | #Item | #Interaction | Sparsity | -| ---------- | -------: | ------: | ------------: | --------: | -| ml-1m | 6,041 | 3,707 | 1,000,209 | 0.9553 | -| DIGINETICA | 59,425 | 42,116 | 547,416 | 0.9998 | -| Yelp | 102,046 | 98,408 | 2,903,648 | 0.9997 | - -### Device information - -``` -OS: Linux -Python Version: 3.8.3 -PyTorch Version: 1.7.0 -cudatoolkit Version: 10.1 -GPU: TITAN RTX(24GB) -Machine Specs: 32 CPU machine, 64GB RAM -``` - -### 1) ml-1m dataset: - -#### Time and memory cost on ml-1m dataset: - -| Method | Training Time (sec/epoch) | Evaluate Time (sec/epoch) | GPU Memory (GB) | -| ---------------- | -----------------: | -----------------: | -----------: | -| Improved GRU-Rec | 7.78 | 0.11 | 1.27 | -| SASRec | 17.78 | 0.12 | 1.84 | -| NARM | 8.29 | 0.11 | 1.29 | -| FPMC | 7.51 | 0.11 | 1.18 | -| STAMP | 7.32 | 0.11 | 1.20 | -| Caser | 44.85 | 0.12 | 1.14 | -| NextItNet | 16433.27 | 96.31 | 1.86 | -| TransRec | 10.08 | 0.16 | 8.18 | -| S3Rec | - | - | - | -| GRU4RecF | 10.20 | 0.15 | 1.80 | -| SASRecF | 18.84 | 0.17 | 1.78 | -| BERT4Rec | 36.09 | 0.34 | 1.97 | -| FDSA | 31.86 | 0.19 | 2.32 | -| SRGNN | 327.38 | 2.19 | 1.21 | -| GCSAN | 335.27 | 0.02 | 1.58 | -| KSR | - | - | - | -| GRU4RecKG | - | - | - | - -#### Config file of ml-1m dataset: - -``` -# dataset config -field_separator: "\t" -seq_separator: " " -USER_ID_FIELD: user_id -ITEM_ID_FIELD: item_id -TIME_FIELD: timestamp -NEG_PREFIX: neg_ -ITEM_LIST_LENGTH_FIELD: item_length -LIST_SUFFIX: _list -MAX_ITEM_LIST_LENGTH: 20 -POSITION_FIELD: position_id -load_col: - inter: [user_id, item_id, timestamp] -min_user_inter_num: 0 -min_item_inter_num: 0 - -# training and evaluation -epochs: 500 -train_batch_size: 2048 -eval_batch_size: 2048 -valid_metric: MRR@10 -eval_setting: TO_LS,full -training_neg_sample_num: 0 -``` - -Other parameters (including model parameters) are default value. - -**NOTE :** - -1) For FPMC and TransRec model, `training_neg_sample_num` should be `1` . - -2) For SASRecF, GRU4RecF and FDSA, `load_col` should as below: - -``` -load_col: - inter: [user_id, item_id, timestamp] - item: [item_id, genre] -``` - -### 2)DIGINETICA dataset: - -#### Time and memory cost on DIGINETICA dataset: - -| Method | Training Time (sec/epoch) | Evaluate Time (sec/epoch) | GPU Memory (GB) | -| ---------------- | -----------------: | -----------------: | -----------: | -| Improved GRU-Rec | 4.10 | 1.05 | 4.02 | -| SASRec | 8.36 | 1.21 | 4.43 | -| NARM | 4.30 | 1.08 | 4.09 | -| FPMC | 2.98 | 1.08 | 4.08 | -| STAMP | 4.27 | 1.04 | 3.88 | -| Caser | 17.15 | 1.18 | 3.94 | -| NextItNet | - | - | - | -| TransRec | - | - | - | -| S3Rec | - | - | - | -| GRU4RecF | 4.79 | 1.17 | 4.83 | -| SASRecF | 8.66 | 1.29 | 5.11 | -| BERT4Rec | 16.80 | 3.54 | 7.97 | -| FDSA | 13.44 | 1.47 | 5.66 | -| SRGNN | 88.59 | 15.37 | 4.01 | -| GCSAN | 96.69 | 17.11 | 4.25 | -| KSR | - | - | - | -| GRU4RecKG | - | - | - | - -#### Config file of DIGINETICA dataset: - -``` -# dataset config -field_separator: "\t" -seq_separator: " " -USER_ID_FIELD: session_id -ITEM_ID_FIELD: item_id -TIME_FIELD: timestamp -NEG_PREFIX: neg_ -ITEM_LIST_LENGTH_FIELD: item_length -LIST_SUFFIX: _list -MAX_ITEM_LIST_LENGTH: 20 -POSITION_FIELD: position_id -load_col: - inter: [session_id, item_id, timestamp] -min_user_inter_num: 6 -min_item_inter_num: 1 - -# training and evaluation -epochs: 500 -train_batch_size: 2048 -eval_batch_size: 2048 -valid_metric: MRR@10 -eval_setting: TO_LS,full -training_neg_sample_num: 0 -``` - -Other parameters (including model parameters) are default value. - -**NOTE :** - -1) For FPMC and TransRec model, `training_neg_sample_num` should be `1` . - -2) For SASRecF, GRU4RecF and FDSA, `load_col` should as below: - -``` -load_col: - inter: [session_id, item_id, timestamp] - item: [item_id, item_category] -``` - -### 3)Yelp dataset: - -#### Time and memory cost on Yelp dataset: - -| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | -| ---------------- | -----------------: | -----------------: | -----------: | -| Improved GRU-Rec | 44.31 | 2.74 | 7.92 | -| SASRec | 75.51 | 3.11 | 8.32 | -| NARM | 45.65 | 2.76 | 7.98 | -| FPMC | 21.05 | 3.05 | 8.22 | -| STAMP | 42.08 | 2.72 | 7.77 | -| Caser | 147.15 | 2.89 | 7.87 | -| NextItNet | 45019.38 | 1670.76 | 8.44 | -| TransRec | - | - | - | -| S3Rec | - | - | - | -| GRU4RecF | - | - | - | -| SASRecF | - | - | - | -| BERT4Rec | 193.74 | 8.43 | 16.57 | -| FDSA | - | - | - | -| SRGNN | 825.11 | 33.20 | 7.90 | -| GCSAN | 837.23 | 33.00 | 8.14 | -| KSR | - | - | - | -| GRU4RecKG | - | - | - | - -#### Config file of DIGINETICA dataset: - -``` -# dataset config -field_separator: "\t" -seq_separator: " " -USER_ID_FIELD: session_id -ITEM_ID_FIELD: item_id -TIME_FIELD: timestamp -NEG_PREFIX: neg_ -ITEM_LIST_LENGTH_FIELD: item_length -LIST_SUFFIX: _list -MAX_ITEM_LIST_LENGTH: 20 -POSITION_FIELD: position_id -load_col: - inter: [session_id, item_id, timestamp] -min_user_inter_num: 6 -min_item_inter_num: 1 - -# training and evaluation -epochs: 500 -train_batch_size: 2048 -eval_batch_size: 2048 -valid_metric: MRR@10 -eval_setting: TO_LS,full -training_neg_sample_num: 0 -``` - -Other parameters (including model parameters) are default value. - -**NOTE :** - -1) For FPMC and TransRec model, `training_neg_sample_num` should be `1` . - -2) For SASRecF, GRU4RecF and FDSA, `load_col` should as below: - -``` -load_col: - inter: [session_id, item_id, timestamp] - item: [item_id, item_category] -``` - - - - - - - +## Time and memory cost of sequential recommendation models + +### Datasets information: + +| Dataset | #User | #Item | #Interaction | Sparsity | +| ---------- | -------: | ------: | ------------: | --------: | +| ml-1m | 6,041 | 3,707 | 1,000,209 | 0.9553 | +| DIGINETICA | 59,425 | 42,116 | 547,416 | 0.9998 | +| Yelp | 102,046 | 98,408 | 2,903,648 | 0.9997 | + +### Device information + +``` +OS: Linux +Python Version: 3.8.3 +PyTorch Version: 1.7.0 +cudatoolkit Version: 10.1 +GPU: TITAN RTX(24GB) +Machine Specs: 32 CPU machine, 64GB RAM +``` + +### 1) ml-1m dataset: + +#### Time and memory cost on ml-1m dataset: + +| Method | Training Time (sec/epoch) | Evaluate Time (sec/epoch) | GPU Memory (GB) | +| ---------------- | -----------------: | -----------------: | -----------: | +| Improved GRU-Rec | 7.78 | 0.11 | 1.27 | +| SASRec | 17.78 | 0.12 | 1.84 | +| NARM | 8.29 | 0.11 | 1.29 | +| FPMC | 7.51 | 0.11 | 1.18 | +| STAMP | 7.32 | 0.11 | 1.20 | +| Caser | 44.85 | 0.12 | 1.14 | +| NextItNet | 16433.27 | 96.31 | 1.86 | +| TransRec | 10.08 | 0.16 | 8.18 | +| S3Rec | - | - | - | +| GRU4RecF | 10.20 | 0.15 | 1.80 | +| SASRecF | 18.84 | 0.17 | 1.78 | +| BERT4Rec | 36.09 | 0.34 | 1.97 | +| FDSA | 31.86 | 0.19 | 2.32 | +| SRGNN | 327.38 | 2.19 | 1.21 | +| GCSAN | 335.27 | 0.02 | 1.58 | +| KSR | - | - | - | +| GRU4RecKG | - | - | - | + +#### Config file of ml-1m dataset: + +``` +# dataset config +field_separator: "\t" +seq_separator: " " +USER_ID_FIELD: user_id +ITEM_ID_FIELD: item_id +TIME_FIELD: timestamp +NEG_PREFIX: neg_ +ITEM_LIST_LENGTH_FIELD: item_length +LIST_SUFFIX: _list +MAX_ITEM_LIST_LENGTH: 20 +POSITION_FIELD: position_id +load_col: + inter: [user_id, item_id, timestamp] +min_user_inter_num: 0 +min_item_inter_num: 0 + +# training and evaluation +epochs: 500 +train_batch_size: 2048 +eval_batch_size: 2048 +valid_metric: MRR@10 +eval_setting: TO_LS,full +training_neg_sample_num: 0 +``` + +Other parameters (including model parameters) are default value. + +**NOTE :** + +1) For FPMC and TransRec model, `training_neg_sample_num` should be `1` . + +2) For SASRecF, GRU4RecF and FDSA, `load_col` should as below: + +``` +load_col: + inter: [user_id, item_id, timestamp] + item: [item_id, genre] +``` + +### 2)DIGINETICA dataset: + +#### Time and memory cost on DIGINETICA dataset: + +| Method | Training Time (sec/epoch) | Evaluate Time (sec/epoch) | GPU Memory (GB) | +| ---------------- | -----------------: | -----------------: | -----------: | +| Improved GRU-Rec | 4.10 | 1.05 | 4.02 | +| SASRec | 8.36 | 1.21 | 4.43 | +| NARM | 4.30 | 1.08 | 4.09 | +| FPMC | 2.98 | 1.08 | 4.08 | +| STAMP | 4.27 | 1.04 | 3.88 | +| Caser | 17.15 | 1.18 | 3.94 | +| NextItNet | 6150.49 | 947.66 | 4.54 | +| TransRec | - | - | Out of Memory | +| S3Rec | - | - | - | +| GRU4RecF | 4.79 | 1.17 | 4.83 | +| SASRecF | 8.66 | 1.29 | 5.11 | +| BERT4Rec | 16.80 | 3.54 | 7.97 | +| FDSA | 13.44 | 1.47 | 5.66 | +| SRGNN | 88.59 | 15.37 | 4.01 | +| GCSAN | 96.69 | 17.11 | 4.25 | +| KSR | - | - | - | +| GRU4RecKG | - | - | - | + +#### Config file of DIGINETICA dataset: + +``` +# dataset config +field_separator: "\t" +seq_separator: " " +USER_ID_FIELD: session_id +ITEM_ID_FIELD: item_id +TIME_FIELD: timestamp +NEG_PREFIX: neg_ +ITEM_LIST_LENGTH_FIELD: item_length +LIST_SUFFIX: _list +MAX_ITEM_LIST_LENGTH: 20 +POSITION_FIELD: position_id +load_col: + inter: [session_id, item_id, timestamp] +min_user_inter_num: 6 +min_item_inter_num: 1 + +# training and evaluation +epochs: 500 +train_batch_size: 2048 +eval_batch_size: 2048 +valid_metric: MRR@10 +eval_setting: TO_LS,full +training_neg_sample_num: 0 +``` + +Other parameters (including model parameters) are default value. + +**NOTE :** + +1) For FPMC and TransRec model, `training_neg_sample_num` should be `1` . + +2) For SASRecF, GRU4RecF and FDSA, `load_col` should as below: + +``` +load_col: + inter: [session_id, item_id, timestamp] + item: [item_id, item_category] +``` + +### 3)Yelp dataset: + +#### Time and memory cost on Yelp dataset: + +| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) | +| ---------------- | -----------------: | -----------------: | -----------: | +| Improved GRU-Rec | 44.31 | 2.74 | 7.92 | +| SASRec | 75.51 | 3.11 | 8.32 | +| NARM | 45.65 | 2.76 | 7.98 | +| FPMC | 21.05 | 3.05 | 8.22 | +| STAMP | 42.08 | 2.72 | 7.77 | +| Caser | 147.15 | 2.89 | 7.87 | +| NextItNet | 45019.38 | 1670.76 | 8.44 | +| TransRec | - | - | Out of Memory | +| S3Rec | - | - | - | +| GRU4RecF | - | - | Out of Memory | +| SASRecF | - | - | Out of Memory | +| BERT4Rec | 193.74 | 8.43 | 16.57 | +| FDSA | - | - | Out of Memory | +| SRGNN | 825.11 | 33.20 | 7.90 | +| GCSAN | 837.23 | 33.00 | 8.14 | +| KSR | - | - | - | +| GRU4RecKG | - | - | - | + +#### Config file of Yelp dataset: + +``` +# dataset config +field_separator: "\t" +seq_separator: " " +USER_ID_FIELD: user_id +ITEM_ID_FIELD: business_id +RATING_FIELD: stars +TIME_FIELD: date +NEG_PREFIX: neg_ +ITEM_LIST_LENGTH_FIELD: item_length +LIST_SUFFIX: _list +MAX_ITEM_LIST_LENGTH: 20 +POSITION_FIELD: position_id +load_col: + inter: [user_id, business_id, stars, date] +min_user_inter_num: 10 +min_item_inter_num: 4 +lowest_val: + stars: 3 +drop_filter_field: True + +# training and evaluation +epochs: 500 +train_batch_size: 2048 +eval_batch_size: 2048 +valid_metric: MRR@10 +eval_setting: TO_LS,full +training_neg_sample_num: 0 + +``` + +Other parameters (including model parameters) are default value. + +**NOTE :** + +1) For FPMC and TransRec model, `training_neg_sample_num` should be `1` . + +2) For SASRecF, GRU4RecF and FDSA, `load_col` should as below: + +``` +load_col: + inter: [session_id, item_id, timestamp] + item: [item_id, item_category] +``` + + + + + + + diff --git a/conda/conda_release.sh b/conda/conda_release.sh new file mode 100644 index 000000000..a8b0e3efa --- /dev/null +++ b/conda/conda_release.sh @@ -0,0 +1,8 @@ +#!/bin/bash + +conda-build --python 3.6 . +printf "python 3.6 version is released \n" +conda-build --python 3.7 . +printf "python 3.7 version is released \n" +conda-build --python 3.8 . +printf "python 3.8 version is released \n" diff --git a/conda/meta.yaml b/conda/meta.yaml index fa8d9677d..edef48fb9 100644 --- a/conda/meta.yaml +++ b/conda/meta.yaml @@ -1,6 +1,6 @@ package: name: recbole - version: 0.1.1 + version: 0.2.0 source: path: ../ diff --git a/docs/Makefile b/docs/Makefile new file mode 100644 index 000000000..d0c3cbf10 --- /dev/null +++ b/docs/Makefile @@ -0,0 +1,20 @@ +# Minimal makefile for Sphinx documentation +# + +# You can set these variables from the command line, and also +# from the environment for the first two. +SPHINXOPTS ?= +SPHINXBUILD ?= sphinx-build +SOURCEDIR = source +BUILDDIR = build + +# Put it first so that "make" without argument is like "make help". +help: + @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) + +.PHONY: help Makefile + +# Catch-all target: route all unknown targets to Sphinx using the new +# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). +%: Makefile + @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) diff --git a/docs/source/asset/afm.jpg b/docs/source/asset/afm.jpg new file mode 100644 index 000000000..61c072a0b Binary files /dev/null and b/docs/source/asset/afm.jpg differ diff --git a/docs/source/asset/autoint.png b/docs/source/asset/autoint.png new file mode 100644 index 000000000..41d242649 Binary files /dev/null and b/docs/source/asset/autoint.png differ diff --git a/docs/source/asset/bert4rec.png b/docs/source/asset/bert4rec.png new file mode 100644 index 000000000..adc826e22 Binary files /dev/null and b/docs/source/asset/bert4rec.png differ diff --git a/docs/source/asset/bpr.png b/docs/source/asset/bpr.png new file mode 100644 index 000000000..f48cb41a1 Binary files /dev/null and b/docs/source/asset/bpr.png differ diff --git a/docs/source/asset/caser.png b/docs/source/asset/caser.png new file mode 100644 index 000000000..d09eb29ae Binary files /dev/null and b/docs/source/asset/caser.png differ diff --git a/docs/source/asset/cdae.png b/docs/source/asset/cdae.png new file mode 100644 index 000000000..99859aeed Binary files /dev/null and b/docs/source/asset/cdae.png differ diff --git a/docs/source/asset/cke.png b/docs/source/asset/cke.png new file mode 100644 index 000000000..b5092f70f Binary files /dev/null and b/docs/source/asset/cke.png differ diff --git a/docs/source/asset/convncf.png b/docs/source/asset/convncf.png new file mode 100644 index 000000000..3725ba15e Binary files /dev/null and b/docs/source/asset/convncf.png differ diff --git a/docs/source/asset/data_flow_en.png b/docs/source/asset/data_flow_en.png new file mode 100644 index 000000000..cf13cfa93 Binary files /dev/null and b/docs/source/asset/data_flow_en.png differ diff --git a/docs/source/asset/dcn.png b/docs/source/asset/dcn.png new file mode 100644 index 000000000..9d2faa85b Binary files /dev/null and b/docs/source/asset/dcn.png differ diff --git a/docs/source/asset/deepfm.png b/docs/source/asset/deepfm.png new file mode 100644 index 000000000..03978c41c Binary files /dev/null and b/docs/source/asset/deepfm.png differ diff --git a/docs/source/asset/dgcf.jpg b/docs/source/asset/dgcf.jpg new file mode 100644 index 000000000..a3685742b Binary files /dev/null and b/docs/source/asset/dgcf.jpg differ diff --git a/docs/source/asset/din.png b/docs/source/asset/din.png new file mode 100644 index 000000000..a0869747a Binary files /dev/null and b/docs/source/asset/din.png differ diff --git a/docs/source/asset/dmf.jpg b/docs/source/asset/dmf.jpg new file mode 100644 index 000000000..75840cd06 Binary files /dev/null and b/docs/source/asset/dmf.jpg differ diff --git a/docs/source/asset/dssm.png b/docs/source/asset/dssm.png new file mode 100644 index 000000000..9b1def0ec Binary files /dev/null and b/docs/source/asset/dssm.png differ diff --git a/docs/source/asset/enmf.jpg b/docs/source/asset/enmf.jpg new file mode 100644 index 000000000..be094d09f Binary files /dev/null and b/docs/source/asset/enmf.jpg differ diff --git a/docs/source/asset/evaluation.png b/docs/source/asset/evaluation.png new file mode 100644 index 000000000..d2a468f4f Binary files /dev/null and b/docs/source/asset/evaluation.png differ diff --git a/docs/source/asset/fdsa.png b/docs/source/asset/fdsa.png new file mode 100644 index 000000000..55af42a45 Binary files /dev/null and b/docs/source/asset/fdsa.png differ diff --git a/docs/source/asset/ffm.png b/docs/source/asset/ffm.png new file mode 100644 index 000000000..fd12c2295 Binary files /dev/null and b/docs/source/asset/ffm.png differ diff --git a/docs/source/asset/fm.png b/docs/source/asset/fm.png new file mode 100644 index 000000000..4702eb61e Binary files /dev/null and b/docs/source/asset/fm.png differ diff --git a/docs/source/asset/fnn.png b/docs/source/asset/fnn.png new file mode 100644 index 000000000..edd1c7f1b Binary files /dev/null and b/docs/source/asset/fnn.png differ diff --git a/docs/source/asset/fossil.jpg b/docs/source/asset/fossil.jpg new file mode 100644 index 000000000..eb14b22b4 Binary files /dev/null and b/docs/source/asset/fossil.jpg differ diff --git a/docs/source/asset/fpmc.png b/docs/source/asset/fpmc.png new file mode 100644 index 000000000..f339cc2c3 Binary files /dev/null and b/docs/source/asset/fpmc.png differ diff --git a/docs/source/asset/fwfm.png b/docs/source/asset/fwfm.png new file mode 100644 index 000000000..01282fe59 Binary files /dev/null and b/docs/source/asset/fwfm.png differ diff --git a/docs/source/asset/gcmc.png b/docs/source/asset/gcmc.png new file mode 100644 index 000000000..99acffab4 Binary files /dev/null and b/docs/source/asset/gcmc.png differ diff --git a/docs/source/asset/gcsan.png b/docs/source/asset/gcsan.png new file mode 100644 index 000000000..0ff336b4f Binary files /dev/null and b/docs/source/asset/gcsan.png differ diff --git a/docs/source/asset/gru4rec.png b/docs/source/asset/gru4rec.png new file mode 100644 index 000000000..c5ef04d29 Binary files /dev/null and b/docs/source/asset/gru4rec.png differ diff --git a/docs/source/asset/gru4recf.png b/docs/source/asset/gru4recf.png new file mode 100644 index 000000000..b1bf6611e Binary files /dev/null and b/docs/source/asset/gru4recf.png differ diff --git a/docs/source/asset/hgn.jpg b/docs/source/asset/hgn.jpg new file mode 100644 index 000000000..9200699b5 Binary files /dev/null and b/docs/source/asset/hgn.jpg differ diff --git a/docs/source/asset/hrm.jpg b/docs/source/asset/hrm.jpg new file mode 100644 index 000000000..5ba634de6 Binary files /dev/null and b/docs/source/asset/hrm.jpg differ diff --git a/docs/source/asset/kgat.png b/docs/source/asset/kgat.png new file mode 100644 index 000000000..84d0622d0 Binary files /dev/null and b/docs/source/asset/kgat.png differ diff --git a/docs/source/asset/kgcn.png b/docs/source/asset/kgcn.png new file mode 100644 index 000000000..86c9040b9 Binary files /dev/null and b/docs/source/asset/kgcn.png differ diff --git a/docs/source/asset/kgnnls.png b/docs/source/asset/kgnnls.png new file mode 100644 index 000000000..664ce86e7 Binary files /dev/null and b/docs/source/asset/kgnnls.png differ diff --git a/docs/source/asset/ksr.jpg b/docs/source/asset/ksr.jpg new file mode 100644 index 000000000..6c764c642 Binary files /dev/null and b/docs/source/asset/ksr.jpg differ diff --git a/docs/source/asset/ktup.png b/docs/source/asset/ktup.png new file mode 100644 index 000000000..0bde6eadb Binary files /dev/null and b/docs/source/asset/ktup.png differ diff --git a/docs/source/asset/lightgcn.png b/docs/source/asset/lightgcn.png new file mode 100644 index 000000000..576f91130 Binary files /dev/null and b/docs/source/asset/lightgcn.png differ diff --git a/docs/source/asset/line.png b/docs/source/asset/line.png new file mode 100644 index 000000000..f9e09e9b2 Binary files /dev/null and b/docs/source/asset/line.png differ diff --git a/docs/source/asset/lr.png b/docs/source/asset/lr.png new file mode 100644 index 000000000..c2d6189a7 Binary files /dev/null and b/docs/source/asset/lr.png differ diff --git a/docs/source/asset/macridvae.png b/docs/source/asset/macridvae.png new file mode 100644 index 000000000..f45826c1b Binary files /dev/null and b/docs/source/asset/macridvae.png differ diff --git a/docs/source/asset/mkr.png b/docs/source/asset/mkr.png new file mode 100644 index 000000000..f37307188 Binary files /dev/null and b/docs/source/asset/mkr.png differ diff --git a/docs/source/asset/multidae.png b/docs/source/asset/multidae.png new file mode 100644 index 000000000..919bc67d6 Binary files /dev/null and b/docs/source/asset/multidae.png differ diff --git a/docs/source/asset/multivae.png b/docs/source/asset/multivae.png new file mode 100644 index 000000000..919bc67d6 Binary files /dev/null and b/docs/source/asset/multivae.png differ diff --git a/docs/source/asset/nais.png b/docs/source/asset/nais.png new file mode 100644 index 000000000..6bd404472 Binary files /dev/null and b/docs/source/asset/nais.png differ diff --git a/docs/source/asset/narm.png b/docs/source/asset/narm.png new file mode 100644 index 000000000..a51a52e54 Binary files /dev/null and b/docs/source/asset/narm.png differ diff --git a/docs/source/asset/neumf.png b/docs/source/asset/neumf.png new file mode 100644 index 000000000..5af976a5d Binary files /dev/null and b/docs/source/asset/neumf.png differ diff --git a/docs/source/asset/nextitnet.png b/docs/source/asset/nextitnet.png new file mode 100644 index 000000000..aa751d33a Binary files /dev/null and b/docs/source/asset/nextitnet.png differ diff --git a/docs/source/asset/nfm.jpg b/docs/source/asset/nfm.jpg new file mode 100644 index 000000000..c242cd794 Binary files /dev/null and b/docs/source/asset/nfm.jpg differ diff --git a/docs/source/asset/ngcf.jpg b/docs/source/asset/ngcf.jpg new file mode 100644 index 000000000..eae87d17f Binary files /dev/null and b/docs/source/asset/ngcf.jpg differ diff --git a/docs/source/asset/nncf.png b/docs/source/asset/nncf.png new file mode 100644 index 000000000..7655c3d94 Binary files /dev/null and b/docs/source/asset/nncf.png differ diff --git a/docs/source/asset/npe.jpg b/docs/source/asset/npe.jpg new file mode 100644 index 000000000..2bedff1d1 Binary files /dev/null and b/docs/source/asset/npe.jpg differ diff --git a/docs/source/asset/pnn.jpg b/docs/source/asset/pnn.jpg new file mode 100644 index 000000000..1c8b475a4 Binary files /dev/null and b/docs/source/asset/pnn.jpg differ diff --git a/docs/source/asset/repeatnet.jpg b/docs/source/asset/repeatnet.jpg new file mode 100644 index 000000000..e63af5f71 Binary files /dev/null and b/docs/source/asset/repeatnet.jpg differ diff --git a/docs/source/asset/ripplenet.jpg b/docs/source/asset/ripplenet.jpg new file mode 100644 index 000000000..21f059094 Binary files /dev/null and b/docs/source/asset/ripplenet.jpg differ diff --git a/docs/source/asset/s3rec.png b/docs/source/asset/s3rec.png new file mode 100644 index 000000000..2b80cd58d Binary files /dev/null and b/docs/source/asset/s3rec.png differ diff --git a/docs/source/asset/sasrec.png b/docs/source/asset/sasrec.png new file mode 100644 index 000000000..e317bd7d5 Binary files /dev/null and b/docs/source/asset/sasrec.png differ diff --git a/docs/source/asset/shan.jpg b/docs/source/asset/shan.jpg new file mode 100644 index 000000000..bb681b387 Binary files /dev/null and b/docs/source/asset/shan.jpg differ diff --git a/docs/source/asset/spectralcf.png b/docs/source/asset/spectralcf.png new file mode 100644 index 000000000..d1e5f59cb Binary files /dev/null and b/docs/source/asset/spectralcf.png differ diff --git a/docs/source/asset/srgnn.png b/docs/source/asset/srgnn.png new file mode 100644 index 000000000..51f7d2935 Binary files /dev/null and b/docs/source/asset/srgnn.png differ diff --git a/docs/source/asset/stamp.png b/docs/source/asset/stamp.png new file mode 100644 index 000000000..32e0a8a40 Binary files /dev/null and b/docs/source/asset/stamp.png differ diff --git a/docs/source/asset/transrec.png b/docs/source/asset/transrec.png new file mode 100644 index 000000000..cc87d45f9 Binary files /dev/null and b/docs/source/asset/transrec.png differ diff --git a/docs/source/asset/widedeep.png b/docs/source/asset/widedeep.png new file mode 100644 index 000000000..76a61d449 Binary files /dev/null and b/docs/source/asset/widedeep.png differ diff --git a/docs/source/asset/xdeepfm.png b/docs/source/asset/xdeepfm.png new file mode 100644 index 000000000..6a92b6431 Binary files /dev/null and b/docs/source/asset/xdeepfm.png differ diff --git a/docs/source/conf.py b/docs/source/conf.py new file mode 100644 index 000000000..befbeb492 --- /dev/null +++ b/docs/source/conf.py @@ -0,0 +1,74 @@ +# Configuration file for the Sphinx documentation builder. +# +# This file only contains a selection of the most common options. For a full +# list see the documentation: +# https://www.sphinx-doc.org/en/master/usage/configuration.html + +# -- Path setup -------------------------------------------------------------- + +# If extensions (or modules to document with autodoc) are in another directory, +# add these directories to sys.path here. If the directory is relative to the +# documentation root, use os.path.abspath to make it absolute, like shown here. +# +import sphinx_rtd_theme +import os +import sys +sys.path.insert(0, os.path.abspath('../..')) + + +# -- Project information ----------------------------------------------------- + +project = 'RecBole' +copyright = '2020, RecBole Contributors' +author = 'AIBox RecBole group' + +# The full version, including alpha/beta/rc tags +release = '0.2.0' + + +# -- General configuration --------------------------------------------------- + +# Add any Sphinx extension module names here, as strings. They can be +# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom +# ones. +extensions = [ + 'sphinx.ext.autodoc', + 'sphinx.ext.napoleon', + 'sphinx.ext.viewcode', + 'sphinx_copybutton', +] + +autodoc_mock_imports = ["pandas", "pyecharts"] +# autoclass_content = 'both' + +# Add any paths that contain templates here, relative to this directory. +templates_path = ['_templates'] + +# The language for content autogenerated by Sphinx. Refer to documentation +# for a list of supported languages. +# +# This is also used if you do content translation via gettext catalogs. +# Usually you set "language" from the command line for these cases. +language = 'en' + +# List of patterns, relative to source directory, that match files and +# directories to ignore when looking for source files. +# This pattern also affects html_static_path and html_extra_path. +exclude_patterns = [] + + +# -- Options for HTML output ------------------------------------------------- + +# The theme to use for HTML and HTML Help pages. See the documentation for +# a list of builtin themes. +# +# html_theme = 'alabaster' + + +html_theme = 'sphinx_rtd_theme' +html_theme_path = [sphinx_rtd_theme.get_html_theme_path()] + +# Add any paths that contain custom static files (such as style sheets) here, +# relative to this directory. They are copied after the builtin static files, +# so a file named "default.css" will overwrite the builtin "default.css". +html_static_path = ['_static'] diff --git a/docs/source/developer_guide/customize_dataloaders.rst b/docs/source/developer_guide/customize_dataloaders.rst new file mode 100644 index 000000000..6565dcedf --- /dev/null +++ b/docs/source/developer_guide/customize_dataloaders.rst @@ -0,0 +1,201 @@ +Customize DataLoaders +====================== +Here, we present how to develop a new DataLoader, and apply it into our tool. If we have a new model, +and there is no special requirement for loading the data, then we need to design a new DataLoader. + + +Abstract DataLoader +-------------------------- +In this project, there are three abstracts: :class:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader`, +:class:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin`, :class:`~recbole.data.dataloader.neg_sample_mixin.NegSampleByMixin`. + +In general, the new dataloader should inherit from the above three abstract classes. +If one only needs to modify existing DataLoader, you can also inherit from the it. +The documentation of dataloader: :doc:`../../recbole/recbole.data.dataloader` + + +AbstractDataLoader +^^^^^^^^^^^^^^^^^^^^^^^^^^ +:class:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader` is the most basic abstract class, +which includes three functions: :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.pr_end`, +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._shuffle` +and :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._next_batch_data`. +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.pr_end` is the max +:attr:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.pr` plus 1. +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._shuffle` is leverage to permute the dataset, +which will be invoked by :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.__iter__` +if the parameter :attr:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.shuffle` is True. +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._next_batch_data` is used to +load the next batch data, and return the :class:`~recbole.data.interaction.Interaction` format, +which will be invoked in :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.__next__`. + +In :class:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader`, +there are two functions to assist the conversion of :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._next_batch_data`, +one is :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._dataframe_to_interaction`, +and the other is :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader._dict_to_interaction`. +They both use the functions with the same name in :attr:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.dataset`. +The :class:`pandas.DataFrame` or :class:`dict` is converted into :class:`~recbole.data.interaction.Interaction`. + +In addition to the above three functions, two other functions can also be rewrite, +that is :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.setup` +and :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.data_preprocess`. + +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.setup` is used to tackle the problems except initializing the parameters. +For example, reset the :attr:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.batch_size`, +examine the :attr:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.shuffle` setting. +All these things can be rewritten in the subclass. +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.data_preprocess` is used to process the data, +e.g., negative sampling. + +At the end of :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.__init__`, +:meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.setup` will be invoked, +and then if :attr:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.real_time` is ``True``, +then :meth:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader.data_preprocess` is recalled. + +NegSampleMixin +^^^^^^^^^^^^^^^^^^^^^^^^^^ +:class:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin` inherent from +:class:`~recbole.data.dataloader.abstract_dataloader.AbstractDataLoader`, which is used for negative sampling. +It has three additional functions upon its father class: +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin._batch_size_adaptation`, +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin._neg_sampling` +and :meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin.get_pos_len_list`. + +Since the positive and negative samples should be framed in the same batch, +the original batch size can be not appropriate. +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin._batch_size_adaptation` is used to reset the batch size, +such that the positive and negative samples can be in the same batch. +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin._neg_sampling` is used for negative sampling, +which should be implemented by the subclass. +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin.get_pos_len_list` returns the positive sample number for each user. + +In addition, :meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin.setup` +and :meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin.data_preprocess` are also changed. +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin.setup` will +call :meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin._batch_size_adaptation`, +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin.data_preprocess` is used for negative sampling +which should be implemented in the subclass. + +NegSampleByMixin +^^^^^^^^^^^^^^^^^^^^^^^^^^ +:class:`~recbole.data.dataloader.neg_sample_mixin.NegSampleByMixin` inherent +from :class:`~recbole.data.dataloader.neg_sample_mixin.NegSampleMixin`, +which is used for negative sampling by ratio. +It supports two strategies, the first one is ``pair-wise sampling``, the other is ``point-wise sampling``. +Then based on the parent class, two functions are added: +:meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleByMixin._neg_sample_by_pair_wise_sampling` +and :meth:`~recbole.data.dataloader.neg_sample_mixin.NegSampleByMixin._neg_sample_by_point_wise_sampling`. + + +Example +-------------------------- +Here, we take :class:`~recbole.data.dataloader.user_dataloader.UserDataLoader` as the example, +this dataloader returns user id, which is leveraged to train the user representations. + + +Implement __init__() +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +:meth:`__init__` can be used to initialize some of the necessary parameters. +Here, we just need to record :attr:`uid_field`. + +.. code:: python + + def __init__(self, config, dataset, + batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + self.uid_field = dataset.uid_field + + super().__init__(config=config, dataset=dataset, + batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + +Implement setup() +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Because of some training requirement, :attr:`self.shuffle` should be true. +Then we can check and revise :attr:`self.shuffle` in :meth:`~recbole.data.dataloader.user_dataloader.setup`. + + +.. code:: python + + def setup(self): + if self.shuffle is False: + self.shuffle = True + self.logger.warning('UserDataLoader must shuffle the data') + +Implement pr_end() and _shuffle() +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Since this dataloader only returns user id, these function can be implemented readily. + +.. code:: python + + @property + def pr_end(self): + return len(self.dataset.user_feat) + + def _shuffle(self): + self.dataset.user_feat = self.dataset.user_feat.sample(frac=1).reset_index(drop=True) + +Implement _next_batch_data +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +This function only require return user id from :attr:`user_feat`, +we only have to select one column, and use :meth:`_dataframe_to_interaction` to convert +:class:`pandas.DataFrame` into :class:`~recbole.data.interaction.Interaction`. + + +.. code:: python + + def _next_batch_data(self): + cur_data = self.dataset.user_feat[[self.uid_field]][self.pr: self.pr + self.step] + self.pr += self.step + return self._dataframe_to_interaction(cur_data) + + +Complete Code +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + class UserDataLoader(AbstractDataLoader): + """:class:`UserDataLoader` will return a batch of data which only contains user-id when it is iterated. + + Args: + config (Config): The config of dataloader. + dataset (Dataset): The dataset of dataloader. + batch_size (int, optional): The batch_size of dataloader. Defaults to ``1``. + dl_format (InputType, optional): The input type of dataloader. Defaults to + :obj:`~recbole.utils.enum_type.InputType.POINTWISE`. + shuffle (bool, optional): Whether the dataloader will be shuffle after a round. Defaults to ``False``. + + Attributes: + shuffle (bool): Whether the dataloader will be shuffle after a round. + However, in :class:`UserDataLoader`, it's guaranteed to be ``True``. + """ + dl_type = DataLoaderType.ORIGIN + + def __init__(self, config, dataset, + batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + self.uid_field = dataset.uid_field + + super().__init__(config=config, dataset=dataset, + batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + + def setup(self): + """Make sure that the :attr:`shuffle` is True. If :attr:`shuffle` is False, it will be changed to True + and give a warning to user. + """ + if self.shuffle is False: + self.shuffle = True + self.logger.warning('UserDataLoader must shuffle the data') + + @property + def pr_end(self): + return len(self.dataset.user_feat) + + def _shuffle(self): + self.dataset.user_feat = self.dataset.user_feat.sample(frac=1).reset_index(drop=True) + + def _next_batch_data(self): + cur_data = self.dataset.user_feat[[self.uid_field]][self.pr: self.pr + self.step] + self.pr += self.step + return self._dataframe_to_interaction(cur_data) + + +Other more complex Dataloader development can refer to the source code. diff --git a/docs/source/developer_guide/customize_models.rst b/docs/source/developer_guide/customize_models.rst new file mode 100644 index 000000000..e6acc6c90 --- /dev/null +++ b/docs/source/developer_guide/customize_models.rst @@ -0,0 +1,268 @@ +Customize Models +====================== +Here, we present how to develop a new model, and apply it to the RecBole. + +RecBole supports General, Context-aware, Sequential and Knowledge-based +recommendation. + +Create a New Model Class +------------------------------ +To begin with, we should create a new model implementing from one of :class:`~recbole.model.abstract_recommender.GeneralRecommender`, +:class:`~recbole.model.abstract_recommender.ContextRecommender`, :class:`~recbole.model.abstract_recommender.SequentialRecommender`, +:class:`~recbole.model.abstract_recommender.KnowledgeRecommender`. +For example, we would like to develop a general model named as NewModel and write the code to `newmodel.py`. + +.. code:: python + + from recbole.model.abstract_recommender import GeneralRecommender + + class NewModel(GeneralRecommender): + pass + +Then, we need to indicate :attr:`~recbole.model.abstract_recommender.AbstractRecommender.input_type`, +RecBole supports two input types: :obj:`~recbole.utils.enum_type.InputType.POINTWISE` and :obj:`~recbole.utils.enum_type.InputType.PAIRWISE`. + +:obj:`~recbole.utils.enum_type.InputType.POINTWISE` will give the :attr:`item` and the corresponding :attr:`label`, which is suitable for pointwise loss, e.g., Cross Entropy Loss. + +:obj:`~recbole.utils.enum_type.InputType.PAIRWISE` will give the item :attr:`pos_item` and :attr:`neg_item`, which is suitable for pairwise loss, e.g., BPR Loss. + +Suppose we want to use pairwise loss: + +.. code:: python + + from recbole.utils import InputType + from recbole.model.abstract_recommender import GeneralRecommender + + class NewModel(GeneralRecommender): + + input_type = InputType.PAIRWISE + pass + +Implement __init__() +-------------------------------- +Then we redefine :meth:`__init__` method, :meth:`__init__` is used to initialize the model, including loading the dataset information, model parameters, define the model structure and initializing methods. + +:meth:`__init__` input the parameters of :attr:`config`. and :attr:`dataset`, where :attr:`config` is used to input parameters, +:attr:`dataset` is leveraged to input datasets including :attr:`n_users`, :attr:`n_items`. + +Here, we suppose the NewModel encode the users and items, where we use :func:`~recbole.model.init.xavier_normal_initialization` to initialize the parameters, and use inner product to compute the score. + +.. code:: python + + import torch + import torch.nn as nn + + from recbole.model.loss import BPRLoss + from recbole.model.init import xavier_normal_initialization + + def __init__(self, config, dataset): + super(NewModel, self).__init__(config, dataset) + + # load dataset info + self.n_users = dataset.user_num + self.n_items = dataset.item_num + + # load parameters info + self.embedding_size = config['embedding_size'] + + # define layers and loss + self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size) + self.loss = BPRLoss() + + # parameters initialization + self.apply(xavier_normal_initialization) + + +Implement calcualte_loss() +---------------------------------------- +Then we define the :meth:`calculate_loss` method, :meth:`calculate_loss` is used to compute the loss, +the input parameters are :class:`~recbole.data.interaction.Interaction`, at last the method return a :class:`torch.Tensor` for computing the BP information. + +.. code:: python + + import torch + + def calculate_loss(self, interaction): + user = interaction[self.USER_ID] + pos_item = interaction[self.ITEM_ID] + neg_item = interaction[self.NEG_ITEM_ID] + + user_e = self.user_embedding(user) # [batch_size, embedding_size] + pos_item_e = self.item_embedding(pos_item) # [batch_size, embedding_size] + neg_item_e = self.item_embedding(neg_item) # [batch_size, embedding_size] + pos_item_score = torch.mul(user_e, pos_item_e).sum(dim=1) # [batch_size] + neg_item_score = torch.mul(user_e, neg_item_e).sum(dim=1) # [batch_size] + + loss = self.loss(pos_item_score, neg_item_score) # [] + + return loss + + +Implement predict() +------------------------------ +At last, we define the :meth:`predict` method, which is used to compute the score for a give user-item pair. +The input is a :class:`~recbole.data.interaction.Interaction`, and the output is a score. + +.. code:: python + + import torch + + def predict(self, interaction): + user = interaction[self.USER_ID] + item = interaction[self.ITEM_ID] + + user_e = self.user_embedding(user) # [batch_size, embedding_size] + item_e = self.item_embedding(item) # [batch_size, embedding_size] + + scores = torch.mul(user_e, item_e).sum(dim=1) # [batch_size] + + return scores + +If you would like to evaluate the full ranking in the NewModel, RecBole also supports an accelerated predict method. + +.. code:: python + + import torch + + def full_sort_predict(self, interaction): + user = interaction[self.USER_ID] + + user_e = self.user_embedding(user) # [batch_size, embedding_size] + all_item_e = self.item_embedding.weight # [n_items, batch_size] + + scores = torch.matmul(user_e, all_item_e.transpose(0, 1)) # [batch_size, n_items] + + return scores + + +This method will recall this method to accelerate the ranking. + + +Complete Code +------------------------ +Thus the final implemented NewModel is: + +.. code:: python + + import torch + import torch.nn as nn + + from recbole.utils import InputType + from recbole.model.abstract_recommender import GeneralRecommender + from recbole.model.loss import BPRLoss + from recbole.model.init import xavier_normal_initialization + + + class NewModel(GeneralRecommender): + + input_type = InputType.PAIRWISE + + def __init__(self, config, dataset): + super(NewModel, self).__init__(config, dataset) + + # load dataset info + self.n_users = dataset.user_num + self.n_items = dataset.item_num + + # load parameters info + self.embedding_size = config['embedding_size'] + + # define layers and loss + self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size) + self.loss = BPRLoss() + + # parameters initialization + self.apply(xavier_normal_initialization) + + def calculate_loss(self, interaction): + user = interaction[self.USER_ID] + pos_item = interaction[self.ITEM_ID] + neg_item = interaction[self.NEG_ITEM_ID] + + user_e = self.user_embedding(user) # [batch_size, embedding_size] + pos_item_e = self.item_embedding(pos_item) # [batch_size, embedding_size] + neg_item_e = self.item_embedding(neg_item) # [batch_size, embedding_size] + pos_item_score = torch.mul(user_e, pos_item_e).sum(dim=1) # [batch_size] + neg_item_score = torch.mul(user_e, neg_item_e).sum(dim=1) # [batch_size] + + loss = self.loss(pos_item_score, neg_item_score) # [] + + return loss + + def predict(self, interaction): + user = interaction[self.USER_ID] + item = interaction[self.ITEM_ID] + + user_e = self.user_embedding(user) # [batch_size, embedding_size] + item_e = self.item_embedding(item) # [batch_size, embedding_size] + + scores = torch.mul(user_e, item_e).sum(dim=1) # [batch_size] + + return scores + + def full_sort_predict(self, interaction): + user = interaction[self.USER_ID] + + user_e = self.user_embedding(user) # [batch_size, embedding_size] + all_item_e = self.item_embedding.weight # [n_items, batch_size] + + scores = torch.matmul(user_e, all_item_e.transpose(0, 1)) # [batch_size, n_items] + + return scores + +Then, we can use NewModel in RecBole as follows (e.g., `run.py`): + +.. code:: python + + from logging import getLogger + from recbole.utils import init_logger, init_seed + from recbole.trainer import Trainer + from newmodel import NewModel + from recbole.config import Config + from recbole.data import create_dataset, data_preparation + + + if __name__ == '__main__': + + config = Config(model=NewModel, dataset='ml-100k') + init_seed(config['seed'], config['reproducibility']) + + # logger initialization + init_logger(config) + logger = getLogger() + + logger.info(config) + + # dataset filtering + dataset = create_dataset(config) + logger.info(dataset) + + # dataset splitting + train_data, valid_data, test_data = data_preparation(config, dataset) + + # model loading and initialization + model = NewModel(config, train_data).to(config['device']) + logger.info(model) + + # trainer loading and initialization + trainer = Trainer(config, model) + + # model training + best_valid_score, best_valid_result = trainer.fit(train_data, valid_data) + + # model evaluation + test_result = trainer.evaluate(test_data) + + logger.info('best valid result: {}'.format(best_valid_result)) + logger.info('test result: {}'.format(test_result)) + +Then, we can run NewModel: + +.. code:: python + + python run.py --embedding_size=64 + +Note, please remember to configure the model parameters +(such as ``embedding_size``) through config files, parameter dicts or command line. diff --git a/docs/source/developer_guide/customize_samplers.rst b/docs/source/developer_guide/customize_samplers.rst new file mode 100644 index 000000000..3c37797cc --- /dev/null +++ b/docs/source/developer_guide/customize_samplers.rst @@ -0,0 +1,174 @@ +Customize Samplers +====================== +Here we present how to develop a new sampler, and apply it into RecBole. +The new sampler is used when we need complex sampling method. + +Here, we take the :class:`~recbole.sampler.sampler.KGSampler` as an example. + + +Create a New Sampler Class +----------------------------- +To begin with, we create a new sampler based on :class:`~recbole.sampler.sampler.AbstractSampler`: + +.. code:: python + + from recbole.sampler import AbstractSampler + class KGSampler(AbstractSampler): + pass + + +Implement __init__() +----------------------- +Then, we implement :meth:`~recbole.sampler.sampler.KGSampler.__init__()`, in this method, we can flexibly define and initialize the parameters, +where we only need to invoke :obj:`super.__init__(distribution)`. + +.. code:: python + + def __init__(self, dataset, distribution='uniform'): + self.dataset = dataset + + self.hid_field = dataset.head_entity_field + self.tid_field = dataset.tail_entity_field + self.hid_list = dataset.head_entities + self.tid_list = dataset.tail_entities + + self.head_entities = set(dataset.head_entities) + self.entity_num = dataset.entity_num + + super().__init__(distribution=distribution) + + +Implement get_random_list() +------------------------------ +We do not use the random function in python or numpy due to their lower efficiency. +Instead, we realize our own :meth:`~recbole.sampler.sampler.AbstractSampler.random` function, where the key method is to combine the random list with the pointer. +The pointer point to some element in the random list. When one calls :meth:`self.random`, the element is returned, and moves the pointer backward by one element. +If the pointer point to the last element, then it will return to the head of the element. + +In :class:`~recbole.sampler.sampler.AbstractSampler`, the :meth:`~recbole.sampler.sampler.AbstractSampler.__init__` will call :meth:`~recbole.sampler.sampler.AbstractSampler.get_random_list`, and shuffle the results. +We only need to return a list including all the elements. + +It should be noted ``0`` can be the token used for padding, thus one should remain this value. + +Example code: + +.. code:: python + + def get_random_list(self): + if self.distribution == 'uniform': + return list(range(1, self.entity_num)) + elif self.distribution == 'popularity': + return list(self.hid_list) + list(self.tid_list) + else: + raise NotImplementedError('Distribution [{}] has not been implemented'.format(self.distribution)) + + +Implement get_used_ids() +---------------------------- +For negative sampling, we do not want to sample positive instance, this function is used to compute the positive sample. +The function will return numpy, and the index is the ID. The return value will be saved in :attr:`self.used_ids`. + +Example code: + +.. code:: python + + def get_used_ids(self): + used_tail_entity_id = np.array([set() for i in range(self.entity_num)]) + for hid, tid in zip(self.hid_list, self.tid_list): + used_tail_entity_id[hid].add(tid) + return used_tail_entity_id + + +Implementing the sampling function +----------------------------------- +In :class:`~recbole.sampler.sampler.AbstractSampler`, we have implemented :meth:`~recbole.sampler.sampler.AbstractSampler.sample_by_key_ids` function, +where we have three parameters: :attr:`key_ids`, :attr:`num` and :attr:`used_ids`. +:attr:`Key_ids` is the candidate objective ID list, :attr:`num` is the number of samples, :attr:`used_ids` are the positive sample list. + +In the function, we sample :attr:`num` instances for each element in :attr:`key_ids`. The function finally return :class:`numpy.ndarray`, +the index of 0, len(key_ids), len(key_ids) * 2, …, len(key_ids) * (num - 1) is the result of key_ids[0]. +The index of 1, len(key_ids) + 1, len(key_ids) * 2 + 1, …, len(key_ids) * (num - 1) + 1 is the result of key_ids[1]. + +One can also design her own sampler, if the above process is not appropriate. + +Example code: + +.. code:: python + + def sample_by_entity_ids(self, head_entity_ids, num=1): + try: + return self.sample_by_key_ids(head_entity_ids, num, self.used_ids[head_entity_ids]) + except IndexError: + for head_entity_id in head_entity_ids: + if head_entity_id not in self.head_entities: + raise ValueError('head_entity_id [{}] not exist'.format(head_entity_id)) + + +Complete Code +---------------------- +.. code:: python + + class KGSampler(AbstractSampler): + """:class:`KGSampler` is used to sample negative entities in a knowledge graph. + + Args: + dataset (Dataset): The knowledge graph dataset, which contains triplets in a knowledge graph. + distribution (str, optional): Distribution of the negative entities. Defaults to 'uniform'. + """ + def __init__(self, dataset, distribution='uniform'): + self.dataset = dataset + + self.hid_field = dataset.head_entity_field + self.tid_field = dataset.tail_entity_field + self.hid_list = dataset.head_entities + self.tid_list = dataset.tail_entities + + self.head_entities = set(dataset.head_entities) + self.entity_num = dataset.entity_num + + super().__init__(distribution=distribution) + + def get_random_list(self): + """ + Returns: + np.ndarray or list: Random list of entity_id. + """ + if self.distribution == 'uniform': + return list(range(1, self.entity_num)) + elif self.distribution == 'popularity': + return list(self.hid_list) + list(self.tid_list) + else: + raise NotImplementedError('Distribution [{}] has not been implemented'.format(self.distribution)) + + def get_used_ids(self): + """ + Returns: + np.ndarray: Used entity_ids is the same as tail_entity_ids in knowledge graph. + Index is head_entity_id, and element is a set of tail_entity_ids. + """ + used_tail_entity_id = np.array([set() for i in range(self.entity_num)]) + for hid, tid in zip(self.hid_list, self.tid_list): + used_tail_entity_id[hid].add(tid) + return used_tail_entity_id + + def sample_by_entity_ids(self, head_entity_ids, num=1): + """Sampling by head_entity_ids. + + Args: + head_entity_ids (np.ndarray or list): Input head_entity_ids. + num (int, optional): Number of sampled entity_ids for each head_entity_id. Defaults to ``1``. + + Returns: + np.ndarray: Sampled entity_ids. + entity_ids[0], entity_ids[len(head_entity_ids)], entity_ids[len(head_entity_ids) * 2], ..., + entity_id[len(head_entity_ids) * (num - 1)] is sampled for head_entity_ids[0]; + entity_ids[1], entity_ids[len(head_entity_ids) + 1], entity_ids[len(head_entity_ids) * 2 + 1], ..., + entity_id[len(head_entity_ids) * (num - 1) + 1] is sampled for head_entity_ids[1]; ...; and so on. + """ + try: + return self.sample_by_key_ids(head_entity_ids, num, self.used_ids[head_entity_ids]) + except IndexError: + for head_entity_id in head_entity_ids: + if head_entity_id not in self.head_entities: + raise ValueError('head_entity_id [{}] not exist'.format(head_entity_id)) + diff --git a/docs/source/developer_guide/customize_trainers.rst b/docs/source/developer_guide/customize_trainers.rst new file mode 100644 index 000000000..fbd6df60e --- /dev/null +++ b/docs/source/developer_guide/customize_trainers.rst @@ -0,0 +1,104 @@ +Customize Trainers +====================== +Here, we present how to develop a new Trainer, and apply it into RecBole. +For a new model, if the training method is complex, and existing trainer can not be used for training and evaluation, +then we need to develop a new trainer. + +The function used to train the model is :meth:`fit`, it will call :meth:`_train_epoch` to train the model. + +The function used to evaluate the model is :meth:`evaluate`, it will call :meth:`_valid_epoch` to evaluate the model. + +If the developed model need more complex training method, +then one can inherent the :class:`~recbole.trainer.trainer.Trainer`, +and revise :meth:`~recbole.trainer.trainer.Trainer.fit` or :meth:`~recbole.trainer.trainer.Trainer._train_epoch`. + +If the developed model need more complex evaluation method, +then one can inherent the :class:`~recbole.trainer.trainer.Trainer`, +and revise :meth:`~recbole.trainer.trainer.Trainer.evaluate` or :meth:`~recbole.trainer.trainer.Trainer._valid_epoch`. + + +Example +---------------- +Here we present a simple Trainer example, which is used for alternative optimization. +We revise the :meth:`~recbole.trainer.trainer.Trainer._train_epoch` method. +To begin with, we need to create a new class for +:class:`NewTrainer` based on :class:`~recbole.trainer.trainer.Trainer`. + +.. code:: python + + from recbole.trainer import Trainer + + class NewTrainer(Trainer): + + def __init__(self, config, model): + super(NewTrainer, self).__init__(config, model) + + +Then we revise :meth:`~recbole.trainer.trainer.Trainer._train_epoch`. +Here, the losses are alternatively optimized after each epoch, +and the losses are computed by :meth:`calculate_loss1` and :meth:`calculate_loss2` + + +.. code:: python + + def _train_epoch(self, train_data, epoch_idx): + self.model.train() + total_loss = 0. + + if epoch_idx % 2 == 0: + for batch_idx, interaction in enumerate(train_data): + interaction = interaction.to(self.device) + self.optimizer.zero_grad() + loss = self.model.calculate_loss1(interaction) + self._check_nan(loss) + loss.backward() + self.optimizer.step() + total_loss += loss.item() + else: + for batch_idx, interaction in enumerate(train_data): + interaction = interaction.to(self.device) + self.optimizer.zero_grad() + loss = self.model.calculate_loss2(interaction) + self._check_nan(loss) + loss.backward() + self.optimizer.step() + total_loss += loss.item() + return total_loss + + +Complete Code +^^^^^^^^^^^^^^^^ + +.. code:: python + + from recbole.trainer import Trainer + + class NewTrainer(Trainer): + + def __init__(self, config, model): + super(NewTrainer, self).__init__(config, model) + + def _train_epoch(self, train_data, epoch_idx): + self.model.train() + total_loss = 0. + + if epoch_idx % 2 == 0: + for batch_idx, interaction in enumerate(train_data): + interaction = interaction.to(self.device) + self.optimizer.zero_grad() + loss = self.model.calculate_loss1(interaction) + self._check_nan(loss) + loss.backward() + self.optimizer.step() + total_loss += loss.item() + else: + for batch_idx, interaction in enumerate(train_data): + interaction = interaction.to(self.device) + self.optimizer.zero_grad() + loss = self.model.calculate_loss2(interaction) + self._check_nan(loss) + loss.backward() + self.optimizer.step() + total_loss += loss.item() + return total_loss + diff --git a/docs/source/get_started/install.rst b/docs/source/get_started/install.rst new file mode 100644 index 000000000..b798defcc --- /dev/null +++ b/docs/source/get_started/install.rst @@ -0,0 +1,56 @@ +Install RecBole +====================== +RecBole can be installed from ``conda``, ``pip`` and source files. + + +System requirements +------------------------ +RecBole is compatible with the following operating systems: + +* Linux +* Windows 10 +* macOS X + +Python 3.6 (or later), torch 1.6.0 (or later) are required to install our library. If you want to use RecBole with GPU, +please ensure that CUDA or CUDAToolkit version is 9.2 or later. +This requires NVIDIA driver version >= 396.26 (for Linux) or >= 397.44 (for Windows10). + + +Install from conda +-------------------------- +``Conda`` can be installed from `miniconda <https://conda.io/miniconda.html>`_ or +the full `anaconda <https://www.anaconda.com/download/>`_. +If you are in China, `Tsinghua Mirrors <https://mirror.tuna.tsinghua.edu.cn/help/anaconda/>`_ is recommended. + +After installing ``conda``, +run `conda create -n recbole python=3.6` to create the Python 3.6 conda environment. +Then the environment can be activated by `conda activate recbole`. +At last, run the following command to install RecBole: + +.. code:: bash + + conda install -c aibox recbole + + +Install from pip +------------------------- +To install RecBole from pip, only the following command is needed: + +.. code:: bash + + pip install recbole + + +Install from source +------------------------- +Download the source files from GitHub. + +.. code:: bash + + git clone https://github.com/RUCAIBox/RecBole.git && cd RecBole + +Run the following command to install: + +.. code:: bash + + pip install -e . --verbose diff --git a/docs/source/get_started/introduction.rst b/docs/source/get_started/introduction.rst new file mode 100644 index 000000000..406fa98b7 --- /dev/null +++ b/docs/source/get_started/introduction.rst @@ -0,0 +1,31 @@ +Introduction +============== + +RecBole is a unified, comprehensive and efficient framework developed based on PyTorch. +It aims to help the researchers to reproduce and develop recommendation models. + +In the first release, our library includes 65 recommendation algorithms `[Model List]`_, covering four major categories: + +- General Recommendation +- Sequential Recommendation +- Context-aware Recommendation +- Knowledge-based Recommendation + +We design a unified and flexible data file format, and provide the support for 28 benchmark recommendation datasets `[Collected Datasets]`_. A user can apply the provided script to process the original data copy, or simply download the processed datasets by our team. + +Features: + +- General and extensible data structure + We deign general and extensible data structures to unify the formatting and usage of various recommendation datasets. +- Comprehensive benchmark models and datasets + We implement 65 commonly used recommendation algorithms, and provide the formatted copies of 28 recommendation datasets. +- Efficient GPU-accelerated execution + We design many tailored strategies in the GPU environment to enhance the efficiency of our library. +- Extensive and standard evaluation protocols + We support a series of commonly used evaluation protocols or settings for testing and comparing recommendation algorithms. + +.. _[Collected Datasets]: + /dataset_list.html + +.. _[Model List]: + /model_list.html diff --git a/docs/source/get_started/quick_start.rst b/docs/source/get_started/quick_start.rst new file mode 100644 index 000000000..9a97a8a15 --- /dev/null +++ b/docs/source/get_started/quick_start.rst @@ -0,0 +1,120 @@ +Quick Start +=============== +Here is a quick-start example for using RecBole. + +Quick-start From Source +-------------------------- +With the source code of `RecBole <https://github.com/RUCAIBox/RecBole>`_, +the following script can be used to run a toy example of our library. + +.. code:: bash + + python run_recbole.py + +This script will run the BPR model on the ml-100k dataset. + +Typically, this example takes less than one minute. We will obtain some output like: + +.. code:: none + + INFO ml-100k + The number of users: 944 + Average actions of users: 106.04453870625663 + The number of items: 1683 + Average actions of items: 59.45303210463734 + The number of inters: 100000 + The sparsity of the dataset: 93.70575143257098% + + INFO Evaluation Settings: + Group by user_id + Ordering: {'strategy': 'shuffle'} + Splitting: {'strategy': 'by_ratio', 'ratios': [0.8, 0.1, 0.1]} + Negative Sampling: {'strategy': 'full', 'distribution': 'uniform'} + + INFO BPRMF( + (user_embedding): Embedding(944, 64) + (item_embedding): Embedding(1683, 64) + (loss): BPRLoss() + ) + Trainable parameters: 168128 + + INFO epoch 0 training [time: 0.27s, train loss: 27.7231] + INFO epoch 0 evaluating [time: 0.12s, valid_score: 0.021900] + INFO valid result: + recall@10: 0.0073 mrr@10: 0.0219 ndcg@10: 0.0093 hit@10: 0.0795 precision@10: 0.0088 + + ... + + INFO epoch 63 training [time: 0.19s, train loss: 4.7660] + INFO epoch 63 evaluating [time: 0.08s, valid_score: 0.394500] + INFO valid result: + recall@10: 0.2156 mrr@10: 0.3945 ndcg@10: 0.2332 hit@10: 0.7593 precision@10: 0.1591 + + INFO Finished training, best eval result in epoch 52 + INFO Loading model structure and parameters from saved/***.pth + INFO best valid result: + recall@10: 0.2169 mrr@10: 0.4005 ndcg@10: 0.235 hit@10: 0.7582 precision@10: 0.1598 + INFO test result: + recall@10: 0.2368 mrr@10: 0.4519 ndcg@10: 0.2768 hit@10: 0.7614 precision@10: 0.1901 + +Note that using the quick start pipeline we provide, the original dataset will be divided into training set, validation set and test set by default. +We optimize model parameters on the training set, do parameter selection according to the results on the validation set, +and finally report the results on the test set. + +If you want to change the parameters, such as ``learning_rate``, ``embedding_size``, +just set the additional command parameters as you need: + +.. code:: bash + + python run_recbole.py --learning_rate=0.0001 --embedding_size=128 + + +If you want to change the models, just run the script by setting additional command parameters: + +.. code:: bash + + python run_recbole.py --model=[model_name] + +``model_name`` indicates the model to be initialized. +RecBole has implemented four categories of recommendation algorithms +including general recommendation, context-aware recommendation, +sequential recommendation and knowledge-based recommendation. +More details can be found in :doc:`../user_guide/model_intro`. + + +The datasets can be changed according to :doc:`../user_guide/data_intro`. + + +Quick-start From API +------------------------- +If RecBole is installed from ``pip`` or ``conda``, you can create a new python file (e.g., `run.py`), +and write the following code: + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole() + + +Then run the following command: + +.. code:: bash + + python run.py --dataset=ml-100k --model=BPR + +This will perform the training and test of the BPR model on the ml-100k dataset. + +One can also use similar methods as mentioned above to run different models, parameters or datasets, +the operations are same with `Quick-start From Source`_. + + +In-depth Usage +------------------- +For a more in-depth usage about RecBole, take a look at + +- :doc:`../user_guide/config_settings` +- :doc:`../user_guide/data_intro` +- :doc:`../user_guide/model_intro` +- :doc:`../user_guide/evaluation_support` +- :doc:`../user_guide/usage` diff --git a/docs/source/index.rst b/docs/source/index.rst new file mode 100644 index 000000000..96c20bbe7 --- /dev/null +++ b/docs/source/index.rst @@ -0,0 +1,59 @@ +.. RecBole documentation master file. + +RecBole v0.2.0 +========================================================= + +`HomePage <https://recbole.io/>`_ | `Docs <https://recbole.io/docs/>`_ | `GitHub <https://github.com/RUCAIBox/RecBole>`_ | `Datasets <https://github.com/RUCAIBox/RecDatasets>`_ | `v0.1.2 </docs/v0.1.2/>`_ + +.. toctree:: + :maxdepth: 1 + :caption: Get Started + + get_started/introduction + get_started/install + get_started/quick_start + +.. toctree:: + :maxdepth: 1 + :caption: User Guide + + user_guide/config_settings + user_guide/data_intro + user_guide/model_intro + user_guide/evaluation_support + user_guide/usage + + +.. toctree:: + :maxdepth: 1 + :caption: Developer Guide + + developer_guide/customize_models + developer_guide/customize_trainers + developer_guide/customize_dataloaders + developer_guide/customize_samplers + + +.. toctree:: + :maxdepth: 1 + :caption: API REFERENCE: + + recbole/recbole.config + recbole/recbole.data + recbole/recbole.evaluator + recbole/recbole.model + recbole/recbole.quick_start.quick_start + recbole/recbole.sampler.sampler + recbole/recbole.trainer.hyper_tuning + recbole/recbole.trainer.trainer + recbole/recbole.utils.case_study + recbole/recbole.utils.enum_type + recbole/recbole.utils.logger + recbole/recbole.utils.utils + + +Indices and tables +================== + +* :ref:`genindex` +* :ref:`search` diff --git a/docs/source/recbole/recbole.config.configurator.rst b/docs/source/recbole/recbole.config.configurator.rst new file mode 100644 index 000000000..1818e7650 --- /dev/null +++ b/docs/source/recbole/recbole.config.configurator.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.config.configurator + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.config.eval_setting.rst b/docs/source/recbole/recbole.config.eval_setting.rst new file mode 100644 index 000000000..cd78d6dca --- /dev/null +++ b/docs/source/recbole/recbole.config.eval_setting.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.config.eval_setting + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.config.rst b/docs/source/recbole/recbole.config.rst new file mode 100644 index 000000000..1b58676fc --- /dev/null +++ b/docs/source/recbole/recbole.config.rst @@ -0,0 +1,8 @@ +recbole.config +====================== + +.. toctree:: + :maxdepth: 4 + + recbole.config.configurator + recbole.config.eval_setting diff --git a/docs/source/recbole/recbole.data.dataloader.abstract_dataloader.rst b/docs/source/recbole/recbole.data.dataloader.abstract_dataloader.rst new file mode 100644 index 000000000..a7d6772f3 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.abstract_dataloader.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.abstract_dataloader + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataloader.context_dataloader.rst b/docs/source/recbole/recbole.data.dataloader.context_dataloader.rst new file mode 100644 index 000000000..f46d5ee0c --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.context_dataloader.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.context_dataloader + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataloader.general_dataloader.rst b/docs/source/recbole/recbole.data.dataloader.general_dataloader.rst new file mode 100644 index 000000000..a1ab677fa --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.general_dataloader.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.general_dataloader + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataloader.knowledge_dataloader.rst b/docs/source/recbole/recbole.data.dataloader.knowledge_dataloader.rst new file mode 100644 index 000000000..fd4eaa083 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.knowledge_dataloader.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.knowledge_dataloader + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataloader.neg_sample_mixin.rst b/docs/source/recbole/recbole.data.dataloader.neg_sample_mixin.rst new file mode 100644 index 000000000..67fdd0e93 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.neg_sample_mixin.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.neg_sample_mixin + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataloader.rst b/docs/source/recbole/recbole.data.dataloader.rst new file mode 100644 index 000000000..1dc37ef3d --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.rst @@ -0,0 +1,13 @@ +recbole.data.dataloader +=============================== + +.. toctree:: + :maxdepth: 4 + + recbole.data.dataloader.abstract_dataloader + recbole.data.dataloader.context_dataloader + recbole.data.dataloader.general_dataloader + recbole.data.dataloader.knowledge_dataloader + recbole.data.dataloader.neg_sample_mixin + recbole.data.dataloader.sequential_dataloader + recbole.data.dataloader.user_dataloader diff --git a/docs/source/recbole/recbole.data.dataloader.sequential_dataloader.rst b/docs/source/recbole/recbole.data.dataloader.sequential_dataloader.rst new file mode 100644 index 000000000..94ab388e9 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.sequential_dataloader.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.sequential_dataloader + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataloader.user_dataloader.rst b/docs/source/recbole/recbole.data.dataloader.user_dataloader.rst new file mode 100644 index 000000000..0fdba68de --- /dev/null +++ b/docs/source/recbole/recbole.data.dataloader.user_dataloader.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataloader.user_dataloader + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataset.customized_dataset.rst b/docs/source/recbole/recbole.data.dataset.customized_dataset.rst new file mode 100644 index 000000000..e70b27f01 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.customized_dataset.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataset.customized_dataset + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataset.dataset.rst b/docs/source/recbole/recbole.data.dataset.dataset.rst new file mode 100644 index 000000000..b7174a373 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.dataset.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataset.dataset + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataset.kg_dataset.rst b/docs/source/recbole/recbole.data.dataset.kg_dataset.rst new file mode 100644 index 000000000..c825654bc --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.kg_dataset.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataset.kg_dataset + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataset.kg_seq_dataset.rst b/docs/source/recbole/recbole.data.dataset.kg_seq_dataset.rst new file mode 100644 index 000000000..44bdec0ca --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.kg_seq_dataset.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataset.kg_seq_dataset + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataset.rst b/docs/source/recbole/recbole.data.dataset.rst new file mode 100644 index 000000000..64d432fae --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.rst @@ -0,0 +1,12 @@ +recbole.data.dataset +============================ + +.. toctree:: + :maxdepth: 4 + + recbole.data.dataset.customized_dataset + recbole.data.dataset.dataset + recbole.data.dataset.kg_dataset + recbole.data.dataset.kg_seq_dataset + recbole.data.dataset.sequential_dataset + recbole.data.dataset.social_dataset diff --git a/docs/source/recbole/recbole.data.dataset.sequential_dataset.rst b/docs/source/recbole/recbole.data.dataset.sequential_dataset.rst new file mode 100644 index 000000000..158a3d393 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.sequential_dataset.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataset.sequential_dataset + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.dataset.social_dataset.rst b/docs/source/recbole/recbole.data.dataset.social_dataset.rst new file mode 100644 index 000000000..47db01c07 --- /dev/null +++ b/docs/source/recbole/recbole.data.dataset.social_dataset.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.dataset.social_dataset + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.interaction.rst b/docs/source/recbole/recbole.data.interaction.rst new file mode 100644 index 000000000..f669c89f8 --- /dev/null +++ b/docs/source/recbole/recbole.data.interaction.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.interaction + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.data.rst b/docs/source/recbole/recbole.data.rst new file mode 100644 index 000000000..3996bb35c --- /dev/null +++ b/docs/source/recbole/recbole.data.rst @@ -0,0 +1,10 @@ +recbole.data +==================== + +.. toctree:: + :maxdepth: 4 + + recbole.data.dataloader + recbole.data.dataset + recbole.data.interaction + recbole.data.utils diff --git a/docs/source/recbole/recbole.data.utils.rst b/docs/source/recbole/recbole.data.utils.rst new file mode 100644 index 000000000..8b20deadf --- /dev/null +++ b/docs/source/recbole/recbole.data.utils.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.data.utils + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.evaluator.abstract_evaluator.rst b/docs/source/recbole/recbole.evaluator.abstract_evaluator.rst new file mode 100644 index 000000000..66de59e49 --- /dev/null +++ b/docs/source/recbole/recbole.evaluator.abstract_evaluator.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.evaluator.abstract_evaluator + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.evaluator.evaluators.rst b/docs/source/recbole/recbole.evaluator.evaluators.rst new file mode 100644 index 000000000..8aeca9804 --- /dev/null +++ b/docs/source/recbole/recbole.evaluator.evaluators.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.evaluator.evaluators + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.evaluator.metrics.rst b/docs/source/recbole/recbole.evaluator.metrics.rst new file mode 100644 index 000000000..2ef968131 --- /dev/null +++ b/docs/source/recbole/recbole.evaluator.metrics.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.evaluator.metrics + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.evaluator.proxy_evaluator.rst b/docs/source/recbole/recbole.evaluator.proxy_evaluator.rst new file mode 100644 index 000000000..c4689d43a --- /dev/null +++ b/docs/source/recbole/recbole.evaluator.proxy_evaluator.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.evaluator.proxy_evaluator + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.evaluator.rst b/docs/source/recbole/recbole.evaluator.rst new file mode 100644 index 000000000..0f3c99793 --- /dev/null +++ b/docs/source/recbole/recbole.evaluator.rst @@ -0,0 +1,11 @@ +recbole.evaluator +========================= + +.. toctree:: + :maxdepth: 4 + + recbole.evaluator.abstract_evaluator + recbole.evaluator.evaluators + recbole.evaluator.metrics + recbole.evaluator.proxy_evaluator + recbole.evaluator.utils diff --git a/docs/source/recbole/recbole.evaluator.utils.rst b/docs/source/recbole/recbole.evaluator.utils.rst new file mode 100644 index 000000000..d57b0fa71 --- /dev/null +++ b/docs/source/recbole/recbole.evaluator.utils.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.evaluator.utils + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.abstract_recommender.rst b/docs/source/recbole/recbole.model.abstract_recommender.rst new file mode 100644 index 000000000..a346344cc --- /dev/null +++ b/docs/source/recbole/recbole.model.abstract_recommender.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.abstract_recommender + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.afm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.afm.rst new file mode 100644 index 000000000..fe80d513d --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.afm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.afm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.autoint.rst b/docs/source/recbole/recbole.model.context_aware_recommender.autoint.rst new file mode 100644 index 000000000..9e914fede --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.autoint.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.autoint + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.dcn.rst b/docs/source/recbole/recbole.model.context_aware_recommender.dcn.rst new file mode 100644 index 000000000..0aac91819 --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.dcn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.dcn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.deepfm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.deepfm.rst new file mode 100644 index 000000000..356462127 --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.deepfm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.deepfm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.dssm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.dssm.rst new file mode 100644 index 000000000..0c2bb69db --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.dssm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.dssm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.ffm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.ffm.rst new file mode 100644 index 000000000..cece0e842 --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.ffm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.ffm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.fm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.fm.rst new file mode 100644 index 000000000..93d6eba0a --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.fm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.fm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.fnn.rst b/docs/source/recbole/recbole.model.context_aware_recommender.fnn.rst new file mode 100644 index 000000000..f884fe80b --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.fnn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.fnn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.fwfm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.fwfm.rst new file mode 100644 index 000000000..d776a3fb0 --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.fwfm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.fwfm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.lr.rst b/docs/source/recbole/recbole.model.context_aware_recommender.lr.rst new file mode 100644 index 000000000..d64ba2088 --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.lr.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.lr + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.nfm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.nfm.rst new file mode 100644 index 000000000..15cf09969 --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.nfm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.nfm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.pnn.rst b/docs/source/recbole/recbole.model.context_aware_recommender.pnn.rst new file mode 100644 index 000000000..d43f4f08d --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.pnn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.pnn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.rst b/docs/source/recbole/recbole.model.context_aware_recommender.rst new file mode 100644 index 000000000..aeb5c3c5e --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.rst @@ -0,0 +1,20 @@ +recbole.model.context\_aware\_recommender +================================================= + +.. toctree:: + :maxdepth: 4 + + recbole.model.context_aware_recommender.afm + recbole.model.context_aware_recommender.autoint + recbole.model.context_aware_recommender.dcn + recbole.model.context_aware_recommender.deepfm + recbole.model.context_aware_recommender.dssm + recbole.model.context_aware_recommender.ffm + recbole.model.context_aware_recommender.fm + recbole.model.context_aware_recommender.fnn + recbole.model.context_aware_recommender.fwfm + recbole.model.context_aware_recommender.lr + recbole.model.context_aware_recommender.nfm + recbole.model.context_aware_recommender.pnn + recbole.model.context_aware_recommender.widedeep + recbole.model.context_aware_recommender.xdeepfm diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.widedeep.rst b/docs/source/recbole/recbole.model.context_aware_recommender.widedeep.rst new file mode 100644 index 000000000..8bcb6834d --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.widedeep.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.widedeep + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.context_aware_recommender.xdeepfm.rst b/docs/source/recbole/recbole.model.context_aware_recommender.xdeepfm.rst new file mode 100644 index 000000000..8e64f67dc --- /dev/null +++ b/docs/source/recbole/recbole.model.context_aware_recommender.xdeepfm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.context_aware_recommender.xdeepfm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.exlib_recommender.rst b/docs/source/recbole/recbole.model.exlib_recommender.rst new file mode 100644 index 000000000..d7d03911f --- /dev/null +++ b/docs/source/recbole/recbole.model.exlib_recommender.rst @@ -0,0 +1,7 @@ +recbole.model.exlib\_recommender +============================================= + +.. toctree:: + :maxdepth: 4 + + recbole.model.exlib_recommender.xgboost diff --git a/docs/source/recbole/recbole.model.exlib_recommender.xgboost.rst b/docs/source/recbole/recbole.model.exlib_recommender.xgboost.rst new file mode 100644 index 000000000..fbbfc2b6f --- /dev/null +++ b/docs/source/recbole/recbole.model.exlib_recommender.xgboost.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.exlib_recommender.xgboost + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.bpr.rst b/docs/source/recbole/recbole.model.general_recommender.bpr.rst new file mode 100644 index 000000000..0eb41563d --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.bpr.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.bpr + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.cdae.rst b/docs/source/recbole/recbole.model.general_recommender.cdae.rst new file mode 100644 index 000000000..5ec3b7dec --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.cdae.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.cdae + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.convncf.rst b/docs/source/recbole/recbole.model.general_recommender.convncf.rst new file mode 100644 index 000000000..ee388d326 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.convncf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.convncf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.dgcf.rst b/docs/source/recbole/recbole.model.general_recommender.dgcf.rst new file mode 100644 index 000000000..6551d7966 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.dgcf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.dgcf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.dmf.rst b/docs/source/recbole/recbole.model.general_recommender.dmf.rst new file mode 100644 index 000000000..499706b1a --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.dmf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.dmf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.fism.rst b/docs/source/recbole/recbole.model.general_recommender.fism.rst new file mode 100644 index 000000000..c706123e5 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.fism.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.fism + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.gcmc.rst b/docs/source/recbole/recbole.model.general_recommender.gcmc.rst new file mode 100644 index 000000000..27a234987 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.gcmc.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.gcmc + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.itemknn.rst b/docs/source/recbole/recbole.model.general_recommender.itemknn.rst new file mode 100644 index 000000000..aabf3fd97 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.itemknn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.itemknn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.lightgcn.rst b/docs/source/recbole/recbole.model.general_recommender.lightgcn.rst new file mode 100644 index 000000000..001bcc62e --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.lightgcn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.lightgcn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.line.rst b/docs/source/recbole/recbole.model.general_recommender.line.rst new file mode 100644 index 000000000..1d1da4ea9 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.line.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.line + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.macridvae.rst b/docs/source/recbole/recbole.model.general_recommender.macridvae.rst new file mode 100644 index 000000000..f1363d7a6 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.macridvae.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.macridvae + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.multidae.rst b/docs/source/recbole/recbole.model.general_recommender.multidae.rst new file mode 100644 index 000000000..becbaba13 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.multidae.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.multidae + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.multivae.rst b/docs/source/recbole/recbole.model.general_recommender.multivae.rst new file mode 100644 index 000000000..0888bad2c --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.multivae.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.multivae + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.nais.rst b/docs/source/recbole/recbole.model.general_recommender.nais.rst new file mode 100644 index 000000000..b3d30cab1 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.nais.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.nais + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.neumf.rst b/docs/source/recbole/recbole.model.general_recommender.neumf.rst new file mode 100644 index 000000000..48173b93f --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.neumf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.neumf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.ngcf.rst b/docs/source/recbole/recbole.model.general_recommender.ngcf.rst new file mode 100644 index 000000000..12703290b --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.ngcf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.ngcf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.pop.rst b/docs/source/recbole/recbole.model.general_recommender.pop.rst new file mode 100644 index 000000000..8e32fc007 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.pop.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.pop + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.general_recommender.rst b/docs/source/recbole/recbole.model.general_recommender.rst new file mode 100644 index 000000000..b4cc32610 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.rst @@ -0,0 +1,24 @@ +recbole.model.general\_recommender +========================================== + +.. toctree:: + :maxdepth: 4 + + recbole.model.general_recommender.bpr + recbole.model.general_recommender.cdae + recbole.model.general_recommender.convncf + recbole.model.general_recommender.dgcf + recbole.model.general_recommender.dmf + recbole.model.general_recommender.fism + recbole.model.general_recommender.gcmc + recbole.model.general_recommender.itemknn + recbole.model.general_recommender.lightgcn + recbole.model.general_recommender.line + recbole.model.general_recommender.macridvae + recbole.model.general_recommender.multidae + recbole.model.general_recommender.multivae + recbole.model.general_recommender.nais + recbole.model.general_recommender.neumf + recbole.model.general_recommender.ngcf + recbole.model.general_recommender.pop + recbole.model.general_recommender.spectralcf diff --git a/docs/source/recbole/recbole.model.general_recommender.spectralcf.rst b/docs/source/recbole/recbole.model.general_recommender.spectralcf.rst new file mode 100644 index 000000000..209accaa2 --- /dev/null +++ b/docs/source/recbole/recbole.model.general_recommender.spectralcf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.general_recommender.spectralcf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.init.rst b/docs/source/recbole/recbole.model.init.rst new file mode 100644 index 000000000..e7afaeb72 --- /dev/null +++ b/docs/source/recbole/recbole.model.init.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.init + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.cfkg.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.cfkg.rst new file mode 100644 index 000000000..46f6fe493 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.cfkg.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.cfkg + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.cke.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.cke.rst new file mode 100644 index 000000000..7ada3c3d9 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.cke.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.cke + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgat.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgat.rst new file mode 100644 index 000000000..5387132b5 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgat.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.kgat + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgcn.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgcn.rst new file mode 100644 index 000000000..d8e5dd177 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgcn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.kgcn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgnnls.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgnnls.rst new file mode 100644 index 000000000..450e497c0 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.kgnnls.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.kgnnls + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.ktup.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.ktup.rst new file mode 100644 index 000000000..83f316b52 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.ktup.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.ktup + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.mkr.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.mkr.rst new file mode 100644 index 000000000..13af4b806 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.mkr.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.mkr + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.ripplenet.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.ripplenet.rst new file mode 100644 index 000000000..da6d7f790 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.ripplenet.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.knowledge_aware_recommender.ripplenet + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.knowledge_aware_recommender.rst b/docs/source/recbole/recbole.model.knowledge_aware_recommender.rst new file mode 100644 index 000000000..fd025e998 --- /dev/null +++ b/docs/source/recbole/recbole.model.knowledge_aware_recommender.rst @@ -0,0 +1,14 @@ +recbole.model.knowledge\_aware\_recommender +=================================================== + +.. toctree:: + :maxdepth: 4 + + recbole.model.knowledge_aware_recommender.cfkg + recbole.model.knowledge_aware_recommender.cke + recbole.model.knowledge_aware_recommender.kgat + recbole.model.knowledge_aware_recommender.kgcn + recbole.model.knowledge_aware_recommender.kgnnls + recbole.model.knowledge_aware_recommender.ktup + recbole.model.knowledge_aware_recommender.mkr + recbole.model.knowledge_aware_recommender.ripplenet diff --git a/docs/source/recbole/recbole.model.layers.rst b/docs/source/recbole/recbole.model.layers.rst new file mode 100644 index 000000000..d4ee82d6e --- /dev/null +++ b/docs/source/recbole/recbole.model.layers.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.layers + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.loss.rst b/docs/source/recbole/recbole.model.loss.rst new file mode 100644 index 000000000..9876f53e5 --- /dev/null +++ b/docs/source/recbole/recbole.model.loss.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.loss + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.rst b/docs/source/recbole/recbole.model.rst new file mode 100644 index 000000000..5b83fbbde --- /dev/null +++ b/docs/source/recbole/recbole.model.rst @@ -0,0 +1,15 @@ +recbole.model +===================== + +.. toctree:: + :maxdepth: 4 + + recbole.model.context_aware_recommender + recbole.model.exlib_recommender + recbole.model.general_recommender + recbole.model.knowledge_aware_recommender + recbole.model.sequential_recommender + recbole.model.abstract_recommender + recbole.model.init + recbole.model.layers + recbole.model.loss diff --git a/docs/source/recbole/recbole.model.sequential_recommender.bert4rec.rst b/docs/source/recbole/recbole.model.sequential_recommender.bert4rec.rst new file mode 100644 index 000000000..271b09a48 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.bert4rec.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.bert4rec + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.caser.rst b/docs/source/recbole/recbole.model.sequential_recommender.caser.rst new file mode 100644 index 000000000..8428fa2a4 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.caser.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.caser + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.din.rst b/docs/source/recbole/recbole.model.sequential_recommender.din.rst new file mode 100644 index 000000000..652222c6e --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.din.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.din + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.fdsa.rst b/docs/source/recbole/recbole.model.sequential_recommender.fdsa.rst new file mode 100644 index 000000000..b3638d5b2 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.fdsa.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.fdsa + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.fossil.rst b/docs/source/recbole/recbole.model.sequential_recommender.fossil.rst new file mode 100644 index 000000000..0b0bafbfb --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.fossil.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.fossil + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.fpmc.rst b/docs/source/recbole/recbole.model.sequential_recommender.fpmc.rst new file mode 100644 index 000000000..93d3af95c --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.fpmc.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.fpmc + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.gcsan.rst b/docs/source/recbole/recbole.model.sequential_recommender.gcsan.rst new file mode 100644 index 000000000..7bde80be3 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.gcsan.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.gcsan + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.gru4rec.rst b/docs/source/recbole/recbole.model.sequential_recommender.gru4rec.rst new file mode 100644 index 000000000..8fadecf8e --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.gru4rec.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.gru4rec + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.gru4recf.rst b/docs/source/recbole/recbole.model.sequential_recommender.gru4recf.rst new file mode 100644 index 000000000..0c2ee9d4c --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.gru4recf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.gru4recf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.gru4reckg.rst b/docs/source/recbole/recbole.model.sequential_recommender.gru4reckg.rst new file mode 100644 index 000000000..a5a8e4724 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.gru4reckg.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.gru4reckg + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.hgn.rst b/docs/source/recbole/recbole.model.sequential_recommender.hgn.rst new file mode 100644 index 000000000..510849766 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.hgn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.hgn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.hrm.rst b/docs/source/recbole/recbole.model.sequential_recommender.hrm.rst new file mode 100644 index 000000000..d4c7c9db0 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.hrm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.hrm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.ksr.rst b/docs/source/recbole/recbole.model.sequential_recommender.ksr.rst new file mode 100644 index 000000000..99442a674 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.ksr.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.ksr + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.narm.rst b/docs/source/recbole/recbole.model.sequential_recommender.narm.rst new file mode 100644 index 000000000..15e52ffdb --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.narm.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.narm + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.nextitnet.rst b/docs/source/recbole/recbole.model.sequential_recommender.nextitnet.rst new file mode 100644 index 000000000..a6e917b59 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.nextitnet.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.nextitnet + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.npe.rst b/docs/source/recbole/recbole.model.sequential_recommender.npe.rst new file mode 100644 index 000000000..9a56fa28f --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.npe.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.npe + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.repeatnet.rst b/docs/source/recbole/recbole.model.sequential_recommender.repeatnet.rst new file mode 100644 index 000000000..f0055ec7a --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.repeatnet.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.repeatnet + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.rst b/docs/source/recbole/recbole.model.sequential_recommender.rst new file mode 100644 index 000000000..258674a8b --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.rst @@ -0,0 +1,30 @@ +recbole.model.sequential\_recommender +============================================= + +.. toctree:: + :maxdepth: 4 + + recbole.model.sequential_recommender.bert4rec + recbole.model.sequential_recommender.caser + recbole.model.sequential_recommender.din + recbole.model.sequential_recommender.fdsa + recbole.model.sequential_recommender.fossil + recbole.model.sequential_recommender.fpmc + recbole.model.sequential_recommender.gcsan + recbole.model.sequential_recommender.gru4rec + recbole.model.sequential_recommender.gru4recf + recbole.model.sequential_recommender.gru4reckg + recbole.model.sequential_recommender.hgn + recbole.model.sequential_recommender.hrm + recbole.model.sequential_recommender.ksr + recbole.model.sequential_recommender.narm + recbole.model.sequential_recommender.nextitnet + recbole.model.sequential_recommender.npe + recbole.model.sequential_recommender.repeatnet + recbole.model.sequential_recommender.s3rec + recbole.model.sequential_recommender.sasrec + recbole.model.sequential_recommender.sasrecf + recbole.model.sequential_recommender.shan + recbole.model.sequential_recommender.srgnn + recbole.model.sequential_recommender.stamp + recbole.model.sequential_recommender.transrec diff --git a/docs/source/recbole/recbole.model.sequential_recommender.s3rec.rst b/docs/source/recbole/recbole.model.sequential_recommender.s3rec.rst new file mode 100644 index 000000000..c9392886f --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.s3rec.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.s3rec + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.sasrec.rst b/docs/source/recbole/recbole.model.sequential_recommender.sasrec.rst new file mode 100644 index 000000000..2c6a563ed --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.sasrec.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.sasrec + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.sasrecf.rst b/docs/source/recbole/recbole.model.sequential_recommender.sasrecf.rst new file mode 100644 index 000000000..0ed23675d --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.sasrecf.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.sasrecf + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.shan.rst b/docs/source/recbole/recbole.model.sequential_recommender.shan.rst new file mode 100644 index 000000000..6ced74f83 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.shan.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.shan + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.srgnn.rst b/docs/source/recbole/recbole.model.sequential_recommender.srgnn.rst new file mode 100644 index 000000000..76201dbff --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.srgnn.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.srgnn + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.stamp.rst b/docs/source/recbole/recbole.model.sequential_recommender.stamp.rst new file mode 100644 index 000000000..eea454975 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.stamp.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.stamp + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.model.sequential_recommender.transrec.rst b/docs/source/recbole/recbole.model.sequential_recommender.transrec.rst new file mode 100644 index 000000000..d4b44d0f8 --- /dev/null +++ b/docs/source/recbole/recbole.model.sequential_recommender.transrec.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.model.sequential_recommender.transrec + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.quick_start.quick_start.rst b/docs/source/recbole/recbole.quick_start.quick_start.rst new file mode 100644 index 000000000..da62b7bdc --- /dev/null +++ b/docs/source/recbole/recbole.quick_start.quick_start.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.quick_start.quick_start + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.sampler.sampler.rst b/docs/source/recbole/recbole.sampler.sampler.rst new file mode 100644 index 000000000..30f94ef93 --- /dev/null +++ b/docs/source/recbole/recbole.sampler.sampler.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.sampler.sampler + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.trainer.hyper_tuning.rst b/docs/source/recbole/recbole.trainer.hyper_tuning.rst new file mode 100644 index 000000000..347f549e4 --- /dev/null +++ b/docs/source/recbole/recbole.trainer.hyper_tuning.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.trainer.hyper_tuning + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.trainer.trainer.rst b/docs/source/recbole/recbole.trainer.trainer.rst new file mode 100644 index 000000000..db0f69d84 --- /dev/null +++ b/docs/source/recbole/recbole.trainer.trainer.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.trainer.trainer + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.utils.case_study.rst b/docs/source/recbole/recbole.utils.case_study.rst new file mode 100644 index 000000000..3f6570eae --- /dev/null +++ b/docs/source/recbole/recbole.utils.case_study.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.utils.case_study + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.utils.enum_type.rst b/docs/source/recbole/recbole.utils.enum_type.rst new file mode 100644 index 000000000..9d8483655 --- /dev/null +++ b/docs/source/recbole/recbole.utils.enum_type.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.utils.enum_type + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.utils.logger.rst b/docs/source/recbole/recbole.utils.logger.rst new file mode 100644 index 000000000..d3bd2975d --- /dev/null +++ b/docs/source/recbole/recbole.utils.logger.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.utils.logger + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/recbole/recbole.utils.utils.rst b/docs/source/recbole/recbole.utils.utils.rst new file mode 100644 index 000000000..9e9fd62d3 --- /dev/null +++ b/docs/source/recbole/recbole.utils.utils.rst @@ -0,0 +1,4 @@ +.. automodule:: recbole.utils.utils + :members: + :undoc-members: + :show-inheritance: diff --git a/docs/source/user_guide/config_settings.rst b/docs/source/user_guide/config_settings.rst new file mode 100644 index 000000000..062391967 --- /dev/null +++ b/docs/source/user_guide/config_settings.rst @@ -0,0 +1,252 @@ +Config Settings +=================== +RecBole is able to config different parameters for controlling the experiment +setup (e.g., data processing, data splitting, training and evaluation). +The users can select the settings according to their own requirements. + +The introduction of different parameter configurations are presented as follows: + +Parameters Introduction +----------------------------- +The parameters in RecBole can be divided into three categories: +Basic Parameters, Dataset Parameters and Model Parameters. + +Basic Parameters +^^^^^^^^^^^^^^^^^^^^^^ +Basic parameters are used to build the general environment including the settings for +model training and evaluation. + +**Environment Setting** + +- ``gpu_id (int or str)`` : The id of GPU device. Defaults to ``0``. +- ``use_gpu (bool)`` : Whether or not to use GPU. If True, using GPU, else using CPU. + Defaults to ``True``. +- ``seed (int)`` : Random seed. Defaults to ``2020``. +- ``state (str)`` : Logging level. Defaults to ``'INFO'``. + Range in ``['INFO', 'DEBUG', 'WARNING', 'ERROR', 'CRITICAL']``. +- ``reproducibility (bool)`` : If True, the tool will use deterministic + convolution algorithms, which makes the result reproducible. If False, + the tool will benchmark multiple convolution algorithms and select the fastest one, + which makes the result not reproducible but can speed up model training in + some case. Defaults to ``True``. +- ``data_path (str)`` : The path of input dataset. Defaults to ``'dataset/'``. +- ``checkpoint_dir (str)`` : The path to save checkpoint file. + Defaults to ``'saved/'``. +- ``show_progress (bool)`` : Show the progress of training epoch and evaluate epoch. + Defaults to ``True``. + +**Training Setting** + +- ``epochs (int)`` : The number of training epochs. Defaults to ``300``. +- ``train_batch_size (int)`` : The training batch size. Defaults to ``2048``. +- ``learner (str)`` : The name of used optimizer. Defaults to ``'adam'``. + Range in ``['adam', 'sgd', 'adagrad', 'rmsprop', 'sparse_adam']``. +- ``learning_rate (float)`` : Learning rate. Defaults to ``0.001``. +- ``training_neg_sample_num (int)`` : The number of negative samples during + training. If it is set to 0, the negative sampling operation will not be + performed. Defaults to ``1``. +- ``training_neg_sample_distribution(str)`` : Distribution of the negative items + in training phase. Default to ``uniform``. Range in ``['uniform', 'popularity']``. +- ``eval_step (int)`` : The number of training epochs before a evaluation + on the valid dataset. If it is less than 1, the model will not be + evaluated on the valid dataset. Defaults to ``1``. +- ``stopping_step (int)`` : The threshold for validation-based early stopping. + Defaults to ``10``. +- ``clip_grad_norm (dict)`` : The args of `clip_grad_norm_ <https://pytorch.org/docs/stable/generated/torch.nn.utils.clip_grad_norm_.html>`_ + which will clips gradient norm of model. Defaults to ``None``. +- ``loss_decimal_place(int)``: The decimal place of training loss. Defaults to ``4``. +- ``weight_decay (float)`` : Weight decay (L2 penalty), used for `optimizer <https://pytorch.org/docs/stable/optim.html?highlight=weight_decay>`_. Default to ``0.0``. + + +**Evaluation Setting** + +- ``eval_setting (str)``: The evaluation settings. Defaults to ``'RO_RS,full'``. + The parameter has two parts. The first part control the splitting methods, + the range is ``['RO_RS','TO_LS','RO_LS','TO_RS']``. The second part(optional) + control the ranking mechanism, the range is ``['full','uni100','uni1000','pop100','pop1000']``. +- ``group_by_user (bool)``: Whether or not to group the users. + It must be ``True`` when ``eval_setting`` is in ``['RO_LS', 'TO_LS']``. + Defaults to ``True``. +- ``spilt_ratio (list)``: The split ratio between train data, valid data and + test data. It only take effects when the first part of ``eval_setting`` + is in ``['RO_RS', 'TO_RS']``. Defaults to ``[0.8, 0.1, 0.1]``. +- ``leave_one_num (int)``: It only take effects when the first part of + ``eval_setting`` is in ``['RO_LS', 'TO_LS']``. Defaults to ``2``. + +- ``metrics (list or str)``: Evaluation metrics. Defaults to + ``['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']``. Range in + ``['Recall', 'MRR', 'NDCG', 'Hit', 'MAP', 'Precision', 'AUC', 'GAUC', + 'MAE', 'RMSE', 'LogLoss']``. +- ``topk (list or int or None)``: The value of k for topk evaluation metrics. + Defaults to ``10``. +- ``valid_metric (str)``: The evaluation metrics for early stopping. + It must be one of used ``metrics``. Defaults to ``'MRR@10'``. +- ``eval_batch_size (int)``: The evaluation batch size. Defaults to ``4096``. +- ``metric_decimal_place(int)``: The decimal place of metric score. Defaults to ``4``. + +Pleaser refer to :doc:`evaluation_support` for more details about the parameters +in Evaluation Setting. + +Dataset Parameters +^^^^^^^^^^^^^^^^^^^^^^^ +Dataset Parameters are used to describe the dataset information and control +the dataset loading and filtering. + +Please refer to :doc:`data/data_args` for more details. + +Model Parameters +^^^^^^^^^^^^^^^^^^^^^ +Model Parameters are used to describe the model structures. + +Please refer to :doc:`model_intro` for more details. + + +Parameters Configuration +------------------------------ +RecBole supports three types of parameter configurations: Config files, +Parameter Dicts and Command Line. The parameters are assigned via the +Configuration module. + +Config Files +^^^^^^^^^^^^^^^^ +Config Files should be organized in the format of yaml. +The users should write their parameters according to the rules aligned with +yaml, and the final config files are processed by the configuration module +to complete the parameter settings. + +To begin with, we write the parameters into the yaml files (e.g. `example.yaml`). + +.. code:: yaml + + gpu_id: 1 + training_batch_size: 1024 + +Then, the yaml files are conveyed to the configuration module to finish the +parameter settings. + +.. code:: python + + from recbole.config import Config + + config = Config(model='BPR', dataset='ml-100k', config_file_list=['example.yaml']) + print('gpu_id: ', config['gpu_id']) + print('training_batch_size: ', config['training_batch_size']) + + +output: + +.. code:: bash + + gpu_id: 1 + training_batch_size: 1024 + +The parameter ``config_file_list`` supports multiple yaml files. + +For more details on yaml, please refer to YAML_. + +.. _YAML: https://yaml.org/ + +When using our toolkit, the parameters belonging to **Dataset parameters** and +Evaluation Settings of **Basic Parameters** are recommended to be written into +the config files, which may be convenient for reusing the configurations. + +Parameter Dicts +^^^^^^^^^^^^^^^^^^ +Parameter Dict is realized by the dict data structure in python, where the key +is the parameter name, and the value is the parameter value. The users can write their +parameters into a dict, and input it into the configuration module. + +An example is as follows: + +.. code:: python + + from recbole.config import Config + + parameter_dict = { + 'gpu_id': 2, + 'training_batch_size': 512 + } + config = Config(model='BPR', dataset='ml-100k', config_dict=parameter_dict) + print('gpu_id: ', config['gpu_id']) + print('training_batch_size: ', config['training_batch_size']) + +output: + +.. code:: bash + + gpu_id: 2 + training_batch_size: 512 + + +Command Line +^^^^^^^^^^^^^^^^^^^^^^^^ +We can also assign parameters based on the command line. +The parameters in the command line can be read from the configuration module. +The format is: `-–parameter_name=[parameter_value]`. + +Write the following code to the python file (e.g. `run.py`): + +.. code:: python + + from recbole.config import Config + + config = Config(model='BPR', dataset='ml-100k') + print('gpu_id: ', config['gpu_id']) + print('training_batch_size: ', config['training_batch_size']) + +Running: + +.. code:: bash + + python run.py --gpu_id=3 --training_batch_size=256 + +output: + +.. code:: bash + + gpu_id: 3 + training_batch_size: 256 + + +Priority +^^^^^^^^^^^^^^^^^ +RecBole supports the combination of three types of parameter configurations. + +The priority of the configuration methods is: Command Line > Parameter Dicts +> Config Files > Default Settings + +A example is as follows: + +`example.yaml`: + +.. code:: yaml + + gpu_id: 1 + training_batch_size: 1024 + +`run.py`: + +.. code:: python + + from recbole.config import Config + + parameter_dict = { + 'gpu_id': 2, + 'training_batch_size': 512 + } + config = Config(model='BPR', dataset='ml-100k', config_file_list=['example.yaml'], config_dict=parameter_dict) + print('gpu_id: ', config['gpu_id']) + print('training_batch_size: ', config['training_batch_size']) + +Running: + +.. code:: bash + + python run.py --gpu_id=3 --training_batch_size=256 + +output: + +.. code:: bash + + gpu_id: 3 + training_batch_size: 256 diff --git a/docs/source/user_guide/data/atomic_files.rst b/docs/source/user_guide/data/atomic_files.rst new file mode 100644 index 000000000..9e74dd4d7 --- /dev/null +++ b/docs/source/user_guide/data/atomic_files.rst @@ -0,0 +1,144 @@ +Atomic Files +=================== + +Atomic files are introduced to format the input of mainstream recommendation tasks in a flexible way. + +So far, our library introduces six atomic file types, and we identify different files by their suffixes. + +========= ============================== ======================================================== +Suffix Content Example Format +========= ============================== ======================================================== +`.inter` User-item interaction `user_id`, `item_id`, `rating`, `timestamp`, `review` +`.user` User feature `user_id`, `age`, `gender` +`.item` Item feature `item_id`, `category` +`.kg` Triplets in a knowledge graph `head_entity`, `tail_entity`, `relation` +`.link` Item-entity linkage data `entity`, `item_id` +`.net` Social graph data `source`, `target` +========= ============================== ======================================================== + +Atomic files are combined to support the input of different recommendation tasks. + +One can write the suffixes into the config arg ``load_col`` to load the corresponding atomic files. + +For each recommendation task, we have to provide several mandatory files: + +================ ================================ +Tasks Mandatory atomic files +================ ================================ +General `.inter` +Context-aware `.inter`, `.user`, `.item` +Knowledge-aware `.inter`, `.kg`, `.link` +Sequential `.inter` +Social `.inter`, `.net` +================ ================================ + +Format +-------- + +Each atomic file can be viewed as a m x n table, where n is the number of features and m-1 is the number of data records(one line for header). + +The first row corresponds to feature names, in which each entry has the form of ``feat_name:feat_type``,indicating the feature name and feature type. + +We support four feature types, which can be processed by tensors in batch. + +============ =========================== ===================== +feat_type Explanations Examples +============ =========================== ===================== +`token` single discrete feature `user_id`, `age` +`token_seq` discrete features sequence `review` +`float` single continuous feature `rating`, `timestamp` +`float_seq` continuous feature sequence `vector` +============ =========================== ===================== + +Examples +---------- + +We present three example data rows in the formatted ML-1M dataset. + +**ml-1m.inter** + +============= ============= ============ =============== +user_id:token item_id:token rating:float timestamp:float +============= ============= ============ =============== +1 1193 5 978300760 +1 661 3 978302109 +============= ============= ============ =============== + +**ml-1m.user** + +============= ========= ============ ================ ============== +user_id:token age:token gender:token occupation:token zip_code:token +============= ========= ============ ================ ============== +1 1 F 10 48067 +2 56 M 16 70072 +============= ========= ============ ================ ============== + +**ml-1m.item** + +============= ===================== ================== ============================ +item_id:token movie_title:token_seq release_year:token genre:token_seq +============= ===================== ================== ============================ +1 Toy Story 1995 Animation Children's Comedy +2 Jumanji 1995 Adventure Children's Fantasy +============= ===================== ================== ============================ + +**ml-1m.kg** + +============= =================================== ============= +head_id:token relation_id:token tail_id:token +============= =================================== ============= +m.0gs6m film.film_genre.films_in_this_genre m.01b195 +m.052_dz film.film.actor m.02nrdp +============= =================================== ============= + +**ml-1m.link** + +============= =============== +item_id:token entity_id:token +============= =============== +2694 m.02hxhz +2079 m.0kvcr9 +============= =============== + +Additional Atomic Files +---------------------------- + +For users who want to load features from additional atomic files (e.g. pretrained entity embeddings), we provide a simple way as following. + +Firstly, prepare your additional atomic file (e.g. ``ml-1m.ent``). + +============= =============================== +ent_id:token ent_emb:float_seq +============= =============================== +m.0gs6m -115.08 13.60 113.69 +m.01b195 -130.97 263.05 -129.88 +============= =============================== + +Secondly, update the args as: + +.. code:: yaml + + additional_feat_suffix: [ent] + load_col: + # inter/user/item/...: As usual + ent: [ent_id, ent_emb] + +Then, this additional atomic file will be loaded into the :class:`Dataset` object. These new features can be used as following. + +.. code:: python + + dataset = create_dataset(config) + print(dataset.ent_feat) + +Note that these features can be preprocessed by the same way as the other features. + +For example, if you want to map the tokens of ``ent_id`` into the same space of ``entity_id``, then update the args as: + +.. code:: yaml + + additional_feat_suffix: [ent] + load_col: + # inter/user/item/...: As usual + ent: [ent_id, ent_emb] + + fields_in_same_space: [[ent_id, entity_id]] diff --git a/docs/source/user_guide/data/data_args.rst b/docs/source/user_guide/data/data_args.rst new file mode 100644 index 000000000..497eb65e6 --- /dev/null +++ b/docs/source/user_guide/data/data_args.rst @@ -0,0 +1,104 @@ +Args for Data +========================= + +RecBole provides several arguments for describing: + +- Basic information of the dataset +- Operations of dataset preprocessing + +See below for the details: + +Atomic File Format +---------------------- + +- ``field_separator (str)`` : Separator of different columns in atomic files. Defaults to ``"\t"``. +- ``seq_separator (str)`` : Separator inside the sequence features. Defaults to ``" "``. + +Basic Information +---------------------- + +Common Features +'''''''''''''''''' + +- ``USER_ID_FIELD (str)`` : Field name of user ID feature. Defaults to ``user_id``. +- ``ITEM_ID_FIELD (str)`` : Field name of item ID feature. Defaults to ``item_id``. +- ``RATING_FIELD (str)`` : Field name of rating feature. Defaults to ``rating``. +- ``TIME_FIELD (str)`` : Field name of timestamp feature. Defaults to ``timestamp``. +- ``seq_len (dict)`` : Keys are field names of sequence features, values are maximum length of each sequence (which means sequences too long will be cut off). If not set, the sequences will not be cut off. Defaults to ``None``. + +Label for Point-wise DataLoader +''''''''''''''''''''''''''''''''''' + +- ``LABEL_FIELD (str)`` : Expected field name of the generated labels. Defaults to ``label``. +- ``threshold (dict)`` : The format is ``{k (str): v (float)}``. 0/1 labels will be generated according to the value of ``inter_feat[k]`` and ``v``. The rows with ``inter_feat[k] >= v`` will be labeled as positive, otherwise the label is negative. Note that at most one pair of ``k`` and ``v`` can exist in ``threshold``. Defaults to ``None``. + +NegSample Prefix for Pair-wise DataLoader +'''''''''''''''''''''''''''''''''''''''''''''''''' + +- ``NEG_PREFIX (str)`` : Prefix of field names which are generated as negative cases. E.g. if we have positive item ID named ``item_id``, then those item ID in negative samples will be called ``NEG_PREFIX + item_id``. Defaults to ``neg_``. + +Sequential Model Needed +''''''''''''''''''''''''''''''''''' + +- ``ITEM_LIST_LENGTH_FIELD (str)`` : Field name of the feature representing item sequences' length. Defaults to ``item_length``. +- ``LIST_SUFFIX (str)`` : Suffix of field names which are generated as sequences. E.g. if we have item ID named ``item_id``, then those item ID sequences will be called ``item_id + LIST_SUFFIX``. Defaults to ``_list``. +- ``MAX_ITEM_LIST_LENGTH (int)``: Maximum length of each generated sequence. Defaults to ``50``. +- ``POSITION_FIELD (str)`` : Field name of the generated position sequence. For sequence of length ``k``, its position sequence is ``range(k)``. Note that this field will only be generated if this arg is not ``None``. Defaults to ``position_id``. + +Knowledge-based Model Needed +''''''''''''''''''''''''''''''''''' + +- ``HEAD_ENTITY_ID_FIELD (str)`` : Field name of the head entity ID feature. Defaults to ``head_id``. +- ``TAIL_ENTITY_ID_FIELD (str)`` : Field name of the tail entity ID feature. Defaults to ``tail_id``. +- ``RELATION_ID_FIELD (str)`` : Field name of the relation ID feature. Defaults to ``relation_id``. +- ``ENTITY_ID_FIELD (str)`` : Field name of the entity ID. Note that it's only a symbol of entities, not real feature of one of the ``xxx_feat``. Defaults to ``entity_id``. + +Selectively Loading +------------------------------ + +- ``load_col (dict)`` : Keys are the suffix of loaded atomic files, values are the list of field names to be loaded. If a suffix doesn't exist in ``load_col``, the corresponding atomic file will not be loaded. Note that if ``load_col`` is ``None``, then all the existed atomic files will be loaded. Defaults to ``{inter: [user_id, item_id]}``. +- ``unload_col (dict)`` : Keys are suffix of loaded atomic files, values are list of field names NOT to be loaded. Note that ``load_col`` and ``unload_col`` can not be set at the same time. Defaults to ``None``. +- ``unused_col (dict)`` : Keys are suffix of loaded atomic files, values are list of field names which is loaded for data processing but will not used in model. E.g. the ``time_field`` may used for time ordering but model does not use this field. Defaults to ``None``. +- ``additional_feat_suffix (list)``: Control loading additional atomic files. E.g. if you want to load features from ``ml-100k.hello``, just set this arg as ``additional_feat_suffix: [hello]``. Features of additional features will be stored in ``Dataset.feat_list``. Defaults to ``None``. + +Filtering +----------- + +Remove duplicated user-item interactions +'''''''''''''''''''''''''''''''''''''''' + +- ``rm_dup_inter (str)`` : Whether to remove duplicated user-item interactions. If ``time_field`` exists, ``inter_feat`` will be sorted by ``time_field`` in ascending order. Otherwise it will remain unchanged. After that, if ``rm_dup_inter == first``, we will keep the first user-item interaction in duplicates; if ``rm_dup_inter == last``, we will keep the last user-item interaction in duplicates. Defaults to ``None``. + +Filter by value +'''''''''''''''''' + +- ``lowest_val (dict)`` : Has the format ``{k (str): v (float)}, ...``. The rows whose ``feat[k] < v`` will be filtered. Defaults to ``None``. +- ``highest_val (dict)`` : Has the format ``{k (str): v (float)}, ...``. The rows whose ``feat[k] > v`` will be filtered. Defaults to ``None``. +- ``equal_val (dict)`` : Has the format ``{k (str): v (float)}, ...``. The rows whose ``feat[k] != v`` will be filtered. Defaults to ``None``. +- ``not_equal_val (dict)`` : Has the format ``{k (str): v (float)}, ...``. The rows whose ``feat[k] == v`` will be filtered. Defaults to ``None``. + +Remove interation by user or item +''''''''''''''''''''''''''''''''''' + +- ``filter_inter_by_user_or_item (bool)`` : If ``True``, we will remove the interaction in ``inter_feat`` which user or item is not in ``user_feat`` or ``item_feat``. Defaults to ``True``. + +Filter by number of interactions +'''''''''''''''''''''''''''''''''''' + +- ``max_user_inter_num (int)`` : Users whose number of interactions is more than ``max_user_inter_num`` will be filtered. Defaults to ``None``. +- ``min_user_inter_num (int)`` : Users whose number of interactions is less than ``min_user_inter_num`` will be filtered. Defaults to ``0``. +- ``max_item_inter_num (int)`` : Items whose number of interactions is more than ``max_item_inter_num`` will be filtered. Defaults to ``None``. +- ``min_item_inter_num (int)`` : Items whose number of interactions is less than ``min_item_inter_num`` will be filtered. Defaults to ``0``. + +Preprocessing +----------------- + +- ``fields_in_same_space (list)`` : List of spaces. Space is a list of string similar to the fields' names. The fields in the same space will be remapped into the same index system. Note that if you want to make some fields remapped in the same space with entities, then just set ``fields_in_same_space = [entity_id, xxx, ...]``. (if ``ENTITY_ID_FIELD != 'entity_id'``, then change the ``'entity_id'`` in the above example.) Defaults to ``None``. +- ``preload_weight (dict)`` : Has the format ``{k (str): v (float)}, ...``. ``k`` if a token field, representing the IDs of each row of preloaded weight matrix. ``v`` is a float like fields. Each pair of ``u`` and ``v`` should be from the same atomic file. This arg can be used to load pretrained vectors. Defaults to ``None``. +- ``normalize_field (list)`` : List of filed names to be normalized. Note that only float like fields can be normalized. Defaults to ``None``. +- ``normalize_all (bool)`` : Normalize all the float like fields if ``True``. Defaults to ``True``. + +Benchmark file +------------------- + +- ``benchmark_filename (list)`` : List of pre-split user-item interaction suffix. We will only apply normalize, remap-id, which will not delete the interaction in inter_feat. And then split the inter_feat by ``benchmark_filename``. E.g. Let's assume that the dataset is called ``click``, and ``benchmark_filename`` equals to ``['part1', 'part2', 'part3']``. That we will load ``click.part1.inter``, ``click.part2.inter``, ``click.part3.inter``, and treat them as train, valid, test dataset. Defaults to ``None``. diff --git a/docs/source/user_guide/data/data_flow.rst b/docs/source/user_guide/data/data_flow.rst new file mode 100644 index 000000000..75022e575 --- /dev/null +++ b/docs/source/user_guide/data/data_flow.rst @@ -0,0 +1,28 @@ +Data Flow +=========== + +For extensibility and reusability, our data module designs an elegant data flow that transforms raw data into the model input. + +The overall data flow can be described as follows: + +.. image:: ../../asset/data_flow_en.png + :align: center + +The details are as follows: + +- Raw Input + Unprocessed raw input dataset. Detailed as `Dataset List </dataset_list.html>`_. +- Atomic Files + Basic components for characterizing the input of various recommendation tasks, proposed by RecBole. Detailed as :doc:`atomic_files`. +- Dataset: + Mainly based on the primary data structure of :class:`pandas.DataFrame` in the library of `pandas <https://pandas.pydata.org/>`_. + During the transformation step from atomic files to class :class:`Dataset`, + we provide many useful functions that support a series of preprocessing functions in recommender systems, + such as k-core data filtering and missing value imputation. +- DataLoader: + Mainly based on a general internal data structure implemented by our library, called :class:`~recbole.data.interaction.Interaction`. + :class:`~recbole.data.interaction.Interaction` is the internal data structural that is fed into the recommendation algorithms. + It is implemented as a new abstract data type based on :class:`python.Dict`, which is a key-value indexed data structure. + The keys correspond to features from input, which can be conveniently referenced with feature names when writing the recommendation algorithms; + and the values correspond to tensors (implemented by :class:`torch.Tensor`), which will be used for the update and computation in learning algorithms. + Specially, the value entry for a specific key stores all the corresponding tensor data in a batch or mini-batch. diff --git a/docs/source/user_guide/data/interaction.rst b/docs/source/user_guide/data/interaction.rst new file mode 100644 index 000000000..43e5c0356 --- /dev/null +++ b/docs/source/user_guide/data/interaction.rst @@ -0,0 +1,29 @@ +Interaction +================ + +:class:`~recbole.data.interaction.Interaction` is the internal data structural that is loaded from :class:`DataLoader`, and fed into the recommendation algorithms. + +It is implemented as a new abstract data type based on :class:`python.dict`. The keys correspond to features from input, which can be conveniently referenced with feature names when writing the recommendation algorithms; and the values correspond to tensors (implemented by :class:`torch.Tensor`), which will be used for the update and computation in learning algorithms. Specially, the value entry for a specific key stores all the corresponding tensor data in a batch or mini-batch. + +With such a data structure, our library provides a friendly interface to write the recommendation algorithms in a batch-based mode. For example, we can read all the user embeddings and items embeddings from an instantiated :class:`~recbole.data.interaction.Interaction` object ``inter`` simply based on the feature names: + +.. code:: python + + user_vec = inter['UserID'] + item_vec = inter['ItemID'] + +The contents of an :class:`~recbole.data.interaction.Interaction` are decided by the loaded fields. +However, it should be noted that there can be some features generated by :class:`DataLoader`, e.g. if one model has ``input_type = InputType.PAIRWISE``, then each item feature has a corresponding negative item feature, whose keys are begin with arg ``NEG_PREFIX``. + +Besides, the value components are implemented based on :class:`torch.Tensor`. We wrap many functions of PyTorch to develop a GRU-oriented data structure, which can support batch-based mechanism (e.g., copying a batch of data to GPU). In specific, we summarize the important functions as follows: + +============================ ================================================================== +Function Description +============================ ================================================================== +to(device) transfer all tensors to :class:`torch.device` +cpu transfer all tensors to CPU +numpy transfer all tensors to :class:`numpy.ndarray` +repeat repeats each tensor along the batch size dimension +repeat interleave repeat elements of a tensor, similar to repeat interleave +update update this object with another Interaction, similar to update +============================ ================================================================== diff --git a/docs/source/user_guide/data_intro.rst b/docs/source/user_guide/data_intro.rst new file mode 100644 index 000000000..f84d2c04e --- /dev/null +++ b/docs/source/user_guide/data_intro.rst @@ -0,0 +1,12 @@ +Data Introduction +=================== + +Here we introduce the whole dataflow and highlight its key features. + +.. toctree:: + :maxdepth: 1 + + data/data_flow + data/atomic_files + data/interaction + data/data_args diff --git a/docs/source/user_guide/evaluation_support.rst b/docs/source/user_guide/evaluation_support.rst new file mode 100644 index 000000000..39cc2167c --- /dev/null +++ b/docs/source/user_guide/evaluation_support.rst @@ -0,0 +1,65 @@ +Evaluation Support +=========================== + +The function of evaluation module is to implement commonly used evaluation +protocols for recommender systems. Since different models can be compared under +the same evaluation modules, RecBole standardizes the evaluation of recommender +systems. + + +Evaluation Settings +----------------------- +The evaluation settings supported by RecBole is as following. Among them, the +first four rows correspond to the dataset splitting methods, while the last two +rows correspond to the ranking mechanism, namely a full ranking over all the +items or a sampled-based ranking. + +================== ======================================================== + Notation Explanation +================== ======================================================== + RO_RS Random Ordering + Ratio-based Splitting + TO_LS Temporal Ordering + Leave-one-out Splitting + RO_LS Random Ordering + Leave-one-out Splitting + TO_RS Temporal Ordering + Ratio-based Splitting + full full ranking with all item candidates + uniN sample-based ranking: each positive item is paired with N sampled negative items in uniform distribution + popN sample-based ranking: each positive item is paired with N sampled negative items in popularity distribution +================== ======================================================== + +The parameters used to control the evaluation settings are as follows: + +- ``eval_setting (str)``: The evaluation settings. Defaults to ``'RO_RS,full'``. + The parameter has two parts. The first part control the splitting methods, + range in ``['RO_RS','TO_LS','RO_LS','TO_RS']``. The second part(optional) + control the ranking mechanism, range in ``['full','uni100','uni1000','pop100','pop1000']``. +- ``group_by_user (bool)``: Whether the users are grouped. + It must be ``True`` when ``eval_setting`` is in ``['RO_LS', 'TO_LS']``. + Defaults to ``True``. +- ``spilt_ratio (list)``: The split ratio between train data, valid data and + test data. It only take effects when the first part of ``eval_setting`` + is in ``['RO_RS', 'TO_RS']``. Defaults to ``[0.8, 0.1, 0.1]``. +- ``leave_one_num (int)``: It only take effects when the first part of + ``eval_setting`` is in ``['RO_LS', 'TO_LS']``. Defaults to ``2``. + +Evaluation Metrics +----------------------- + +RecBole supports both value-based and ranking-based evaluation metrics. + +The value-based metrics (i.e., for rating prediction) include ``RMSE``, ``MAE``, +``AUC`` and ``LogLoss``, measuring the prediction difference between the true +and predicted values. + +The ranking-based metrics (i.e., for top-k item recommendation) include the most +common ranking-aware metrics, such as ``Recall``, ``Precision``, ``Hit``, +``NDCG``, ``MAP`` and ``MRR``, measuring the ranking performance of the +generated recommendation lists by an algorithm. + +The parameters used to control the evaluation metrics are as follows: + +- ``metrics (list or str)``: Evaluation metrics. Defaults to + ``['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']``. Range in + ``['Recall', 'MRR', 'NDCG', 'Hit', 'MAP', 'Precision', 'AUC', + 'MAE', 'RMSE', 'LogLoss']``. +- ``topk (list or int or None)``: The value of k for topk evaluation metrics. + Defaults to ``10``. diff --git a/docs/source/user_guide/model/context/afm.rst b/docs/source/user_guide/model/context/afm.rst new file mode 100644 index 000000000..b67e6fa43 --- /dev/null +++ b/docs/source/user_guide/model/context/afm.rst @@ -0,0 +1,73 @@ +AFM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.5555/3172077.3172324>`_ + +**Title:** Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks + +**Authors:** Jun Xiao, Hao Ye, Xiangnan He, Hanwang Zhang, Fei Wu, Tat-Seng Chua + +**Abstract:** *Factorization Machines* (FMs) are a supervised learning approach that enhances the linear regression model by incorporating the second-order feature interactions. Despite effectiveness, FM can be hindered by its modelling of all feature interactions with the same weight, as not all feature interactions are equally useful and predictive. For example, the interactions with useless features may even introduce noises and adversely degrade the performance. In this work, we improve FM by discriminating the importance of different feature interactions. We propose a novel model named *Attentional Factorization Machine* (AFM), which learns the importance of each feature interaction from data via a neural attention network. Extensive experiments on two real-world datasets demonstrate the effectiveness of AFM. Empirically, it is shown on regression task AFM betters FM with a 8.6% relative improvement, and consistently outperforms the state-of-the-art deep learning methods Wide&Deep [Cheng *et al.* , 2016] and Deep-Cross [Shan *et al.* , 2016] with a much simpler structure and fewer model parameters. + +.. image:: ../../../asset/afm.jpg + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``attention_size (int)`` : The vector size in attention mechanism. Defaults to ``25``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.3``. +- ``weight_decay (float)`` : The L2 regularization weight. Defaults to ``2``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='AFM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + attention_size choice [10,15,20,25,30,40] + reg_weight choice [0,0.1,0.2,1,2,5,10] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/context/autoint.rst b/docs/source/user_guide/model/context/autoint.rst new file mode 100644 index 000000000..664bdb281 --- /dev/null +++ b/docs/source/user_guide/model/context/autoint.rst @@ -0,0 +1,75 @@ +AutoInt +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3357384.3357925>`_ + +**Title:** AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks + +**Authors:** Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang, Jian Tang + +**Abstract:** Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking on an ad or an item, is critical to many online applications such as online advertising and recommender systems. The problem is very challenging since (1) the input features (e.g., the user id, user age, item id, item category) are usually sparse and high-dimensional, and (2) an effective prediction relies on high-order combinatorial features (a.k.a. cross features), which are very time-consuming to hand-craft by domain experts and are impossible to be enumerated. Therefore, there have been efforts in finding low-dimensional representations of the sparse and high-dimensional raw features and their meaningful combinations. In this paper, we propose an effective and efficient method called the AutoInt to automatically learn the high-order feature interactions of input features. Our proposed algorithm is very general, which can be applied to both numerical and categorical input features. Specifically, we map both the numerical and categorical features into the same low-dimensional space. Afterwards, a multi-head self-attentive neural network with residual connections is proposed to explicitly model the feature interactions in the low-dimensional space. With different layers of the multi-head self-attentive neural networks, different orders of feature combinations of input features can be modeled. The whole model can be efficiently fit on large-scale raw data in an end-to-end fashion. Experimental results on four real-world datasets show that our proposed approach not only outperforms existing state-of-the-art approaches for prediction but also offers good explainability. + +.. image:: ../../../asset/autoint.png + :width: 500 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``attention_size (int)`` : The vector size in attention mechanism. Defaults to ``16``. +- ``n_layers (int)`` : The number of attention layers. Defaults to ``3``. +- ``num_heads (int)`` : The number of attention heads. Defaults to ``2``. +- ``dropout_probs (list of float)`` : The dropout rate of dropout layer. Defaults to ``[0.2,0.2,0.2]``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[128,128]``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='AutoInt', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + attention_size choice [8,16,32] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[64,64]','[128,128]','[256,256]','[512,512]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/context/dcn.rst b/docs/source/user_guide/model/context/dcn.rst new file mode 100644 index 000000000..49206b533 --- /dev/null +++ b/docs/source/user_guide/model/context/dcn.rst @@ -0,0 +1,91 @@ +DCN +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3124749.3124754>`_ + +**Title:** Deep & Cross Network for Ad Click Predictions + +**Authors:** Ruoxi Wang, Bin Fu, Gang Fu, Mingliang Wang + +**Abstract:** Feature engineering has been the key to the success of many prediction +models. However, the process is nontrivial and oen requires +manual feature engineering or exhaustive searching. DNNs +are able to automatically learn feature interactions; however, they +generate all the interactions implicitly, and are not necessarily efficient +in learning all types of cross features. In this paper, we propose +the Deep & Cross Network (DCN) which keeps the benefits of +a DNN model, and beyond that, it introduces a novel cross network +that is more efficient in learning certain bounded-degree feature +interactions. In particular, DCN explicitly applies feature crossing +at each layer, requires no manual feature engineering, and adds +negligible extra complexity to the DNN model. Our experimental +results have demonstrated its superiority over the state-of-art algorithms +on the CTR prediction dataset and dense classification +dataset, in terms of both model accuracy and memory usage. + +.. image:: ../../../asset/dcn.png + :width: 500 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[256,256,256]``. +- ``cross_layer_num (int)`` : The number of cross layers. Defaults to ``6``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``2``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.2``. + + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='DCN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]','[1024, 1024]'] + reg_weight choice [0.1,1,2,5,10] + cross_layer_num choice [3,4,5,6] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/context/deepfm.rst b/docs/source/user_guide/model/context/deepfm.rst new file mode 100644 index 000000000..bf6817b24 --- /dev/null +++ b/docs/source/user_guide/model/context/deepfm.rst @@ -0,0 +1,72 @@ +DeepFM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.5555/3172077.3172127>`_ + +**Title:** DeepFM: A Factorization-Machine based Neural Network for CTR Prediction + +**Authors:** Huifeng Guo , Ruiming Tang, Yunming Yey, Zhenguo Li, Xiuqiang He + +**Abstract:** Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide \& Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data. + +.. image:: ../../../asset/deepfm.png + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[128,128,128]``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.2``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='DeepFM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/context/din.rst b/docs/source/user_guide/model/context/din.rst new file mode 100644 index 000000000..90018241d --- /dev/null +++ b/docs/source/user_guide/model/context/din.rst @@ -0,0 +1,100 @@ +DIN +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3219819.3219823>`_ + +**Title:** Deep Interest Network for Click-Through Rate Prediction + +**Authors:** Guorui Zhou, Chengru Song, Xiaoqiang Zhu, Ying Fan, Han Zhu, Xiao Ma, +Yanghui Yan, Junqi Jin, Han Li, Kun Gai + +**Abstract:** Click-through rate prediction is an essential task in industrial +applications, such as online advertising. Recently deep learning +based models have been proposed, which follow a similar Embedding& +MLP paradigm. In these methods large scale sparse input +features are first mapped into low dimensional embedding vectors, +and then transformed into fixed-length vectors in a group-wise +manner, finally concatenated together to fed into a multilayer perceptron +(MLP) to learn the nonlinear relations among features. In +this way, user features are compressed into a fixed-length representation +vector, in regardless of what candidate ads are. The use +of fixed-length vector will be a bottleneck, which brings difficulty +for Embedding&MLP methods to capture user’s diverse interests +effectively from rich historical behaviors. In this paper, we propose +a novel model: Deep Interest Network (DIN) which tackles this challenge +by designing a local activation unit to adaptively learn the +representation of user interests from historical behaviors with respect +to a certain ad. This representation vector varies over different +ads, improving the expressive ability of model greatly. Besides, we +develop two techniques: mini-batch aware regularization and data +adaptive activation function which can help training industrial deep +networks with hundreds of millions of parameters. Experiments on +two public datasets as well as an Alibaba real production dataset +with over 2 billion samples demonstrate the effectiveness of proposed +approaches, which achieve superior performance compared +with state-of-the-art methods. DIN now has been successfully deployed +in the online display advertising system in Alibaba, serving +the main traffic. + +.. image:: ../../../asset/din.png + :width: 1000 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[256,256,256]``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.0``. +- ``pooling_mode (str)`` : Pooling mode of sequence data. Defaults to ``'mean'``. Range in ``['max', 'mean', 'sum']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='DIN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]'] + pooling_mode choice ['mean','max','sum'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/context/dssm.rst b/docs/source/user_guide/model/context/dssm.rst new file mode 100644 index 000000000..55da708e4 --- /dev/null +++ b/docs/source/user_guide/model/context/dssm.rst @@ -0,0 +1,78 @@ +DSSM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2505515.2505665>`_ + +**Title:** Learning deep structured semantic models for web search using clickthrough data + +**Authors:** Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, Larry Heck + +**Abstract:** Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. In this study we strive to develop a series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them. The proposed deep structured semantic models are discriminatively trained by maximizing the conditional likelihood of the clicked documents given a query using the clickthrough data. + +To make our models applicable to large-scale Web search applications, we also use a technique called word hashing, which is shown to effectively scale up our semantic models to handle large vocabularies which are common in such tasks. The new models are evaluated on a Web document ranking task using a real-world data set. Results show that our best model significantly outperforms other latent semantic models, which were considered state-of-the-art in the performance prior to the work presented in this paper. + +.. image:: ../../../asset/dssm.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[256, 256, 256]``. +- ``dropout_prob (float)`` : The dropout rate of edge in the linear predict layer. Defaults to ``0.3``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='DSSM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + + - DSSM requires user-side and item-side features. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/context/ffm.rst b/docs/source/user_guide/model/context/ffm.rst new file mode 100644 index 000000000..95119ae48 --- /dev/null +++ b/docs/source/user_guide/model/context/ffm.rst @@ -0,0 +1,73 @@ +FFM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2959100.2959134>`_ + +**Title:** Field-aware Factorization Machines for CTR Prediction + +**Authors:** Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, Chih-Jen Lin + +**Abstract:** Click-through rate (CTR) prediction plays an important role in computational advertising. Models based on degree-2 polynomial mappings and factorization machines (FMs) are widely used for this task. Recently, a variant of FMs, field-aware factorization machines (FFMs), outperforms existing models in some world-wide CTR-prediction competitions. Based on our experiences in winning two of them, in this paper we establish FFMs as an effective method for classifying large sparse data including those from CTR prediction. First, we propose efficient implementations for training FFMs. Then we comprehensively analyze FFMs and compare this approach with competing models. Experiments show that FFMs are very useful for certain classification problems. Finally, we have released a package of FFMs for public use. + +.. image:: ../../../asset/ffm.png + :width: 500 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``fields (dict)`` : This parameter defines the mapping from fields to features, key is field's id, value is a list of features in this field. For example, in ml-100k dataset, it can be set as ``{0: ['user_id','age'], 1: ['item_id', 'class']}``. If it is set to ``None``, the features and the fields are corresponding one-to-one. Defaults to ``None``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FFM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- The features defined in ``fields`` must be in the dataset and be loaded by data module in RecBole. It means the value in ``fields`` must appear in ``load_col``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/context/fm.rst b/docs/source/user_guide/model/context/fm.rst new file mode 100644 index 000000000..6565fac29 --- /dev/null +++ b/docs/source/user_guide/model/context/fm.rst @@ -0,0 +1,67 @@ +FM +=========== + +Introduction +--------------------- + +`[paper] <https://ieeexplore.ieee.org/abstract/document/5694074/>`_ + +**Title:** Factorization Machines + +**Authors:** Steffen Rendle + +**Abstract:** In this paper, we introduce Factorization Machines (FM) which are a new model class that combines the advantages of Support Vector Machines (SVM) with factorization models. Like SVMs, FMs are a general predictor working with any real valued feature vector. In contrast to SVMs, FMs model all interactions between variables using factorized parameters. Thus they are able to estimate interactions even in problems with huge sparsity (like recommender systems) where SVMs fail. We show that the model equation of FMs can be calculated in linear time and thus FMs can be optimized directly. So unlike nonlinear SVMs, a transformation in the dual form is not necessary and the model parameters can be estimated directly without the need of any support vector in the solution. We show the relationship to SVMs and the advantages of FMs for parameter estimation in sparse settings. On the other hand there are many different factorization models like matrix factorization, parallel factor analysis or specialized models like SVD++, PITF or FPMC. The drawback of these models is that they are not applicable for general prediction tasks but work only with special input data. Furthermore their model equations and optimization algorithms are derived individually for each task. We show that FMs can mimic these models just by specifying the input data (i.e. the feature vectors). This makes FMs easily applicable even for users without expert knowledge in factorization models. + +.. image:: ../../../asset/fm.png + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/context/fnn.rst b/docs/source/user_guide/model/context/fnn.rst new file mode 100644 index 000000000..358287123 --- /dev/null +++ b/docs/source/user_guide/model/context/fnn.rst @@ -0,0 +1,71 @@ +FNN +=========== + +Introduction +--------------------- + +`[paper] <https://link.springer.com/chapter/10.1007/978-3-319-30671-1_4>`_ + +**Title:** Deep Learning over Multi-field Categorical Data + +**Authors:** Weinan Zhang, Tianming Du, and Jun Wang + +**Abstract:** Predicting user responses, such as click-through rate and conversion rate, are critical in many web applications including web search, personalised recommendation, and online advertising. Different from continuous raw features that we usually found in the image and audio domains, the input features in web space are always of multi-field and are mostly discrete and categorical while their dependencies are little known. Major user response prediction models have to either limit themselves to linear models or require manually building up high-order combination features. The former loses the ability of exploring feature interactions, while the latter results in a heavy computation in the large feature space. To tackle the issue, we propose two novel models using deep neural networks (DNNs) to automatically learn effective patterns from categorical feature interactions and make predictions of users’ ad clicks. To get our DNNs efficiently work, we propose to leverage three feature transformation methods, i.e., factorisation machines (FMs), restricted Boltzmann machines (RBMs) and denoising auto-encoders (DAEs). This paper presents the structure of our models and their efficient training algorithms. The large-scale experiments with real-world data demonstrate that our methods work better than major state-of-the-art models. + +.. image:: ../../../asset/fnn.png + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[256,256,256]``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.2``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FNN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size in ['[128,256,128]','[128,128,128]','[64,128,64]','[256,256,256]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/context/fwfm.rst b/docs/source/user_guide/model/context/fwfm.rst new file mode 100644 index 000000000..15c41be3e --- /dev/null +++ b/docs/source/user_guide/model/context/fwfm.rst @@ -0,0 +1,74 @@ +FwFM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3178876.3186040>`_ + +**Title:** Field-weighted Factorization Machines for Click-Through Rate Prediction in Display Advertising + +**Authors:** Junwei Pan, Jian Xu, Alfonso Lobos Ruiz, Wenliang Zhao, Shengjun Pan, Yu Sun, Quan Lu + +**Abstract:** Click-through rate (CTR) prediction is a critical task in online display advertising. The data involved in CTR prediction are typically multi-field categorical data, i.e., every feature is categorical and belongs to one and only one field. One of the interesting characteristics of such data is that features from one field often interact differently with features from different other fields. Recently, Field-aware Factorization Machines (FFMs) have been among the best performing models for CTR prediction by explicitly modeling such difference. However, the number of parameters in FFMs is in the order of feature number times field number, which is unacceptable in the real-world production systems. In this paper, we propose Field-weighted Factorization Machines (FwFMs) to model the different feature interactions between different fields in a much more memory-efficient way. Our experimental evaluations show that FwFMs can achieve competitive prediction performance with only as few as 4% parameters of FFMs. When using the same number of parameters, FwFMs can bring 0.92% and 0.47% AUC lift over FFMs on two real CTR prediction data sets. + +.. image:: ../../../asset/fwfm.png + :width: 500 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.0``. +- ``fields (dict)`` : This parameter defines the mapping from fields to features, key is field's id, value is a list of features in this field. For example, in ml-100k dataset, it can be set as ``{0: ['user_id','age'], 1: ['item_id', 'class']}``. If it is set to ``None``, the features and the fields are corresponding one-to-one. Defaults to ``None``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FwFM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- The features defined in ``fields`` must be in the dataset and be loaded by data module in RecBole. It means the value in ``fields`` must appear in ``load_col``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/context/lr.rst b/docs/source/user_guide/model/context/lr.rst new file mode 100644 index 000000000..77b8e615a --- /dev/null +++ b/docs/source/user_guide/model/context/lr.rst @@ -0,0 +1,67 @@ +LR +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/1242572.1242643>`_ + +**Title:** Predicting Clicks Estimating the Click-Through Rate for New Ads + +**Authors:** Matthew Richardson, Ewa Dominowska, Robert Ragno + +**Abstract:** Search engine advertising has become a significant element of the Web browsing experience. Choosing the right ads for the query and the order in which they are displayed greatly affects the probability that a user will see and click on each ad. This ranking has a strong impact on the revenue the search engine receives from the ads. Further, showing the user an ad that they prefer to click on improves user satisfaction. For these reasons, it is important to be able to accurately estimate the click-through rate of ads in the system. For ads that have been displayed repeatedly, this is empirically measurable, but for new ads, other means must be used. We show that we can use features of ads, terms, and advertisers to learn a model that accurately predicts the click-though rate for new ads. We also show that using our model improves the convergence and performance of an advertising system. As a result, our model increases both revenue and user satisfaction. + +.. image:: ../../../asset/lr.png + :width: 500 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='LR', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/context/nfm.rst b/docs/source/user_guide/model/context/nfm.rst new file mode 100644 index 000000000..0c4e8472c --- /dev/null +++ b/docs/source/user_guide/model/context/nfm.rst @@ -0,0 +1,75 @@ +NFM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3077136.3080777>`_ + +**Title:** Neural Factorization Machines for Sparse Predictive Analytics + +**Authors:** Xiangnan He, Tat-Seng Chua + +**Abstract:** Many predictive tasks of web applications need to model categorical variables, such as user IDs and demographics like genders and occupations. To apply standard machine learning techniques, these categorical predictors are always converted to a set of binary features via one-hot encoding, making the resultant feature vector highly sparse. To learn from such sparse data effectively, it is crucial to account for the interactions between features. + +*Factorization Machines* (FMs) are a popular solution for efficiently using the second-order feature interactions. However, FM models feature interactions in a linear way, which can be insufficient for capturing the non-linear and complex inherent structure of real-world data. While deep neural networks have recently been applied to learn non-linear feature interactions in industry, such as the *Wide&Deep* by Google and *DeepCross* by Microsoft, the deep structure meanwhile makes them difficult to train. + +In this paper, we propose a novel model *Neural Factorization Machine* (NFM) for prediction under sparse settings. NFM seamlessly combines the linearity of FM in modelling second-order feature interactions and the non-linearity of neural network in modelling higher-order feature interactions. Conceptually, NFM is more expressive than FM since FM can be seen as a special case of NFM without hidden layers. Empirical results on two regression tasks show that with one hidden layer only, NFM significantly outperforms FM with a 7.3% relative improvement. Compared to the recent deep learning methods Wide&Deep and DeepCross, our NFM uses a shallower structure but offers better performance, being much easier to train and tune in practice. + +.. image:: ../../../asset/nfm.jpg + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[64, 64, 64]``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.0``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NFM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[10,10]','[20,20]','[30,30]','[40,40]','[50,50]',[20,20,20]','[30,30,30]','[40,40,40]','[50,50,50]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/context/pnn.rst b/docs/source/user_guide/model/context/pnn.rst new file mode 100644 index 000000000..75fe48d13 --- /dev/null +++ b/docs/source/user_guide/model/context/pnn.rst @@ -0,0 +1,75 @@ +PNN +=========== + +Introduction +--------------------- + +`[paper] <https://ieeexplore.ieee.org/abstract/document/7837964/>`_ + +**Title:** Product-based neural networks for user response prediction + +**Authors:** Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, Jun Wang + +**Abstract:** Predicting user responses, such as clicks and conversions, is of great importance and has found its usage inmany Web applications including recommender systems, webs earch and online advertising. The data in those applications is mostly categorical and contains multiple fields, a typical representation is to transform it into a high-dimensional sparse binary feature representation via one-hot encoding. Facing with the extreme sparsity, traditional models may limit their capacity of mining shallow patterns from the data, i.e. low-order feature combinations. Deep models like deep neural networks, on the other hand, cannot be directly applied for the high-dimensional input because of the huge feature space. In this paper, we propose a Product-based Neural Networks (PNN) with an embedding layer to learn a distributed representation of the categorical data, a product layer to capture interactive patterns between interfieldcategories, and further fully connected layers to explore high-order feature interactions. Our experimental results on two-large-scale real-world ad click datasets demonstrate that PNNs consistently outperform the state-of-the-art models on various metrics. + +.. image:: ../../../asset/pnn.jpg + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[128, 256, 128]``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.0``. +- ``use_inner (bool)`` : Whether to use the inner product in the model. Defaults to ``True``. +- ``use_outer (bool)`` : Whether to use the outer product in the model. Defaults to ``False``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``0.0``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='PNN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]'] + reg_weight choice [0.0] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/context/widedeep.rst b/docs/source/user_guide/model/context/widedeep.rst new file mode 100644 index 000000000..8428a281e --- /dev/null +++ b/docs/source/user_guide/model/context/widedeep.rst @@ -0,0 +1,73 @@ +WideDeep +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2988450.2988454>`_ + +**Title:** Wide & Deep Learning for Recommender Systems + +**Authors:** Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, Hemal Shah + +**Abstract:** Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide & Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide & Deep significantly increased app acquisitions compared with wide-only and deep-only models. We have also open-sourced our implementation in TensorFlow. + +.. image:: ../../../asset/widedeep.png + :width: 700 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[32, 16, 8]``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.1``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='WideDeep', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/context/xdeepfm.rst b/docs/source/user_guide/model/context/xdeepfm.rst new file mode 100644 index 000000000..8efbaef0a --- /dev/null +++ b/docs/source/user_guide/model/context/xdeepfm.rst @@ -0,0 +1,102 @@ +xDeepFM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3219819.3220023>`_ + +**Title:** xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems + +**Authors:** Jianxun Lian, Xiaohuan Zhou, Fuzheng Zhang, +Zhongxia Chen, Xing Xie, Guangzhong Sun + +**Abstract:** Combinatorial features are essential for the success of many commercial +models. Manually crafting these features usually comes +with high cost due to the variety, volume and velocity of raw data +in web-scale systems. Factorization based models, which measure +interactions in terms of vector product, can learn patterns of combinatorial +features automatically and generalize to unseen features +as well. With the great success of deep neural networks (DNNs) +in various fields, recently researchers have proposed several DNNbased +factorization model to learn both low- and high-order feature +interactions. Despite the powerful ability of learning an arbitrary +function from data, plain DNNs generate feature interactions implicitly +and at the bit-wise level. In this paper, we propose a novel +Compressed Interaction Network (CIN), which aims to generate +feature interactions in an explicit fashion and at the vector-wise +level. We show that the CIN share some functionalities with convolutional +neural networks (CNNs) and recurrent neural networks +(RNNs). We further combine a CIN and a classical DNN into one +unified model, and named this new model eXtreme Deep Factorization +Machine (xDeepFM). On one hand, the xDeepFM is able +to learn certain bounded-degree feature interactions explicitly; on +the other hand, it can learn arbitrary low- and high-order feature +interactions implicitly. We conduct comprehensive experiments on +three real-world datasets. Our results demonstrate that xDeepFM +outperforms state-of-the-art models. + +.. image:: ../../../asset/xdeepfm.png + :width: 500 + :align: center + +Quick Start with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of features. Defaults to ``10``. +- ``mlp_hidden_size (list of int)`` : The hidden size of MLP layers. Defaults to ``[128,128,128]``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``5e-4``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.2``. +- ``direct (bool)`` : Whether the output of the current layer will be output directly or not. When it is set to ``False``, the output of the current layer will be equally devided into two parts, one part will be the input of the next hidden layer, and the other part will be output directly. Defaults to ``False``. +- ``cin_layer_size (list of int)`` : The size of CIN layers. Defaults to ``[100,100,100]`` + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='xDeepFM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]'] + cin_layer_size choice ['[60,60,60]','[80,80,80]','[100,100,100]','[120,120,120]'] + reg_weight choice [1e-7,1e-5,5e-4,1e-3] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/context/xgboost.rst b/docs/source/user_guide/model/context/xgboost.rst new file mode 100644 index 000000000..c290aafbb --- /dev/null +++ b/docs/source/user_guide/model/context/xgboost.rst @@ -0,0 +1,51 @@ +XGBOOST(External algorithm library) +===================================== + +Introduction +--------------------- + +`[XGBoost] <https://xgboost.readthedocs.io/en/latest/>`_ + +**XGBoost** is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples. + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``convert_token_to_onehot (bool)`` : If True, the token type features will be converted into onehot form. Defaults to ``False``. +- ``token_num_threhold (int)`` : The threshold of doing onehot conversion. + +- ``xgb_silent (bool, optional)`` : Whether print messages during construction. +- ``xgb_nthread (int, optional)`` : Number of threads to use for loading data when parallelization is applicable. If -1, uses maximum threads available on the system. +- ``xgb_model (file name of stored xgb model or 'Booster' instance)`` :Xgb model to be loaded before training. +- ``xgb_params (dict)`` : Booster params. +- ``xgb_num_boost_round (int)`` : Number of boosting iterations. +- ``xgb_early_stopping_rounds (int)`` : Activates early stopping. +- ``xgb_verbose_eval (bool or int)`` : If verbose_eval is True then the evaluation metric on the validation set is printed at each boosting stage. If verbose_eval is an integer then the evaluation metric on the validation set is printed at every given verbose_eval boosting stage. + +Please refer to [XGBoost Python package](https://xgboost.readthedocs.io/en/latest/python/python_api.html) for more details. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='xgboost', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/bpr.rst b/docs/source/user_guide/model/general/bpr.rst new file mode 100644 index 000000000..ee9c6492d --- /dev/null +++ b/docs/source/user_guide/model/general/bpr.rst @@ -0,0 +1,80 @@ +BPR +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.5555/1795114.1795167>`_ + +**Title:** BPR Bayesian Personalized Ranking from Implicit Feedback + +**Authors:** Steffen Rendle, Christoph Freudenthaler, Zeno Gantner and Lars Schmidt-Thieme + +**Abstract:** Item recommendation is the task of predicting a personalized ranking on a set of items (e.g. websites, movies, products). +In this paper, we investigate the most common scenario with implicit feedback (e.g. clicks, purchases). +There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or +adaptive knearest-neighbor (kNN). Even though these methods are designed for the item prediction task of personalized +ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion +BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem. +We also provide a generic learning algorithm for optimizing models with respect to BPR-Opt. The learning method is based +on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art +recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of personalized +ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the +importance of optimizing models for the right criterion. + +.. image:: ../../../asset/bpr.png + :width: 500 + :align: center + + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. + + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='BPR', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/cdae.rst b/docs/source/user_guide/model/general/cdae.rst new file mode 100644 index 000000000..ca0e8716f --- /dev/null +++ b/docs/source/user_guide/model/general/cdae.rst @@ -0,0 +1,74 @@ +CDAE +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2835776.2835837>`_ + +**Title:** Collaborative Denoising Auto-Encoders for Top-N Recommender Systems + +**Authors:** Yao Wu, Christopher DuBois, Alice X. Zheng, Martin Ester + +**Abstract:** Most real-world recommender services measure their performance based on the top-N results shown to the end users. Thus, advances in top-N recommendation have far-ranging consequences in practical applications. In this paper, we present a novel method, called Collaborative Denoising Auto-Encoder (CDAE), for top-N recommendation that utilizes the idea of Denoising Auto-Encoders. We demonstrate that the proposed model is a generalization of several well-known collaborative filtering models but with more flexible components. Thorough experiments are conducted to understand the performance of CDAE under various component settings. Furthermore, experimental results on several public datasets demonstrate that CDAE consistently outperforms state-of-the-art top-N recommendation methods on a variety of common evaluation metrics. + +.. image:: ../../../asset/cdae.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``loss_type (str)`` : The loss function of model, now we support ``[BCE, MSE]``. Defaults to ``BCE``. +- ``hid_activation (str)`` : The hidden layer activation function, now we support ``[sigmoid, relu, tanh]`` Defaults to ``relu``. +- ``out_activation (str)`` : The output layer activation function, now we support ``[sigmoid, relu]``. Defaults to ``sigmoid``. +- ``corruption_ratio (float)`` : The corruption ratio of the input. Defaults to ``0.5``. +- ``embedding_size (int)`` : The embedding size of user. Defaults to ``64``. +- ``reg_weight_1 (float)`` : L1-regularization weight. Defaults to ``0.``. +- ``reg_weight_2 (float)`` : L2-regularization weight. Defaults to ``0.01``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='CDAE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Note**: Because this model is a non-sampling model, so you must set ``training_neg_sample=0`` when you run this model. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/convncf.rst b/docs/source/user_guide/model/general/convncf.rst new file mode 100644 index 000000000..36cd1303b --- /dev/null +++ b/docs/source/user_guide/model/general/convncf.rst @@ -0,0 +1,79 @@ +ConvNCF +=========== + +Introduction +--------------------- + +`[paper] <https://www.ijcai.org/Proceedings/2018/308>`_ + +**Title:** Outer Product-based Neural Collaborative Filtering + +**Authors:** Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang and Tat-Seng Chua + +**Abstract:** In this work, we contribute a new multi-layer neural network architecture named ONCF to perform collaborative filtering. The idea is to use an outer product to explicitly model the pairwise correlations between the dimensions of the embedding space. In contrast to existing neural recommender models that combine user embedding and item embedding via a simple concatenation or element-wise product, our proposal of using outer product above the embedding layer results in a two-dimensional interaction map that is more expressive and semantically plausible. +Above the interaction map obtained by outer product, we propose to employ a convolutional neural network to learn high-order correlations among embedding dimensions. Extensive experiments on two public implicit feedback data demonstrate the effectiveness of our proposed ONCF framework, in particular, the positive effect of using outer product to model the correlations between embedding dimensions in the low level of multi-layer neural recommender model. + +.. image:: ../../../asset/convncf.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``cnn_channels (list)`` : The number of channels in each convolutional neural network layer. Defaults to ``[1, 32, 32, 32, 32]``. +- ``cnn_kernels (list)`` : The size of convolutional kernel in each convolutional neural network layer. Defaults to ``[4, 4, 2, 2]``. +- ``cnn_strides (list)`` : The strides of convolution in each convolutional neural network layer. Defaults to ``[4, 4, 2, 2]``. +- ``dropout_prob (float)`` : The dropout rate in the linear predict layer. Defaults to ``0.2``. +- ``reg_weights (list)`` : The L2 regularization weights. Defaults to ``[0.1, 0.1]``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='ConvNCF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + cnn_channels choice ['[1,128,128,64,32]','[1,32,32,32,32,32,32]','[1,64,32,32,32,32]','[1,64,32,32,32]'] + cnn_kernels choice ['[4,4,2,2]','[2,2,2,2,2,2]','[4,2,2,2,2]','[8,4,2]'] + cnn_strides choice ['[4,4,2,2]','[2,2,2,2,2,2]','[4,2,2,2,2]','[8,4,2]'] + reg_weights choice ['[0.1,0.1]','[0.2,0.2]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/dgcf.rst b/docs/source/user_guide/model/general/dgcf.rst new file mode 100644 index 000000000..01f30c0c1 --- /dev/null +++ b/docs/source/user_guide/model/general/dgcf.rst @@ -0,0 +1,110 @@ +DGCF +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3397271.3401137>`_ + +**Title:** Disentangled Graph Collaborative Filtering + +**Authors:** Xiang Wang, Hongye Jin, An Zhang, Xiangnan He, Tong Xu, Tat-Seng Chua + +**Abstract:** Learning informative representations of users and items from the +interaction data is of crucial importance to collaborative filtering +(CF). Present embedding functions exploit user-item relationships +to enrich the representations, evolving from a single user-item +instance to the holistic interaction graph. Nevertheless, they largely +model the relationships in a uniform manner, while neglecting +the diversity of user intents on adopting the items, which could +be to pass time, for interest, or shopping for others like families. +Such uniform approach to model user interests easily results in +suboptimal representations, failing to model diverse relationships +and disentangle user intents in representations. + +In this work, we pay special attention to user-item relationships +at the finer granularity of user intents. We hence devise a new +model, Disentangled Graph Collaborative Filtering (DGCF), to +disentangle these factors and yield disentangled representations. +Specifically, by modeling a distribution over intents for each +user-item interaction, we iteratively refine the intent-aware +interaction graphs and representations. Meanwhile, we encourage +independence of different intents. This leads to disentangled +representations, effectively distilling information pertinent to each +intent. We conduct extensive experiments on three benchmark +datasets, and DGCF achieves significant improvements over several +state-of-the-art models like NGCF, DisenGCN, and +MacridVAE. Further analyses offer insights into the advantages +of DGCF on the disentanglement of user intents and interpretability +of representations. + +.. image:: ../../../asset/dgcf.jpg + :width: 700 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``n_factors (int)`` : The number of factors for disentanglement. Defaults to ``4``. +- ``n_iterations (int)`` : The number of iterations for each layer. Defaults to ``2``. +- ``n_layers (int)`` : The number of reasoning layers. Defaults to ``1``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-03``. +- ``cor_weight (float)`` : The correlation loss weight. Defaults to ``0.01``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='DGCF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- ``embedding_size`` needs to be exactly divisible by ``n_factors`` + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + n_factors choice [2,4,8] + reg_weight choice [1e-03] + cor_weight choice [0.005,0.01,0.02,0.05] + n_layers choice [1] + n_iterations choice [2] + delay choice [1e-03] + cor_delay choice [1e-02] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/dmf.rst b/docs/source/user_guide/model/general/dmf.rst new file mode 100644 index 000000000..ce6490d11 --- /dev/null +++ b/docs/source/user_guide/model/general/dmf.rst @@ -0,0 +1,80 @@ +DMF +=========== + +Introduction +--------------------- + +`[paper] <https://www.ijcai.org/Proceedings/2017/447>`_ + +**Title:** Deep Matrix Factorization Models for Recommender Systems + +**Authors:** Hong-Jian Xue, Xin-Yu Dai, Jianbing Zhang, Shujian Huang, Jiajun Chen + +**Abstract:** Recommender systems usually make personalized recommendation with user-item interaction ratings, implicit feedback and auxiliary information. Matrix factorization is the basic idea to predict a personalized ranking over a set of items for an individual user with the similarities among users and items. In this paper, we propose a novel matrix factorization model with neural network architecture. Firstly, we construct a user-item matrix with explicit ratings and non-preference implicit feedback. With this matrix as the input, we present a deep structure learning architecture to learn a common low dimensional space for the representations of users and items. Secondly, we design a new loss function based on binary cross entropy, in which we consider both explicit ratings and implicit feedback for a better optimization. The experimental results show the effectiveness of both our proposed model and the loss function. On several benchmark datasets, our model outperformed other state-of-the-art methods. We also conduct extensive experiments to evaluate the performance within different experimental settings. + +.. image:: ../../../asset/dmf.jpg + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``user_embedding_size (int)`` : The initial embedding size of users. Defaults to ``64``. +- ``item_embedding_size (int)`` : The initial embedding size of items. Defaults to ``64``. +- ``user_hidden_size_list (list)`` : The hidden size of each layer in MLP for users, the length of list is equal to the number of layers. Defaults to ``[64,64]``. +- ``item_hidden_size_list (list)`` : The hidden size of each layer in MLP for items, the length of list is equal to the number of layers. Defaults to ``[64,64]``. +- ``inter_matrix_type (str)`` : Use the implicit interaction matrix or the rating matrix. Defaults to ``'01'``. Range in ``['01', 'rating']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='DMF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- The last value in ``user_hidden_size_list`` and ``item_hidden_size_list`` must be the same. + +- If you set ``inter_matrix_type='rating'``, the 'rating' field from \*.inter atomic files must be remained when loading dataset. It means that 'rating' must be appeared in ``load_col``. Besides, if you use 'rating' field to filter the dataset, please set ``drop_filter_field=False``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + user_layers_dim choice ['[64, 64]','[64, 32]','[128,64']] + item_layers_dim choice ['[64, 64]','[64, 32]','[128,64']] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/enmf.rst b/docs/source/user_guide/model/general/enmf.rst new file mode 100644 index 000000000..6b30a2d05 --- /dev/null +++ b/docs/source/user_guide/model/general/enmf.rst @@ -0,0 +1,76 @@ +ENMF +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3373807>`_ + +**Title:** Efficient Neural Matrix Factorization without Sampling for Recommendation + +**Authors:** Chen, Chong and Zhang, Min and Wang, Chenyang and Ma, Weizhi and Li, Minming and Liu, Yiqun and Ma, Shaoping + +**Abstract:** Recommendation systems play a vital role to keep users engaged with personalized contents in modern online platforms. Recently, deep learning has revolutionized many research fields and there is a surge of interest in applying it for recommendation. However, existing studies have largely focused on exploring complex deep-learning architectures for recommendation task, while typically applying the negative sampling strategy for model learning. Despite effectiveness, we argue that these methods suffer from two important limitations: (1) the methods with complex network structures have a substantial number of parameters, and require expensive computations even with a sampling-based learning strategy; (2) the negative sampling strategy is not robust, making sampling-based methods difficult to achieve the optimal performance in practical applications. + +In this work, we propose to learn neural recommendation models from the whole training data without sampling. However, such a non-sampling strategy poses strong challenges to learning efficiency. To address this, we derive three new optimization methods through rigorous mathematical reasoning, which can efficiently learn model parameters from the whole data (including all missing data) with a rather low time complexity. Moreover, based on a simple Neural Matrix Factorization architecture, we present a general framework named ENMF, short for *Efficient Neural Matrix Factorization*. Extensive experiments on three real-world public datasets indicate that the proposed ENMF framework consistently and significantly outperforms the state-of-the-art methods on the Top-K recommendation task. Remarkably, ENMF also shows significant advantages in training efficiency, which makes it more applicable to real-world large-scale systems. + +.. image:: ../../../asset/enmf.jpg + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``dropout_prob (float)`` : The dropout ratio of the embedding. Defaults to ``0.7``. +- ``embedding_size (int)`` : The embedding size of user. Defaults to ``64``. +- ``reg_weight (float)`` : L2-regularization weight. Defaults to ``0.``. +- ``negative_weight (float)`` : The weight of non-observed data. Defaults to ``0.5``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='ENMF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Note**: Because this model is a non-sampling model, so you must set ``training_neg_sample=0`` when you run this model. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + + negative_weight choice [0.001,0.005,0.01,0.02,0.05,0.1,0.2,0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/fism.rst b/docs/source/user_guide/model/general/fism.rst new file mode 100644 index 000000000..c93eaf00d --- /dev/null +++ b/docs/source/user_guide/model/general/fism.rst @@ -0,0 +1,82 @@ +FISM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2487575.2487589>`_ + +**Title:** FISM: Factored Item Similarity Models for Top-N Recommender Systems + +**Authors:** Santosh Kabbur, Xia Ning, George Karypis + +**Abstract:** The effectiveness of existing top-N recommendation methods decreases as +the sparsity of the datasets increases. To alleviate this problem, we present an +item-based method for generating top-N recommendations that learns the itemitem +similarity matrix as the product of two low dimensional latent factor matrices. +These matrices are learned using a structural equation modeling approach, wherein the +value being estimated is not used for its own estimation. A comprehensive set of +experiments on multiple datasets at three different sparsity levels indicate that +the proposed methods can handle sparse datasets effectively and outperforms other +state-of-the-art top-N recommendation methods. The experimental results also show +that the relative performance gains compared to competing methods increase as the +data gets sparser. + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``split_to (int)`` : This is a parameter used to reduce the GPU memory usage during the evaluation. The larger the value, the less the memory usage and the slower the evaluation speed. Defaults to ``0``. +- ``alpha (float)`` : It is a hyper-parameter controlling the normalization effect of the number of user history interactions when calculating the similarity. Defaults to ``0``. +- ``reg_weights (list)`` : The L2 regularization weights. Defaults to ``[1e-2, 1e-2]``. + + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FISM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + reg_weights choice ['[1e-7, 1e-7]','[0, 0]'] + alpha choice [0] + weight_size choice [64] + beta choice [0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/gcmc.rst b/docs/source/user_guide/model/general/gcmc.rst new file mode 100644 index 000000000..efb985579 --- /dev/null +++ b/docs/source/user_guide/model/general/gcmc.rst @@ -0,0 +1,89 @@ +GCMC +=========== + +Introduction +--------------------- + +`[paper] <https://arxiv.org/abs/1706.02263>`_ + +**Title:** Graph Convolutional Matrix Completion + +**Authors:** Rianne van den Berg, Thomas N. Kipf, Max Welling + +**Abstract:** We consider matrix completion for recommender systems from the point of view of +link prediction on graphs. Interaction data +such as movie ratings can be represented by a +bipartite user-item graph with labeled edges +denoting observed ratings. Building on recent +progress in deep learning on graph-structured +data, we propose a graph auto-encoder framework based on differentiable message passing +on the bipartite interaction graph. Our model +shows competitive performance on standard +collaborative filtering benchmarks. In settings +where complimentary feature information or +structured data such as a social network is +available, our framework outperforms recent +state-of-the-art methods. + +.. image:: ../../../asset/gcmc.png + :width: 700 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``accum (str)`` : The accumulation function in the GCN layers. Defaults to ``'stack'``. Range in ``['sum', 'stack']``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.3``. +- ``gcn_output_dim (int)`` : The output dimension of GCN layer in GCN encoder. Defaults to ``500``. +- ``embedding_size (int)`` : The embedding size of user and item. Defaults to ``64``. +- ``sparse_feature (bool)`` : Whether to use sparse tensor to represent the features. Defaults to ``True``. +- ``class_num (int)`` : Number of rating types. Defaults to ``2``. +- ``num_basis_functions (int)`` : Number of basis functions for BiDecoder. Defaults to ``2``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='GCMC', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + accum choice ['stack','sum'] + gcn_output_dim choice [500,256,1024] + num_basis_functions choice ['2'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/general/itemknn.rst b/docs/source/user_guide/model/general/itemknn.rst new file mode 100644 index 000000000..1f63e6228 --- /dev/null +++ b/docs/source/user_guide/model/general/itemknn.rst @@ -0,0 +1,81 @@ +ItemKNN +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/963770.963776>`_ + +**Title:** Item-based top-N recommendation algorithms + +**Authors:** Mukund Deshpande and George Karypis + +**Abstract:** The explosive growth of the world-wide-web and the emergence of e-commerce has led to the development of recommender systems—a personalized information filtering technology used to identify +a set of items that will be of interest to a certain user. User-based collaborative filtering is the most +successful technology for building recommender systems to date and is extensively used in many +commercial recommender systems. Unfortunately, the computational complexity of these methods +grows linearly with the number of customers, which in typical commercial applications can be several millions. To address these scalability concerns model-based recommendation techniques have +been developed. These techniques analyze the user–item matrix to discover relations between the +different items and use these relations to compute the list of recommendations. + +In this article, we present one such class of model-based recommendation algorithms that first +determines the similarities between the various items and then uses them to identify the set of +items to be recommended. The key steps in this class of algorithms are (i) the method used to +compute the similarity between the items, and (ii) the method used to combine these similarities +in order to compute the similarity between a basket of items and a candidate recommender item. +Our experimental evaluation on eight real datasets shows that these item-based algorithms are +up to two orders of magnitude faster than the traditional user-neighborhood based recommender +systems and provide recommendations with comparable or better quality. + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``k (int)`` : The neighborhood size. Defaults to ``100``. + +- ``shrink (float)`` : A normalization hyper parameter in calculate cosine distance. Defaults to ``0.0``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='ItemKNN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + k choice [10,50,100,200,250,300,400,500,1000,1500,2000,2500] + shrink choice [0.0,1.0] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/lightgcn.rst b/docs/source/user_guide/model/general/lightgcn.rst new file mode 100644 index 000000000..01127d929 --- /dev/null +++ b/docs/source/user_guide/model/general/lightgcn.rst @@ -0,0 +1,102 @@ +LightGCN +============ + +Introduction +------------------ + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3397271.3401063>`_ + +**Title:** LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation + +**Authors:** Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, Meng Wang + +**Abstract:** +Graph Convolution Network (GCN) has become new state-ofthe-art for collaborative filtering. Nevertheless, the reasons of +its effectiveness for recommendation are not well understood. +Existing work that adapts GCN to recommendation lacks thorough +ablation analyses on GCN, which is originally designed for graph +classification tasks and equipped with many neural network +operations. However, we empirically find that the two most +common designs in GCNs — feature transformation and nonlinear +activation — contribute little to the performance of collaborative +filtering. Even worse, including them adds to the difficulty of +training and degrades recommendation performance. + +In this work, we aim to simplify the design of GCN to +make it more concise and appropriate for recommendation. We +propose a new model named LightGCN, including only the most +essential component in GCN — neighborhood aggregation — for +collaborative filtering. Specifically, LightGCN learns user and +item embeddings by linearly propagating them on the user-item +interaction graph, and uses the weighted sum of the embeddings +learned at all layers as the final embedding. Such simple, linear, +and neat model is much easier to implement and train, exhibiting +substantial improvements (about 16.0% relative improvement on +average) over Neural Graph Collaborative Filtering (NGCF) — a +state-of-the-art GCN-based recommender model — under exactly +the same experimental setting. Further analyses are provided +towards the rationality of the simple LightGCN from both analytical +and empirical perspectives. + + +.. image:: ../../../asset/lightgcn.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : the embedding size of users and items. Defaults to ``64``. +- ``n_layers (int)`` : The number of layers in lightGCN. Defaults to ``2``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-05``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='LightGCN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + n_layers choice [1,2,3,4] + reg_weight choice [1e-05,1e-04,1e-03,1e-02] + + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/general/line.rst b/docs/source/user_guide/model/general/line.rst new file mode 100644 index 000000000..c154154b2 --- /dev/null +++ b/docs/source/user_guide/model/general/line.rst @@ -0,0 +1,72 @@ +LINE +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2736277.2741093>`_ + +**Title:** LINE: Large-scale Information Network Embedding + +**Authors:** Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, Qiaozhu Mei + +**Abstract:** This paper studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph embedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we propose a novel network embedding method called the ``LINE``, which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks, including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine. The source code of the LINE is available. + +.. image:: ../../../asset/line.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``order (int)`` : The order of proximity of the model. Defaults to ``2``. +- ``second_order_loss_weight (float)`` : The super parameter of the loss of second proximity loss. Defaults to ``1``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='LINE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + training_neg_sample_num choice [1,3,5] + second_order_loss_weight choice [0.3,0.6,1] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/macridvae.rst b/docs/source/user_guide/model/general/macridvae.rst new file mode 100644 index 000000000..cf2cbd72f --- /dev/null +++ b/docs/source/user_guide/model/general/macridvae.rst @@ -0,0 +1,99 @@ +MacridVAE +=========== + +Introduction +--------------------- + +`[paper] <https://jianxinma.github.io/assets/disentangle-recsys.pdf>`_ + +**Title:** Learning Disentangled Representations for Recommendation + +**Authors:** Jianxin Ma, Chang Zhou, Peng Cui, Hongxia Yang, Wenwu Zhu + +**Abstract:** User behavior data in recommender systems are driven by the complex interactions +of many latent factors behind the users’ decision making processes. The factors are +highly entangled, and may range from high-level ones that govern user intentions, +to low-level ones that characterize a user’s preference when executing an intention. +Learning representations that uncover and disentangle these latent factors can bring +enhanced robustness, interpretability, and controllability. However, learning such +disentangled representations from user behavior is challenging, and remains largely +neglected by the existing literature. In this paper, we present the MACRo-mIcro +Disentangled Variational Auto-Encoder (MacridVAE) for learning disentangled +representations from user behavior. Our approach achieves macro disentanglement +by inferring the high-level concepts associated with user intentions (e.g., to buy +a shirt or a cellphone), while capturing the preference of a user regarding the +different concepts separately. A micro-disentanglement regularizer, stemming +from an information-theoretic interpretation of VAEs, then forces each dimension +of the representations to independently reflect an isolated low-level factor (e.g., +the size or the color of a shirt). Empirical results show that our approach can +achieve substantial improvement over the state-of-the-art baselines. We further +demonstrate that the learned representations are interpretable and controllable, +which can potentially lead to a new paradigm for recommendation where users a + +.. image:: ../../../asset/macridvae.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The latent dimension of auto-encoder. Defaults to ``128``. +- ``dropout_prob (float)`` : The drop out probability of input. Defaults to ``0.5``. +- ``kfac (int)`` : Number of facets (macro concepts). ``10``. +- ``nogb (boolean)`` : Disable Gumbel-Softmax sampling. ``False``. +- ``std (float)`` : Standard deviation of the Gaussian prior. ``False``. +- ``encoder_hidden_size (list)`` : The MLP hidden layer. Defaults to ``[600]``. +- ``tau (float)`` : Temperature of sigmoid/softmax, in (0,oo). ``False``. +- ``anneal_cap (float)`` : The super parameter of the weight of KL loss. Defaults to ``0.2``. +- ``total_anneal_steps (int)`` : The maximum steps of anneal update. Defaults to ``200000``. +- ``reg_weights (list)`` : L2 regularization. Defaults to ``[0.0,0.0]``. +- ``training_neg_sample (int)`` : The negative sample num for training. Defaults to ``0``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='MacridVAE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Note**: Because this model is a non-sampling model, so you must set ``training_neg_sample=0`` when you run this model. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + kafc choice [3,5,10,20] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/multidae.rst b/docs/source/user_guide/model/general/multidae.rst new file mode 100644 index 000000000..7ba8cbbf9 --- /dev/null +++ b/docs/source/user_guide/model/general/multidae.rst @@ -0,0 +1,73 @@ +MultiDAE +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3178876.3186150>`_ + +**Title:** Variational Autoencoders for Collaborative Filtering + +**Authors:** Dawen Liang, Rahul G, Matthew D Hoffman, Tony Jebara + +**Abstract:** We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread use in language modeling and economics, the multinomial likelihood receives less attention in the recommender systems literature. We introduce a different regularization parameter for the learning objective, which proves to be crucial for achieving competitive performance. Remarkably, there is an efficient way to tune the parameter using annealing. The resulting model and learning algorithm has information-theoretic connections to maximum entropy discrimination and the information bottleneck principle. Empirically, we show that the proposed approach significantly outperforms several state-of-the-art baselines, including two recently-proposed neural network approaches, on several real-world datasets. We also provide extended experiments comparing the multinomial likelihood with other commonly used likelihood functions in the latent factor collaborative filtering literature and show favorable results. Finally, we identify the pros and cons of employing a principled Bayesian inference approach and characterize settings where it provides the most significant improvements. + +.. image:: ../../../asset/multidae.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``latent_dimendion (int)`` : The latent dimension of auto-encoder. Defaults to ``64``. +- ``mlp_hidden_size (list)`` : The MLP hidden layer. Defaults to ``[600]``. +- ``dropout_prob (float)`` : The drop out probability of input. Defaults to ``0.5``. +- ``training_neg_sample (int)`` : The negative sample num for training. Defaults to ``0``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='MultiDAE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Note**: Because this model is a non-sampling model, so you must set ``training_neg_sample=0`` when you run this model. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/multivae.rst b/docs/source/user_guide/model/general/multivae.rst new file mode 100644 index 000000000..620e02bd4 --- /dev/null +++ b/docs/source/user_guide/model/general/multivae.rst @@ -0,0 +1,75 @@ +MultiVAE +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3178876.3186150>`_ + +**Title:** Variational Autoencoders for Collaborative Filtering + +**Authors:** Dawen Liang, Rahul G, Matthew D Hoffman, Tony Jebara + +**Abstract:** We extend variational autoencoders (VAEs) to collaborative filtering for implicit feedback. This non-linear probabilistic model enables us to go beyond the limited modeling capacity of linear factor models which still largely dominate collaborative filtering research.We introduce a generative model with multinomial likelihood and use Bayesian inference for parameter estimation. Despite widespread use in language modeling and economics, the multinomial likelihood receives less attention in the recommender systems literature. We introduce a different regularization parameter for the learning objective, which proves to be crucial for achieving competitive performance. Remarkably, there is an efficient way to tune the parameter using annealing. The resulting model and learning algorithm has information-theoretic connections to maximum entropy discrimination and the information bottleneck principle. Empirically, we show that the proposed approach significantly outperforms several state-of-the-art baselines, including two recently-proposed neural network approaches, on several real-world datasets. We also provide extended experiments comparing the multinomial likelihood with other commonly used likelihood functions in the latent factor collaborative filtering literature and show favorable results. Finally, we identify the pros and cons of employing a principled Bayesian inference approach and characterize settings where it provides the most significant improvements. + +.. image:: ../../../asset/multivae.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``latent_dimendion (int)`` : The latent dimension of auto-encoder. Defaults to ``128``. +- ``mlp_hidden_size (list)`` : The MLP hidden layer. Defaults to ``[600]``. +- ``dropout_prob (float)`` : The drop out probability of input. Defaults to ``0.5``. +- ``anneal_cap (float)`` : The super parameter of the weight of KL loss. Defaults to ``0.2``. +- ``total_anneal_steps (int)`` : The maximum steps of anneal update. Defaults to ``200000``. +- ``training_neg_sample (int)`` : The negative sample num for training. Defaults to ``0``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='MultiVAE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Note**: Because this model is a non-sampling model, so you must set ``training_neg_sample=0`` when you run this model. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/nais.rst b/docs/source/user_guide/model/general/nais.rst new file mode 100644 index 000000000..2e0f60ea7 --- /dev/null +++ b/docs/source/user_guide/model/general/nais.rst @@ -0,0 +1,95 @@ +NAIS +=========== + +Introduction +--------------------- + +`[paper] <https://doi.ieeecomputersociety.org/10.1109/TKDE.2018.2831682>`_ + +**Title:** NAIS: Neural Attentive Item Similarity Model for Recommendation + +**Authors:** Xiangnan He, Zhankui He, Jingkuan Song, Zhenguang Liu, Yu-Gang Jiang, and Tat-Seng Chua + +**Abstract:** Item-to-item collaborative filtering (aka.item-based CF) has been long used for building +recommender systems in industrial settings, owing to its interpretability and efficiency in real-time +personalization. It builds a user’s profile as her historically interacted items, recommending new items +that are similar to the user’s profile. As such, the key to an item-based CF method is in the estimation +of item similarities. Early approaches use statistical measures such as cosine similarity and Pearson +coefficient to estimate item similarities, which are less accurate since they lack tailored optimization +for the recommendation task. In recent years, several works attempt to learn item similarities from data, +by expressing the similarity as an underlying model and estimating model parameters by optimizing a +recommendation-aware objective function. While extensive efforts have been made to use shallow linear +models for learning item similarities, there has been relatively less work exploring nonlinear neural +network models for item-based CF. In this work, we propose a neural network model named Neural Attentive +Item Similaritymodel(NAIS) for item-based CF. The key to our design of NAIS is an attention network, +which is capable of distinguishing which historical items in a user profile are more important for a prediction. +Compared to the state-of-the-art item-based CF method FactoredItem SimilarityModel(FISM), our NAIS has +stronger representation power with only a few additional parameters brought by the attention network. +Extensive experiments on two public benchmarks demonstrate the effectiveness of NAIS. This work is the first +attempt that designs neural network models for item-based CF, opening up new research possibilities for future +developments of neural recommender systems. + +.. image:: ../../../asset/nais.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``weight_size (int)`` : The vector that projects the hidden layer into an output attention weight. Defaults to ``64``. +- ``algorithm (str)`` : The attention method. Defaults to ``'prod'``. Range in ``['prod', 'concat']``. +- ``split_to (int)`` : This is a parameter used to reduce the GPU memory usage during the evaluation. The larger the value, the less the memory usage and the slower the evaluation speed. Defaults to ``0``. +- ``alpha (float)`` : It is a hyper-parameter controlling the normalization effect of the number of user history interactions when calculating the similarity. Defaults to ``0``. +- ``beta (float)`` : It is the smoothing exponent controlling the denominator of softmax, it will be set in the range of ``[0, 1]``. Obviously, when beta is set to ``1``,it rcovers the softmax function; when ``beta`` is smaller than ``1``,the value of denominator will be suppressed, as a result, the attention weights will not be overly punished for active users. Defaults to ``0.5``. +- ``reg_weights (list)`` : The L2 regularization weights. Defaults to ``[1e-7, 1e-7, 1e-5]``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NAIS', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + weight_size choice [64] + reg_weights choice ['[1e-7, 1e-7, 1e-5]','[0,0,0]'] + alpha choice [0] + beta choice [0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/neumf.rst b/docs/source/user_guide/model/general/neumf.rst new file mode 100644 index 000000000..3ff4548a2 --- /dev/null +++ b/docs/source/user_guide/model/general/neumf.rst @@ -0,0 +1,81 @@ +NeuMF +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3038912.3052569>`_ + +**Title:** Neural Collaborative Filtering + +**Authors:** Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu and Tat-Seng Chua + +**Abstract:** In recent years, deep neural networks have yielded immense success on speech recognition, computer vision and natural language processing. However, the exploration of deep neural networks on recommender systems has received relatively less scrutiny. In this work, we strive to develop techniques based on neural networks to tackle the key problem in recommendation --- collaborative filtering --- on the basis of implicit feedback. + +Although some recent work has employed deep learning for recommendation, they primarily used it to model auxiliary information, such as textual descriptions of items and acoustic features of musics. When it comes to model the key factor in collaborative filtering --- the interaction between user and item features, they still resorted to matrix factorization and applied an inner product on the latent features of users and items. + +By replacing the inner product with a neural architecture that can learn an arbitrary function from data, we present a general framework named NCF, short for Neural network-based Collaborative Filtering. NCF is generic and can express and generalize matrix factorization under its framework. To supercharge NCF modelling with non-linearities, we propose to leverage a multi-layer perceptron to learn the user-item interaction function. Extensive experiments on two real-world datasets show significant improvements of our proposed NCF framework over the state-of-the-art methods. Empirical evidence shows that using deeper layers of neural networks offers better recommendation performance. + +.. image:: ../../../asset/neumf.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``mf_embedding_size (int)`` : The MF embedding size of user and item. Defaults to ``64``. +- ``mlp_embedding_size (int)`` : The MLP embedding size of user and item. Defaults to ``64``. +- ``mlp_hidden_size (list)`` : The hidden size of each layer in MLP, the length of list is equal to the number of layers. Defaults to ``[128,64]``. +- ``dropout_prob (float)`` : The dropout rate in MLP layers. Defaults to ``0.1``. +- ``mf_train (bool)`` : Whether to train the MF part of the model. Defaults to ``True``. +- ``mlp_train (bool)`` : Whether to train the MLP part of the model. Defaults to ``True``. +- ``use_pretrain (bool)`` : Whether to use the pre-trained parameters for MF and MLP part. Defaults to ``False``. +- ``mf_pretrain_path`` : The path of pre-trained MF part model. If ``use_pretrain`` is set to False, it will be ignored. Defaults to ``None``. +- ``mlp_pretrain_path`` : The path of pre-trained MLP part model. If ``use_pretrain`` is set to False, it will be ignored. Defaults to ``None``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NeuMF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + mlp_hidden_size choice ['[64,32,16]','[32,16,8]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/ngcf.rst b/docs/source/user_guide/model/general/ngcf.rst new file mode 100644 index 000000000..b1e12ae01 --- /dev/null +++ b/docs/source/user_guide/model/general/ngcf.rst @@ -0,0 +1,79 @@ +NGCF +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3331184.3331267>`_ + +**Title:** Neural Graph Collaborative Filtering + +**Authors:** Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng and Tat-Seng Chua + +**Abstract:** Learning vector representations (aka. embeddings) of users and items lies at the core of modern recommender systems. Ranging from early matrix factorization to recently emerged deep learning based methods, existing efforts typically obtain a user's (or an item's) embedding by mapping from pre-existing features that describe the user (or the item), such as ID and attributes. We argue that an inherent drawback of such methods is that, the collaborative signal, which is latent in user-item interactions, is not encoded in the embedding process. As such, the resultant embeddings may not be sufficient to capture the collaborative filtering effect. + +In this work, we propose to integrate the user-item interactions - more specifically the bipartite graph structure - into the embedding process. We develop a new recommendation framework Neural Graph Collaborative Filtering (NGCF), which exploits the user-item graph structure by propagating embeddings on it. This leads to the expressive modeling of high-order connectivity in user-item graph, effectively injecting the collaborative signal into the embedding process in an explicit manner. We conduct extensive experiments on three public benchmarks, demonstrating significant improvements over several state-of-the-art models like HOP-Rec and Collaborative Memory Network . Further analysis verifies the importance of embedding propagation for learning better user and item representations, justifying the rationality and effectiveness of NGCF. + +.. image:: ../../../asset/ngcf.jpg + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``hidden_size_list (list)`` : The hidden size of each layer in GCN layers, the length of list is equal to the number of layers. Defaults to ``[64,64,64]``. +- ``node_dropout (float)`` : The dropout rate of node in each GNN layer. Defaults to ``0.0``. +- ``message_dropout (float)`` : The dropout rate of edge in each GNN layer. Defaults to ``0.1``. +- ``reg_weight (list)`` : The L2 regularization weight. Defaults to ``1e-5``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NGCF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + hidden_size_list choice ['[64,64,64]','[128,128,128]','[256,256,256]','[512,512,512]'] + node_dropout choice [0.0,0.1,0.2] + message_dropout choice [0.0,0.1,0.2,0.3] + reg_weight choice [1e-5,1e-4] + delay choice [1e-5,1e-4,1e-4,1e-2,1e-1] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/nncf.rst b/docs/source/user_guide/model/general/nncf.rst new file mode 100644 index 000000000..824483b45 --- /dev/null +++ b/docs/source/user_guide/model/general/nncf.rst @@ -0,0 +1,84 @@ +NNCF +========== + +Introduction +------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3132847.3133083>`_ + +**Title:** A Neural Collaborative Filtering Model with Interaction-based Neighborhood + +**Authors:** Ting Bai, Ji-Rong Wen, Jun Zhang, Wayne Xin Zhao + +**Abstract:** Recently, deep neural networks have been widely applied to recommender systems. A representative work is to utilize deep learning for modeling complex user-item interactions. However, similar to traditional latent factor models by factorizing user-item interactions, they tend to be ineffective to capture localized information. Localized information, such as neighborhood, is important to recommender systems in complementing the user-item interaction data. Based on this consideration, we propose a novel Neighborhood-based Neural Collaborative Filtering model (NNCF). To the best of our knowledge, it is the first time that the neighborhood information is integrated into the neural collaborative filtering methods. Extensive experiments on three real-world datasets demonstrate the effectiveness of our model for the implicit recommendation task. + +.. image:: ../../../asset/nncf.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``ui_embedding_size (int)``: The embedding size of user and item. Defaluts to ``64``. +- ``neigh_embedding_size (int)``: The embedding size of neighborhood information. Defaults to ``64``. +- ``num_conv_kernel (int)``: The number of kernels in convolution layer. Defaults to ``128``. +- ``conv_kernel_size (int)``: The size of kernel in convolution layer. Defaults to ``5``. +- ``pool_kernel_size (int)``: The size of kernel in pooling layer. Defaults to ``5``. +- ``mlp_hidden_size (list)``: The hidden size of each layer in MLP, the length of list is equal to the number of layers. Defaults to ``[128,64,32,16]``. +- ``neigh_num (int)``: The number of neighbors we choose. Defaults to ``20``. +- ``dropout (float)``: The dropout rate in MLP layers. Defaults to ``0.5``. +- ``resolution (float)``: The parameter in louvain algorithm, which decides the size of the community. Defaults to ``1.0``. +- ``use_random (bool)``: Whether to use random method to train neighborhood embedding. Defaults to ``True``. +- ``use_knn (bool)``: Whether to use knn method to train neighborhood embedding. Defaults to ``False``. +- ``use_louvain (bool)``: Whether to use louvain method to train neighborhood embedding. Defaults to ``False``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NNCF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.0005,0.0001,0.00005] + neigh_embedding_size choice [64,32] + mlp_hidden_size choice ['[128,64,32,16]','[64,32,16,8]'] + num_conv choice [128,64] + + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/general/pop.rst b/docs/source/user_guide/model/general/pop.rst new file mode 100644 index 000000000..7171fe4a1 --- /dev/null +++ b/docs/source/user_guide/model/general/pop.rst @@ -0,0 +1,39 @@ +Pop +=========== + +Introduction +--------------------- + +This is a model that records the popularity of items in the dataset and recommend the most popular items to users. + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- No hyper-parameters + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='Pop', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/general/spectralcf.rst b/docs/source/user_guide/model/general/spectralcf.rst new file mode 100644 index 000000000..b222c6f14 --- /dev/null +++ b/docs/source/user_guide/model/general/spectralcf.rst @@ -0,0 +1,81 @@ +SpectralCF +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3240323.3240343>`_ + +**Title:** Spectral collaborative filtering + +**Authors:** Lei Zheng, Chun-Ta Lu, Fei Jiang, Jiawei Zhang, Philip S. Yu + +**Abstract:** Despite the popularity of Collaborative Filtering (CF), CF-based methods are haunted by the cold-start problem, +which has a significantly negative impact on users' experiences with Recommender Systems (RS). In this paper, to overcome the +aforementioned drawback, we first formulate the relationships between users and items as a bipartite graph. Then, we propose +a new spectral convolution operation directly performing in the spectral domain, where not only the proximity information of +a graph but also the connectivity information hidden in the graph are revealed. With the proposed spectral convolution operation, +we build a deep recommendation model called Spectral Collaborative Filtering (SpectralCF). Benefiting from the rich information +of connectivity existing in the spectral domain, SpectralCF is capable of discovering deep connections between users and items +and therefore, alleviates the cold-start problem for CF. To the best of our knowledge, SpectralCF is the first CF-based method +directly learning from the spectral domains of user-item bipartite graphs. We apply our method on several standard datasets. +It is shown that SpectralCF significantly out-performs state-of-the-art models. + +.. image:: ../../../asset/spectralcf.png + :width: 700 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``n_layers (int)`` : The number of layers in SpectralCF. Defaults to ``4``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-3``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='SpectralCF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + reg_weight choice [0.01,0.002,0.001,0.0005] + n_layers choice [1,2,3,4] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/knowledge/cfkg.rst b/docs/source/user_guide/model/knowledge/cfkg.rst new file mode 100644 index 000000000..220897e92 --- /dev/null +++ b/docs/source/user_guide/model/knowledge/cfkg.rst @@ -0,0 +1,89 @@ +CFKG +=========== + +Introduction +--------------------- + +`[paper] <https://www.mdpi.com/1999-4893/11/9/137>`_ + +**Title:** Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation + +**Authors:** Qingyao Ai, Vahid Azizi, Xu Chen and Yongfeng Zhang + +**Abstract:** Providing model-generated explanations in recommender systems is important to user +experience. State-of-the-art recommendation algorithms—especially the collaborative filtering +(CF)-based approaches with shallow or deep models—usually work with various unstructured +information sources for recommendation, such as textual reviews, visual images, and various implicit or +explicit feedbacks. Though structured knowledge bases were considered in content-based approaches, +they have been largely ignored recently due to the availability of vast amounts of data and the learning +power of many complex models. However, structured knowledge bases exhibit unique advantages +in personalized recommendation systems. When the explicit knowledge about users and items is +considered for recommendation, the system could provide highly customized recommendations based +on users’ historical behaviors and the knowledge is helpful for providing informed explanations +regarding the recommended items. A great challenge for using knowledge bases for recommendation is +how to integrate large-scale structured and unstructured data, while taking advantage of collaborative +filtering for highly accurate performance. Recent achievements in knowledge-base embedding (KBE) +sheds light on this problem, which makes it possible to learn user and item representations while +preserving the structure of their relationship with external knowledge for explanation. In this work, +we propose to explain knowledge-base embeddings for explainable recommendation. Specifically, +we propose a knowledge-base representation learning framework to embed heterogeneous entities for +recommendation, and based on the embedded knowledge base, a soft matching algorithm is proposed +to generate personalized explanations for the recommended items. Experimental results on real-world +e-commerce datasets verified the superior recommendation performance and the explainability power +of our approach compared with state-of-the-art baselines. + + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users, items, entities and relations. Defaults to ``64``. +- ``loss_function (str)`` : The optimization loss function. Defaults to ``'inner_product'``. Range in ``['inner_product', 'transe']``. +- ``margin (float)`` : The margin in margin loss, only be used when ``loss_function`` is set to ``'transe'``. Defaults to ``1.0``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='CFKG', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + loss_function choice ['inner_product', 'transe'] + margin choice [0.5,1.0,2.0] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/knowledge/cke.rst b/docs/source/user_guide/model/knowledge/cke.rst new file mode 100644 index 000000000..336542f4c --- /dev/null +++ b/docs/source/user_guide/model/knowledge/cke.rst @@ -0,0 +1,88 @@ +CKE +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2939672.2939673>`_ + +**Title:** Collaborative Knowledge Base Embedding for Recommender Systems + +**Authors:** Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, Wei-Ying Ma + +**Abstract:** Among different recommendation techniques, collaborative filtering usually suffer from limited performance due to the sparsity +of user-item interactions. To address the issues, auxiliary information is usually used to boost the performance. Due to the rapid +collection of information on the web, the knowledge base provides +heterogeneous information including both structured and unstructured data with different semantics, which can be consumed by various applications. In this paper, we investigate how to leverage +the heterogeneous information in a knowledge base to improve the +quality of recommender systems. First, by exploiting the knowledge base, we design three components to extract items’ semantic +representations from structural content, textual content and visual content, respectively. To be specific, we adopt a heterogeneous +network embedding method, termed as TransR, to extract items’ +structural representations by considering the heterogeneity of both +nodes and relationships. We apply stacked denoising auto-encoders +and stacked convolutional auto-encoders, which are two types of +deep learning based embedding techniques, to extract items’ textual representations and visual representations, respectively. Finally, we propose our final integrated framework, which is termed as +Collaborative Knowledge Base Embedding (CKE), to jointly learn +the latent representations in collaborative filtering as well as items’ semantic representations from the knowledge base. To evaluate the performance of each embedding component as well as the +whole system, we conduct extensive experiments with two realworld datasets from different scenarios. The results reveal that our +approaches outperform several widely adopted state-of-the-art recommendation methods. + +.. image:: ../../../asset/cke.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users, items and entities. Defaults to ``64``. +- ``kg_embedding_size (int)`` : The embedding size of relations in knowledge graph. Defaults to ``64``. +- ``reg_weights (list of float)`` : The L2 regularization weights, there are two values, + the former is for user and item embedding regularization and the latter is for entity and relation embedding regularization. Defaults to ``[1e-02,1e-02]`` + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='CKE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + kg_embedding_size choice [16,32,64,128] + reg_weights choice ['[0.1,0.1]','[0.01,0.01]','[0.001,0.001]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/knowledge/kgat.rst b/docs/source/user_guide/model/knowledge/kgat.rst new file mode 100644 index 000000000..84c6aac65 --- /dev/null +++ b/docs/source/user_guide/model/knowledge/kgat.rst @@ -0,0 +1,109 @@ +KGAT +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3292500.3330989>`_ + +**Title:** KGAT: Knowledge Graph Attention Network for Recommendation + +**Authors:** Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, Tat-Seng Chua + +**Abstract:** To provide more accurate, diverse, and explainable recommendation, +it is compulsory to go beyond modeling user-item interactions +and take side information into account. Traditional methods like +factorization machine (FM) cast it as a supervised learning problem, +which assumes each interaction as an independent instance with +side information encoded. Due to the overlook of the relations +among instances or items (e.g., the director of a movie is also an +actor of another movie), these methods are insufficient to distill the +collaborative signal from the collective behaviors of users. + +In this work, we investigate the utility of knowledge graph +(KG), which breaks down the independent interaction assumption +by linking items with their attributes. We argue that in such a +hybrid structure of KG and user-item graph, high-order relations +— which connect two items with one or multiple linked attributes +— are an essential factor for successful recommendation. We +propose a new method named Knowledge Graph Attention Network +(KGAT) which explicitly models the high-order connectivities +in KG in an end-to-end fashion. It recursively propagates the +embeddings from a node’s neighbors (which can be users, items, +or attributes) to refine the node’s embedding, and employs +an attention mechanism to discriminate the importance of the +neighbors. Our KGAT is conceptually advantageous to existing +KG-based recommendation methods, which either exploit highorder relations by extracting paths or implicitly modeling them +with regularization. Empirical results on three public benchmarks +show that KGAT significantly outperforms state-of-the-art methods +like Neural FM and RippleNet. Further studies verify +the efficacy of embedding propagation for high-order relation +modeling and the interpretability benefits brought by the attention +mechanism. + +.. image:: ../../../asset/kgat.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users, items and entities. Defaults to ``64``. +- ``kg_embedding_size (int)`` : The embedding size of relations in knowledge graph. Defaults to ``64``. +- ``layers (list of int)`` : The hidden size in GNN layers, the length of this list is equal to the number of layers in GNN structure. Defaults to ``[64]``. +- ``mess_dropout (float)`` : The message dropout rate in GNN layer. Defaults to ``0.1``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-05``. +- ``aggregator_type (str)`` : The aggregator type used in GNN layer. Defaults to ``'bi'``. Range in ``['gcn', 'graphsage', 'bi']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='KGAT', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- If you want to run KGAT in RecBole, please ensure the torch version is 1.6.0 or later. Because we use torch.sparse.softmax in KGAT, which is only available in torch 1.6.0 or later. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + layers choice ['[64,32,16]','[64,64,64]','[128,64,32]'] + reg_weight choice [1e-4,5e-5,1e-5,5e-6,1e-6] + mess_dropout choice [0.1,0.2,0.3,0.4,0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/knowledge/kgcn.rst b/docs/source/user_guide/model/knowledge/kgcn.rst new file mode 100644 index 000000000..79a9fdc15 --- /dev/null +++ b/docs/source/user_guide/model/knowledge/kgcn.rst @@ -0,0 +1,86 @@ +KGCN +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3308558.3313417>`_ + +**Title:** Knowledge Graph Convolutional Networks for Recommender + +**Authors:** Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, Minyi Guo + +**Abstract:** To alleviate sparsity and cold start problem of collaborative filtering +based recommender systems, researchers and engineers usually +collect attributes of users and items, and design delicate algorithms +to exploit these additional information. In general, the attributes are +not isolated but connected with each other, which forms a knowledge graph (KG). In this paper, we propose Knowledge Graph +Convolutional Networks (KGCN), an end-to-end framework that +captures inter-item relatedness effectively by mining their associated attributes on the KG. To automatically discover both high-order +structure information and semantic information of the KG, we sample from the neighbors for each entity in the KG as their receptive +field, then combine neighborhood information with bias when calculating the representation of a given entity. The receptive field can +be extended to multiple hops away to model high-order proximity +information and capture users’ potential long-distance interests. +Moreover, we implement the proposed KGCN in a minibatch fashion, which enables our model to operate on large datasets and KGs. +We apply the proposed model to three datasets about movie, book, +and music recommendation, and experiment results demonstrate +that our approach outperforms strong recommender baselines. + +.. image:: ../../../asset/kgcn.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users, relations and entities. Defaults to ``64``. +- ``aggregator (str)`` : The aggregator used in GNN layers. Defaults to ``'sum'``. Range in ``['sum', 'neighbor', 'concat']``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-7``. +- ``neighbor_sample_size (int)`` : The number of neighbors to be sampled. Defaults to ``4``. +- ``n_iter (int)`` : The number of iterations when computing entity representation. Defaults to ``1``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='KGCN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/knowledge/kgnnls.rst b/docs/source/user_guide/model/knowledge/kgnnls.rst new file mode 100644 index 000000000..915633f47 --- /dev/null +++ b/docs/source/user_guide/model/knowledge/kgnnls.rst @@ -0,0 +1,86 @@ +KGNNLS +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3292500.3330836>`_ + +**Title:** Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems + +**Authors:** Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao, Wenjie Li, Zhongyuan Wang + +**Abstract:** Knowledge graphs capture structured information and relations between a set of entities or items. +As such knowledge graphs represent an attractive source of information that could help improve recommender systems. +However, existing approaches in this domain rely on manual feature engineering and do not allow for an end-to-end +training. Here we propose Knowledge-aware Graph Neural Networks with Label Smoothness regularization (KGNN-LS) to +provide better recommendations. Conceptually, our approach computes user-specific item embeddings by first applying +a trainable function that identifies important knowledge graph relationships for a given user. This way we transform +the knowledge graph into a user-specific weighted graph and then apply a graph neural network to compute personalized +item embeddings. To provide better inductive bias, we rely on label smoothness assumption, which posits that adjacent +items in the knowledge graph are likely to have similar user relevance labels/scores. Label smoothness provides +regularization over the edge weights and we prove that it is equivalent to a label propagation scheme on a graph. +We also develop an efficient implementation that shows strong scalability with respect to the knowledge graph size. +Experiments on four datasets show that our method outperforms state of the art baselines. KGNN-LS also achieves +strong performance in cold-start scenarios where user-item interactions are sparse. + +.. image:: ../../../asset/kgnnls.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The initial embedding size of users, relations and entities. Defaults to ``64``. +- ``aggregator (str)`` : The aggregator used in GNN layers. Defaults to ``'sum'``. Range in ``['sum', 'neighbor', 'concat']``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-7``. +- ``neighbor_sample_size (int)`` : The number of neighbors to be sampled. Defaults to ``4``. +- ``n_iter (int)`` : The number of iterations when computing entity representation. Defaults to ``1``. +- ``ls_weight (float)`` : The label smoothness regularization weight. Defaults to ``0.5``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='KGNNLS', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/knowledge/ktup.rst b/docs/source/user_guide/model/knowledge/ktup.rst new file mode 100644 index 000000000..b3ed4b760 --- /dev/null +++ b/docs/source/user_guide/model/knowledge/ktup.rst @@ -0,0 +1,103 @@ +KTUP +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3308558.3313705>`_ + +**Title:** Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences + +**Authors:** Yixin Cao, Xiang Wang, Xiangnan He, Zikun hu, Tat-Seng Chua + +**Abstract:** Incorporating knowledge graph (KG) into recommender system +is promising in improving the recommendation accuracy and explainability. However, existing methods largely assume that a KG is +complete and simply transfer the "knowledge" in KG at the shallow +level of entity raw data or embeddings. This may lead to suboptimal +performance, since a practical KG can hardly be complete, and it is +common that a KG has missing facts, relations, and entities. Thus, +we argue that it is crucial to consider the incomplete nature of KG +when incorporating it into recommender system. + +In this paper, we jointly learn the model of recommendation +and knowledge graph completion. Distinct from previous KG-based +recommendation methods, we transfer the relation information +in KG, so as to understand the reasons that a user likes an item. +As an example, if a user has watched several movies directed by +(relation) the same person (entity), we can infer that the director +relation plays a critical role when the user makes the decision, thus +help to understand the user’s preference at a finer granularity. + +Technically, we contribute a new translation-based recommendation model, which specially accounts for various preferences in +translating a user to an item, and then jointly train it with a KG +completion model by combining several transfer schemes. Extensive experiments on two benchmark datasets show that our method +outperforms state-of-the-art KG-based recommendation methods. +Further analysis verifies the positive effect of joint training on both +tasks of recommendation and KG completion, and the advantage +of our model in understanding user preference. + +.. image:: ../../../asset/ktup.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``train_rec_step (int)`` : The number of steps for continuous training recommendation task. Defaults to ``5``. +- ``train_kg_step (int)`` : The number of steps for continuous training knowledge related task. Defaults to ``5``. +- ``embedding_size (int)`` : The embedding size of users, items, entities, relations and preferences. Defaults to ``64``. +- ``use_st_gumbel (bool)`` : Whether to use gumbel softmax. Defaults to ``True``. +- ``L1_flag (bool)`` : Whether to use L1 distance to calculate dissimilarity, if set to False, use L2 distance. Defaults to ``False``. +- ``margin (float)`` : The margin in margin loss. Defaults to ``1.0``. +- ``kg_weight (float)`` : The weight decay for kg model. Defaults to ``1.0``. +- ``align_weight (float)`` : The align loss weight(make the item embedding in rec and kg more closer). Defaults to ``1.0``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='KTUP', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + L1_flag choice [True, False] + use_st_gumbel choice [True, False] + train_rec_step choice [8,10] + train_kg_step choice [0,1,2,3,4,5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/knowledge/mkr.rst b/docs/source/user_guide/model/knowledge/mkr.rst new file mode 100644 index 000000000..62633f5fb --- /dev/null +++ b/docs/source/user_guide/model/knowledge/mkr.rst @@ -0,0 +1,78 @@ +MKR +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3308558.3313411>`_ + +**Title:** Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation + +**Authors:** Hongwei Wang, Fuzheng Zhang, Miao Zhao, Wenjie Li, Xing Xie, Minyi Guo + +**Abstract:** Collaborative filtering often suffers from sparsity and cold start problems in real recommendation scenarios, therefore, researchers and engineers usually use side information to address the issues and improve the performance of recommender systems. In this paper, we consider knowledge graphs as the source of side information. We propose MKR, a Multi-task feature learning approach for Knowledge graph enhanced Recommendation. MKR is a deep end-to-end framework that utilizes knowledge graph embedding task to assist recommendation task. The two tasks are associated by crosscompress units, which automatically share latent features and learn high-order interactions between items in recommender systems and entities in the knowledge graph. We prove that crosscompress units have sufficient capability of polynomial approximation, and show that MKR is a generalized framework over several representative methods of recommender systems and multi-task learning. Through extensive experiments on real-world datasets, we demonstrate that MKR achieves substantial gains in movie, book, music, and news recommendation, over state-of-the-art baselines. MKR is also shown to be able to maintain satisfactory performance even if user-item interactions are sparse. + +.. image:: ../../../asset/mkr.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``kg_embedding_size (int)`` : The embedding size of entities, relations. Defaults to ``64``. +- ``low_layers_num (int)`` : The number of low layers. Defaults to ``1``. +- ``high_layers_num (int)`` : The number of high layers. Defaults to ``1``. +- ``kge_interval (int)`` : The number of steps for continuous training knowledge related task. Defaults to ``3``. +- ``use_inner_product (bool)`` : Whether to use inner product to calculate scores. Defaults to ``True``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-6``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.0``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='MKR', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + low_layers_num choice [1,2,3] + high_layers_num choice [1,2] + l2_weight choice [1e-6,1e-4] + kg_embedding_size choice [16,32,64] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/knowledge/ripplenet.rst b/docs/source/user_guide/model/knowledge/ripplenet.rst new file mode 100644 index 000000000..f1fadbd28 --- /dev/null +++ b/docs/source/user_guide/model/knowledge/ripplenet.rst @@ -0,0 +1,86 @@ +RippleNet +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3269206.3271739>`_ + +**Title:** RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems + +**Authors:** Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, Minyi Guo + +**Abstract:** To address the sparsity and cold start problem of collaborative filtering, researchers usually make use of side information, such as social +networks or item attributes, to improve recommendation performance. This paper considers the knowledge graph as the source of +side information. To address the limitations of existing embeddingbased and path-based methods for knowledge-graph-aware recommendation, we propose RippleNet, an end-to-end framework that +naturally incorporates the knowledge graph into recommender +systems. Similar to actual ripples propagating on the water, RippleNet stimulates the propagation of user preferences over the set +of knowledge entities by automatically and iteratively extending a +user’s potential interests along links in the knowledge graph. The +multiple "ripples" activated by a user’s historically clicked items +are thus superposed to form the preference distribution of the user +with respect to a candidate item, which could be used for predicting the final clicking probability. Through extensive experiments +on real-world datasets, we demonstrate that RippleNet achieves +substantial gains in a variety of scenarios, including movie, book +and news recommendation, over several state-of-the-art baselines. + +.. image:: ../../../asset/ripplenet.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users, items and entities. Defaults to ``64``. +- ``n_hop (int)`` : The number of hop reasoning for knowledge base. Defaults to ``2``. +- ``n_memory (int)`` : The number of memory size of every hop. Defaults to ``16``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-07``, +- ``kg_weight (float)`` : The kg loss weight. Defaults to ``0.01``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='RippleNet', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + n_memory choice [4, 8. 16. 32] + training_neg_sample_num choice [1, 2, 5, 10] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/bert4rec.rst b/docs/source/user_guide/model/sequential/bert4rec.rst new file mode 100644 index 000000000..f6c6210ae --- /dev/null +++ b/docs/source/user_guide/model/sequential/bert4rec.rst @@ -0,0 +1,98 @@ +BERT4Rec +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3357384.3357895>`_ + +**Title:** BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer + +**Authors:** Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, Peng Jiang + +**Abstract:** Modeling users' dynamic preferences from their historical behaviors is challenging and crucial for recommendation systems. Previous methods employ sequential neural networks to encode users' +historical interactions from left to right into hidden representations +for making recommendations. Despite their effectiveness, we argue +that such left-to-right unidirectional models are sub-optimal due +to the limitations including: a) unidirectional architectures restrict +the power of hidden representation in users' behavior sequences; +b) they often assume a rigidly ordered sequence which is not always +practical. To address these limitations, we proposed a sequential recommendation model called BERT4Rec, which employs the deep +bidirectional self-attention to model user behavior sequences. To +avoid the information leakage and efficiently train the bidirectional +model, we adopt the Cloze objective to sequential recommendation, +predicting the random masked items in the sequence by jointly +conditioning on their left and right context. In this way, we learn +a bidirectional representation model to make recommendations +by allowing each item in user historical behaviors to fuse information from both left and right sides. Extensive experiments on +four benchmark datasets show that our model outperforms various +state-of-the-art sequential models consistently. + +.. image:: ../../../asset/bert4rec.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``hidden_size (int)`` : The number of features in the hidden state. It is also the initial embedding size of items. Defaults to ``64``. +- ``inner_size (int)`` : The inner hidden size in feed-forward layer. Defaults to ``256``. +- ``n_layers (int)`` : The number of transformer layers in transformer encoder. Defaults to ``2``. +- ``n_heads (int)`` : The number of attention heads for multi-head attention layer. Defaults to ``2``. +- ``hidden_dropout_prob (float)`` : The probability of an element to be zeroed. Defaults to ``0.5``. +- ``attn_dropout_prob (float)`` : The probability of an attention score to be zeroed. Defaults to ``0.5``. +- ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``. +- ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability. Defaults to ``1e-12``. +- ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``. +- ``mask_ratio (float)`` : The probability for a item replaced by MASK token. Defaults to ``0.2``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='BERT4Rec', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + attn_dropout_prob choice [0.2,0.5] + hidden_dropout_prob choice [0.2,0.5] + n_heads choice [1,2] + n_layers choice [1,2] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/caser.rst b/docs/source/user_guide/model/sequential/caser.rst new file mode 100644 index 000000000..c502e0299 --- /dev/null +++ b/docs/source/user_guide/model/sequential/caser.rst @@ -0,0 +1,79 @@ +Caser +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3159652.3159656>`_ + +**Title:** Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding + +**Authors:** Jiaxi Tang, Ke Wang + +**Abstract:** Top-N sequential recommendation models each user as a sequence of items interacted in the past and aims to predict top-N ranked items that a user will likely interact in a “near future”. The order of interaction implies that sequential patterns play an important role where more recent items in a sequence have a larger impact on the next item. In this paper, we propose a Convolutional Sequence Embedding Recommendation Model (Caser) as a solution to address this requirement. The idea is to embed a sequence of recent items into an “image” in the time and latent spaces and learn sequential patterns as local features of the image using convolutional filters. This approach provides a unified and flexible network structure for capturing both general preferences and sequential patterns. The ex- periments on public data sets demonstrated that Caser consistently outperforms state-of-the-art sequential recommendation methods on a variety of common evaluation metrics. + +.. image:: ../../../asset/caser.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``n_h (int)`` : The number of horizontal Convolutional filters. Defaults to ``16``. +- ``n_v (int)`` : The number of vertical Convolutional filters. Defaults to ``8``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-4``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.4``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='Caser', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of Caser can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + reg_weight choice [0,1e-4,1e-5] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/fdsa.rst b/docs/source/user_guide/model/sequential/fdsa.rst new file mode 100644 index 000000000..90ee4aec5 --- /dev/null +++ b/docs/source/user_guide/model/sequential/fdsa.rst @@ -0,0 +1,100 @@ +FDSA +=========== + +Introduction +--------------------- + +`[paper] <https://www.ijcai.org/Proceedings/2019/600>`_ + +**Title:** Feature-level Deeper Self-Attention Network for Sequential Recommendation + +**Authors:** Tingting Zhang, Pengpeng Zhao, Yanchi Liu, Victor S. Sheng, Jiajie Xu, Deqing Wang, Guanfeng Liu, Xiaofang Zhou + +**Abstract:** Sequential recommendation, which aims to recommend next item that the user will +likely interact in a near future, has become essential in various Internet applications. +Existing methods usually consider the transition patterns between items, but ignore the +transition patterns between features of items. We argue that only the item-level sequences +cannot reveal the full sequential patterns, while explicit and implicit feature-level +sequences can help extract the full sequential patterns. In this paper, we propose a novel +method named Feature-level Deeper Self-Attention Network (FDSA) for sequential recommendation. +Specifically, FDSA first integrates various heterogeneous features of items into feature +sequences with different weights through a vanilla mechanism. After that, FDSA applies +separated self-attention blocks on item-level sequences and feature-level sequences, +respectively, to model item transition patterns and feature transition patterns. +Then, we integrate the outputs of these two blocks to a fully-connected layer for next item recommendation. +Finally, comprehensive experimental results demonstrate that considering the transition relationships between +features can significantly improve the performance of sequential recommendation. + +.. image:: ../../../asset/fdsa.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``hidden_size (int)`` : The number of features in the hidden state. It is also the initial embedding size of items. Defaults to ``64``. +- ``inner_size (int)`` : The inner hidden size in feed-forward layer. Defaults to ``256``. +- ``n_layers (int)`` : The number of transformer layers in transformer encoder. Defaults to ``2``. +- ``n_heads (int)`` : The number of attention heads for multi-head attention layer. Defaults to ``2``. +- ``hidden_dropout_prob (float)`` : The probability of an element to be zeroed. Defaults to ``0.5``. +- ``attn_dropout_prob (float)`` : The probability of an attention score to be zeroed. Defaults to ``0.5``. +- ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``. +- ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability. Defaults to ``1e-12``. +- ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``. +- ``selected_features (list)`` : The list of selected item features. Defaults to ``['class']`` for ml-100k dataset. +- ``pooling_mode (str)``: The intra-feature pooling mode. Defaults to ``'mean'``. Range in ``['max', 'mean', 'sum']``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FDSA', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- FDSA is a sequential model that integrates item context information. ``selected_features`` controls the used item context information. The used context information must be in the dataset and be loaded by data module in RecBole. It means the value in ``selected_features`` must appear in ``load_col``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + attn_dropout_prob choice [0.2, 0.5] + hidden_dropout_prob choice [0.2, 0.5] + n_heads choice [1, 2] + n_layers choice [1,2,3] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/fossil.rst b/docs/source/user_guide/model/sequential/fossil.rst new file mode 100644 index 000000000..f05e5dd71 --- /dev/null +++ b/docs/source/user_guide/model/sequential/fossil.rst @@ -0,0 +1,101 @@ +FOSSIL +=========== + +Introduction +--------------------- + +`[paper] <https://ieeexplore.ieee.org/abstract/document/7837843/>`_ + +**Title:** FOSSIL: Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation. + +**Authors:** Ruining He + +**Abstract:** Abstract—Predicting personalized sequential behavior is a +key task for recommender systems. In order to predict user +actions such as the next product to purchase, movie to watch, +or place to visit, it is essential to take into account both long- +term user preferences and sequential patterns (i.e., short-term +dynamics). Matrix Factorization and Markov Chain methods +have emerged as two separate but powerful paradigms for +modeling the two respectively. Combining these ideas has led +to unified methods that accommodate long- and short-term +dynamics simultaneously by modeling pairwise user-item and +item-item interactions. +In spite of the success of such methods for tackling dense +data, they are challenged by sparsity issues, which are prevalent +in real-world datasets. In recent years, similarity-based methods +have been proposed for (sequentially-unaware) item recommendation with promising results on sparse datasets. In this +paper, we propose to fuse such methods with Markov Chains to +make personalized sequential recommendations. We evaluate +our method, Fossil, on a variety of large, real-world datasets. +We show quantitatively that Fossil outperforms alternative +algorithms, especially on sparse datasets, and qualitatively +that it captures personalized dynamics and is able to make +meaningful recommendations. + +.. image:: ../../../asset/fossil.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``order_len (int)`` : The last N items . Defaults to ``3``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``0.00``. +- ``alpha (float)`` : The parameter of alpha in calculate the similarity. Defaults to ``0.6``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FOSSIL', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of FOSSIL can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.001] + embedding_size choice [64] + reg_weight choice [0,0.0001] + order_len choice [1,2,3,5] + alpha choice [0.2,0.5,0.6] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/fpmc.rst b/docs/source/user_guide/model/sequential/fpmc.rst new file mode 100644 index 000000000..6f759d8f5 --- /dev/null +++ b/docs/source/user_guide/model/sequential/fpmc.rst @@ -0,0 +1,93 @@ +FPMC +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/1772690.1772773>`_ + +**Title:** Factorizing personalized Markov chains for next-basket recommendation + +**Authors:** Steffen Rendle, Christoph Freudenthaler, Lars Schmidt-Thieme + +**Abstract:** Recommender systems are an important component of many +websites. Two of the most popular approaches are based on +matrix factorization (MF) and Markov chains (MC). MF +methods learn the general taste of a user by factorizing the +matrix over observed user-item preferences. On the other +hand, MC methods model sequential behavior by learning a +transition graph over items that is used to predict the next +action based on the recent actions of a user. In this paper, we +present a method bringing both approaches together. Our +method is based on personalized transition graphs over underlying Markov chains. That means for each user an own +transition matrix is learned – thus in total the method uses +a transition cube. As the observations for estimating the +transitions are usually very limited, our method factorizes +the transition cube with a pairwise interaction model which +is a special case of the Tucker Decomposition. We show +that our factorized personalized MC (FPMC) model subsumes both a common Markov chain and the normal matrix +factorization model. For learning the model parameters, we +introduce an adaption of the Bayesian Personalized Ranking +(BPR) framework for sequential basket data. Empirically, +we show that our FPMC model outperforms both the common matrix factorization and the unpersonalized MC model +both learned with and without factorization. + +.. image:: ../../../asset/fpmc.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='FPMC', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- Different from other sequential models, FPMC must be optimized in pair-wise way using negative sampling, so it needs ``training_neg_sample_num=1``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/gcsan.rst b/docs/source/user_guide/model/sequential/gcsan.rst new file mode 100644 index 000000000..3707767d4 --- /dev/null +++ b/docs/source/user_guide/model/sequential/gcsan.rst @@ -0,0 +1,101 @@ +GCSAN +=========== + +Introduction +--------------------- + +`[paper] <https://www.ijcai.org/Proceedings/2019/547>`_ + +**Title:** Graph Contextualized Self-Attention Network for Session-based Recommendation + +**Authors:** Chengfeng Xu, Pengpeng Zhao, Yanchi Liu, Victor S. Sheng, Jiajie Xu, Fuzhen Zhuang, Junhua Fang, Xiaofang Zhou + +**Abstract:** Session-based recommendation, which aims to predict the user’s immediate next action based on +anonymous sessions, is a key task in many online +services (e:g:; e-commerce, media streaming). Recently, Self-Attention Network (SAN) has achieved +significant success in various sequence modeling +tasks without using either recurrent or convolutional network. However, SAN lacks local dependencies that exist over adjacent items and limits its capacity for learning contextualized representations of items in sequences. In this paper, we propose a graph contextualized self-attention model +(GC-SAN), which utilizes both graph neural network and self-attention mechanism, for sessionbased recommendation. In GC-SAN, we dynamically construct a graph structure for session sequences and capture rich local dependencies via graph neural network (GNN). Then each session learns long-range dependencies by applying +the self-attention mechanism. Finally, each session +is represented as a linear combination of the global +preference and the current interest of that session. +Extensive experiments on two real-world datasets show that GC-SAN outperforms state-of-the-art +methods consistently. + +.. image:: ../../../asset/gcsan.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``hidden_size (int)`` : The number of features in the hidden state. It is also the initial embedding size of item. Defaults to ``64``. +- ``inner_size (int)`` : The inner hidden size in feed-forward layer. Defaults to ``256``. +- ``n_layers (int)`` : The number of transformer layers in transformer encoder. Defaults to ``1``. +- ``n_heads (int)`` : The number of attention heads for multi-head attention layer. Defaults to ``1``. +- ``hidden_dropout_prob (float)`` : The probability of an element to be zeroed. Defaults to ``0.2``. +- ``attn_dropout_prob (float)`` : The probability of an attention score to be zeroed. Defaults to ``0.2``. +- ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``. +- ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability. Defaults to ``1e-12``. +- ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``. +- ``step (int)`` : The number of layers in GNN. Defaults to ``1``. +- ``weight (float)`` : The weight parameter controls the contribution of self-attention representation and the last-clicked action, the original paper suggests that setting w to a value of 0.4 to 0.8 is more desirable. Defaults to ``0.6``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``[5e-5]``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='GCSAN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + step choice [1] + n_layers choice [1] + n_heads choice [1] + hidden_size choice [64] + inner_size choice [256] + hidden_dropout_prob choice [0.2] + attn_dropout_prob choice [0.2] + hidden_act choice ['gelu'] + layer_norm_eps choice [1e-12] + initializer_range choice [0.02] + weight choice [0.5,0.6] + reg_weight choice [5e-5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/sequential/gru4rec.rst b/docs/source/user_guide/model/sequential/gru4rec.rst new file mode 100644 index 000000000..8139db4e6 --- /dev/null +++ b/docs/source/user_guide/model/sequential/gru4rec.rst @@ -0,0 +1,83 @@ +GRU4Rec +================= + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2988450.2988452>`_ + +**Title:** Improved Recurrent Neural Networks for Session-based Recommendations + +**Authors:** Yong Kiam Tan, Xinxing Xu, Yong Liu + +**Abstract:** Recurrent neural networks (RNNs) were recently proposed +for the session-based recommendation task. The models +showed promising improvements over traditional recommendation approaches. In this work, we further study RNNbased models for session-based recommendations. We propose the application of two techniques to improve model +performance, namely, data augmentation, and a method to +account for shifts in the input data distribution. We also +empirically study the use of generalised distillation, and a +novel alternative model that directly predicts item embeddings. Experiments on the RecSys Challenge 2015 dataset +demonstrate relative improvements of 12.8% and 14.8% over +previously reported results on the Recall\@20 and Mean Reciprocal Rank\@20 metrics respectively. + +.. image:: ../../../asset/gru4rec.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items. Defaults to ``64``. +- ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``128``. +- ``num_layers (int)`` : The number of layers in GRU. Defaults to ``1``. +- ``dropout_prob (float)``: The dropout rate. Defaults to ``0.3``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='GRU4Rec', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + num_layers choice [1,2,3] + hidden_size choice [128] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/sequential/gru4recf.rst b/docs/source/user_guide/model/sequential/gru4recf.rst new file mode 100644 index 000000000..488546ba8 --- /dev/null +++ b/docs/source/user_guide/model/sequential/gru4recf.rst @@ -0,0 +1,95 @@ +GRU4RecF +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/2959100.2959167>`_ + +**Title:** Parallel Recurrent Neural Network Architectures for +Feature-rich Session-based Recommendations + +**Authors:** Balázs Hidasi, Massimo Quadrana, Alexandros Karatzoglou, Domonkos Tikk + +**Abstract:** Real-life recommender systems often face the daunting task +of providing recommendations based only on the clicks of +a user session. Methods that rely on user profiles – such +as matrix factorization – perform very poorly in this setting, thus item-to-item recommendations are used most of +the time. However the items typically have rich feature representations such as pictures and text descriptions that can +be used to model the sessions. Here we investigate how these +features can be exploited in Recurrent Neural Network based +session models using deep learning. We show that obvious +approaches do not leverage these data sources. We thus introduce a number of parallel RNN (p-RNN) architectures to +model sessions based on the clicks and the features (images +and text) of the clicked items. We also propose alternative +training strategies for p-RNNs that suit them better than +standard training. We show that p-RNN architectures with +proper training have significant performance improvements +over feature-less session models while all session-based models outperform the item-to-item type baseline. + +.. image:: ../../../asset/gru4recf.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items. Defaults to ``64``. +- ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``128``. +- ``num_layers (int)`` : The number of layers in GRU. Defaults to ``1``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.3``. +- ``selected_features (list)`` : The list of selected item features. Defaults to ``['class']`` for ml-100k dataset. +- ``pooling_mode (str)`` : The intra-feature pooling mode. Defaults to ``'sum'``. Range in ``['max', 'mean', 'sum']``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='GRU4RecF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + + +**Notes:** + +- GRU4RecF is a sequential model that integrates item context information. ``selected_features`` controls the used item context information. The used context information must be in the dataset and be loaded by data module in RecBole. It means the value in ``selected_features`` must appear in ``load_col``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + num_layers choice [1, 2] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/sequential/gru4reckg.rst b/docs/source/user_guide/model/sequential/gru4reckg.rst new file mode 100644 index 000000000..f1653cabf --- /dev/null +++ b/docs/source/user_guide/model/sequential/gru4reckg.rst @@ -0,0 +1,88 @@ +GRU4RecKG +=========== + +Introduction +--------------------- + +It is an extension of GRU4Rec, which concatenates items and its corresponding knowledge graph embedding as the input. + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items and the KG feature. Defaults to ``64``. +- ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``128``. +- ``num_layers (int)`` : The number of layers in GRU. Defaults to ``1``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.1``. +- ``freeze_kg (bool)`` : Whether to freeze the pre-trained knowledge embedding feature. Defaults to ``True``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='GRU4RecKG', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- If you want to run GRU4RecKG, please prepare pretrained knowledge graph embedding and add the following settings to config files: + + .. code:: yaml + + load_col: + inter: [user_id, item_id] + kg: [head_id, relation_id, tail_id] + link: [item_id, entity_id] + ent_feature: [ent_id, ent_vec] + fields_in_same_space: [ + [ent_id, entity_id] + ] + preload_weight: + ent_id: ent_vec + additional_feat_suffix: [ent_feature] + + where the pretrained knowledge graph embedding should be stored in file named [dataset_name].ent_feature. If you want to + add additional feature embedding, please refer to this example. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + num_layers choice [1,2,3] + hidden_size choice [128] + freeze_kg choice [True, False] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/hgn.rst b/docs/source/user_guide/model/sequential/hgn.rst new file mode 100644 index 000000000..630fcb445 --- /dev/null +++ b/docs/source/user_guide/model/sequential/hgn.rst @@ -0,0 +1,96 @@ +HGN +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3292500.3330984>`_ + +**Title:** HGN: Hierarchical Gating Networks for Sequential Recommendation. + +**Authors:** Chen Ma + +**Abstract:** The chronological order of user-item interactions is a key feature +in many recommender systems, where the items that users will +interact may largely depend on those items that users just accessed +recently. However, with the tremendous increase of users and items, +sequential recommender systems still face several challenging problems: (1) the hardness of modeling the long-term user interests from +sparse implicit feedback; (2) the difficulty of capturing the short- +term user interests given several items the user just accessed. To +cope with these challenges, we propose a hierarchical gating net- +work (HGN), integrated with the Bayesian Personalized Ranking +(BPR) to capture both the long-term and short-term user interests. +Our HGN consists of a feature gating module, an instance gating +module, and an item-item product module. In particular, our feature +gating and instance gating modules select what item features can +be passed to the downstream layers from the feature and instance +levels, respectively. Our item-item product module explicitly captures the item relations between the items that users accessed in +the past and those items users will access in the future. We extensively evaluate our model with several state-of-the-art methods +and different validation metrics on five real-world datasets. The +experimental results demonstrate the effectiveness of our model on +Top-N sequential recommendation. + +.. image:: ../../../asset/hgn.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``pooling_type (str)`` : The type of pooling include average pooling and max pooling . Defaults to ``average``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``[0.00,0.00]``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='HGN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of HGN can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.001] + embedding_size choice [64] + pooling_type choice ["average","max"] + reg_weight choice ['[0.00,0.00]','[0.001,0.00001]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/hrm.rst b/docs/source/user_guide/model/sequential/hrm.rst new file mode 100644 index 000000000..766d9b21a --- /dev/null +++ b/docs/source/user_guide/model/sequential/hrm.rst @@ -0,0 +1,102 @@ +HRM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/2766462.2767694>`_ + +**Title:** HRM: Learning Hierarchical Representation Model for Next Basket Recommendation. + +**Authors:** Pengfei Wang + +**Abstract:** Next basket recommendation is a crucial task in market bas- +ket analysis. Given a user’s purchase history, usually a sequence of transaction data, one attempts to build a recom- +mender that can predict the next few items that the us- +er most probably would like. Ideally, a good recommender +should be able to explore the sequential behavior (i.e., buy- +ing one item leads to buying another next), as well as ac- +count for users’ general taste (i.e., what items a user is typically interested in) for recommendation. Moreover, these +two factors may interact with each other to influence users’ +next purchase. To tackle the above problems, in this pa- +per, we introduce a novel recommendation approach, name- +ly hierarchical representation model (HRM). HRM can well +capture both sequential behavior and users’ general taste by +involving transaction and user representations in prediction. +Meanwhile, the flexibility of applying different aggregation +operations, especially nonlinear operations, on representations allows us to model complicated interactions among +different factors. Theoretically, we show that our model +subsumes several existing methods when choosing proper +aggregation operations. Empirically, we demonstrate that +our model can consistently outperform the state-of-the-art +baselines under different evaluation metrics on real-world +transaction data. + +.. image:: ../../../asset/hrm.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``high_order (int)`` : The last N items . Defaults to ``2``. +- ``pooling_type_layer_1 (str)`` : The type of pooling in the first floor include average pooling and max pooling . Defaults to ``max``. +- ``pooling_type_layer_2 (str)`` : The type of pooling in the second floor include average pooling and max pooling . Defaults to ``max``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.2``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='HRM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of HRM can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.001] + embedding_size choice [64] + high_order choice [1,2,4] + dropout_prob choice [0.2] + pooling_type_layer_1 choice ["max","average"] + pooling_type_layer_2 choice ["max","average"] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/ksr.rst b/docs/source/user_guide/model/sequential/ksr.rst new file mode 100644 index 000000000..8085e71b4 --- /dev/null +++ b/docs/source/user_guide/model/sequential/ksr.rst @@ -0,0 +1,102 @@ +KSR +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3209978.3210017>`_ + +**Title:** Improving Sequential Recommendation with Knowledge-Enhanced Memory Networks + +**Authors:** Jin Huang, Wayne Xin Zhao, Hongjian Dou, Ji-Rong Wen, Edward Y. Chang + +**Abstract:** With the revival of neural networks, many studies try to adapt powerful sequential neural models, ıe Recurrent Neural Networks (RNN), to sequential recommendation. RNN-based networks encode historical interaction records into a hidden state vector. Although the state vector is able to encode sequential dependency, it still has limited representation power in capturing complicated user preference. It is difficult to capture fine-grained user preference from the interaction sequence. Furthermore, the latent vector representation is usually hard to understand and explain. To address these issues, in this paper, we propose a novel knowledge enhanced sequential recommender. Our model integrates the RNN-based networks with Key-Value Memory Network (KV-MN). We further incorporate knowledge base (KB) information to enhance the semantic representation of KV-MN. RNN-based models are good at capturing sequential user preference, while knowledge-enhanced KV-MNs are good at capturing attribute-level user preference. By using a hybrid of RNNs and KV-MNs, it is expected to be endowed with both benefits from these two components. The sequential preference representation together with the attribute-level preference representation are combined as the final representation of user preference. With the incorporation of KB information, our model is also highly interpretable. To our knowledge, it is the first time that sequential recommender is integrated with external memories by leveraging large-scale KB information. + +.. image:: ../../../asset/ksr.jpg + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items and the KG feature. Defaults to ``64``. +- ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``128``. +- ``num_layers (int)`` : The number of layers in GRU. Defaults to ``1``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.1``. +- ``freeze_kg (bool)`` : Whether to freeze the pre-trained knowledge embedding feature. Defaults to ``True``. +- ``gamma (float)`` : The scaling factor used in read operation when calculating the attention weights of user preference on attributes. Defaults to ``10``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='KSR', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- If you want to run KSR, please prepare pretrained knowledge graph embedding and add the following settings to config files: + + .. code:: yaml + + load_col: + inter: [user_id, item_id] + kg: [head_id, relation_id, tail_id] + link: [item_id, entity_id] + ent_feature: [ent_id, ent_vec] + rel_feature: [rel_id, rel_vec] + fields_in_same_space: [ + [ent_id, entity_id] + [rel_id, relation_id] + ] + preload_weight: + ent_id: ent_vec + rel_id: rel_vec + additional_feat_suffix: [ent_feature, rel_feature] + + where the pretrained knowledge graph embedding should be stored in file named [dataset_name].ent_feature. If you want to + add additional feature embedding, please refer to this example. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + dropout_prob choice [0.0,0.1,0.2,0.3,0.4,0.5] + num_layers choice [1,2,3] + hidden_size choice [128] + freeze_kg choice [True, False] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/narm.rst b/docs/source/user_guide/model/sequential/narm.rst new file mode 100644 index 000000000..d9bd60460 --- /dev/null +++ b/docs/source/user_guide/model/sequential/narm.rst @@ -0,0 +1,92 @@ +NARM +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3132847.3132926>`_ + +**Title:** Neural Attentive Session-based Recommendation + +**Authors:** Jing Li, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Tao Lian, Jun Ma + +**Abstract:** Given e-commerce scenarios that user profiles are invisible, sessionbased recommendation is proposed to generate recommendation +results from short sessions. Previous work only considers the +user’s sequential behavior in the current session, whereas the +user’s main purpose in the current session is not emphasized. In +this paper, we propose a novel neural networks framework, i.e., +Neural Attentive Recommendation Machine (NARM), to tackle +this problem. Specifically, we explore a hybrid encoder with an +attention mechanism to model the user’s sequential behavior and +capture the user’s main purpose in the current session, which +are combined as a unified session representation later. We then +compute the recommendation scores for each candidate item with +a bi-linear matching scheme based on this unified session representation. We train NARM by jointly learning the item and session +representations as well as their matchings. We carried out extensive experiments on two benchmark datasets. Our experimental +results show that NARM outperforms state-of-the-art baselines on +both datasets. Furthermore, we also find that NARM achieves a +significant improvement on long sessions, which demonstrates its +advantages in modeling the user’s sequential behavior and main +purpose simultaneously. + +.. image:: ../../../asset/narm.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items. Defaults to ``64``. +- ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``128``. +- ``n_layers (int)`` : The number of layers in GRU. Defaults to ``1``. +- ``dropout_probs (list of float)`` : The dropout rate, there are two values, + the former is for embedding layer and the latter is for concatenation of the vector obtained by the local encoder and the vector obtained by the global encoder. Defaults to ``[0.25,0.5]``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NARM', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + hidden_size choice [128] + n_layers choice [1,2] + dropout_probs choice ['[0.25,0.5]','[0.2,0.2]','[0.1,0.2]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/nextitnet.rst b/docs/source/user_guide/model/sequential/nextitnet.rst new file mode 100644 index 000000000..2dcd97da5 --- /dev/null +++ b/docs/source/user_guide/model/sequential/nextitnet.rst @@ -0,0 +1,79 @@ +NextItNet +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3289600.3290975>`_ + +**Title:** A Simple Convolutional Generative Network for Next Item Recommendation + +**Authors:** Fajie Yuan, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M Jose, Xiangnan He + +**Abstract:** Convolutional Neural Networks (CNNs) have been recently intro- duced in the domain of session-based next item recommendation. An ordered collection of past items the user has interacted with in a session (or sequence) are embedded into a 2-dimensional latent matrix, and treated as an image. The convolution and pooling opera- tions are then applied to the mapped item embeddings. In this paper, we first examine the typical session-based CNN recommender and show that both the generative model and network architecture are suboptimal when modeling long-range dependencies in the item sequence. To address the issues, we introduce a simple, but very effective generative model that is capable of learning high-level representation from both short- and long-range item dependencies. The network architecture of the proposed model is formed of a stack of holed convolutional layers, which can efficiently increase the receptive fields without relying on the pooling operation. Another contribution is the effective use of residual block structure in recom- mender systems, which can ease the optimization for much deeper networks. The proposed generative model attains state-of-the-art accuracy with less training time in the next item recommendation task. It accordingly can be used as a powerful recommendation baseline to beat in future, especially when there are long sequences of user feedback. + +.. image:: ../../../asset/nextitnet.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``kernel_size (int)`` : The width of convolutional filter. Defaults to ``3``. +- ``block_num (int)`` : The number of residual blocks. Defaults to ``5``. +- ``dilations (list)`` : Control the spacing between the kernel points. Defaults to ``[1,4]``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``1e-5``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NextItNet', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of NextitNet can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + reg_weight choice [0,1e-5,1e-4] + block_num choice [2,3,4,5] + dilations choice ['[1, 2]' '[1, 4]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/sequential/npe.rst b/docs/source/user_guide/model/sequential/npe.rst new file mode 100644 index 000000000..0cf9ca655 --- /dev/null +++ b/docs/source/user_guide/model/sequential/npe.rst @@ -0,0 +1,89 @@ +NPE +=========== + +Introduction +--------------------- + +`[paper] <https://arxiv.org/abs/1805.06563>`_ + +**Title:** NPE: Neural Personalized Embedding for Collaborative Filtering. + +**Authors:** Ying, H + +**Abstract:** Matrix factorization is one of the most efficient approaches in recommender systems. However, such +algorithms, which rely on the interactions between +users and items, perform poorly for “cold-users” +(users with little history of such interactions) and +at capturing the relationships between closely related items. To address these problems, we propose +a neural personalized embedding (NPE) model, +which improves the recommendation performance +for cold-users and can learn effective representations of items. It models a user’s click to an item +in two terms: the personal preference of the user +for the item, and the relationships between this +item and other items clicked by the user. We show +that NPE outperforms competing methods for top- +N recommendations, specially for cold-user recommendations. We also performed a qualitative analysis that shows the effectiveness +of the representations learned by the model. + +.. image:: ../../../asset/npe.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.3``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='NPE', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of NPE can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.001] + embedding_size choice [64] + dropout_prob choice [0.2,0.3,0.5] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/repeatnet.rst b/docs/source/user_guide/model/sequential/repeatnet.rst new file mode 100644 index 000000000..47dcc5325 --- /dev/null +++ b/docs/source/user_guide/model/sequential/repeatnet.rst @@ -0,0 +1,94 @@ +RepeatNet +=========== + +Introduction +--------------------- + +`[paper] <https://ojs.aaai.org//index.php/AAAI/article/view/4408>`_ + +**Title:** RepeatNet: A Repeat Aware Neural Recommendation Machine for Session-based Recommendation. + +**Authors:** Pengjie Ren, Zhumin Chen, Jing Li, Zhaochun Ren, Jun Ma, Maarten de Rijke + +**Abstract:** Recurrent neural networks for session-based recommendation have attracted a lot of attention recently because of +their promising performance. repeat consumption is a com-mon phenomenon in many recommendation scenarios (e.g.,e-commerce, music, and TV program recommendations), +where the same item is re-consumed repeatedly over time. +However, no previous studies have emphasized repeat consumption with neural networks. An effective neural approach +is needed to decide when to perform repeat recommendation. In this paper, we incorporate a repeat-explore mechanism into neural networks and propose a new model, called +RepeatNet, with an encoder-decoder structure. RepeatNet integrates a regular neural recommendation approach in the de- +coder with a new repeat recommendation mechanism that can +choose items from a user’s history and recommends them at +the right time. We report on extensive experiments on three +benchmark datasets. RepeatNet outperforms state-of-the-art +baselines on all three datasets in terms of MRR and Recall. +Furthermore, as the dataset size and the repeat ratio increase, +the improvements of RepeatNet over the baselines also in- +crease, which demonstrates its advantage in handling repeat +recommendation scenarios. + +.. image:: ../../../asset/repeatnet.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``hidden_size (int)`` : The number of features in the hidden state. Defaults to ``64``. +- ``joint_train (bool)`` : The indicator whether the train loss should add the repeat_explore_loss. Defaults to ``False``. +- ``dropout_prob (float)`` : The dropout rate. Defaults to ``0.5``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='RepeatNet', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of RepeatNet can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.001,] + embedding_size choice [64] + joint_train choice [False,True] + dropout_prob choice [0.5,] + train_batch_size: 2048 + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/s3rec.rst b/docs/source/user_guide/model/sequential/s3rec.rst new file mode 100644 index 000000000..ee24b0f43 --- /dev/null +++ b/docs/source/user_guide/model/sequential/s3rec.rst @@ -0,0 +1,138 @@ +S3Rec +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3340531.3411954>`_ + +**Title:** S^3-Rec: Self-Supervised Learning for Sequential +Recommendation with Mutual Information Maximization + +**Authors:** Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen + +**Abstract:** Recently, significant progress has been made in sequential recommendation with deep learning. Existing neural sequential recommendation models usually rely on the item prediction loss to learn +model parameters or data representations. However, the model +trained with this loss is prone to suffer from data sparsity problem. +Since it overemphasizes the final performance, the association or +fusion between context data and sequence data has not been well +captured and utilized for sequential recommendation. +To tackle this problem, we propose the model S3-Rec, which +stands for Self-Supervised learning for Sequential Recommendation, +based on the self-attentive neural architecture. The main idea of +our approach is to utilize the intrinsic data correlation to derive +self-supervision signals and enhance the data representations via +pre-training methods for improving sequential recommendation. +For our task, we devise four auxiliary self-supervised objectives +to learn the correlations among attribute, item, subsequence, and +sequence by utilizing the mutual information maximization (MIM) +principle. MIM provides a unified way to characterize the correlation between different types of data, which is particularly suitable +in our scenario. Extensive experiments conducted on six real-world +datasets demonstrate the superiority of our proposed method over +existing state-of-the-art methods, especially when only limited +training data is available. Besides, we extend our self-supervised +learning method to other recommendation models, which also improve their performance. + +.. image:: ../../../asset/s3rec.png + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``hidden_size (int)`` : The number of features in the hidden state. It is also the initial embedding size of item. Defaults to ``64``. +- ``inner_size (int)`` : The inner hidden size in feed-forward layer. Defaults to ``256``. +- ``n_layers (int)`` : The number of transformer layers in transformer encoder. Defaults to ``2``. +- ``n_heads (int)`` : The number of attention heads for multi-head attention layer. Defaults to ``2``. +- ``hidden_dropout_prob (float)`` : The probability of an element to be zeroed. Defaults to ``0.5``. +- ``attn_dropout_prob (float)`` : The probability of an attention score to be zeroed. Defaults to ``0.5``. +- ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``. +- ``layer_norm_eps (float)``: a value added to the denominator for numerical stability. Defaults to ``1e-12``. +- ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``. +- ``mask_ratio (float)`` : The probability for a item replaced by MASK token. Defaults to ``0.2``. +- ``aap_weight (float)`` : The weight for Associated Attribute Prediction loss. Defaults to ``1.0``. +- ``mip_weight (float)`` : The weight for Masked Item Prediction loss. Defaults to ``0.2``. +- ``map_weight (float)`` : The weight for Masked Attribute Prediction loss. Defaults to ``1.0``. +- ``sp_weight (float)`` : The weight for Segment Prediction loss. Defaults to ``0.5``. +- ``train_stage (float)`` : The training stage. Defaults to ``'pretrain'``. Range in ``['pretrain', 'finetune']``. +- ``item_attribute (str)`` : The item features used as attributes for pre-training. Defaults to ``'class'`` for ml-100k dataset. +- ``save_step (int)`` : Save pre-trained model every ``save_step`` pre-training epochs. Defaults to ``10``. +- ``pre_model_path (float)`` : The path of pretrained model. Defaults to ``''``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +1. Run pre-training. Write the following code to `run_pretrain.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + config_dict = { + 'train_stage': 'pretrain', + 'save_step': 10, + } + run_recbole(model='S3Rec', dataset='ml-100k', + config_dict=config_dict, saved=False) + +And then: + +.. code:: bash + + python run_pretrain.py + +2. Run fine-tuning. Write the following code to `run_finetune.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + config_dict = { + 'train_stage': 'finetune', + 'pre_model_path': './saved/S3Rec-ml-100k-100.pth', + } + run_recbole(model='S3Rec', dataset='ml-100k', + config_dict=config_dict) + +And then: + +.. code:: bash + + python run_finetune.py + + +**Notes:** + +- In the pre-training stage, the pre-trained model would be saved every 10 epochs, named as ``S3Rec-[dataset_name]-[pretrain_epochs].pth`` (e.g. S3Rec-ml-100k-100.pth) and saved to ``./saved/``. + +- In the fine-tuning stage, please make sure that the pre-trained model path is existed. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + pretrain_epochs choice [50, 100, 150] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model/sequential/sasrec.rst b/docs/source/user_guide/model/sequential/sasrec.rst new file mode 100644 index 000000000..b67d2f3f0 --- /dev/null +++ b/docs/source/user_guide/model/sequential/sasrec.rst @@ -0,0 +1,97 @@ +SASRec +=========== + +Introduction +--------------------- + +`[paper] <https://ieeexplore.ieee.org/document/8594844/>`_ + +**Title:** Self-Attentive Sequential Recommendation + +**Authors:** Wang-Cheng Kang, Julian McAuley + +**Abstract:** Sequential dynamics are a key feature of many modern recommender systems, +which seek to capture the 'context' of users' activities on the basis of actions they have +performed recently. To capture such patterns, two approaches have proliferated: Markov Chains (MCs) +and Recurrent Neural Networks (RNNs). Markov Chains assume that a user's next action can be +predicted on the basis of just their last (or last few) actions, while RNNs in principle allow +for longer-term semantics to be uncovered. Generally speaking, MC-based methods perform best in +extremely sparse datasets, where model parsimony is critical, while RNNs perform better in denser +datasets where higher model complexity is affordable. The goal of our work is to balance these +two goals, by proposing a self-attention based sequential model (SASRec) that allows us to capture +long-term semantics (like an RNN), but, using an attention mechanism, makes its predictions based +on relatively few actions (like an MC). At each time step, SASRec seeks to identify which items +are 'relevant' from a user's action history, and use them to predict the next item. Extensive +empirical studies show that our method outperforms various state-of-the-art sequential +models (including MC/CNN/RNN-based approaches) on both sparse and dense datasets. +Moreover, the model is an order of magnitude more efficient than comparable CNN/RNN-based models. +Visualizations on attention weights also show how our model adaptively handles datasets with +various density, and uncovers meaningful patterns in activity sequences. + +.. image:: ../../../asset/sasrec.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``hidden_size (int)`` : The number of features in the hidden state. It is also the initial embedding size of item. Defaults to ``64``. +- ``inner_size (int)`` : The inner hidden size in feed-forward layer. Defaults to ``256``. +- ``n_layers (int)`` : The number of transformer layers in transformer encoder. Defaults to ``2``. +- ``n_heads (int)`` : The number of attention heads for multi-head attention layer. Defaults to ``2``. +- ``hidden_dropout_prob (float)`` : The probability of an element to be zeroed. Defaults to ``0.5``. +- ``attn_dropout_prob (float)`` : The probability of an attention score to be zeroed. Defaults to ``0.5``. +- ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``. +- ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability, Defaults to ``1e-12``. +- ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to 0.02``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='SASRec', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + attn_dropout_prob choice [0.2, 0.5] + hidden_dropout_prob choice [0.2, 0.5] + n_heads choice [1, 2] + n_layers choice [1,2,3] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/sasrecf.rst b/docs/source/user_guide/model/sequential/sasrecf.rst new file mode 100644 index 000000000..ecf8bab8a --- /dev/null +++ b/docs/source/user_guide/model/sequential/sasrecf.rst @@ -0,0 +1,77 @@ +SASRecF +=========== + +Introduction +--------------------- + +It is an extension of SASRec, which concatenates items and items' features as the input. + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``hidden_size (int)`` : The number of features in the hidden state. It is also the initial embedding size of items. Defaults to ``64``. +- ``inner_size (int)`` : The inner hidden size in feed-forward layer. Defaults to ``256``. +- ``n_layers (int)`` : The number of transformer layers in transformer encoder. Defaults to ``2``. +- ``n_heads (int)`` : The number of attention heads for multi-head attention layer. Defaults to ``2``. +- ``hidden_dropout_prob (float)`` : The probability of an element to be zeroed. Defaults to ``0.5``. +- ``attn_dropout_prob (float)`` : The probability of an attention score to be zeroed. Defaults to ``0.5``. +- ``hidden_act (str)`` : The activation function in feed-forward layer. Defaults to ``'gelu'``. Range in ``['gelu', 'relu', 'swish', 'tanh', 'sigmoid']``. +- ``layer_norm_eps (float)`` : A value added to the denominator for numerical stability. Defaults to ``1e-12``. +- ``initializer_range (float)`` : The standard deviation for normal initialization. Defaults to ``0.02``. +- ``selected_features (list)`` : The list of selected item features. Defaults to ``['class']`` for ml-100k dataset. +- ``pooling_mode (str)`` : intra-feature pooling mode. Defaults to ``'sum'``. Range in ``['max', 'mean', 'sum']``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='SASRecF', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- SASRecF is a sequential model that integrates item context information. ``selected_features`` controls the used item context information. The used context information must be in the dataset and be loaded by data module in RecBole. It means the value in ``selected_features`` must appear in ``load_col``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + attn_dropout_prob choice [0.2, 0.5] + hidden_dropout_prob choice [0.2, 0.5] + n_heads choice [1, 2] + n_layers choice [1,2,3] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` \ No newline at end of file diff --git a/docs/source/user_guide/model/sequential/shan.rst b/docs/source/user_guide/model/sequential/shan.rst new file mode 100644 index 000000000..6e97a1b0d --- /dev/null +++ b/docs/source/user_guide/model/sequential/shan.rst @@ -0,0 +1,96 @@ +SHAN +=========== + +Introduction +--------------------- + +`[paper] <https://opus.lib.uts.edu.au/handle/10453/126040>`_ + +**Title:** SHAN: Sequential Recommender System based on Hierarchical Attention Network. + +**Authors:** Ying, H + +**Abstract:** With a large amount of user activity data accumulated, it is crucial to exploit user sequential behavior for sequential recommendations. Convention- +ally, user general taste and recent demand are combined to promote recommendation performances. +However, existing methods often neglect that user +long-term preference keep evolving over time, and +building a static representation for user general +taste may not adequately reflect the dynamic characters. Moreover, they integrate user-item or item- +item interactions through a linear way which lim- +its the capability of model. To this end, in this +paper, we propose a novel two-layer hierarchical +attention network, which takes the above proper- +ties into account, to recommend the next item user +might be interested. Specifically, the first attention +layer learns user long-term preferences based on +the historical purchased item representation, while +the second one outputs final user representation +through coupling user long-term and short-term +preferences. The experimental study demonstrates +the superiority of our method compared with other +state-of-the-art ones. + +.. image:: ../../../asset/shan.jpg + :width: 600 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of users and items. Defaults to ``64``. +- ``short_item_length (int)`` : The last N items . Defaults to ``2``. +- ``reg_weight (float)`` : The L2 regularization weight. Defaults to ``[0.01,0.0001]``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='SHAN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- By setting ``reproducibility=False``, the training speed of SHAN can be greatly accelerated. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.001] + embedding_size choice [64] + short_item_length choice [1,2,4,8] + reg_weight choice ['[0.0,0.0]','[0.01,0.0001]'] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + diff --git a/docs/source/user_guide/model/sequential/srgnn.rst b/docs/source/user_guide/model/sequential/srgnn.rst new file mode 100644 index 000000000..64bdfb9a4 --- /dev/null +++ b/docs/source/user_guide/model/sequential/srgnn.rst @@ -0,0 +1,84 @@ +SRGNN +=========== + +Introduction +--------------------- + +`[paper] <https://www.aaai.org/ojs/index.php/AAAI/article/view/3804>`_ + +**Title:** Session-based Recommendation with Graph Neural Networks + +**Authors:** Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, Wei-Ying Ma + +**Abstract:** The problem of session-based recommendation aims to predict user actions based on anonymous sessions. Previous +methods model a session as a sequence and estimate user representations besides item representations to make recommendations. Though achieved promising results, they are insufficient to obtain accurate user vectors in sessions and neglect +complex transitions of items. To obtain accurate item embedding and take complex transitions of items into account, we +propose a novel method, i.e. Session-based Recommendation +with Graph Neural Networks, SR-GNN for brevity. In the +proposed method, session sequences are modeled as graphstructured data. Based on the session graph, GNN can capture complex transitions of items, which are difficult to be +revealed by previous conventional sequential methods. Each +session is then represented as the composition of the global +preference and the current interest of that session using an +attention network. Extensive experiments conducted on two +real datasets show that SR-GNN evidently outperforms the +state-of-the-art session-based recommendation methods consistently. + + +.. image:: ../../../asset/srgnn.png + :width: 700 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items. Defaults to ``64``. +- ``step (int)`` : The number of layers in GNN. Defaults to ``1``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='SRGNN', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + step choice [1, 2] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + + diff --git a/docs/source/user_guide/model/sequential/stamp.rst b/docs/source/user_guide/model/sequential/stamp.rst new file mode 100644 index 000000000..448c68745 --- /dev/null +++ b/docs/source/user_guide/model/sequential/stamp.rst @@ -0,0 +1,87 @@ +STAMP +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/abs/10.1145/3219819.3219950>`_ + +**Title:** STAMP: Short-Term Attention/Memory Priority Model for Session-based Recommendation + +**Authors:** Qiao Liu, Yifu Zeng, Refuoe Mokhosi, Haibin Zhang + +**Abstract:** Predicting users’ actions based on anonymous sessions is a challenging problem in web-based behavioral modeling research, mainly +due to the uncertainty of user behavior and the limited information. +Recent advances in recurrent neural networks have led to promising +approaches to solving this problem, with long short-term memory +model proving effective in capturing users’ general interests from +previous clicks. However, none of the existing approaches explicitly +take the effects of users’ current actions on their next moves into +account. In this study, we argue that a long-term memory model +may be insufficient for modeling long sessions that usually contain +user interests drift caused by unintended clicks. A novel short-term +attention/memory priority model is proposed as a remedy, which is +capable of capturing users’ general interests from the long-term memory of a session context, whilst taking into account users’ current +interests from the short-term memory of the last-clicks. The validity +and efficacy of the proposed attention mechanism is extensively +evaluated on three benchmark data sets from the RecSys Challenge +2015 and CIKM Cup 2016. The numerical results show that our +model achieves state-of-the-art performance in all the tests. + +.. image:: ../../../asset/stamp.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items. Defaults to ``64``. +- ``loss_type (str)`` : The type of loss function. If it set to ``'CE'``, the training task is regarded as a multi-classification task and the target item is the ground truth. In this way, negative sampling is not needed. If it set to ``'BPR'``, the training task will be optimized in the pair-wise way, which maximize the difference between positive item and negative item. In this way, negative sampling is necessary, such as setting ``training_neg_sample_num = 1``. Defaults to ``'CE'``. Range in ``['BPR', 'CE']``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='STAMP', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` + + diff --git a/docs/source/user_guide/model/sequential/transrec.rst b/docs/source/user_guide/model/sequential/transrec.rst new file mode 100644 index 000000000..5e6839b5d --- /dev/null +++ b/docs/source/user_guide/model/sequential/transrec.rst @@ -0,0 +1,84 @@ +TransRec +=========== + +Introduction +--------------------- + +`[paper] <https://dl.acm.org/doi/10.1145/3109859.3109882>`_ + +**Title:** Translation-based Recommendation + +**Authors:** Ruining He, Wang-Cheng Kang, Julian McAuley + +**Abstract:** Modeling the complex interactions between users and items as well +as amongst items themselves is at the core of designing successful recommender systems. +One classical setting is predicting users' personalized sequential behavior (or 'next-item' +recommendation), where the challenges mainly lie in modeling 'third-order' interactions +between a user, her previously visited item(s), and the next item to consume. Existing +methods typically decompose these higher-order interactions into a combination +of pairwise relationships, by way of which user preferences (user-item interactions) +and sequential patterns (item-item interactions) are captured by separate components. +In this paper, we propose a unified method, TransRec, to model such third-order relationships +for large-scale sequential prediction. Methodologically, we embed items into a +'transition space' where users are modeled as translation vectors operating on +item sequences. Empirically, this approach outperforms the state-of-the-art on +a wide spectrum of real-world datasets. + +.. image:: ../../../asset/transrec.png + :width: 500 + :align: center + +Running with RecBole +------------------------- + +**Model Hyper-Parameters:** + +- ``embedding_size (int)`` : The embedding size of items. Defaults to ``64``. + +**A Running Example:** + +Write the following code to a python file, such as `run.py` + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(model='TransRec', dataset='ml-100k') + +And then: + +.. code:: bash + + python run.py + +**Notes:** + +- Different from other sequential models, TransRec must be optimized in pair-wise way using negative sampling, so it needs ``training_neg_sample_num=1``. + +Tuning Hyper Parameters +------------------------- + +If you want to use ``HyperTuning`` to tune hyper parameters of this model, you can copy the following settings and name it as ``hyper.test``. + +.. code:: bash + + learning_rate choice [0.01,0.005,0.001,0.0005,0.0001] + train_batch_size choice [512, 1024, 2048] + +Note that we just provide these hyper parameter ranges for reference only, and we can not guarantee that they are the optimal range of this model. + +Then, with the source code of RecBole (you can download it from GitHub), you can run the ``run_hyper.py`` to tuning: + +.. code:: bash + + python run_hyper.py --model=[model_name] --dataset=[dataset_name] --config_files=[config_files_path] --params_file=hyper.test + +For more details about Parameter Tuning, refer to :doc:`../../../user_guide/usage/parameter_tuning`. + + +If you want to change parameters, dataset or evaluation settings, take a look at + +- :doc:`../../../user_guide/config_settings` +- :doc:`../../../user_guide/data_intro` +- :doc:`../../../user_guide/evaluation_support` +- :doc:`../../../user_guide/usage` diff --git a/docs/source/user_guide/model_intro.rst b/docs/source/user_guide/model_intro.rst new file mode 100644 index 000000000..5be3c657d --- /dev/null +++ b/docs/source/user_guide/model_intro.rst @@ -0,0 +1,106 @@ +Model Introduction +===================== +We implement 65 recommendation models covering general recommendation, sequential recommendation, +context-aware recommendation and knowledge-based recommendation. A brief introduction to these models are as follows: + + +General Recommendation +-------------------------- + +.. toctree:: + :maxdepth: 1 + + model/general/pop + model/general/itemknn + model/general/bpr + model/general/neumf + model/general/convncf + model/general/dmf + model/general/fism + model/general/nais + model/general/spectralcf + model/general/gcmc + model/general/ngcf + model/general/lightgcn + model/general/dgcf + model/general/line + model/general/multivae + model/general/multidae + model/general/macridvae + model/general/cdae + model/general/enmf + model/general/nncf + + +Context-aware Recommendation +------------------------------- + +.. toctree:: + :maxdepth: 1 + + model/context/lr + model/context/fm + model/context/nfm + model/context/deepfm + model/context/xdeepfm + model/context/afm + model/context/ffm + model/context/fwfm + model/context/fnn + model/context/pnn + model/context/dssm + model/context/widedeep + model/context/din + model/context/dcn + model/context/autoint + model/context/xgboost + + +Sequential Recommendation +--------------------------------- + +.. toctree:: + :maxdepth: 1 + + model/sequential/fpmc + model/sequential/gru4rec + model/sequential/narm + model/sequential/stamp + model/sequential/caser + model/sequential/nextitnet + model/sequential/transrec + model/sequential/sasrec + model/sequential/bert4rec + model/sequential/srgnn + model/sequential/gcsan + model/sequential/gru4recf + model/sequential/sasrecf + model/sequential/fdsa + model/sequential/s3rec + model/sequential/gru4reckg + model/sequential/ksr + model/sequential/fossil + model/sequential/shan + model/sequential/repeatnet + model/sequential/hgn + model/sequential/hrm + model/sequential/npe + + + +Knowledge-based Recommendation +--------------------------------- + +.. toctree:: + :maxdepth: 1 + + model/knowledge/cke + model/knowledge/cfkg + model/knowledge/ktup + model/knowledge/kgat + model/knowledge/ripplenet + model/knowledge/mkr + model/knowledge/kgcn + model/knowledge/kgnnls + + diff --git a/docs/source/user_guide/usage.rst b/docs/source/user_guide/usage.rst new file mode 100644 index 000000000..f98dd081a --- /dev/null +++ b/docs/source/user_guide/usage.rst @@ -0,0 +1,14 @@ +Usage +=================== +Here we introduce how to use RecBole. + +.. toctree:: + :maxdepth: 1 + + usage/run_recbole + usage/use_modules + usage/parameter_tuning + usage/running_new_dataset + usage/running_different_models + usage/qa + usage/load_pretrained_embedding \ No newline at end of file diff --git a/docs/source/user_guide/usage/load_pretrained_embedding.rst b/docs/source/user_guide/usage/load_pretrained_embedding.rst new file mode 100644 index 000000000..c8f3c010a --- /dev/null +++ b/docs/source/user_guide/usage/load_pretrained_embedding.rst @@ -0,0 +1,44 @@ +Load Pre-trained Embedding +=========================== + +For users who want to use pre-trained user(item) embedding to train their model. We provide a simple way as following. + +Firstly, prepare your additional embedding feature file, which contain at least two columns (id & embedding vector) as following format and name it as ``dataset.suffix`` (e.g: ``ml-1m.useremb``). + +============= =============================== +uid:token user_emb:float_seq +============= =============================== +1 -115.08 13.60 113.69 +2 -130.97 263.05 -129.88 +============= =============================== + +Note that here the header of user id must be different from user id in your ``.user`` file or ``.inter`` file (e.g: if the header of user id in ``.user`` or ``.inter`` file is ``user_id:token``, the header of user id in your additional embedding feature file must be different. It can be either ``uid:token`` or ``userid:token``). + +Secondly, update the args as (suppose that ``USER_ID_FIELD: user_id``): + +.. code:: yaml + + additional_feat_suffix: [useremb] + load_col: + # inter/user/item/...: As usual + useremb: [uid, user_emb] + fields_in_same_space: [[uid, user_id]] + preload_weight: + uid: user_emb + +Then, this additional embedding feature file will be loaded into the :class:`Dataset` object. These new features can be accessed as following: + +.. code:: python + + dataset = create_dataset(config) + print(dataset.useremb_feat) + +In your model, user embedding matrix can be initialized by your pre-trained embedding vectors as following: + +.. code:: python + + class YourModel(GeneralRecommender): + def __init__(self, config, dataset): + pretrained_user_emb = dataset.get_preload_weight('uid') + self.user_embedding = nn.Embedding.from_pretrained(torch.from_numpy(pretrained_user_emb)) + diff --git a/docs/source/user_guide/usage/parameter_tuning.rst b/docs/source/user_guide/usage/parameter_tuning.rst new file mode 100644 index 000000000..5c8168432 --- /dev/null +++ b/docs/source/user_guide/usage/parameter_tuning.rst @@ -0,0 +1,148 @@ +Parameter Tuning +===================== +RecBole is featured in the capability of automatic parameter +(or hyper-parameter) tuning. One can readily optimize +a given model according to the provided hyper-parameter spaces. + +The general steps are given as follows: + +To begin with, the user has to claim an +:class:`~recbole.trainer.hyper_tuning.HyperTuning` +instance in the running python file (e.g., `run.py`): + +.. code:: python + + from recbole.trainer import HyperTuning + from recbole.quick_start import objective_function + + hp = HyperTuning(objective_function=objective_function, algo='exhaustive', + params_file='model.hyper', fixed_config_file_list=['example.yaml']) + +:attr:`objective_function` is the optimization objective, +the input of :attr:`objective_function` is the parameter, +and the output is the optimal result of these parameters. +The users can design this :attr:`objective_function` according to their own requirements. +The user can also use an encapsulated :attr:`objective_function`, that is: + +.. code:: python + + def objective_function(config_dict=None, config_file_list=None): + + config = Config(config_dict=config_dict, config_file_list=config_file_list) + init_seed(config['seed']) + dataset = create_dataset(config) + train_data, valid_data, test_data = data_preparation(config, dataset) + model = get_model(config['model'])(config, train_data).to(config['device']) + trainer = get_trainer(config['MODEL_TYPE'], config['model'])(config, model) + best_valid_score, best_valid_result = trainer.fit(train_data, valid_data, verbose=False) + test_result = trainer.evaluate(test_data) + + return { + 'best_valid_score': best_valid_score, + 'valid_score_bigger': config['valid_metric_bigger'], + 'best_valid_result': best_valid_result, + 'test_result': test_result + } + +:attr:`algo` is the optimization algorithm. RecBole realize this module based +on hyperopt_. In addition, we also support grid search tunning method. + +.. code:: python + + from hyperopt import tpe + + # hyperopt 自带的优化算法 + hp1 = HyperTuning(algo=tpe.suggest) + + # Grid Search + hp2 = HyperTuning(algo='exhaustive') + +:attr:`params_file` is the ranges of the parameters, which is exampled as +(e.g., `model.hyper`): + +.. code:: none + + learning_rate loguniform -8,0 + embedding_size choice [64,96,128] + mlp_hidden_size choice ['[64,64,64]','[128,128]'] + +Each line represent a parameter and the corresponding search range. +There are three components: parameter name, range type, range. + +:class:`~recbole.trainer.hyper_tuning.HyperTuning` supports four range types, +the details are as follows: + ++----------------+---------------------------------+------------------------------------------------------------------+ +| range type | range | discription | ++================+=================================+==================================================================+ +| choice | options(list) | search in options | ++----------------+---------------------------------+------------------------------------------------------------------+ +| uniform | low(int),high(int) | search in uniform distribution: (low,high) | ++----------------+---------------------------------+------------------------------------------------------------------+ +| loguniform | low(int),high(int) | search in uniform distribution: exp(uniform(low,high)) | ++----------------+---------------------------------+------------------------------------------------------------------+ +| quniform | low(int),high(int),q(int) | search in uniform distribution: round(uniform(low,high)/q)*q | ++----------------+---------------------------------+------------------------------------------------------------------+ + +It should be noted that if the parameters are list and the range type is choice, +then the inner list should be quoted, e.g., :attr:`mlp_hidden_size` in `model.hyper`. + +.. _hyperopt: https://github.com/hyperopt/hyperopt + +:attr:`fixed_config_file_list` is the fixed parameters, e.g., dataset related parameters and evaluation parameters. +These parameters should be aligned with the format in :attr:`config_file_list`. See details as :doc:`../config_settings`. + +Calling method of HyperTuning like: + +.. code:: python + + from recbole.trainer import HyperTuning + from recbole.quick_start import objective_function + + hp = HyperTuning(objective_function=objective_function, algo='exhaustive', + params_file='model.hyper', fixed_config_file_list=['example.yaml']) + + # run + hp.run() + # export result to the file + hp.export_result(output_file='hyper_example.result') + # print best parameters + print('best params: ', hp.best_params) + # print best result + print('best result: ') + print(hp.params2result[hp.params2str(hp.best_params)]) + +Run like: + +.. code:: bash + + python run.py --dataset=[dataset_name] --model=[model_name] + +:attr:`dataset_name` is the dataset name, :attr:`model_name` is the model name, which can be controlled by the command line or the yaml configuration files. + +For example: + +.. code:: yaml + + dataset: ml-100k + model: BPR + +A simple example is to search the :attr:`learning_rate` and :attr:`embedding_size` in BPR, that is, + +.. code:: bash + + running_parameters: + {'embedding_size': 128, 'learning_rate': 0.005} + current best valid score: 0.3795 + current best valid result: + {'recall@10': 0.2008, 'mrr@10': 0.3795, 'ndcg@10': 0.2151, 'hit@10': 0.7306, 'precision@10': 0.1466} + current test result: + {'recall@10': 0.2186, 'mrr@10': 0.4388, 'ndcg@10': 0.2591, 'hit@10': 0.7381, 'precision@10': 0.1784} + + ... + + best params: {'embedding_size': 64, 'learning_rate': 0.001} + best result: { + 'best_valid_result': {'recall@10': 0.2169, 'mrr@10': 0.4005, 'ndcg@10': 0.235, 'hit@10': 0.7582, 'precision@10': 0.1598} + 'test_result': {'recall@10': 0.2368, 'mrr@10': 0.4519, 'ndcg@10': 0.2768, 'hit@10': 0.7614, 'precision@10': 0.1901} + } diff --git a/docs/source/user_guide/usage/qa.rst b/docs/source/user_guide/usage/qa.rst new file mode 100644 index 000000000..f088c6802 --- /dev/null +++ b/docs/source/user_guide/usage/qa.rst @@ -0,0 +1,36 @@ +Clarifications on some practical issues +========================================= + +**Q1**: + +Why the result of ``Dataset.item_num`` always one plus of the actual number of items in the dataset? + +**A1**: + +We add ``[PAD]`` for all the token like fields. Thus after remapping ID, ``0`` will be reserved for ``[PAD]``, which makes the result of ``Dataset.item_num`` more than the actual number. + +Note that for Knowledge-based models, we add one more relation called ``U-I Relation``. It describes the history interactions which will be used in :meth:`recbole.data.dataset.kg_dataset.KnowledgeBasedDataset.ckg_graph`. +Thus the result of ``KGDataset.relation_num`` is two more than the actual number of relations. + +**Q2**: + +Why are the test results usually better than the best valid results? + +**A2**: + +For more rigorous evaluation, those user-item interaction records in validation sets will not be ranked while testing. +Thus the distribution of validation & test sets may be inconsistent. + +However, this doesn't affect the comparison between models. + +**Q3** + +Why do I receive a warning about ``batch_size changed``? What is the meaning of :attr:`batch_size` in dataloader? + +**A3** + +In RecBole's dataloader, the meaning of :attr:`batch_size` is the upper bound of the number of **interactions** in one single batch. + +On the one hand, it's easy to calculate and control the usage of GPU memories. E.g., while comparing between different datasets, you don't need to change the value of :attr:`batch_size`, because the usage of GPU memories will not change a lot. + +On the other hand, in RecBole's top-k evaluation, we need the interactions of each user grouped in one batch. In other words, the interactions of any user should not be separated into multiple batches. We try to feed more interactions into one batch, but due to the above rules, the :attr:`batch_size` is just an upper bound. And :meth:`_batch_size_adaptation` is designed to adapt the actual batch size dynamically. Thus, while executing :meth:`_batch_size_adaptation`, you will receive a warning message. diff --git a/docs/source/user_guide/usage/run_recbole.rst b/docs/source/user_guide/usage/run_recbole.rst new file mode 100644 index 000000000..90b979481 --- /dev/null +++ b/docs/source/user_guide/usage/run_recbole.rst @@ -0,0 +1,39 @@ +Use run_recbole +========================== +We enclose the training and evaluation processes in the api of +:func:`~recbole.quick_start.quick_start.run_recbole`, +which is composed of: dataset loading, dataset splitting, model initialization, +model training and model evaluation. + +If this process can satisfy your requirement, you can recall this api to use +RecBole. + +You can create a python file (e.g., `run.py` ), and write the following code +into the file. + +.. code:: python + + from recbole.quick_start import run_recbole + + run_recbole(dataset=dataset, model=model, config_file_list=config_file_list, config_dict=config_dict) + +:attr:`dataset` is the name of the data, such as 'ml-100k', +:attr:`model` indicates the model name, such as 'BPR'. + +:attr:`config_file_list` indicates the configuration files, +:attr:`config_dict` is the parameter dict. +The two variables are used to config parameters in our toolkit. +If you do not want to use the two variables to config parameters, +please ignore them. In addition, we can also support ot control the parameters +by the command line. + +Please refer to :doc:`../config_settings` for more details about config settings. + +Then execute the following command to run:: + +.. code:: bash + + python run.py --[param_name]=[param_value] + +`--[param_name]=[param_value]` is the way to control parameters by +the command line. diff --git a/docs/source/user_guide/usage/running_different_models.rst b/docs/source/user_guide/usage/running_different_models.rst new file mode 100644 index 000000000..e262a940d --- /dev/null +++ b/docs/source/user_guide/usage/running_different_models.rst @@ -0,0 +1,160 @@ +Running Different Models +========================== +Here, we present how to run different models in RecBole. + +Proper Parameters Configuration +---------------------------------- +Since different categories of models have different requirements for data +processing and evaluation setting, we need to configure these settings +appropriately. + +The following will introduce the parameter configuration of these four +categories of models: namely general recommendation, context-aware +recommendation, sequential recommendation and knowledge-based recommendation. + +General Recommendation +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**specify and load the user and item columns** + +General recommendation models utilize the historical interactions between +users and items to make recommendations, so it needs to specify and load the +user and item columns of the dataset. + +.. code:: yaml + + USER_ID_FIELD: user_id + ITEM_ID_FIELD: item_id + load_col: + inter: [user_id, item_id] + +For some dataset, the column names corresponding to user and item in atomic +files may not be `user_id` and `item_id`. Just replace them with the +corresponding column names. + +**training and evaluation settings** + +General recommendation models usually needs to group data by user and perform +negative sampling. + +.. code:: yaml + + group_by_user: True + training_neg_sample_num: 1 + +Context-aware Recommendation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**load the feature columns** + +Context-aware recommendation models utilize the features of users, items and +interactions to make CTR predictions, so it needs to load the used features. + +.. code:: yaml + + load_col: + inter: [inter_feature1, inter_feature2] + item: [item_feature1, item_feature2] + user: [user_feature1, user_feature2] + +`inter_feature1` refers to the column name of the corresponding feature in the +inter atomic file. + +**label setting** + +We also need to configure `LABEL_FIELD`, which represents the label column in +the CTR prediction. For the Context-aware recommendation models, the setting of +`LABEL_FIELD` is divided into two cases: + +1) There is a label field in atomic file, and the value is in 0/1, we only need to +set as follows: + +.. code:: yaml + + LABEL_FIELD: label + +2) There is no label field in atomic file, we need to generate label field based +on some information. + +.. code:: yaml + + LABEL_FIELD: label + threshold: + rating: 3 + +`rating` is a column in atomic file and is loaded (by ``load_col``). In this way, +the label of the interaction with ``rating >= 3`` is set to 1, the reset are +set to 0. + +**training and evaluation settings** + +Context-aware recommendation models usually does not need to group data by user and +perform negative sampling. + +.. code:: yaml + + group_by_user: False + training_neg_sample_num: 0 + +Since there is no need to rank the results, ``eval_setting`` only needs to set +the first part, for example: + +.. code:: yaml + + eval_setting: RO_RS + +The evaluation metrics are generally set to `AUC` and `LogLoss`. + +.. code:: yaml + + metrics: ['AUC', 'LogLoss'] + + +Sequential Recommendation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +**specify and load the user, item and time columns** + +Sequential recommendation models utilize the historical interaction sequences +to predict hte next item, so it needs to specify and load the user, item and +time columns of the dataset. + +.. code:: yaml + + USER_ID_FIELD: user_id + ITEM_ID_FIELD: item_id + TIME_FIELD: timestamp + load_col: + inter: [user_id, item_id, timestamp] + +For some dataset, the column names corresponding to user, item and time in +atomic files may not be `user_id`, `item_id` and `timestamp`, just replace them +with the corresponding column names. + +**maximum length of the sequence** + +The maximum length of the sequence can be modified by setting +``MAX_ITEM_LIST_LENGTH`` + +.. code:: yaml + + MAX_ITEM_LIST_LENGTH: 50 + +Knowledge-based Recommendation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +**specify and load the kg entity columns** + +Knowledge-based recommendation models utilize KG information to make +recommendations, so it needs to specify and load the kg information of the dataset. + +.. code:: yaml + + USER_ID_FIELD: user_id + ITEM_ID_FIELD: item_id + HEAD_ENTITY_ID_FIELD: head_id + TAIL_ENTITY_ID_FIELD: tail_id + RELATION_ID_FIELD: relation_id + ENTITY_ID_FIELD: entity_id + load_col: + inter: [user_id, item_id] + kg: [head_id, relation_id, tail_id] + link: [item_id, entity_id] diff --git a/docs/source/user_guide/usage/running_new_dataset.rst b/docs/source/user_guide/usage/running_new_dataset.rst new file mode 100644 index 000000000..b986567f5 --- /dev/null +++ b/docs/source/user_guide/usage/running_new_dataset.rst @@ -0,0 +1,131 @@ +Running New Dataset +======================= +Here, we present how to use a new dataset in RecBole. + + +Convert to Atomic Files +------------------------- + +If the user use the collected datasets, she can choose one of the following ways: + +1. Download the converted atomic files from `Google Drive <https://drive.google.com/drive/folders/1so0lckI6N6_niVEYaBu-LIcpOdZf99kj?usp=sharing>`_ or `Baidu Wangpan <https://pan.baidu.com/s/1p51sWMgVFbAaHQmL4aD_-g>`_ (Password: e272). +2. Find the converting script from RecDatasets_, and transform them to atomic files. + +If the user use other datasets, she should format the data according to the format of the atomic files. + +.. _RecDatasets: https://github.com/RUCAIBox/RecDatasets + +For the dataset of ml-1m, the converting file is: + +**ml-1m.inter** + +============= ============= ============ =============== +user_id:token item_id:token rating:float timestamp:float +============= ============= ============ =============== +1 1193 5 978300760 +1 661 3 978302109 +============= ============= ============ =============== + +**ml-1m.user** + +============= ========= ============ ================ ============== +user_id:token age:token gender:token occupation:token zip_code:token +============= ========= ============ ================ ============== +1 1 F 10 48067 +2 56 M 16 70072 +============= ========= ============ ================ ============== + +**ml-1m.item** + +============= ===================== ================== ============================ +item_id:token movie_title:token_seq release_year:token genre:token_seq +============= ===================== ================== ============================ +1 Toy Story 1995 Animation Children's Comedy +2 Jumanji 1995 Adventure Children's Fantasy +============= ===================== ================== ============================ + + +Local Path +--------------- +Name of atomic files, name of dir that containing atomic files and ``config['dataset']`` should be the same. + +``config['data_path']`` should be the parent dir of the dir that containing atomic files. + +For example: + +.. code:: none + + ~/xxx/yyy/ml-1m/ + ├── ml-1m.inter + ├── ml-1m.item + ├── ml-1m.kg + ├── ml-1m.link + └── ml-1m.user + +.. code:: yaml + + data_path: ~/xxx/yyy/ + dataset: ml-1m + +Convert to Dataset +--------------------- +Here, we present how to convert atomic files into :class:`~recbole.data.dataset.dataset.Dataset`. + +Suppose we use ml-1m to train BPR. + +According to the dataset information, the user should set the dataset information and filtering parameters in the configuration file `ml-1m.yaml`. +For example, we conduct 10-core filtering, removing the ratings which are smaller than 3, the time of the record should be earlier than 97830000, and we only load inter data. + +.. code:: yaml + + USER_ID_FIELD: user_id + ITEM_ID_FIELD: item_id + RATING_FIELD: rating + TIME_FIELD: timestamp + + load_col: + inter: [user_id, item_id, rating, timestamp] + + min_user_inter_num: 10 + min_item_inter_num: 10 + lowest_val: + rating: 3 + timestamp: 97830000 + + +.. code:: python + + from recbole.config import Config + from recbole.data import create_dataset, data_preparation + + if __name__ == '__main__': + config = Config(model='BPR', dataset='ml-1m', config_file_list=['ml-1m.yaml']) + dataset = create_dataset(config) + + +Convert to Dataloader +------------------------ +Here, we present how to convert :class:`~recbole.data.dataset.dataset.Dataset` into :obj:`Dataloader`. + +We firstly set the parameters in the configuration file `ml-1m.yaml`. +We leverage random ordering + ratio-based splitting and full ranking with all item candidates, the splitting ratio is set as 8:1:1. + +.. code:: yaml + + ... + + eval_setting: RO_RS,full + split_ratio: [0.8,0.1,0.1] + + +.. code:: python + + from recbole.config import Config + from recbole.data import create_dataset, data_preparation + + + if __name__ == '__main__': + + ... + + train_data, valid_data, test_data = data_preparation(config, dataset) diff --git a/docs/source/user_guide/usage/use_modules.rst b/docs/source/user_guide/usage/use_modules.rst new file mode 100644 index 000000000..6b20556a1 --- /dev/null +++ b/docs/source/user_guide/usage/use_modules.rst @@ -0,0 +1,203 @@ +Use Modules +================ +You can recall different modules in RecBole to satisfy your requirement. + +The complete process is as follows: + +.. code:: python + + from logging import getLogger + from recbole.config import Config + from recbole.data import create_dataset, data_preparation + from recbole.model.general_recommender import BPR + from recbole.trainer import Trainer + from recbole.utils import init_seed, init_logger + + if __name__ == '__main__': + + # configurations initialization + config = Config(model='BPR', dataset='ml-100k') + + # init random seed + init_seed(config['seed'], config['reproducibility']) + + # logger initialization + init_logger(config) + logger = getLogger() + + # write config info into log + logger.info(config) + + # dataset creating and filtering + dataset = create_dataset(config) + logger.info(dataset) + + # dataset splitting + train_data, valid_data, test_data = data_preparation(config, dataset) + + # model loading and initialization + model = BPR(config, train_data).to(config['device']) + logger.info(model) + + # trainer loading and initialization + trainer = Trainer(config, model) + + # model training + best_valid_score, best_valid_result = trainer.fit(train_data, valid_data) + + # model evaluation + test_result = trainer.evaluate(test_data) + print(test_result) + + +Configurations Initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + config = Config(model='BPR', dataset='ml-100k') + +:class:`~recbole.config.configurator.Config` module is used to set parameters and experiment setup. +Please refer to :doc:`../config_settings` for more details. + + +Init Random Seed +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + init_seed(config['seed'], config['reproducibility']) + +Initializing the random seed to ensure the reproducibility of the experiments. + + +Dataset Filtering +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + dataset = create_dataset(config) + +Filtering the data files according to the parameters indicated in the configuration. + + +Dataset Splitting +^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + train_data, valid_data, test_data = data_preparation(config, dataset) + +Splitting the dataset according to the parameters indicated in the configuration. + + +Model Initialization +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + model = BPR(config, train_data).to(config['device']) + +Initializing the model according to the model names, and initializing the instance of the model. + + +Trainer Initialization +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + trainer = Trainer(config, model) + +Initializing the trainer, which is used to model training and evaluation. + + +Automatic Selection of Model and Trainer +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +In the above example, we manually import the model class :class:`~recbole.model.general_recommender.bpr.BPR` and the trainer class :class:`~recbole.trainer.trainer.Trainer`. +For the implemented model, we support the automatic acquisition of the corresponding model class and +trainer class through the model name. + + +.. code:: python + + from recbole.utils import get_model, get_trainer + + if __name__ == '__main__': + + ... + + # model loading and initialization + model = get_model(config['model'])(config, train_data).to(config['device']) + + # trainer loading and initialization + trainer = get_trainer(config['MODEL_TYPE'], config['model'])(config, model) + + ... + + +Model Training +^^^^^^^^^^^^^^^^^^^ + +.. code:: python + + best_valid_score, best_valid_result = trainer.fit(train_data, valid_data) + +Inputting the training and valid data, and beginning the training process. + + +Model Evaluation +^^^^^^^^^^^^^^^^^^^^^^^ +.. code:: python + + test_result = trainer.evaluate(test_data) + +Inputting the test data, and evaluating based on the trained model. + + +Resume Model From Break Point +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Our toolkit also supports reloading the parameters from previously trained models. + +In this example, we present how to train the model from the former parameters. + +.. code:: python + + ... + + if __name__ == '__main__': + + ... + + # trainer loading and initialization + trainer = get_trainer(config['MODEL_TYPE'], config['model'])(config, model) + + # resume from break point + checkpoint_file = 'checkpoint.pth' + trainer.resume_checkpoint(checkpoint_file) + + # model training + best_valid_score, best_valid_result = trainer.fit(train_data, valid_data) + + ... + +:attr:`checkpoint_file` is the file used to store the model. + + +In this example, we present how to test a model based on the previous saved parameters. + +.. code:: python + + ... + + if __name__ == '__main__': + + ... + + # trainer loading and initialization + trainer = get_trainer(config['MODEL_TYPE'], config['model'])(config, model) + + # model evaluation + checkpoint_file = 'checkpoint.pth' + test_result = trainer.evaluate(test_data, model_file=checkpoint_file) + print(test_result) + ... \ No newline at end of file diff --git a/recbole/__init__.py b/recbole/__init__.py index 182dd10a5..5e6584fac 100644 --- a/recbole/__init__.py +++ b/recbole/__init__.py @@ -2,4 +2,4 @@ from __future__ import print_function from __future__ import division -__version__ = '0.1.1' +__version__ = '0.2.0' \ No newline at end of file diff --git a/recbole/config/configurator.py b/recbole/config/configurator.py index 381164ae7..44da2fae7 100644 --- a/recbole/config/configurator.py +++ b/recbole/config/configurator.py @@ -19,7 +19,7 @@ import torch from logging import getLogger -from recbole.evaluator import loss_metrics, topk_metrics +from recbole.evaluator import group_metrics, individual_metrics from recbole.utils import get_model, Enum, EvaluatorType, ModelType, InputType, \ general_arguments, training_arguments, evaluation_arguments, dataset_arguments @@ -89,14 +89,16 @@ def _build_yaml_loader(self): loader = yaml.FullLoader loader.add_implicit_resolver( u'tag:yaml.org,2002:float', - re.compile(u'''^(?: + re.compile( + u'''^(?: [-+]?(?:[0-9][0-9_]*)\\.[0-9_]*(?:[eE][-+]?[0-9]+)? |[-+]?(?:[0-9][0-9_]*)(?:[eE][-+]?[0-9]+) |\\.[0-9_]+(?:[eE][-+][0-9]+)? |[-+]?[0-9][0-9_]*(?::[0-5]?[0-9])+\\.[0-9_]* |[-+]?\\.(?:inf|Inf|INF) - |\\.(?:nan|NaN|NAN))$''', re.X), - list(u'-+0123456789.')) + |\\.(?:nan|NaN|NAN))$''', re.X + ), list(u'-+0123456789.') + ) return loader def _convert_config_dict(self, config_dict): @@ -175,7 +177,8 @@ def _get_model_and_dataset(self, model, dataset): except KeyError: raise KeyError( 'model need to be specified in at least one of the these ways: ' - '[model variable, config file, config dict, command line] ') + '[model variable, config file, config dict, command line] ' + ) if not isinstance(model, str): final_model_class = model final_model = model.__name__ @@ -187,13 +190,22 @@ def _get_model_and_dataset(self, model, dataset): try: final_dataset = self.external_config_dict['dataset'] except KeyError: - raise KeyError('dataset need to be specified in at least one of the these ways: ' - '[dataset variable, config file, config dict, command line] ') + raise KeyError( + 'dataset need to be specified in at least one of the these ways: ' + '[dataset variable, config file, config dict, command line] ' + ) else: final_dataset = dataset return final_model, final_model_class, final_dataset + def _update_internal_config_dict(self, file): + with open(file, 'r', encoding='utf-8') as f: + config_dict = yaml.load(f.read(), Loader=self.yaml_loader) + if config_dict is not None: + self.internal_config_dict.update(config_dict) + return config_dict + def _load_internal_config_dict(self, model, model_class, dataset): current_path = os.path.dirname(os.path.realpath(__file__)) overall_init_file = os.path.join(current_path, '../properties/overall.yaml') @@ -201,65 +213,46 @@ def _load_internal_config_dict(self, model, model_class, dataset): sample_init_file = os.path.join(current_path, '../properties/dataset/sample.yaml') dataset_init_file = os.path.join(current_path, '../properties/dataset/' + dataset + '.yaml') + quick_start_config_path = os.path.join(current_path, '../properties/quick_start_config/') + context_aware_init = os.path.join(quick_start_config_path, 'context-aware.yaml') + context_aware_on_ml_100k_init = os.path.join(quick_start_config_path, 'context-aware_ml-100k.yaml') + DIN_init = os.path.join(quick_start_config_path, 'sequential_DIN.yaml') + DIN_on_ml_100k_init = os.path.join(quick_start_config_path, 'sequential_DIN_on_ml-100k.yaml') + sequential_init = os.path.join(quick_start_config_path, 'sequential.yaml') + special_sequential_on_ml_100k_init = os.path.join(quick_start_config_path, 'special_sequential_on_ml-100k.yaml') + sequential_embedding_model_init = os.path.join(quick_start_config_path, 'sequential_embedding_model.yaml') + knowledge_base_init = os.path.join(quick_start_config_path, 'knowledge_base.yaml') + self.internal_config_dict = dict() for file in [overall_init_file, model_init_file, sample_init_file, dataset_init_file]: if os.path.isfile(file): - with open(file, 'r', encoding='utf-8') as f: - config_dict = yaml.load(f.read(), Loader=self.yaml_loader) - if file == dataset_init_file: - self.parameters['Dataset'] += [key for key in config_dict.keys() if - key not in self.parameters['Dataset']] - if config_dict is not None: - self.internal_config_dict.update(config_dict) + config_dict = self._update_internal_config_dict(file) + if file == dataset_init_file: + self.parameters['Dataset'] += [ + key for key in config_dict.keys() if key not in self.parameters['Dataset'] + ] + self.internal_config_dict['MODEL_TYPE'] = model_class.type if self.internal_config_dict['MODEL_TYPE'] == ModelType.GENERAL: pass - elif self.internal_config_dict['MODEL_TYPE'] == ModelType.CONTEXT: - self.internal_config_dict.update({ - 'eval_setting': 'RO_RS', - 'group_by_user': False, - 'training_neg_sample_num': 0, - 'metrics': ['AUC', 'LogLoss'], - 'valid_metric': 'AUC', - }) + elif self.internal_config_dict['MODEL_TYPE'] in {ModelType.CONTEXT, ModelType.XGBOOST}: + self._update_internal_config_dict(context_aware_init) if dataset == 'ml-100k': - self.internal_config_dict.update({ - 'threshold': {'rating': 4}, - 'load_col': {'inter': ['user_id', 'item_id', 'rating', 'timestamp'], - 'user': ['user_id', 'age', 'gender', 'occupation'], - 'item': ['item_id', 'release_year', 'class']}, - }) - + self._update_internal_config_dict(context_aware_on_ml_100k_init) elif self.internal_config_dict['MODEL_TYPE'] == ModelType.SEQUENTIAL: if model == 'DIN': - self.internal_config_dict.update({ - 'eval_setting': 'TO_LS, uni100', - 'metrics': ['AUC', 'LogLoss'], - 'valid_metric': 'AUC', - }) + self._update_internal_config_dict(DIN_init) if dataset == 'ml-100k': - self.internal_config_dict.update({ - 'load_col': {'inter': ['user_id', 'item_id', 'rating', 'timestamp'], - 'user': ['user_id', 'age', 'gender', 'occupation'], - 'item': ['item_id', 'release_year']}, - }) - + self._update_internal_config_dict(DIN_on_ml_100k_init) + elif model in ['GRU4RecKG', 'KSR']: + self._update_internal_config_dict(sequential_embedding_model_init) else: - self.internal_config_dict.update({ - 'eval_setting': 'TO_LS,full', - }) + self._update_internal_config_dict(sequential_init) if dataset == 'ml-100k' and model in ['GRU4RecF', 'SASRecF', 'FDSA', 'S3Rec']: - self.internal_config_dict.update({ - 'load_col': {'inter': ['user_id', 'item_id', 'rating', 'timestamp'], - 'item': ['item_id', 'release_year', 'class']}, - }) + self._update_internal_config_dict(special_sequential_on_ml_100k_init) elif self.internal_config_dict['MODEL_TYPE'] == ModelType.KNOWLEDGE: - self.internal_config_dict.update({ - 'load_col': {'inter': ['user_id', 'item_id', 'rating', 'timestamp'], - 'kg': ['head_id', 'relation_id', 'tail_id'], - 'link': ['item_id', 'entity_id']} - }) + self._update_internal_config_dict(knowledge_base_init) def _get_final_config_dict(self): final_config_dict = dict() @@ -285,17 +278,16 @@ def _set_default_parameters(self): elif self.final_config_dict['loss_type'] in ['BPR']: self.final_config_dict['MODEL_INPUT_TYPE'] = InputType.PAIRWISE else: - raise ValueError('Either Model has attr \'input_type\',' - 'or arg \'loss_type\' should exist in config.') + raise ValueError('Either Model has attr \'input_type\',' 'or arg \'loss_type\' should exist in config.') eval_type = None for metric in self.final_config_dict['metrics']: - if metric.lower() in loss_metrics: + if metric.lower() in individual_metrics: if eval_type is not None and eval_type == EvaluatorType.RANKING: raise RuntimeError('Ranking metrics and other metrics can not be used at the same time.') else: eval_type = EvaluatorType.INDIVIDUAL - if metric.lower() in topk_metrics: + if metric.lower() in group_metrics: if eval_type is not None and eval_type == EvaluatorType.INDIVIDUAL: raise RuntimeError('Ranking metrics and other metrics can not be used at the same time.') else: @@ -337,10 +329,10 @@ def __str__(self): args_info = '' for category in self.parameters: args_info += category + ' Hyper Parameters: \n' - args_info += '\n'.join( - ["{}={}".format(arg, value) - for arg, value in self.final_config_dict.items() - if arg in self.parameters[category]]) + args_info += '\n'.join([ + "{}={}".format(arg, value) for arg, value in self.final_config_dict.items() + if arg in self.parameters[category] + ]) args_info += '\n\n' return args_info diff --git a/recbole/config/eval_setting.py b/recbole/config/eval_setting.py index 27d910c5a..c377eb51a 100644 --- a/recbole/config/eval_setting.py +++ b/recbole/config/eval_setting.py @@ -7,7 +7,6 @@ # @Author : Yupeng Hou, Yushuo Chen # @Email : houyupeng@ruc.edu.cn, chenyushuo@ruc.edu.cn - """ recbole.config.eval_setting ################################ @@ -57,7 +56,7 @@ class EvalSetting(object): Usually records are sorted by timestamp, or shuffled. split_args (dict): Args about splitting. - usually records are splitted by ratio (eg. 8:1:1), + usually records are split by ratio (eg. 8:1:1), or by 'leave one out' strategy, which means the last purchase record of one user is used for evaluation. @@ -163,7 +162,7 @@ def random_ordering(self): """ self.set_ordering('shuffle') - def sort_by(self, field, ascending=None): + def sort_by(self, field, ascending=True): """Setting about Sorting. Similar with pandas' sort_values_ @@ -173,14 +172,8 @@ def sort_by(self, field, ascending=None): Args: field (str or list of str): Name or list of names ascending (bool or list of bool): Sort ascending vs. descending. Specify list for multiple sort orders. - If this is a list of bools, must match the length of the field + If this is a list of bool, must match the length of the field """ - if not isinstance(field, list): - field = [field] - if ascending is None: - ascending = [True] * len(field) - if len(ascending) == 1: - ascending = True self.set_ordering('by', field=field, ascending=ascending) def temporal_ordering(self): @@ -278,6 +271,40 @@ def neg_sample_by(self, by, distribution='uniform'): """ self.set_neg_sampling(strategy='by', by=by, distribution=distribution) + def set_ordering_and_splitting(self, es_str): + """Setting about ordering and split method. + + Args: + es_str (str): Ordering and splitting method string. Either ``RO_RS``, ``RO_LS``, ``TO_RS`` or ``TO_LS``. + """ + args = es_str.split('_') + if len(args) != 2: + raise ValueError(f'`{es_str}` is invalid eval_setting.') + ordering_args, split_args = args + + if self.config['group_by_user']: + self.group_by_user() + + if ordering_args == 'RO': + self.random_ordering() + elif ordering_args == 'TO': + self.temporal_ordering() + else: + raise NotImplementedError(f'Ordering args `{ordering_args}` is not implemented.') + + if split_args == 'RS': + ratios = self.config['split_ratio'] + if ratios is None: + raise ValueError('`ratios` should be set if `RS` is set.') + self.split_by_ratio(ratios) + elif split_args == 'LS': + leave_one_num = self.config['leave_one_num'] + if leave_one_num is None: + raise ValueError('`leave_one_num` should be set if `LS` is set.') + self.leave_one_out(leave_one_num=leave_one_num) + else: + raise NotImplementedError(f'Split args `{split_args}` is not implemented.') + def RO_RS(self, ratios=(0.8, 0.1, 0.1), group_by_user=True): """Preset about Random Ordering and Ratio-based Splitting. diff --git a/recbole/data/__init__.py b/recbole/data/__init__.py index 76b29b2e0..4b790aba7 100644 --- a/recbole/data/__init__.py +++ b/recbole/data/__init__.py @@ -1,4 +1,3 @@ from recbole.data.utils import * - __all__ = ['create_dataset', 'data_preparation'] diff --git a/recbole/data/dataloader/__init__.py b/recbole/data/dataloader/__init__.py index 18e6f0674..90ffa311f 100644 --- a/recbole/data/dataloader/__init__.py +++ b/recbole/data/dataloader/__init__.py @@ -4,3 +4,5 @@ from recbole.data.dataloader.context_dataloader import * from recbole.data.dataloader.sequential_dataloader import * from recbole.data.dataloader.knowledge_dataloader import * +from recbole.data.dataloader.xgboost_dataloader import * +from recbole.data.dataloader.user_dataloader import * diff --git a/recbole/data/dataloader/abstract_dataloader.py b/recbole/data/dataloader/abstract_dataloader.py index 6b21009cb..73e642472 100644 --- a/recbole/data/dataloader/abstract_dataloader.py +++ b/recbole/data/dataloader/abstract_dataloader.py @@ -42,8 +42,7 @@ class AbstractDataLoader(object): """ dl_type = None - def __init__(self, config, dataset, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + def __init__(self, config, dataset, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): self.config = config self.logger = getLogger() self.dataset = dataset @@ -56,11 +55,6 @@ def __init__(self, config, dataset, if self.real_time is None: self.real_time = True - self.join = self.dataset.join - self.history_item_matrix = self.dataset.history_item_matrix - self.history_user_matrix = self.dataset.history_user_matrix - self.inter_matrix = self.dataset.inter_matrix - for dataset_attr in self.dataset._dataloader_apis: try: flag = hasattr(self.dataset, dataset_attr) @@ -80,7 +74,7 @@ def setup(self): pass def data_preprocess(self): - """This function is used to do some data preprocess, such as pre-neg-sampling and pre-data-augmentation. + """This function is used to do some data preprocess, such as pre-data-augmentation. By default, it will do nothing. """ pass @@ -127,24 +121,13 @@ def set_batch_size(self, batch_size): raise PermissionError('Cannot change dataloader\'s batch_size while iteration') if self.batch_size != batch_size: self.batch_size = batch_size - self.logger.warning('Batch size is changed to {}'.format(batch_size)) - - def get_user_feature(self): - """It is similar to :meth:`~recbole.data.dataset.dataset.Dataset.get_user_feature`, but it will return an - :class:`~recbole.data.interaction.Interaction` of user feature instead of a :class:`pandas.DataFrame`. - - Returns: - Interaction: The interaction of user feature. - """ - user_df = self.dataset.get_user_feature() - return self._dataframe_to_interaction(user_df) + self.logger.warning(f'Batch size is changed to {batch_size}.') - def get_item_feature(self): - """It is similar to :meth:`~recbole.data.dataset.dataset.Dataset.get_item_feature`, but it will return an - :class:`~recbole.data.interaction.Interaction` of item feature instead of a :class:`pandas.DataFrame`. + def upgrade_batch_size(self, batch_size): + """Upgrade the batch_size of the dataloader, if input batch_size is bigger than current batch_size. - Returns: - Interaction: The interaction of item feature. + Args: + batch_size (int): the new batch_size of dataloader. """ - item_df = self.dataset.get_item_feature() - return self._dataframe_to_interaction(item_df) + if self.batch_size < batch_size: + self.set_batch_size(batch_size) diff --git a/recbole/data/dataloader/context_dataloader.py b/recbole/data/dataloader/context_dataloader.py index a42ff8be4..9ca4fc4df 100644 --- a/recbole/data/dataloader/context_dataloader.py +++ b/recbole/data/dataloader/context_dataloader.py @@ -12,11 +12,13 @@ ################################################ """ -from recbole.data.dataloader.general_dataloader import GeneralDataLoader, GeneralNegSampleDataLoader +from recbole.data.dataloader.general_dataloader import GeneralDataLoader, GeneralNegSampleDataLoader, \ + GeneralFullDataLoader class ContextDataLoader(GeneralDataLoader): - """:class:`ContextDataLoader` is inherit from :class:`~recbole.data.dataloader.general_dataloader.GeneralDataLoader`, + """:class:`ContextDataLoader` is inherit from + :class:`~recbole.data.dataloader.general_dataloader.GeneralDataLoader`, and didn't add/change anything at all. """ pass @@ -28,3 +30,11 @@ class ContextNegSampleDataLoader(GeneralNegSampleDataLoader): and didn't add/change anything at all. """ pass + + +class ContextFullDataLoader(GeneralFullDataLoader): + """:class:`ContextFullDataLoader` is inherit from + :class:`~recbole.data.dataloader.general_dataloader.GeneralFullDataLoader`, + and didn't add/change anything at all. + """ + pass diff --git a/recbole/data/dataloader/general_dataloader.py b/recbole/data/dataloader/general_dataloader.py index 58627fa3e..4e91dd945 100644 --- a/recbole/data/dataloader/general_dataloader.py +++ b/recbole/data/dataloader/general_dataloader.py @@ -13,12 +13,11 @@ """ import numpy as np -import pandas as pd import torch -from tqdm import tqdm from recbole.data.dataloader.abstract_dataloader import AbstractDataLoader from recbole.data.dataloader.neg_sample_mixin import NegSampleMixin, NegSampleByMixin +from recbole.data.interaction import Interaction, cat_interactions from recbole.utils import DataLoaderType, InputType @@ -35,10 +34,8 @@ class GeneralDataLoader(AbstractDataLoader): """ dl_type = DataLoaderType.ORIGIN - def __init__(self, config, dataset, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): - super().__init__(config, dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + def __init__(self, config, dataset, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + super().__init__(config, dataset, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) @property def pr_end(self): @@ -48,9 +45,9 @@ def _shuffle(self): self.dataset.shuffle() def _next_batch_data(self): - cur_data = self.dataset[self.pr: self.pr + self.step] + cur_data = self.dataset[self.pr:self.pr + self.step] self.pr += self.step - return self._dataframe_to_interaction(cur_data) + return cur_data class GeneralNegSampleDataLoader(NegSampleByMixin, AbstractDataLoader): @@ -70,33 +67,38 @@ class GeneralNegSampleDataLoader(NegSampleByMixin, AbstractDataLoader): :obj:`~recbole.utils.enum_type.InputType.POINTWISE`. shuffle (bool, optional): Whether the dataloader will be shuffle after a round. Defaults to ``False``. """ - def __init__(self, config, dataset, sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): - self.uid2index, self.uid2items_num = None, None - super().__init__(config, dataset, sampler, neg_sample_args, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + def __init__( + self, config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False + ): + self.uid_field = dataset.uid_field + self.iid_field = dataset.iid_field + self.uid_list, self.uid2index, self.uid2items_num = None, None, None + + super().__init__( + config, dataset, sampler, neg_sample_args, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle + ) def setup(self): if self.user_inter_in_one_batch: - self.uid2index, self.uid2items_num = self.dataset.uid2index + uid_field = self.dataset.uid_field + user_num = self.dataset.user_num + self.dataset.sort(by=uid_field, ascending=True) + self.uid_list = [] + start, end = dict(), dict() + for i, uid in enumerate(self.dataset.inter_feat[uid_field].numpy()): + if uid not in start: + self.uid_list.append(uid) + start[uid] = i + end[uid] = i + self.uid2index = np.array([None] * user_num) + self.uid2items_num = np.zeros(user_num, dtype=np.int64) + for uid in self.uid_list: + self.uid2index[uid] = slice(start[uid], end[uid] + 1) + self.uid2items_num[uid] = end[uid] - start[uid] + 1 + self.uid_list = np.array(self.uid_list) self._batch_size_adaptation() - def data_preprocess(self): - if self.user_inter_in_one_batch: - new_inter_num = 0 - new_inter_feat = [] - new_uid2index = [] - for uid, index in self.uid2index: - new_inter_feat.append(self._neg_sampling(self.dataset.inter_feat[index])) - new_num = len(new_inter_feat[-1]) - new_uid2index.append((uid, slice(new_inter_num, new_inter_num + new_num))) - new_inter_num += new_num - self.dataset.inter_feat = pd.concat(new_inter_feat, ignore_index=True) - self.uid2index = np.array(new_uid2index) - else: - self.dataset.inter_feat = self._neg_sampling(self.dataset.inter_feat) - def _batch_size_adaptation(self): if self.user_inter_in_one_batch: inters_num = sorted(self.uid2items_num * self.times, reverse=True) @@ -105,85 +107,83 @@ def _batch_size_adaptation(self): for i in range(1, len(inters_num)): if new_batch_size + inters_num[i] > self.batch_size: break - batch_num = i + batch_num = i + 1 new_batch_size += inters_num[i] self.step = batch_num - self.set_batch_size(new_batch_size) + self.upgrade_batch_size(new_batch_size) else: batch_num = max(self.batch_size // self.times, 1) new_batch_size = batch_num * self.times - self.step = batch_num if self.real_time else new_batch_size - self.set_batch_size(new_batch_size) + self.step = batch_num + self.upgrade_batch_size(new_batch_size) @property def pr_end(self): if self.user_inter_in_one_batch: - return len(self.uid2index) + return len(self.uid_list) else: return len(self.dataset) def _shuffle(self): if self.user_inter_in_one_batch: - new_index = np.random.permutation(len(self.uid2index)) - self.uid2index = self.uid2index[new_index] - self.uid2items_num = self.uid2items_num[new_index] + np.random.shuffle(self.uid_list) else: self.dataset.shuffle() def _next_batch_data(self): if self.user_inter_in_one_batch: - sampling_func = self._neg_sampling if self.real_time else (lambda x: x) - cur_data = [] - for uid, index in self.uid2index[self.pr: self.pr + self.step]: - cur_data.append(sampling_func(self.dataset[index])) - cur_data = pd.concat(cur_data, ignore_index=True) - pos_len_list = self.uid2items_num[self.pr: self.pr + self.step] + uid_list = self.uid_list[self.pr:self.pr + self.step] + data_list = [] + for uid in uid_list: + index = self.uid2index[uid] + data_list.append(self._neg_sampling(self.dataset[index])) + cur_data = cat_interactions(data_list) + pos_len_list = self.uid2items_num[uid_list] user_len_list = pos_len_list * self.times + cur_data.set_additional_info(list(pos_len_list), list(user_len_list)) self.pr += self.step - return self._dataframe_to_interaction(cur_data, list(pos_len_list), list(user_len_list)) + return cur_data else: - cur_data = self.dataset[self.pr: self.pr + self.step] + cur_data = self._neg_sampling(self.dataset[self.pr:self.pr + self.step]) self.pr += self.step - if self.real_time: - cur_data = self._neg_sampling(cur_data) - return self._dataframe_to_interaction(cur_data) + return cur_data def _neg_sampling(self, inter_feat): - uid_field = self.config['USER_ID_FIELD'] - iid_field = self.config['ITEM_ID_FIELD'] - uids = inter_feat[uid_field].to_list() + uids = inter_feat[self.uid_field] neg_iids = self.sampler.sample_by_user_ids(uids, self.neg_sample_by) - return self.sampling_func(uid_field, iid_field, neg_iids, inter_feat) - - def _neg_sample_by_pair_wise_sampling(self, uid_field, iid_field, neg_iids, inter_feat): - inter_feat.insert(len(inter_feat.columns), self.neg_item_id, neg_iids) - - if self.dataset.item_feat is not None: - neg_prefix = self.config['NEG_PREFIX'] - neg_item_feat = self.dataset.item_feat.add_prefix(neg_prefix) - inter_feat = pd.merge(inter_feat, neg_item_feat, - on=self.neg_item_id, how='left', suffixes=('_inter', '_item')) - + return self.sampling_func(inter_feat, neg_iids) + + def _neg_sample_by_pair_wise_sampling(self, inter_feat, neg_iids): + inter_feat = inter_feat.repeat(self.times) + neg_item_feat = Interaction({self.iid_field: neg_iids}) + neg_item_feat = self.dataset.join(neg_item_feat) + neg_item_feat.add_prefix(self.neg_prefix) + inter_feat.update(neg_item_feat) return inter_feat - def _neg_sample_by_point_wise_sampling(self, uid_field, iid_field, neg_iids, inter_feat): + def _neg_sample_by_point_wise_sampling(self, inter_feat, neg_iids): pos_inter_num = len(inter_feat) - - new_df = pd.concat([inter_feat] * self.times, ignore_index=True) - new_df[iid_field].values[pos_inter_num:] = neg_iids - - labels = np.zeros(pos_inter_num * self.times, dtype=np.int64) - labels[: pos_inter_num] = 1 - new_df[self.label_field] = labels - - return new_df + new_data = inter_feat.repeat(self.times) + new_data[self.iid_field][pos_inter_num:] = neg_iids + new_data = self.dataset.join(new_data) + labels = torch.zeros(pos_inter_num * self.times) + labels[:pos_inter_num] = 1.0 + new_data.update(Interaction({self.label_field: labels})) + return new_data def get_pos_len_list(self): """ Returns: - np.ndarray or list: Number of positive item for each user in a training/evaluating epoch. + numpy.ndarray: Number of positive item for each user in a training/evaluating epoch. + """ + return self.uid2items_num[self.uid_list] + + def get_user_len_list(self): + """ + Returns: + numpy.ndarray: Number of all item for each user in a training/evaluating epoch. """ - return self.uid2items_num + return self.uid2items_num[self.uid_list] * self.times class GeneralFullDataLoader(NegSampleMixin, AbstractDataLoader): @@ -203,92 +203,97 @@ class GeneralFullDataLoader(NegSampleMixin, AbstractDataLoader): """ dl_type = DataLoaderType.FULL - def __init__(self, config, dataset, sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + def __init__( + self, config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False + ): if neg_sample_args['strategy'] != 'full': raise ValueError('neg_sample strategy in GeneralFullDataLoader() should be `full`') - self.uid2index, self.uid2items_num = dataset.uid2index - - super().__init__(config, dataset, sampler, neg_sample_args, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) - - def data_preprocess(self): - self.user_tensor, tmp_pos_idx, tmp_used_idx, self.pos_len_list, self.neg_len_list = \ - self._neg_sampling(self.uid2index, show_progress=True) - tmp_pos_len_list = [sum(self.pos_len_list[_: _ + self.step]) for _ in range(0, self.pr_end, self.step)] - tot_item_num = self.dataset.item_num - tmp_used_len_list = [sum( - [tot_item_num - x for x in self.neg_len_list[_: _ + self.step]] - ) for _ in range(0, self.pr_end, self.step)] - self.pos_idx = list(torch.split(tmp_pos_idx, tmp_pos_len_list)) - self.used_idx = list(torch.split(tmp_used_idx, tmp_used_len_list)) - for i in range(len(self.pos_idx)): - self.pos_idx[i] -= i * tot_item_num * self.step - for i in range(len(self.used_idx)): - self.used_idx[i] -= i * tot_item_num * self.step + + uid_field = dataset.uid_field + iid_field = dataset.iid_field + user_num = dataset.user_num + self.uid_list = [] + self.uid2items_num = np.zeros(user_num, dtype=np.int64) + self.uid2swap_idx = np.array([None] * user_num) + self.uid2rev_swap_idx = np.array([None] * user_num) + self.uid2history_item = np.array([None] * user_num) + + dataset.sort(by=uid_field, ascending=True) + last_uid = None + positive_item = set() + uid2used_item = sampler.used_ids + for uid, iid in zip(dataset.inter_feat[uid_field].numpy(), dataset.inter_feat[iid_field].numpy()): + if uid != last_uid: + self._set_user_property(last_uid, uid2used_item[last_uid], positive_item) + last_uid = uid + self.uid_list.append(uid) + positive_item = set() + positive_item.add(iid) + self._set_user_property(last_uid, uid2used_item[last_uid], positive_item) + self.uid_list = torch.tensor(self.uid_list) + self.user_df = dataset.join(Interaction({uid_field: self.uid_list})) + + super().__init__( + config, dataset, sampler, neg_sample_args, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle + ) + + def _set_user_property(self, uid, used_item, positive_item): + if uid is None: + return + history_item = used_item - positive_item + positive_item_num = len(positive_item) + self.uid2items_num[uid] = positive_item_num + swap_idx = torch.tensor(sorted(set(range(positive_item_num)) ^ positive_item)) + self.uid2swap_idx[uid] = swap_idx + self.uid2rev_swap_idx[uid] = swap_idx.flip(0) + self.uid2history_item[uid] = torch.tensor(list(history_item), dtype=torch.int64) def _batch_size_adaptation(self): batch_num = max(self.batch_size // self.dataset.item_num, 1) new_batch_size = batch_num * self.dataset.item_num self.step = batch_num - self.set_batch_size(new_batch_size) + self.upgrade_batch_size(new_batch_size) @property def pr_end(self): - return len(self.uid2index) + return len(self.uid_list) def _shuffle(self): self.logger.warnning('GeneralFullDataLoader can\'t shuffle') def _next_batch_data(self): - if not self.real_time: - slc = slice(self.pr, self.pr + self.step) - idx = self.pr // self.step - cur_data = self.user_tensor[slc], self.pos_idx[idx], self.used_idx[idx], \ - self.pos_len_list[slc], self.neg_len_list[slc] - else: - cur_data = self._neg_sampling(self.uid2index[self.pr: self.pr + self.step]) + user_df = self.user_df[self.pr:self.pr + self.step] + cur_data = self._neg_sampling(user_df) self.pr += self.step return cur_data - def _neg_sampling(self, uid2index, show_progress=False): - uid_field = self.dataset.uid_field - iid_field = self.dataset.iid_field - tot_item_num = self.dataset.item_num - - start_idx = 0 - pos_len_list = [] - neg_len_list = [] - - pos_idx = [] - used_idx = [] - - iter_data = tqdm(uid2index) if show_progress else uid2index - for uid, index in iter_data: - pos_item_id = self.dataset.inter_feat[iid_field][index].values - pos_idx.extend([_ + start_idx for _ in pos_item_id]) - pos_num = len(pos_item_id) - pos_len_list.append(pos_num) + def _neg_sampling(self, user_df): + uid_list = list(user_df[self.dataset.uid_field]) + pos_len_list = self.uid2items_num[uid_list] + user_len_list = np.full(len(uid_list), self.item_num) + user_df.set_additional_info(pos_len_list, user_len_list) - used_item_id = self.sampler.used_ids[uid] - used_idx.extend([_ + start_idx for _ in used_item_id]) - used_num = len(used_item_id) + history_item = self.uid2history_item[uid_list] + history_row = torch.cat([torch.full_like(hist_iid, i) for i, hist_iid in enumerate(history_item)]) + history_col = torch.cat(list(history_item)) - neg_num = tot_item_num - used_num - neg_len_list.append(neg_num) - - start_idx += tot_item_num - - user_df = pd.DataFrame({uid_field: np.array(uid2index[:, 0], dtype=np.int)}) - user_interaction = self._dataframe_to_interaction(self.join(user_df)) - - return user_interaction, \ - torch.LongTensor(pos_idx), torch.LongTensor(used_idx), \ - pos_len_list, neg_len_list + swap_idx = self.uid2swap_idx[uid_list] + rev_swap_idx = self.uid2rev_swap_idx[uid_list] + swap_row = torch.cat([torch.full_like(swap, i) for i, swap in enumerate(swap_idx)]) + swap_col_after = torch.cat(list(swap_idx)) + swap_col_before = torch.cat(list(rev_swap_idx)) + return user_df, (history_row, history_col), swap_row, swap_col_after, swap_col_before def get_pos_len_list(self): """ Returns: - np.ndarray or list: Number of positive item for each user in a training/evaluating epoch. + numpy.ndarray: Number of positive item for each user in a training/evaluating epoch. + """ + return self.uid2items_num[self.uid_list] + + def get_user_len_list(self): + """ + Returns: + numpy.ndarray: Number of all item for each user in a training/evaluating epoch. """ - return self.uid2items_num + return np.full(self.pr_end, self.item_num) diff --git a/recbole/data/dataloader/knowledge_dataloader.py b/recbole/data/dataloader/knowledge_dataloader.py index f4718ad20..6b6bb00ac 100644 --- a/recbole/data/dataloader/knowledge_dataloader.py +++ b/recbole/data/dataloader/knowledge_dataloader.py @@ -13,6 +13,7 @@ """ from recbole.data.dataloader import AbstractDataLoader, GeneralNegSampleDataLoader +from recbole.data.interaction import Interaction from recbole.utils import InputType, KGDataLoaderState @@ -34,8 +35,7 @@ class KGDataLoader(AbstractDataLoader): However, in :class:`KGDataLoader`, it's guaranteed to be ``True``. """ - def __init__(self, config, dataset, sampler, - batch_size=1, dl_format=InputType.PAIRWISE, shuffle=False): + def __init__(self, config, dataset, sampler, batch_size=1, dl_format=InputType.PAIRWISE, shuffle=False): self.sampler = sampler self.neg_sample_num = 1 @@ -47,8 +47,7 @@ def __init__(self, config, dataset, sampler, self.neg_tid_field = self.neg_prefix + self.tid_field dataset.copy_field_property(self.neg_tid_field, self.tid_field) - super().__init__(config, dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + super().__init__(config, dataset, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) def setup(self): """Make sure that the :attr:`shuffle` is True. If :attr:`shuffle` is False, it will be changed to True @@ -63,24 +62,17 @@ def pr_end(self): return len(self.dataset.kg_feat) def _shuffle(self): - self.dataset.kg_feat = self.dataset.kg_feat.sample(frac=1).reset_index(drop=True) + self.dataset.kg_feat.shuffle() def _next_batch_data(self): - cur_data = self.dataset.kg_feat[self.pr: self.pr + self.step] + cur_data = self._neg_sampling(self.dataset.kg_feat[self.pr:self.pr + self.step]) self.pr += self.step - if self.real_time: - cur_data = self._neg_sampling(cur_data) - return self._dataframe_to_interaction(cur_data) - - def data_preprocess(self): - """Do neg-sampling before training/evaluation. - """ - self.dataset.kg_feat = self._neg_sampling(self.dataset.kg_feat) + return cur_data def _neg_sampling(self, kg_feat): - hids = kg_feat[self.hid_field].to_list() + hids = kg_feat[self.hid_field] neg_tids = self.sampler.sample_by_entity_ids(hids, self.neg_sample_num) - kg_feat.insert(len(kg_feat.columns), self.neg_tid_field, neg_tids) + kg_feat.update(Interaction({self.neg_tid_field: neg_tids})) return kg_feat @@ -118,76 +110,84 @@ class KnowledgeBasedDataLoader(AbstractDataLoader): and user-item interaction information. """ - def __init__(self, config, dataset, sampler, kg_sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + def __init__( + self, + config, + dataset, + sampler, + kg_sampler, + neg_sample_args, + batch_size=1, + dl_format=InputType.POINTWISE, + shuffle=False + ): # using sampler - self.general_dataloader = GeneralNegSampleDataLoader(config=config, dataset=dataset, - sampler=sampler, neg_sample_args=neg_sample_args, - batch_size=batch_size, dl_format=dl_format, - shuffle=shuffle) + self.general_dataloader = GeneralNegSampleDataLoader( + config=config, + dataset=dataset, + sampler=sampler, + neg_sample_args=neg_sample_args, + batch_size=batch_size, + dl_format=dl_format, + shuffle=shuffle + ) # using kg_sampler and dl_format is pairwise - self.kg_dataloader = KGDataLoader(config, dataset, kg_sampler, - batch_size=batch_size, dl_format=InputType.PAIRWISE, shuffle=shuffle) - - self.main_dataloader = self.general_dataloader + self.kg_dataloader = KGDataLoader( + config, dataset, kg_sampler, batch_size=batch_size, dl_format=InputType.PAIRWISE, shuffle=True + ) - super().__init__(config, dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + self.state = None - @property - def pr(self): - """Pointer of :class:`KnowledgeBasedDataLoader`. It would be affect by self.state. - """ - return self.main_dataloader.pr - - @pr.setter - def pr(self, value): - self.main_dataloader.pr = value + super().__init__(config, dataset, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) def __iter__(self): - if not hasattr(self, 'state') or not hasattr(self, 'main_dataloader'): - raise ValueError('The dataloader\'s state and main_dataloader must be set ' - 'when using the kg based dataloader') - return super().__iter__() + if self.state is None: + raise ValueError( + 'The dataloader\'s state must be set when using the kg based dataloader, ' + 'you should call set_mode() before __iter__()' + ) + if self.state == KGDataLoaderState.KG: + return self.kg_dataloader.__iter__() + elif self.state == KGDataLoaderState.RS: + return self.general_dataloader.__iter__() + elif self.state == KGDataLoaderState.RSKG: + self.kg_dataloader.__iter__() + self.general_dataloader.__iter__() + return self def _shuffle(self): - if self.state == KGDataLoaderState.RSKG: - self.general_dataloader._shuffle() - self.kg_dataloader._shuffle() - else: - self.main_dataloader._shuffle() + pass def __next__(self): - if self.pr >= self.pr_end: - if self.state == KGDataLoaderState.RSKG: - self.general_dataloader.pr = 0 - self.kg_dataloader.pr = 0 - else: - self.pr = 0 + if self.general_dataloader.pr >= self.general_dataloader.pr_end: + self.general_dataloader.pr = 0 + self.kg_dataloader.pr = 0 raise StopIteration() return self._next_batch_data() def __len__(self): - return len(self.main_dataloader) + if self.state == KGDataLoaderState.KG: + return len(self.kg_dataloader) + else: + return len(self.general_dataloader) @property def pr_end(self): - return self.main_dataloader.pr_end + if self.state == KGDataLoaderState.KG: + return self.kg_dataloader.pr_end + else: + return self.general_dataloader.pr_end def _next_batch_data(self): - if self.state == KGDataLoaderState.KG: - return self.kg_dataloader._next_batch_data() - elif self.state == KGDataLoaderState.RS: - return self.general_dataloader._next_batch_data() - elif self.state == KGDataLoaderState.RSKG: - if self.kg_dataloader.pr >= self.kg_dataloader.pr_end: - self.kg_dataloader.pr = 0 - kg_data = self.kg_dataloader._next_batch_data() - rec_data = self.general_dataloader._next_batch_data() - rec_data.update(kg_data) - return rec_data + try: + kg_data = self.kg_dataloader.__next__() + except StopIteration: + kg_data = self.kg_dataloader.__next__() + rec_data = self.general_dataloader.__next__() + rec_data.update(kg_data) + return rec_data def set_mode(self, state): """Set the mode of :class:`KnowledgeBasedDataLoader`, it can be set to three states: @@ -202,13 +202,5 @@ def set_mode(self, state): state (KGDataLoaderState): the state of :class:`KnowledgeBasedDataLoader`. """ if state not in set(KGDataLoaderState): - raise NotImplementedError('kg data loader has no state named [{}]'.format(self.state)) + raise NotImplementedError(f'Kg data loader has no state named [{self.state}].') self.state = state - if self.state == KGDataLoaderState.RS: - self.main_dataloader = self.general_dataloader - elif self.state == KGDataLoaderState.KG: - self.main_dataloader = self.kg_dataloader - else: # RSKG - kgpr = self.kg_dataloader.pr_end - rspr = self.general_dataloader.pr_end - self.main_dataloader = self.general_dataloader if rspr < kgpr else self.kg_dataloader diff --git a/recbole/data/dataloader/neg_sample_mixin.py b/recbole/data/dataloader/neg_sample_mixin.py index 2187227de..e21d614ac 100644 --- a/recbole/data/dataloader/neg_sample_mixin.py +++ b/recbole/data/dataloader/neg_sample_mixin.py @@ -33,27 +33,22 @@ class NegSampleMixin(AbstractDataLoader): """ dl_type = DataLoaderType.NEGSAMPLE - def __init__(self, config, dataset, sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + def __init__( + self, config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False + ): if neg_sample_args['strategy'] not in ['by', 'full']: - raise ValueError('neg_sample strategy [{}] has not been implemented'.format(neg_sample_args['strategy'])) + raise ValueError(f"Neg_sample strategy [{neg_sample_args['strategy']}] has not been implemented.") self.sampler = sampler self.neg_sample_args = neg_sample_args - super().__init__(config, dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + super().__init__(config, dataset, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) def setup(self): """Do batch size adaptation. """ self._batch_size_adaptation() - def data_preprocess(self): - """Do neg-sampling before training/evaluation. - """ - raise NotImplementedError('Method [data_preprocess] should be implemented.') - def _batch_size_adaptation(self): """Adjust the batch size to ensure that each positive and negative interaction can be in a batch. """ @@ -72,10 +67,17 @@ def _neg_sampling(self, inter_feat): def get_pos_len_list(self): """ Returns: - np.ndarray or list: Number of positive item for each user in a training/evaluating epoch. + numpy.ndarray: Number of positive item for each user in a training/evaluating epoch. """ raise NotImplementedError('Method [get_pos_len_list] should be implemented.') + def get_user_len_list(self): + """ + Returns: + numpy.ndarray: Number of all item for each user in a training/evaluating epoch. + """ + raise NotImplementedError('Method [get_user_len_list] should be implemented.') + class NegSampleByMixin(NegSampleMixin): """:class:`NegSampleByMixin` is an abstract class which can sample negative examples by ratio. @@ -92,12 +94,12 @@ class NegSampleByMixin(NegSampleMixin): :obj:`~recbole.utils.enum_type.InputType.POINTWISE`. shuffle (bool, optional): Whether the dataloader will be shuffle after a round. Defaults to ``False``. """ - def __init__(self, config, dataset, sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + + def __init__( + self, config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False + ): if neg_sample_args['strategy'] != 'by': raise ValueError('neg_sample strategy in GeneralInteractionBasedDataLoader() should be `by`') - if dl_format == InputType.PAIRWISE and neg_sample_args['by'] != 1: - raise ValueError('Pairwise dataloader can only neg sample by 1') self.user_inter_in_one_batch = (sampler.phase != 'train') and (config['eval_type'] != EvaluatorType.INDIVIDUAL) self.neg_sample_by = neg_sample_args['by'] @@ -109,22 +111,23 @@ def __init__(self, config, dataset, sampler, neg_sample_args, self.label_field = config['LABEL_FIELD'] dataset.set_field_property(self.label_field, FeatureType.FLOAT, FeatureSource.INTERACTION, 1) elif dl_format == InputType.PAIRWISE: - self.times = 1 + self.times = self.neg_sample_by self.sampling_func = self._neg_sample_by_pair_wise_sampling - neg_prefix = config['NEG_PREFIX'] + self.neg_prefix = config['NEG_PREFIX'] iid_field = config['ITEM_ID_FIELD'] - self.neg_item_id = neg_prefix + iid_field + self.neg_item_id = self.neg_prefix + iid_field columns = [iid_field] if dataset.item_feat is None else dataset.item_feat.columns for item_feat_col in columns: - neg_item_feat_col = neg_prefix + item_feat_col + neg_item_feat_col = self.neg_prefix + item_feat_col dataset.copy_field_property(neg_item_feat_col, item_feat_col) else: - raise ValueError('`neg sampling by` with dl_format [{}] not been implemented'.format(dl_format)) + raise ValueError(f'`neg sampling by` with dl_format [{dl_format}] not been implemented.') - super().__init__(config, dataset, sampler, neg_sample_args, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + super().__init__( + config, dataset, sampler, neg_sample_args, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle + ) def _neg_sample_by_pair_wise_sampling(self, *args): """Pair-wise sampling. diff --git a/recbole/data/dataloader/sequential_dataloader.py b/recbole/data/dataloader/sequential_dataloader.py index 0013b1132..51d3a845d 100644 --- a/recbole/data/dataloader/sequential_dataloader.py +++ b/recbole/data/dataloader/sequential_dataloader.py @@ -16,7 +16,8 @@ import torch from recbole.data.dataloader.abstract_dataloader import AbstractDataLoader -from recbole.data.dataloader.neg_sample_mixin import NegSampleByMixin +from recbole.data.dataloader.neg_sample_mixin import NegSampleByMixin, NegSampleMixin +from recbole.data.interaction import Interaction, cat_interactions from recbole.utils import DataLoaderType, FeatureSource, FeatureType, InputType @@ -42,136 +43,107 @@ class SequentialDataLoader(AbstractDataLoader): """ dl_type = DataLoaderType.ORIGIN - def __init__(self, config, dataset, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + def __init__(self, config, dataset, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): self.uid_field = dataset.uid_field self.iid_field = dataset.iid_field self.time_field = dataset.time_field self.max_item_list_len = config['MAX_ITEM_LIST_LENGTH'] list_suffix = config['LIST_SUFFIX'] - self.item_list_field = self.iid_field + list_suffix - self.time_list_field = self.time_field + list_suffix - self.position_field = config['POSITION_FIELD'] - self.target_iid_field = self.iid_field - self.target_time_field = self.time_field - self.item_list_length_field = config['ITEM_LIST_LENGTH_FIELD'] - for field in dataset.inter_feat: - if field not in [self.uid_field, self.iid_field, self.time_field]: + if field != self.uid_field: + list_field = field + list_suffix + setattr(self, f'{field}_list_field', list_field) ftype = dataset.field2type[field] - setattr(self, f'{field}_list_field', field + list_suffix) - if dataset.field2type[field] == FeatureType.TOKEN: - dataset.set_field_property(getattr(self, f'{field}_list_field'), FeatureType.TOKEN_SEQ, - FeatureSource.INTERACTION, - self.max_item_list_len) - elif dataset.field2type[field] == FeatureType.FLOAT: - dataset.set_field_property(getattr(self, f'{field}_list_field'), FeatureType.FLOAT_SEQ, - FeatureSource.INTERACTION, - self.max_item_list_len) + + if ftype in [FeatureType.TOKEN, FeatureType.TOKEN_SEQ]: + list_ftype = FeatureType.TOKEN_SEQ + else: + list_ftype = FeatureType.FLOAT_SEQ + + if ftype in [FeatureType.TOKEN_SEQ, FeatureType.FLOAT_SEQ]: + list_len = (self.max_item_list_len, dataset.field2seqlen[field]) else: - raise NotImplementedError('Field with ftype [{}] is not implemented for sequential model'.format(ftype)) - - dataset.set_field_property(self.item_list_field, FeatureType.TOKEN_SEQ, FeatureSource.INTERACTION, - self.max_item_list_len) - dataset.set_field_property(self.time_list_field, FeatureType.FLOAT_SEQ, FeatureSource.INTERACTION, - self.max_item_list_len) - if self.position_field: - dataset.set_field_property(self.position_field, FeatureType.TOKEN_SEQ, FeatureSource.INTERACTION, - self.max_item_list_len) - dataset.set_field_property(self.target_iid_field, FeatureType.TOKEN, FeatureSource.INTERACTION, 1) - dataset.set_field_property(self.target_time_field, FeatureType.FLOAT, FeatureSource.INTERACTION, 1) + list_len = self.max_item_list_len + + dataset.set_field_property(list_field, list_ftype, FeatureSource.INTERACTION, list_len) + + self.item_list_length_field = config['ITEM_LIST_LENGTH_FIELD'] dataset.set_field_property(self.item_list_length_field, FeatureType.TOKEN, FeatureSource.INTERACTION, 1) - self.uid_list, self.item_list_index, self.target_index, self.item_list_length = \ - dataset.prepare_data_augmentation() + self.uid_list = dataset.uid_list + self.item_list_index = dataset.item_list_index + self.target_index = dataset.target_index + self.item_list_length = dataset.item_list_length self.pre_processed_data = None - super().__init__(config, dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + super().__init__(config, dataset, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) def data_preprocess(self): """Do data augmentation before training/evaluation. """ - self.pre_processed_data = self.augmentation(self.uid_list, self.item_list_field, - self.target_index, self.item_list_length) + self.pre_processed_data = self.augmentation(self.item_list_index, self.target_index, self.item_list_length) @property def pr_end(self): return len(self.uid_list) def _shuffle(self): - new_index = np.random.permutation(len(self.item_list_index)) if self.real_time: + new_index = torch.randperm(self.pr_end) self.uid_list = self.uid_list[new_index] self.item_list_index = self.item_list_index[new_index] self.target_index = self.target_index[new_index] self.item_list_length = self.item_list_length[new_index] else: - new_data = {} - for key, value in self.pre_processed_data.items(): - new_data[key] = value[new_index] - self.pre_processed_data = new_data + self.pre_processed_data.shuffle() def _next_batch_data(self): - cur_index = slice(self.pr, self.pr + self.step) + cur_data = self._get_processed_data(slice(self.pr, self.pr + self.step)) + self.pr += self.step + return cur_data + + def _get_processed_data(self, index): if self.real_time: - cur_data = self.augmentation(self.uid_list[cur_index], - self.item_list_index[cur_index], - self.target_index[cur_index], - self.item_list_length[cur_index]) + cur_data = self.augmentation( + self.item_list_index[index], self.target_index[index], self.item_list_length[index] + ) else: - cur_data = {} - for key, value in self.pre_processed_data.items(): - cur_data[key] = value[cur_index] - self.pr += self.step - return self._dict_to_interaction(cur_data) + cur_data = self.pre_processed_data[index] + return cur_data - def augmentation(self, uid_list, item_list_index, target_index, item_list_length): + def augmentation(self, item_list_index, target_index, item_list_length): """Data augmentation. Args: - uid_list (np.ndarray): user id list. - item_list_index (np.ndarray): the index of history items list in interaction. - target_index (np.ndarray): the index of items to be predicted in interaction. - item_list_length (np.ndarray): history list length. + item_list_index (numpy.ndarray): the index of history items list in interaction. + target_index (numpy.ndarray): the index of items to be predicted in interaction. + item_list_length (numpy.ndarray): history list length. Returns: dict: the augmented data. """ new_length = len(item_list_index) + new_data = self.dataset.inter_feat[target_index] new_dict = { - self.uid_field: uid_list, - self.item_list_field: np.zeros((new_length, self.max_item_list_len), dtype=np.int64), - self.time_list_field: np.zeros((new_length, self.max_item_list_len)), - self.target_iid_field: self.dataset.inter_feat[self.iid_field][target_index].values, - self.target_time_field: self.dataset.inter_feat[self.time_field][target_index].values, - self.item_list_length_field: item_list_length, + self.item_list_length_field: torch.tensor(item_list_length), } - for field in self.dataset.inter_feat: - if field not in [self.uid_field, self.iid_field, self.time_field]: - new_dict[field] = self.dataset.inter_feat[field][target_index].values - """Add extra field feature for interaction""" - ftype = self.dataset.field2type[field] - if ftype == FeatureType.TOKEN or ftype == FeatureType.FLOAT: - field_value = self.dataset.inter_feat[field] - dtype = np.int64 if ftype == FeatureType.TOKEN else np.float32 - new_dict[getattr(self, f'{field}_list_field')] = np.zeros((new_length, self.max_item_list_len), - dtype=dtype) - for i, (index, length) in enumerate(zip(item_list_index, item_list_length)): - new_dict[getattr(self, f'{field}_list_field')][i][:length] = field_value[index] - else: - raise NotImplementedError('Field with ftype [{}] is not implemented for sequential model'.format(ftype)) - if self.position_field: - new_dict[self.position_field] = np.tile(np.arange(self.max_item_list_len), (new_length, 1)) - - iid_value = self.dataset.inter_feat[self.iid_field].values - time_value = self.dataset.inter_feat[self.time_field].values - for i, (index, length) in enumerate(zip(item_list_index, item_list_length)): - new_dict[self.item_list_field][i][:length] = iid_value[index] - new_dict[self.time_list_field][i][:length] = time_value[index] - return new_dict + for field in self.dataset.inter_feat: + if field != self.uid_field: + list_field = getattr(self, f'{field}_list_field') + list_len = self.dataset.field2seqlen[list_field] + shape = (new_length, list_len) if isinstance(list_len, int) else (new_length,) + list_len + list_ftype = self.dataset.field2type[list_field] + dtype = torch.int64 if list_ftype in [FeatureType.TOKEN, FeatureType.TOKEN_SEQ] else torch.float64 + new_dict[list_field] = torch.zeros(shape, dtype=dtype) + + value = self.dataset.inter_feat[field] + for i, (index, length) in enumerate(zip(item_list_index, item_list_length)): + new_dict[list_field][i][:length] = value[index] + + new_data.update(Interaction(new_dict)) + return new_data class SequentialNegSampleDataLoader(NegSampleByMixin, SequentialDataLoader): @@ -191,88 +163,77 @@ class SequentialNegSampleDataLoader(NegSampleByMixin, SequentialDataLoader): :obj:`~recbole.utils.enum_type.InputType.POINTWISE`. shuffle (bool, optional): Whether the dataloader will be shuffle after a round. Defaults to ``False``. """ - def __init__(self, config, dataset, sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): - super().__init__(config, dataset, sampler, neg_sample_args, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) - def data_preprocess(self): - """Do data augmentation and neg-sampling before training/evaluation. - """ - self.pre_processed_data = self.augmentation(self.uid_list, self.item_list_field, - self.target_index, self.item_list_length) - self.pre_processed_data = self._neg_sampling(self.pre_processed_data) + def __init__( + self, config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False + ): + super().__init__( + config, dataset, sampler, neg_sample_args, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle + ) def _batch_size_adaptation(self): batch_num = max(self.batch_size // self.times, 1) new_batch_size = batch_num * self.times - self.step = batch_num if self.real_time else new_batch_size - self.set_batch_size(new_batch_size) + self.step = batch_num + self.upgrade_batch_size(new_batch_size) def _next_batch_data(self): - cur_index = slice(self.pr, self.pr + self.step) - if self.real_time: - cur_data = self.augmentation(self.uid_list[cur_index], - self.item_list_index[cur_index], - self.target_index[cur_index], - self.item_list_length[cur_index]) - cur_data = self._neg_sampling(cur_data) - else: - cur_data = {} - for key, value in self.pre_processed_data.items(): - cur_data[key] = value[cur_index] + cur_data = self._get_processed_data(slice(self.pr, self.pr + self.step)) + cur_data = self._neg_sampling(cur_data) self.pr += self.step if self.user_inter_in_one_batch: cur_data_len = len(cur_data[self.uid_field]) pos_len_list = np.ones(cur_data_len // self.times, dtype=np.int64) user_len_list = pos_len_list * self.times - return self._dict_to_interaction(cur_data, list(pos_len_list), list(user_len_list)) - else: - return self._dict_to_interaction(cur_data) + cur_data.set_additional_info(list(pos_len_list), list(user_len_list)) + return cur_data def _neg_sampling(self, data): if self.user_inter_in_one_batch: data_len = len(data[self.uid_field]) data_list = [] for i in range(data_len): - uids = data[self.uid_field][i: i + 1] + uids = data[self.uid_field][i:i + 1] neg_iids = self.sampler.sample_by_user_ids(uids, self.neg_sample_by) - cur_data = {field: data[field][i: i + 1] for field in data} + cur_data = data[i:i + 1] data_list.append(self.sampling_func(cur_data, neg_iids)) - return {field: np.concatenate([d[field] for d in data_list]) - for field in data} + return cat_interactions(data_list) else: uids = data[self.uid_field] neg_iids = self.sampler.sample_by_user_ids(uids, self.neg_sample_by) return self.sampling_func(data, neg_iids) def _neg_sample_by_pair_wise_sampling(self, data, neg_iids): - data[self.neg_item_id] = neg_iids - return data + new_data = data.repeat(self.times) + new_data.update(Interaction({self.neg_item_id: neg_iids})) + return new_data def _neg_sample_by_point_wise_sampling(self, data, neg_iids): - new_data = {} - for key, value in data.items(): - if key == self.target_iid_field: - new_data[key] = np.concatenate([value, neg_iids]) - else: - new_data[key] = np.concatenate([value] * self.times) - pos_len = len(data[self.target_iid_field]) - total_len = len(new_data[self.target_iid_field]) - new_data[self.label_field] = np.zeros(total_len, dtype=np.int) - new_data[self.label_field][:pos_len] = 1 + pos_inter_num = len(data) + new_data = data.repeat(self.times) + new_data[self.iid_field][pos_inter_num:] = neg_iids + labels = torch.zeros(pos_inter_num * self.times) + labels[:pos_inter_num] = 1.0 + new_data.update(Interaction({self.label_field: labels})) return new_data def get_pos_len_list(self): """ Returns: - np.ndarray or list: Number of positive item for each user in a training/evaluating epoch. + numpy.ndarray: Number of positive item for each user in a training/evaluating epoch. """ return np.ones(self.pr_end, dtype=np.int64) + def get_user_len_list(self): + """ + Returns: + numpy.ndarray: Number of all item for each user in a training/evaluating epoch. + """ + return np.full(self.pr_end, self.times) + -class SequentialFullDataLoader(SequentialDataLoader): +class SequentialFullDataLoader(NegSampleMixin, SequentialDataLoader): """:class:`SequentialFullDataLoader` is a sequential-dataloader with full sort. In order to speed up calculation, this dataloader would only return then user part of interactions, positive items and used items. It would not return negative items. @@ -289,26 +250,45 @@ class SequentialFullDataLoader(SequentialDataLoader): """ dl_type = DataLoaderType.FULL - def __init__(self, config, dataset, sampler, neg_sample_args, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): - super().__init__(config, dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + def __init__( + self, config, dataset, sampler, neg_sample_args, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False + ): + super().__init__( + config, dataset, sampler, neg_sample_args, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle + ) + + def _batch_size_adaptation(self): + pass + + def _neg_sampling(self, inter_feat): + pass def _shuffle(self): self.logger.warnning('SequentialFullDataLoader can\'t shuffle') def _next_batch_data(self): interaction = super()._next_batch_data() - tot_item_num = self.dataset.item_num inter_num = len(interaction) - pos_idx = used_idx = interaction[self.target_iid_field] + torch.arange(inter_num) * tot_item_num - pos_len_list = [1] * inter_num - neg_len_list = [tot_item_num - 1] * inter_num - return interaction, pos_idx, used_idx, pos_len_list, neg_len_list + pos_len_list = np.ones(inter_num, dtype=np.int64) + user_len_list = np.full(inter_num, self.item_num) + interaction.set_additional_info(pos_len_list, user_len_list) + scores_row = torch.arange(inter_num).repeat(2) + padding_idx = torch.zeros(inter_num, dtype=torch.int64) + positive_idx = interaction[self.iid_field] + scores_col_after = torch.cat((padding_idx, positive_idx)) + scores_col_before = torch.cat((positive_idx, padding_idx)) + return interaction, None, scores_row, scores_col_after, scores_col_before def get_pos_len_list(self): """ Returns: - np.ndarray or list: Number of positive item for each user in a training/evaluating epoch. + numpy.ndarray or list: Number of positive item for each user in a training/evaluating epoch. + """ + return np.ones(self.pr_end, dtype=np.int64) + + def get_user_len_list(self): + """ + Returns: + numpy.ndarray: Number of all item for each user in a training/evaluating epoch. """ - return np.ones(self.pr_end, dtype=np.int64) \ No newline at end of file + return np.full(self.pr_end, self.item_num) diff --git a/recbole/data/dataloader/user_dataloader.py b/recbole/data/dataloader/user_dataloader.py index 73d92aa51..2d2fd62a0 100644 --- a/recbole/data/dataloader/user_dataloader.py +++ b/recbole/data/dataloader/user_dataloader.py @@ -3,16 +3,18 @@ # @Email : chenyushuo@ruc.edu.cn # UPDATE -# @Time : 2020/9/23 -# @Author : Yushuo Chen -# @email : chenyushuo@ruc.edu.cn +# @Time : 2020/9/23, 2020/12/28 +# @Author : Yushuo Chen, Xingyu Pan +# @email : chenyushuo@ruc.edu.cn, panxy@ruc.edu.cn """ recbole.data.dataloader.user_dataloader ################################################ """ +import torch from recbole.data.dataloader import AbstractDataLoader +from recbole.data.interaction import Interaction from recbole.utils.enum_type import DataLoaderType, InputType @@ -33,12 +35,11 @@ class UserDataLoader(AbstractDataLoader): """ dl_type = DataLoaderType.ORIGIN - def __init__(self, config, dataset, - batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): + def __init__(self, config, dataset, batch_size=1, dl_format=InputType.POINTWISE, shuffle=False): self.uid_field = dataset.uid_field + self.user_list = Interaction({self.uid_field: torch.arange(dataset.user_num)}) - super().__init__(config=config, dataset=dataset, - batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) + super().__init__(config=config, dataset=dataset, batch_size=batch_size, dl_format=dl_format, shuffle=shuffle) def setup(self): """Make sure that the :attr:`shuffle` is True. If :attr:`shuffle` is False, it will be changed to True @@ -50,12 +51,12 @@ def setup(self): @property def pr_end(self): - return len(self.dataset.user_feat) + return len(self.user_list) def _shuffle(self): - self.dataset.user_feat = self.dataset.user_feat.sample(frac=1).reset_index(drop=True) + self.user_list.shuffle() def _next_batch_data(self): - cur_data = self.dataset.user_feat[[self.uid_field]][self.pr: self.pr + self.step] + cur_data = self.user_list[self.pr:self.pr + self.step] self.pr += self.step - return self._dataframe_to_interaction(cur_data) + return cur_data diff --git a/recbole/data/dataloader/xgboost_dataloader.py b/recbole/data/dataloader/xgboost_dataloader.py new file mode 100644 index 000000000..17c9ab2b0 --- /dev/null +++ b/recbole/data/dataloader/xgboost_dataloader.py @@ -0,0 +1,40 @@ +# @Time : 2020/11/19 +# @Author : Chen Yang +# @Email : 254170321@qq.com + +# UPDATE: +# @Time : 2020/11/19 +# @Author : Chen Yang +# @Email : 254170321@qq.com + +""" +recbole.data.dataloader.xgboost_dataloader +################################################ +""" + +from recbole.data.dataloader.general_dataloader import GeneralDataLoader, GeneralNegSampleDataLoader, \ + GeneralFullDataLoader + + +class XgboostDataLoader(GeneralDataLoader): + """:class:`XgboostDataLoader` is inherit from + :class:`~recbole.data.dataloader.general_dataloader.GeneralDataLoader`, + and didn't add/change anything at all. + """ + pass + + +class XgboostNegSampleDataLoader(GeneralNegSampleDataLoader): + """:class:`XgboostNegSampleDataLoader` is inherit from + :class:`~recbole.data.dataloader.general_dataloader.GeneralNegSampleDataLoader`, + and didn't add/change anything at all. + """ + pass + + +class XgboostFullDataLoader(GeneralFullDataLoader): + """:class:`XgboostFullDataLoader` is inherit from + :class:`~recbole.data.dataloader.general_dataloader.GeneralFullDataLoader`, + and didn't add/change anything at all. + """ + pass diff --git a/recbole/data/dataset/__init__.py b/recbole/data/dataset/__init__.py index 2bdb7f4ad..58026c332 100644 --- a/recbole/data/dataset/__init__.py +++ b/recbole/data/dataset/__init__.py @@ -3,4 +3,5 @@ from recbole.data.dataset.kg_dataset import KnowledgeBasedDataset from recbole.data.dataset.social_dataset import SocialDataset from recbole.data.dataset.kg_seq_dataset import Kg_Seq_Dataset +from recbole.data.dataset.xgboost_dataset import XgboostDataset from recbole.data.dataset.customized_dataset import * diff --git a/recbole/data/dataset/customized_dataset.py b/recbole/data/dataset/customized_dataset.py index 2d7a6d7c7..676e35cdb 100644 --- a/recbole/data/dataset/customized_dataset.py +++ b/recbole/data/dataset/customized_dataset.py @@ -15,10 +15,12 @@ class GRU4RecKGDataset(Kg_Seq_Dataset): + def __init__(self, config, saved_dataset=None): super().__init__(config, saved_dataset=saved_dataset) class KSRDataset(Kg_Seq_Dataset): + def __init__(self, config, saved_dataset=None): super().__init__(config, saved_dataset=saved_dataset) diff --git a/recbole/data/dataset/dataset.py b/recbole/data/dataset/dataset.py index aadf4eeeb..22c5c8b60 100644 --- a/recbole/data/dataset/dataset.py +++ b/recbole/data/dataset/dataset.py @@ -23,11 +23,10 @@ import torch import torch.nn.utils.rnn as rnn_utils from scipy.sparse import coo_matrix -from sklearn.impute import SimpleImputer -from recbole.utils import FeatureSource, FeatureType from recbole.data.interaction import Interaction from recbole.data.utils import dlapi +from recbole.utils import FeatureSource, FeatureType class Dataset(object): @@ -78,17 +77,18 @@ class Dataset(object): time_field (str or None): The same as ``config['TIME_FIELD']``. - inter_feat (:class:`pandas.DataFrame`): Internal data structure stores the interaction features. + inter_feat (:class:`Interaction`): Internal data structure stores the interaction features. It's loaded from file ``.inter``. - user_feat (:class:`pandas.DataFrame` or None): Internal data structure stores the user features. + user_feat (:class:`Interaction` or None): Internal data structure stores the user features. It's loaded from file ``.user`` if existed. - item_feat (:class:`pandas.DataFrame` or None): Internal data structure stores the item features. + item_feat (:class:`Interaction` or None): Internal data structure stores the item features. It's loaded from file ``.item`` if existed. - feat_list (list): A list contains all the features (:class:`pandas.DataFrame`), including additional features. + feat_name_list (list): A list contains all the features' name (:class:`str`), including additional features. """ + def __init__(self, config, saved_dataset=None): self.config = config self.dataset_name = config['dataset'] @@ -105,18 +105,18 @@ def _from_scratch(self): """Load dataset from scratch. Initialize attributes firstly, then load data from atomic files, pre-process the dataset lastly. """ - self.logger.debug('Loading {} from scratch'.format(self.__class__)) + self.logger.debug(f'Loading {self.__class__} from scratch') self._get_preset() self._get_field_from_config() self._load_data(self.dataset_name, self.dataset_path) self._data_processing() + self._change_feat_format() def _get_preset(self): """Initialization useful inside attributes. """ self.dataset_path = self.config['data_path'] - self._fill_nan_flag = self.config['fill_nan'] self.field2type = {} self.field2source = {} @@ -134,20 +134,24 @@ def _get_field_from_config(self): self.label_field = self.config['LABEL_FIELD'] self.time_field = self.config['TIME_FIELD'] - self.logger.debug('uid_field: {}'.format(self.uid_field)) - self.logger.debug('iid_field: {}'.format(self.iid_field)) + if (self.uid_field is None) ^ (self.iid_field is None): + raise ValueError( + 'USER_ID_FIELD and ITEM_ID_FIELD need to be set at the same time or not set at the same time.' + ) + + self.logger.debug(f'uid_field: {self.uid_field}') + self.logger.debug(f'iid_field: {self.iid_field}') def _data_processing(self): """Data preprocessing, including: - - K-core data filtering - - Value-based data filtering + - Data filtering - Remap ID - Missing value imputation - Normalization - Preloading weights initialization """ - self.feat_list = self._build_feat_list() + self.feat_name_list = self._build_feat_name_list() if self.benchmark_filename_list is None: self._data_filtering() @@ -162,36 +166,42 @@ def _data_filtering(self): """Data filtering - Filter missing user_id or item_id + - Remove duplicated user-item interaction - Value-based data filtering + - Remove interaction by user or item - K-core data filtering Note: After filtering, feats(``DataFrame``) has non-continuous index, - thus :meth:`~recbole.data.dataset.dataset.Dataset._reset_index()` will reset the index of feats. + thus :meth:`~recbole.data.dataset.dataset.Dataset._reset_index` will reset the index of feats. """ self._filter_nan_user_or_item() self._remove_duplication() self._filter_by_field_value() + self._filter_inter_by_user_or_item() self._filter_by_inter_num() self._reset_index() - def _build_feat_list(self): + def _build_feat_name_list(self): """Feat list building. - Any feat loaded by Dataset can be found in ``feat_list`` + Any feat loaded by Dataset can be found in ``feat_name_list`` Returns: - builded feature list. + built feature name list. Note: Subclasses can inherit this method to add new feat. """ - feat_list = [feat for feat in [self.inter_feat, self.user_feat, self.item_feat] if feat is not None] + feat_name_list = [ + feat_name for feat_name in ['inter_feat', 'user_feat', 'item_feat'] + if getattr(self, feat_name, None) is not None + ] if self.config['additional_feat_suffix'] is not None: for suf in self.config['additional_feat_suffix']: - if hasattr(self, '{}_feat'.format(suf)): - feat_list.append(getattr(self, '{}_feat'.format(suf))) - return feat_list + if getattr(self, f'{suf}_feat', None) is not None: + feat_name_list.append(f'{suf}_feat') + return feat_name_list def _restore_saved_dataset(self, saved_dataset): """Restore saved dataset from ``saved_dataset``. @@ -199,10 +209,10 @@ def _restore_saved_dataset(self, saved_dataset): Args: saved_dataset (str): path for the saved dataset. """ - self.logger.debug('Restoring dataset from [{}]'.format(saved_dataset)) + self.logger.debug(f'Restoring dataset from [{saved_dataset}].') if (saved_dataset is None) or (not os.path.isdir(saved_dataset)): - raise ValueError('filepath [{}] need to be a dir'.format(saved_dataset)) + raise ValueError(f'Filepath [{saved_dataset}] need to be a dir.') with open(os.path.join(saved_dataset, 'basic-info.json')) as file: basic_info = json.load(file) @@ -212,12 +222,12 @@ def _restore_saved_dataset(self, saved_dataset): feats = ['inter', 'user', 'item'] for name in feats: - cur_file_name = os.path.join(saved_dataset, '{}.csv'.format(name)) + cur_file_name = os.path.join(saved_dataset, f'{name}.csv') if os.path.isfile(cur_file_name): df = pd.read_csv(cur_file_name) - setattr(self, '{}_feat'.format(name), df) + setattr(self, f'{name}_feat', df) else: - setattr(self, '{}_feat'.format(name), None) + setattr(self, f'{name}_feat', None) self._get_field_from_config() @@ -250,24 +260,24 @@ def _load_inter_feat(self, token, dataset_path): dataset_path (str): path of dataset dir. """ if self.benchmark_filename_list is None: - inter_feat_path = os.path.join(dataset_path, '{}.{}'.format(token, 'inter')) + inter_feat_path = os.path.join(dataset_path, f'{token}.inter') if not os.path.isfile(inter_feat_path): - raise ValueError('File {} not exist'.format(inter_feat_path)) + raise ValueError(f'File {inter_feat_path} not exist.') inter_feat = self._load_feat(inter_feat_path, FeatureSource.INTERACTION) - self.logger.debug('interaction feature loaded successfully from [{}]'.format(inter_feat_path)) + self.logger.debug(f'Interaction feature loaded successfully from [{inter_feat_path}].') self.inter_feat = inter_feat else: sub_inter_lens = [] sub_inter_feats = [] for filename in self.benchmark_filename_list: - file_path = os.path.join(dataset_path, '{}.{}.{}'.format(token, filename, 'inter')) + file_path = os.path.join(dataset_path, f'{token}.{filename}.inter') if os.path.isfile(file_path): temp = self._load_feat(file_path, FeatureSource.INTERACTION) sub_inter_feats.append(temp) sub_inter_lens.append(len(temp)) else: - raise ValueError('File {} not exist'.format(file_path)) + raise ValueError(f'File {file_path} not exist.') inter_feat = pd.concat(sub_inter_feats) self.inter_feat, self.file_size_list = inter_feat, sub_inter_lens @@ -287,19 +297,19 @@ def _load_user_or_item_feat(self, token, dataset_path, source, field_name): ``user_id`` and ``item_id`` has source :obj:`~recbole.utils.enum_type.FeatureSource.USER_ID` and :obj:`~recbole.utils.enum_type.FeatureSource.ITEM_ID` """ - feat_path = os.path.join(dataset_path, '{}.{}'.format(token, source.value)) + feat_path = os.path.join(dataset_path, f'{token}.{source.value}') if os.path.isfile(feat_path): feat = self._load_feat(feat_path, source) - self.logger.debug('[{}] feature loaded successfully from [{}]'.format(source.value, feat_path)) + self.logger.debug(f'[{source.value}] feature loaded successfully from [{feat_path}].') else: feat = None - self.logger.debug('[{}] not found, [{}] features are not loaded'.format(feat_path, source.value)) + self.logger.debug(f'[{feat_path}] not found, [{source.value}] features are not loaded.') field = getattr(self, field_name, None) if feat is not None and field is None: - raise ValueError('{} must be exist if {}_feat exist'.format(field_name, source.value)) + raise ValueError(f'{field_name} must be exist if {source.value}_feat exist.') if feat is not None and field not in feat: - raise ValueError('{} must be loaded if {}_feat is loaded'.format(field_name, source.value)) + raise ValueError(f'{field_name} must be loaded if {source.value}_feat is loaded.') if field in self.field2source: self.field2source[field] = FeatureSource(source.value + '_id') @@ -310,7 +320,7 @@ def _load_additional_feat(self, token, dataset_path): For those additional features, e.g. pretrained entity embedding, user can set them as ``config['additional_feat_suffix']``, then they will be loaded and stored in - :attr:`feat_list`. See :doc:`../user_guide/data/data_args` for details. + :attr:`feat_name_list`. See :doc:`../user_guide/data/data_args` for details. Args: token (str): dataset name. @@ -319,14 +329,14 @@ def _load_additional_feat(self, token, dataset_path): if self.config['additional_feat_suffix'] is None: return for suf in self.config['additional_feat_suffix']: - if hasattr(self, '{}_feat'.format(suf)): - raise ValueError('{}_feat already exist'.format(suf)) - feat_path = os.path.join(dataset_path, '{}.{}'.format(token, suf)) + if hasattr(self, f'{suf}_feat'): + raise ValueError(f'{suf}_feat already exist.') + feat_path = os.path.join(dataset_path, f'{token}.{suf}') if os.path.isfile(feat_path): feat = self._load_feat(feat_path, suf) else: - raise ValueError('Additional feature file [{}] not found'.format(feat_path)) - setattr(self, '{}_feat'.format(suf), feat) + raise ValueError(f'Additional feature file [{feat_path}] not found.') + setattr(self, f'{suf}_feat', feat) def _get_load_and_unload_col(self, source): """Parsing ``config['load_col']`` and ``config['unload_col']`` according to source. @@ -355,10 +365,11 @@ def _get_load_and_unload_col(self, source): unload_col = None if load_col and unload_col: - raise ValueError('load_col [{}] and unload_col [{}] can not be set the same time'.format( - load_col, unload_col)) + raise ValueError(f'load_col [{load_col}] and unload_col [{unload_col}] can not be set the same time.') - self.logger.debug('\n [{}]:\n\t load_col: [{}]\n\t unload_col: [{}]\n'.format(source, load_col, unload_col)) + self.logger.debug(f'[{source}]: ') + self.logger.debug(f'\t load_col: [{load_col}]') + self.logger.debug(f'\t unload_col: [{unload_col}]') return load_col, unload_col def _load_feat(self, filepath, source): @@ -374,11 +385,11 @@ def _load_feat(self, filepath, source): pandas.DataFrame: Loaded feature Note: - For sequence features, ``seqlen`` will be loaded, but data in DataFrame will not be cutted off. + For sequence features, ``seqlen`` will be loaded, but data in DataFrame will not be cut off. Their length is limited only after calling :meth:`~_dict_to_interaction` or :meth:`~_dataframe_to_interaction` """ - self.logger.debug('loading feature from [{}] (source: [{}])'.format(filepath, source)) + self.logger.debug(f'Loading feature from [{filepath}] (source: [{source}]).') load_col, unload_col = self._get_load_and_unload_col(source) if load_col == set(): @@ -395,7 +406,7 @@ def _load_feat(self, filepath, source): try: ftype = FeatureType(ftype) except ValueError: - raise ValueError('Type {} from field {} is not supported'.format(ftype, field)) + raise ValueError(f'Type {ftype} from field {field} is not supported.') if load_col is not None and field not in load_col: continue if unload_col is not None and field in unload_col: @@ -410,7 +421,7 @@ def _load_feat(self, filepath, source): dtype[field_type] = np.float64 if ftype == FeatureType.FLOAT else str if len(columns) == 0: - self.logger.warning('no columns has been loaded from [{}]'.format(source)) + self.logger.warning(f'No columns has been loaded from [{source}]') return None df = pd.read_csv(filepath, delimiter=self.config['field_separator'], usecols=usecols, dtype=dtype) @@ -421,7 +432,7 @@ def _load_feat(self, filepath, source): ftype = self.field2type[field] if not ftype.value.endswith('seq'): continue - df[field].fillna(value='0', inplace=True) + df[field].fillna(value='', inplace=True) if ftype == FeatureType.TOKEN_SEQ: df[field] = [list(filter(None, _.split(seq_separator))) for _ in df[field].values] elif ftype == FeatureType.FLOAT_SEQ: @@ -431,24 +442,16 @@ def _load_feat(self, filepath, source): def _user_item_feat_preparation(self): """Sort :attr:`user_feat` and :attr:`item_feat` by ``user_id`` or ``item_id``. - Missing values will be filled. + Missing values will be filled later. """ - flag = False if self.user_feat is not None: new_user_df = pd.DataFrame({self.uid_field: np.arange(self.user_num)}) self.user_feat = pd.merge(new_user_df, self.user_feat, on=self.uid_field, how='left') - flag = True self.logger.debug('ordering user features by user id.') if self.item_feat is not None: new_item_df = pd.DataFrame({self.iid_field: np.arange(self.item_num)}) self.item_feat = pd.merge(new_item_df, self.item_feat, on=self.iid_field, how='left') - flag = True self.logger.debug('ordering item features by user id.') - if flag: - # CANNOT be removed - # user/item feat has been updated, thus feat_list should be updated too. - self.feat_list = self._build_feat_list() - self._fill_nan_flag = True def _preload_weight_matrix(self): """Transfer preload weight features into :class:`numpy.ndarray` with shape ``[id_token_length]`` @@ -457,32 +460,31 @@ def _preload_weight_matrix(self): preload_fields = self.config['preload_weight'] if preload_fields is None: return - drop_flag = self.config['drop_preload_weight'] - if drop_flag is None: - drop_flag = True - self.logger.debug('preload weight matrix for {}, drop=[{}]'.format(preload_fields, drop_flag)) + self.logger.debug(f'Preload weight matrix for {preload_fields}.') for preload_id_field in preload_fields: preload_value_field = preload_fields[preload_id_field] if preload_id_field not in self.field2source: - raise ValueError('prelaod id field [{}] not exist'.format(preload_id_field)) + raise ValueError(f'Preload id field [{preload_id_field}] not exist.') if preload_value_field not in self.field2source: - raise ValueError('prelaod value field [{}] not exist'.format(preload_value_field)) + raise ValueError(f'Preload value field [{preload_value_field}] not exist.') pid_source = self.field2source[preload_id_field] pv_source = self.field2source[preload_value_field] if pid_source != pv_source: - raise ValueError('preload id field [{}] is from source [{}],' - 'while prelaod value field [{}] is from source [{}], which should be the same'.format( - preload_id_field, pid_source, preload_value_field, pv_source - )) - for feat in self.feat_list: + raise ValueError( + f'Preload id field [{preload_id_field}] is from source [{pid_source}],' + f'while preload value field [{preload_value_field}] is from source [{pv_source}], ' + f'which should be the same.' + ) + for feat_name in self.feat_name_list: + feat = getattr(self, feat_name) if preload_id_field in feat: id_ftype = self.field2type[preload_id_field] if id_ftype != FeatureType.TOKEN: - raise ValueError('prelaod id field [{}] should be type token, but is [{}]'.format( - preload_id_field, id_ftype - )) + raise ValueError( + f'Preload id field [{preload_id_field}] should be type token, but is [{id_ftype}].' + ) value_ftype = self.field2type[preload_value_field] token_num = self.num(preload_id_field) if value_ftype == FeatureType.FLOAT: @@ -503,14 +505,12 @@ def _preload_weight_matrix(self): else: matrix[pid] = prow[:max_len] else: - self.logger.warning('Field [{}] with type [{}] is not \'float\' or \'float_seq\', \ - which will not be handled by preload matrix.'.format(preload_value_field, - value_ftype)) + self.logger.warning( + f'Field [{preload_value_field}] with type [{value_ftype}] is not `float` or `float_seq`, ' + f'which will not be handled by preload matrix.' + ) continue self._preloaded_weight[preload_id_field] = matrix - if drop_flag: - self._del_col(preload_id_field) - self._del_col(preload_value_field) def _fill_nan(self): """Missing value imputation. @@ -520,28 +520,19 @@ def _fill_nan(self): For fields with type :obj:`~recbole.utils.enum_type.FeatureType.FLOAT`, missing value will be filled by the average of original data. - - For sequence features, missing value will be filled by ``[0]``. """ self.logger.debug('Filling nan') - if not self._fill_nan_flag: - return - - most_freq = SimpleImputer(missing_values=np.nan, strategy='most_frequent', copy=False) - aveg = SimpleImputer(missing_values=np.nan, strategy='mean', copy=False) - - for feat in self.feat_list: + for feat_name in self.feat_name_list: + feat = getattr(self, feat_name) for field in feat: ftype = self.field2type[field] if ftype == FeatureType.TOKEN: - feat[field] = most_freq.fit_transform(feat[field].values.reshape(-1, 1)) + feat[field].fillna(value=0, inplace=True) elif ftype == FeatureType.FLOAT: - feat[field] = aveg.fit_transform(feat[field].values.reshape(-1, 1)) - elif ftype.value.endswith('seq'): - feat[field] = feat[field].apply(lambda x: [0] - if (not isinstance(x, np.ndarray) and (not isinstance(x, list))) - else x) + feat[field].fillna(value=feat[field].mean(), inplace=True) + else: + feat[field] = feat[field].apply(lambda x: [] if isinstance(x, float) else x) def _normalize(self): """Normalization if ``config['normalize_field']`` or ``config['normalize_all']`` is set. @@ -553,25 +544,26 @@ def _normalize(self): Note: Only float-like fields can be normalized. """ - if self.config['normalize_field'] is not None and self.config['normalize_all'] is not None: - raise ValueError('normalize_field and normalize_all can\'t be set at the same time') + if self.config['normalize_field'] is not None and self.config['normalize_all'] is True: + raise ValueError('Normalize_field and normalize_all can\'t be set at the same time.') if self.config['normalize_field']: fields = self.config['normalize_field'] for field in fields: ftype = self.field2type[field] if field not in self.field2type: - raise ValueError('Field [{}] doesn\'t exist'.format(field)) + raise ValueError(f'Field [{field}] does not exist.') elif ftype != FeatureType.FLOAT and ftype != FeatureType.FLOAT_SEQ: - self.logger.warning('{} is not a FLOAT/FLOAT_SEQ feat, which will not be normalized.'.format(field)) + self.logger.warning(f'{field} is not a FLOAT/FLOAT_SEQ feat, which will not be normalized.') elif self.config['normalize_all']: fields = self.float_like_fields else: return - self.logger.debug('Normalized fields: {}'.format(fields)) + self.logger.debug(f'Normalized fields: {fields}') - for feat in self.feat_list: + for feat_name in self.feat_name_list: + feat = getattr(self, feat_name) for field in feat: if field not in fields: continue @@ -580,15 +572,19 @@ def _normalize(self): lst = feat[field].values mx, mn = max(lst), min(lst) if mx == mn: - raise ValueError('All the same value in [{}] from [{}_feat]'.format(field, feat)) - feat[field] = (lst - mn) / (mx - mn) + self.logger.warning(f'All the same value in [{field}] from [{feat}_feat].') + feat[field] = 1.0 + else: + feat[field] = (lst - mn) / (mx - mn) elif ftype == FeatureType.FLOAT_SEQ: split_point = np.cumsum(feat[field].agg(len))[:-1] lst = feat[field].agg(np.concatenate) mx, mn = max(lst), min(lst) if mx == mn: - raise ValueError('All the same value in [{}] from [{}_feat]'.format(field, feat)) - lst = (lst - mn) / (mx - mn) + self.logger.warning(f'All the same value in [{field}] from [{feat}_feat].') + lst = 1.0 + else: + lst = (lst - mn) / (mx - mn) lst = np.split(lst, split_point) feat[field] = lst @@ -599,15 +595,17 @@ def _filter_nan_user_or_item(self): feat = getattr(self, name + '_feat') if feat is not None: dropped_feat = feat.index[feat[field].isnull()] - if dropped_feat.any(): - self.logger.warning('In {}_feat, line {}, {} do not exist, so they will be removed'.format( - name, list(dropped_feat + 2), field)) + if len(dropped_feat): + self.logger.warning( + f'In {name}_feat, line {list(dropped_feat + 2)}, {field} do not exist, so they will be removed.' + ) feat.drop(feat.index[dropped_feat], inplace=True) if field is not None: dropped_inter = self.inter_feat.index[self.inter_feat[field].isnull()] - if dropped_inter.any(): - self.logger.warning('In inter_feat, line {}, {} do not exist, so they will be removed'.format( - name, list(dropped_inter + 2), field)) + if len(dropped_inter): + self.logger.warning( + f'In inter_feat, line {list(dropped_inter + 2)}, {field} do not exist, so they will be removed.' + ) self.inter_feat.drop(self.inter_feat.index[dropped_inter], inplace=True) def _remove_duplication(self): @@ -626,11 +624,14 @@ def _remove_duplication(self): if self.time_field in self.inter_feat: self.inter_feat.sort_values(by=[self.time_field], ascending=True, inplace=True) - self.logger.info('Records in original dataset have been sorted by value of [{}] in ascending order.'.format( - self.time_field)) + self.logger.info( + f'Records in original dataset have been sorted by value of [{self.time_field}] in ascending order.' + ) else: - self.logger.warning('Timestamp field has not been loaded or specified, ' - 'thus strategy [{}] of duplication removal may be meaningless.'.format(keep)) + self.logger.warning( + f'Timestamp field has not been loaded or specified, ' + f'thus strategy [{keep}] of duplication removal may be meaningless.' + ) self.inter_feat.drop_duplicates(subset=[self.uid_field, self.iid_field], keep=keep, inplace=True) def _filter_by_inter_num(self): @@ -643,16 +644,42 @@ def _filter_by_inter_num(self): Lower bound is also called k-core filtering, which means this method will filter loops until all the users and items has at least k interactions. """ + if self.uid_field is None or self.iid_field is None: + return + + max_user_inter_num = self.config['max_user_inter_num'] + min_user_inter_num = self.config['min_user_inter_num'] + max_item_inter_num = self.config['max_item_inter_num'] + min_item_inter_num = self.config['min_item_inter_num'] + + if max_user_inter_num is None and min_user_inter_num is None: + user_inter_num = Counter() + else: + user_inter_num = Counter(self.inter_feat[self.uid_field].values) + + if max_item_inter_num is None and min_item_inter_num is None: + item_inter_num = Counter() + else: + item_inter_num = Counter(self.inter_feat[self.iid_field].values) + while True: - ban_users = self._get_illegal_ids_by_inter_num(field=self.uid_field, feat=self.user_feat, - max_num=self.config['max_user_inter_num'], - min_num=self.config['min_user_inter_num']) - ban_items = self._get_illegal_ids_by_inter_num(field=self.iid_field, feat=self.item_feat, - max_num=self.config['max_item_inter_num'], - min_num=self.config['min_item_inter_num']) + ban_users = self._get_illegal_ids_by_inter_num( + field=self.uid_field, + feat=self.user_feat, + inter_num=user_inter_num, + max_num=max_user_inter_num, + min_num=min_user_inter_num + ) + ban_items = self._get_illegal_ids_by_inter_num( + field=self.iid_field, + feat=self.item_feat, + inter_num=item_inter_num, + max_num=max_item_inter_num, + min_num=min_item_inter_num + ) if len(ban_users) == 0 and len(ban_items) == 0: - return + break if self.user_feat is not None: dropped_user = self.user_feat[self.uid_field].isin(ban_users) @@ -663,46 +690,43 @@ def _filter_by_inter_num(self): self.item_feat.drop(self.item_feat.index[dropped_item], inplace=True) dropped_inter = pd.Series(False, index=self.inter_feat.index) - if self.uid_field: - dropped_inter |= self.inter_feat[self.uid_field].isin(ban_users) - if self.iid_field: - dropped_inter |= self.inter_feat[self.iid_field].isin(ban_items) - self.logger.debug('[{}] dropped interactions'.format(len(dropped_inter))) - self.inter_feat.drop(self.inter_feat.index[dropped_inter], inplace=True) - - def _get_illegal_ids_by_inter_num(self, field, feat, max_num=None, min_num=None): + user_inter = self.inter_feat[self.uid_field] + item_inter = self.inter_feat[self.iid_field] + dropped_inter |= user_inter.isin(ban_users) + dropped_inter |= item_inter.isin(ban_items) + + user_inter_num -= Counter(user_inter[dropped_inter].values) + item_inter_num -= Counter(item_inter[dropped_inter].values) + + dropped_index = self.inter_feat.index[dropped_inter] + self.logger.debug(f'[{len(dropped_index)}] dropped interactions.') + self.inter_feat.drop(dropped_index, inplace=True) + + def _get_illegal_ids_by_inter_num(self, field, feat, inter_num, max_num=None, min_num=None): """Given inter feat, return illegal ids, whose inter num out of [min_num, max_num] Args: field (str): field name of user_id or item_id. feat (pandas.DataFrame): interaction feature. + inter_num (Counter): interaction number counter. max_num (int, optional): max number of interaction. Defaults to ``None``. min_num (int, optional): min number of interaction. Defaults to ``None``. Returns: set: illegal ids, whose inter num out of [min_num, max_num] """ - self.logger.debug('\n get_illegal_ids_by_inter_num:\n\t field=[{}], max_num=[{}], min_num=[{}]'.format( - field, max_num, min_num - )) - - if field is None: - return set() - if max_num is None and min_num is None: - return set() + self.logger.debug(f'get_illegal_ids_by_inter_num: field=[{field}], max_num=[{max_num}], min_num=[{min_num}]') max_num = max_num or np.inf min_num = min_num or -1 - ids = self.inter_feat[field].values - inter_num = Counter(ids) ids = {id_ for id_ in inter_num if inter_num[id_] < min_num or inter_num[id_] > max_num} if feat is not None: for id_ in feat[field].values: if inter_num[id_] < min_num: ids.add(id_) - self.logger.debug('[{}] illegal_ids_by_inter_num, field=[{}]'.format(len(ids), field)) + self.logger.debug(f'[{len(ids)}] illegal_ids_by_inter_num, field=[{field}]') return ids def _filter_by_field_value(self): @@ -714,16 +738,11 @@ def _filter_by_field_value(self): filter_field += self._drop_by_value(self.config['equal_val'], lambda x, y: x != y) filter_field += self._drop_by_value(self.config['not_equal_val'], lambda x, y: x == y) - if not filter_field: - return - if self.config['drop_filter_field']: - for field in set(filter_field): - self._del_col(field) - def _reset_index(self): - """Reset index for all feats in :attr:`feat_list`. + """Reset index for all feats in :attr:`feat_name_list`. """ - for feat in self.feat_list: + for feat_name in self.feat_name_list: + feat = getattr(self, feat_name) if feat.empty: raise ValueError('Some feat is empty, please check the filtering settings.') feat.reset_index(drop=True, inplace=True) @@ -732,8 +751,8 @@ def _drop_by_value(self, val, cmp): """Drop illegal rows by value. Args: - val (float): value that compared to. - cmp (function): return False if a row need to be droped + val (dict): value that compared to. + cmp (Callable): return False if a row need to be dropped Returns: field names that used to compare with val. @@ -741,33 +760,54 @@ def _drop_by_value(self, val, cmp): if val is None: return [] - self.logger.debug('drop_by_value: val={}'.format(val)) + self.logger.debug(f'drop_by_value: val={val}') filter_field = [] for field in val: if field not in self.field2type: - raise ValueError('field [{}] not defined in dataset'.format(field)) + raise ValueError(f'Field [{field}] not defined in dataset.') if self.field2type[field] not in {FeatureType.FLOAT, FeatureType.FLOAT_SEQ}: - raise ValueError('field [{}] is not float-like field in dataset, which can\'t be filter'.format(field)) - for feat in self.feat_list: + raise ValueError(f'Field [{field}] is not float-like field in dataset, which can\'t be filter.') + for feat_name in self.feat_name_list: + feat = getattr(self, feat_name) if field in feat: feat.drop(feat.index[cmp(feat[field].values, val[field])], inplace=True) filter_field.append(field) return filter_field - def _del_col(self, field): + def _del_col(self, feat, field): """Delete columns Args: - field (str): field name to be droped. + feat (pandas.DataFrame or Interaction): the feat contains field. + field (str): field name to be dropped. """ - self.logger.debug('delete column [{}]'.format(field)) - for feat in self.feat_list: - if field in feat: - feat.drop(columns=field, inplace=True) - for dct in [self.field2id_token, self.field2seqlen, self.field2source, self.field2type]: + self.logger.debug(f'Delete column [{field}].') + if isinstance(feat, Interaction): + feat.drop(column=field) + else: + feat.drop(columns=field, inplace=True) + for dct in [self.field2id_token, self.field2token_id, self.field2seqlen, self.field2source, self.field2type]: if field in dct: del dct[field] + def _filter_inter_by_user_or_item(self): + """Remove interaction in inter_feat which user or item is not in user_feat or item_feat. + """ + if self.config['filter_inter_by_user_or_item'] is not True: + return + + remained_inter = pd.Series(True, index=self.inter_feat.index) + + if self.user_feat is not None: + remained_uids = self.user_feat[self.uid_field].values + remained_inter &= self.inter_feat[self.uid_field].isin(remained_uids) + + if self.item_feat is not None: + remained_iids = self.item_feat[self.iid_field].values + remained_inter &= self.inter_feat[self.iid_field].isin(remained_iids) + + self.inter_feat.drop(self.inter_feat.index[~remained_inter], inplace=True) + def _set_label_by_threshold(self): """Generate 0/1 labels according to value of features. @@ -777,24 +817,24 @@ def _set_label_by_threshold(self): Note: Key of ``config['threshold']`` if a field name. - This field will be droped after label generation. + This field will be dropped after label generation. """ threshold = self.config['threshold'] if threshold is None: return - self.logger.debug('set label by {}'.format(threshold)) + self.logger.debug(f'Set label by {threshold}.') if len(threshold) != 1: - raise ValueError('threshold length should be 1') + raise ValueError('Threshold length should be 1.') self.set_field_property(self.label_field, FeatureType.FLOAT, FeatureSource.INTERACTION, 1) for field, value in threshold.items(): if field in self.inter_feat: self.inter_feat[self.label_field] = (self.inter_feat[field] >= value).astype(int) else: - raise ValueError('field [{}] not in inter_feat'.format(field)) - self._del_col(field) + raise ValueError(f'Field [{field}] not in inter_feat.') + self._del_col(self.inter_feat, field) def _get_fields_in_same_space(self): """Parsing ``config['fields_in_same_space']``. See :doc:`../user_guide/data/data_args` for detail arg setting. @@ -818,14 +858,14 @@ def _get_fields_in_same_space(self): elif count == 1: continue else: - raise ValueError('field [{}] occurred in `fields_in_same_space` more than one time'.format(field)) + raise ValueError(f'Field [{field}] occurred in `fields_in_same_space` more than one time.') for field_set in fields_in_same_space: if self.uid_field in field_set and self.iid_field in field_set: raise ValueError('uid_field and iid_field can\'t in the same ID space') for field in field_set: if field not in token_like_fields: - raise ValueError('field [{}] is not a token-like field'.format(field)) + raise ValueError(f'Field [{field}] is not a token-like field.') fields_in_same_space.extend(additional) return fields_in_same_space @@ -859,7 +899,7 @@ def _get_remap_list(self, field_set): source = self.field2source[field] if isinstance(source, FeatureSource): source = source.value - feat = getattr(self, '{}_feat'.format(source)) + feat = getattr(self, f'{source}_feat') ftype = self.field2type[field] remap_list.append((feat, field, ftype)) return remap_list @@ -868,7 +908,7 @@ def _remap_ID_all(self): """Get ``config['fields_in_same_space']`` firstly, and remap each. """ fields_in_same_space = self._get_fields_in_same_space() - self.logger.debug('fields_in_same_space: {}'.format(fields_in_same_space)) + self.logger.debug(f'fields_in_same_space: {fields_in_same_space}') for field_set in fields_in_same_space: remap_list = self._get_remap_list(field_set) self._remap(remap_list) @@ -916,6 +956,13 @@ def _remap(self, remap_list): split_point = np.cumsum(feat[field].agg(len))[:-1] feat[field] = np.split(new_ids, split_point) + def _change_feat_format(self): + """Change feat format from :class:`pandas.DataFrame` to :class:`Interaction`. + """ + for feat_name in self.feat_name_list: + feat = getattr(self, feat_name) + setattr(self, feat_name, self._dataframe_to_interaction(feat)) + @dlapi.set() def num(self, field): """Given ``field``, for token-like fields, return the number of different tokens after remapping, @@ -928,7 +975,7 @@ def num(self, field): int: The number of different tokens (``1`` if ``field`` is a float-like field). """ if field not in self.field2type: - raise ValueError('field [{}] not defined in dataset'.format(field)) + raise ValueError(f'Field [{field}] not defined in dataset.') if self.field2type[field] not in {FeatureType.TOKEN, FeatureType.TOKEN_SEQ}: return self.field2seqlen[field] else: @@ -1024,10 +1071,10 @@ def token2id(self, field, tokens): Args: field (str): Field of external tokens. - tokens (str, list or np.ndarray): External tokens. + tokens (str, list or numpy.ndarray): External tokens. Returns: - int or np.ndarray: The internal ids of external tokens. + int or numpy.ndarray: The internal ids of external tokens. """ if isinstance(tokens, str): if tokens in self.field2token_id[field]: @@ -1045,18 +1092,18 @@ def id2token(self, field, ids): Args: field (str): Field of internal ids. - ids (int, list, np.ndarray or torch.Tensor): Internal ids. + ids (int, list, numpy.ndarray or torch.Tensor): Internal ids. Returns: - str or np.ndarray: The external tokens of internal ids. + str or numpy.ndarray: The external tokens of internal ids. """ try: return self.field2id_token[field][ids] except IndexError: if isinstance(ids, list): - raise ValueError('[{}] is not a one-dimensional list'.format(ids)) + raise ValueError(f'[{ids}] is not a one-dimensional list.') else: - raise ValueError('[{}] is not a valid ids'.format(ids)) + raise ValueError(f'[{ids}] is not a valid ids.') @property @dlapi.set() @@ -1096,7 +1143,7 @@ def avg_actions_of_users(self): Returns: numpy.float64: Average number of users' interaction records. """ - return np.mean(self.inter_feat.groupby(self.uid_field).size()) + return np.mean(list(Counter(self.inter_feat[self.uid_field].numpy()).values())) @property def avg_actions_of_items(self): @@ -1105,7 +1152,7 @@ def avg_actions_of_items(self): Returns: numpy.float64: Average number of items' interaction records. """ - return np.mean(self.inter_feat.groupby(self.iid_field).size()) + return np.mean(list(Counter(self.inter_feat[self.iid_field].numpy()).values())) @property def sparsity(self): @@ -1116,31 +1163,6 @@ def sparsity(self): """ return 1 - self.inter_num / self.user_num / self.item_num - @property - def uid2index(self): - """Sort ``self.inter_feat``, - and get the mapping of user_id and index of its interaction records. - - Returns: - tuple: - - ``numpy.ndarray`` of tuple ``(uid, slice)``, - interaction records between slice are all belong to the same uid. - - ``numpy.ndarray`` of int, - representing number of interaction records of each user. - """ - self._check_field('uid_field') - self.sort(by=self.uid_field, ascending=True) - uid_list = [] - start, end = dict(), dict() - for i, uid in enumerate(self.inter_feat[self.uid_field].values): - if uid not in start: - uid_list.append(uid) - start[uid] = i - end[uid] = i - index = [(uid, slice(start[uid], end[uid] + 1)) for uid in uid_list] - uid2items_num = [end[uid] - start[uid] + 1 for uid in uid_list] - return np.array(index), np.array(uid2items_num) - def _check_field(self, *field_names): """Given a name of attribute, check if it's exist. @@ -1149,21 +1171,22 @@ def _check_field(self, *field_names): """ for field_name in field_names: if getattr(self, field_name, None) is None: - raise ValueError('{} isn\'t set'.format(field_name)) + raise ValueError(f'{field_name} isn\'t set.') + @dlapi.set() def join(self, df): """Given interaction feature, join user/item feature into it. Args: - df (pandas.DataFrame): Interaction feature to be joint. + df (Interaction): Interaction feature to be joint. Returns: - pandas.DataFrame: Interaction feature after joining operation. + Interaction: Interaction feature after joining operation. """ if self.user_feat is not None and self.uid_field in df: - df = pd.merge(df, self.user_feat, on=self.uid_field, how='left', suffixes=('_inter', '_user')) + df.update(self.user_feat[df[self.uid_field]]) if self.item_feat is not None and self.iid_field in df: - df = pd.merge(df, self.item_feat, on=self.iid_field, how='left', suffixes=('_inter', '_item')) + df.update(self.item_feat[df[self.iid_field]]) return df def __getitem__(self, index, join=True): @@ -1179,15 +1202,17 @@ def __repr__(self): def __str__(self): info = [self.dataset_name] if self.uid_field: - info.extend(['The number of users: {}'.format(self.user_num), - 'Average actions of users: {}'.format(self.avg_actions_of_users)]) + info.extend([ + f'The number of users: {self.user_num}', f'Average actions of users: {self.avg_actions_of_users}' + ]) if self.iid_field: - info.extend(['The number of items: {}'.format(self.item_num), - 'Average actions of items: {}'.format(self.avg_actions_of_items)]) - info.append('The number of inters: {}'.format(self.inter_num)) + info.extend([ + f'The number of items: {self.item_num}', f'Average actions of items: {self.avg_actions_of_items}' + ]) + info.append(f'The number of inters: {self.inter_num}') if self.uid_field and self.iid_field: - info.append('The sparsity of the dataset: {}%'.format(self.sparsity * 100)) - info.append('Remain Fields: {}'.format(list(self.field2type))) + info.append(f'The sparsity of the dataset: {self.sparsity * 100}%') + info.append(f'Remain Fields: {list(self.field2type)}') return '\n'.join(info) def copy(self, new_inter_feat): @@ -1195,7 +1220,7 @@ def copy(self, new_inter_feat): whose interaction feature is updated with ``new_inter_feat``, and all the other attributes the same. Args: - new_inter_feat (pandas.DataFrame): The new interaction feature need to be updated. + new_inter_feat (Interaction): The new interaction feature need to be updated. Returns: :class:`~Dataset`: the new :class:`~Dataset` object, whose interaction feature has been updated. @@ -1204,6 +1229,32 @@ def copy(self, new_inter_feat): nxt.inter_feat = new_inter_feat return nxt + def _drop_unused_col(self): + """Drop columns which are loaded for data preparation but not used in model. + """ + unused_col = self.config['unused_col'] + if unused_col is None: + return + + for feat_name, unused_fields in unused_col.items(): + feat = getattr(self, feat_name + '_feat') + for field in unused_fields: + if field not in feat: + self.logger.warning( + f'Field [{field}] is not in [{feat_name}_feat], which can not be set in `unused_col`.' + ) + continue + self._del_col(feat, field) + + def _grouped_index(self, group_by_list): + index = {} + for i, key in enumerate(group_by_list): + if key not in index: + index[key] = [i] + else: + index[key].append(i) + return index.values() + def _calcu_split_ids(self, tot, ratios): """Given split ratios, and total number, calculate the number of each part after splitting. @@ -1230,12 +1281,12 @@ def split_by_ratio(self, ratios, group_by=None): Defaults to ``None`` Returns: - list: List of :class:`~Dataset`, whose interaction features has been splitted. + list: List of :class:`~Dataset`, whose interaction features has been split. Note: Other than the first one, each part is rounded down. """ - self.logger.debug('split by ratios [{}], group_by=[{}]'.format(ratios, group_by)) + self.logger.debug(f'split by ratios [{ratios}], group_by=[{group_by}]') tot_ratio = sum(ratios) ratios = [_ / tot_ratio for _ in ratios] @@ -1244,15 +1295,16 @@ def split_by_ratio(self, ratios, group_by=None): split_ids = self._calcu_split_ids(tot=tot_cnt, ratios=ratios) next_index = [range(start, end) for start, end in zip([0] + split_ids, split_ids + [tot_cnt])] else: - grouped_inter_feat_index = self.inter_feat.groupby(by=group_by).groups.values() - next_index = [[] for i in range(len(ratios))] + grouped_inter_feat_index = self._grouped_index(self.inter_feat[group_by].numpy()) + next_index = [[] for _ in range(len(ratios))] for grouped_index in grouped_inter_feat_index: tot_cnt = len(grouped_index) split_ids = self._calcu_split_ids(tot=tot_cnt, ratios=ratios) for index, start, end in zip(next_index, [0] + split_ids, split_ids + [tot_cnt]): - index.extend(grouped_index[start: end]) + index.extend(grouped_index[start:end]) - next_df = [self.inter_feat.loc[index].reset_index(drop=True) for index in next_index] + self._drop_unused_col() + next_df = [self.inter_feat[index] for index in next_index] next_ds = [self.copy(_) for _ in next_df] return next_ds @@ -1260,13 +1312,13 @@ def _split_index_by_leave_one_out(self, grouped_index, leave_one_num): """Split indexes by strategy leave one out. Args: - grouped_index (pandas.DataFrameGroupBy): Index to be splitted. + grouped_index (list of list of int): Index to be split. leave_one_num (int): Number of parts whose length is expected to be ``1``. Returns: - list: List of index that has been splitted. + list: List of index that has been split. """ - next_index = [[] for i in range(leave_one_num + 1)] + next_index = [[] for _ in range(leave_one_num + 1)] for index in grouped_index: index = list(index) tot_cnt = len(index) @@ -1287,32 +1339,34 @@ def leave_one_out(self, group_by, leave_one_num=1): Defaults to ``1``. Returns: - list: List of :class:`~Dataset`, whose interaction features has been splitted. + list: List of :class:`~Dataset`, whose interaction features has been split. """ - self.logger.debug('leave one out, group_by=[{}], leave_one_num=[{}]'.format(group_by, leave_one_num)) + self.logger.debug(f'leave one out, group_by=[{group_by}], leave_one_num=[{leave_one_num}]') if group_by is None: raise ValueError('leave one out strategy require a group field') - grouped_inter_feat_index = self.inter_feat.groupby(by=group_by).groups.values() + grouped_inter_feat_index = self._grouped_index(self.inter_feat[group_by].numpy()) next_index = self._split_index_by_leave_one_out(grouped_inter_feat_index, leave_one_num) - next_df = [self.inter_feat.loc[index].reset_index(drop=True) for index in next_index] + + self._drop_unused_col() + next_df = [self.inter_feat[index] for index in next_index] next_ds = [self.copy(_) for _ in next_df] return next_ds def shuffle(self): """Shuffle the interaction records inplace. """ - self.inter_feat = self.inter_feat.sample(frac=1).reset_index(drop=True) + self.inter_feat.shuffle() def sort(self, by, ascending=True): """Sort the interaction records inplace. Args: - by (str): Field that as the key in the sorting process. - ascending (bool, optional): Results are ascending if ``True``, otherwise descending. + by (str or list of str): Field that as the key in the sorting process. + ascending (bool or list of bool, optional): Results are ascending if ``True``, otherwise descending. Defaults to ``True`` """ - self.inter_feat.sort_values(by=by, ascending=ascending, inplace=True, ignore_index=True) + self.inter_feat.sort(by=by, ascending=ascending) def build(self, eval_setting): """Processing dataset according to evaluation setting, including Group, Order and Split. @@ -1323,8 +1377,13 @@ def build(self, eval_setting): Object contains evaluation settings, which guide the data processing procedure. Returns: - list: List of builded :class:`Dataset`. + list: List of built :class:`Dataset`. """ + if self.benchmark_filename_list is not None: + cumsum = list(np.cumsum(self.file_size_list)) + datasets = [self.copy(self.inter_feat[start:end]) for start, end in zip([0] + cumsum[:-1], cumsum)] + return datasets + ordering_args = eval_setting.ordering_args if ordering_args['strategy'] == 'shuffle': self.shuffle() @@ -1352,9 +1411,9 @@ def save(self, filepath): filepath (str): path of saved dir. """ if (filepath is None) or (not os.path.isdir(filepath)): - raise ValueError('filepath [{}] need to be a dir'.format(filepath)) + raise ValueError(f'Filepath [{filepath}] need to be a dir.') - self.logger.debug('Saving into [{}]'.format(filepath)) + self.logger.debug(f'Saving into [{filepath}]') basic_info = { 'field2type': self.field2type, 'field2source': self.field2source, @@ -1367,29 +1426,31 @@ def save(self, filepath): feats = ['inter', 'user', 'item'] for name in feats: - df = getattr(self, '{}_feat'.format(name)) + df = getattr(self, f'{name}_feat') if df is not None: - df.to_csv(os.path.join(filepath, '{}.csv'.format(name))) + df.to_csv(os.path.join(filepath, f'{name}.csv')) + @dlapi.set() def get_user_feature(self): """ Returns: - pandas.DataFrame: user features + Interaction: user features """ if self.user_feat is None: self._check_field('uid_field') - return pd.DataFrame({self.uid_field: np.arange(self.user_num)}) + return Interaction({self.uid_field: torch.arange(self.user_num)}) else: return self.user_feat + @dlapi.set() def get_item_feature(self): """ Returns: - pandas.DataFrame: item features + Interaction: item features """ if self.item_feat is None: self._check_field('iid_field') - return pd.DataFrame({self.iid_field: np.arange(self.item_num)}) + return Interaction({self.iid_field: torch.arange(self.item_num)}) else: return self.item_feat @@ -1404,7 +1465,7 @@ def _create_sparse_matrix(self, df_feat, source_field, target_field, form='coo', else ``matrix[src, tgt] = df_feat[value_field][src, tgt]``. Args: - df_feat (pandas.DataFrame): Feature where src and tgt exist. + df_feat (Interaction): Feature where src and tgt exist. source_field (str): Source field target_field (str): Target field form (str, optional): Sparse matrix format. Defaults to ``coo``. @@ -1414,14 +1475,14 @@ def _create_sparse_matrix(self, df_feat, source_field, target_field, form='coo', Returns: scipy.sparse: Sparse matrix in form ``coo`` or ``csr``. """ - src = df_feat[source_field].values - tgt = df_feat[target_field].values + src = df_feat[source_field] + tgt = df_feat[target_field] if value_field is None: data = np.ones(len(df_feat)) else: - if value_field not in df_feat.columns: - raise ValueError('value_field [{}] should be one of `df_feat`\'s features.'.format(value_field)) - data = df_feat[value_field].values + if value_field not in df_feat: + raise ValueError(f'Value_field [{value_field}] should be one of `df_feat`\'s features.') + data = df_feat[value_field] mat = coo_matrix((data, (src, tgt)), shape=(self.num(source_field), self.num(target_field))) if form == 'coo': @@ -1429,9 +1490,9 @@ def _create_sparse_matrix(self, df_feat, source_field, target_field, form='coo', elif form == 'csr': return mat.tocsr() else: - raise NotImplementedError('sparse matrix format [{}] has not been implemented.'.format(form)) + raise NotImplementedError(f'Sparse matrix format [{form}] has not been implemented.') - def _create_graph(self, df_feat, source_field, target_field, form='dgl', value_field=None): + def _create_graph(self, tensor_feat, source_field, target_field, form='dgl', value_field=None): """Get graph that describe relations between two fields. Source and target should be token-like fields. @@ -1442,7 +1503,7 @@ def _create_graph(self, df_feat, source_field, target_field, form='dgl', value_f Currently, we support graph in `DGL`_ and `PyG`_. Args: - df_feat (pandas.DataFrame): Feature where src and tgt exist. + tensor_feat (Interaction): Feature where src and tgt exist. source_field (str): Source field target_field (str): Target field form (str, optional): Library of graph data structure. Defaults to ``dgl``. @@ -1458,7 +1519,6 @@ def _create_graph(self, df_feat, source_field, target_field, form='dgl', value_f .. _PyG: https://github.com/rusty1s/pytorch_geometric """ - tensor_feat = self._dataframe_to_interaction(df_feat) src = tensor_feat[source_field] tgt = tensor_feat[target_field] @@ -1477,8 +1537,9 @@ def _create_graph(self, df_feat, source_field, target_field, form='dgl', value_f graph = Data(edge_index=torch.stack([src, tgt]), edge_attr=edge_attr) return graph else: - raise NotImplementedError('graph format [{}] has not been implemented.'.format(form)) + raise NotImplementedError(f'Graph format [{form}] has not been implemented.') + @dlapi.set() def inter_matrix(self, form='coo', value_field=None): """Get sparse matrix that describe interactions between user_id and item_id. @@ -1496,7 +1557,7 @@ def inter_matrix(self, form='coo', value_field=None): scipy.sparse: Sparse matrix in form ``coo`` or ``csr``. """ if not self.uid_field or not self.iid_field: - raise ValueError('dataset doesn\'t exist uid/iid, thus can not converted to sparse matrix') + raise ValueError('dataset does not exist uid/iid, thus can not converted to sparse matrix.') return self._create_sparse_matrix(self.inter_feat, self.uid_field, self.iid_field, form, value_field) def _history_matrix(self, row, value_field=None): @@ -1524,13 +1585,13 @@ def _history_matrix(self, row, value_field=None): """ self._check_field('uid_field', 'iid_field') - user_ids, item_ids = self.inter_feat[self.uid_field].values, self.inter_feat[self.iid_field].values + user_ids, item_ids = self.inter_feat[self.uid_field].numpy(), self.inter_feat[self.iid_field].numpy() if value_field is None: values = np.ones(len(self.inter_feat)) else: - if value_field not in self.inter_feat.columns: - raise ValueError('value_field [{}] should be one of `inter_feat`\'s features.'.format(value_field)) - values = self.inter_feat[value_field].values + if value_field not in self.inter_feat: + raise ValueError(f'Value_field [{value_field}] should be one of `inter_feat`\'s features.') + values = self.inter_feat[value_field].numpy() if row == 'user': row_num, max_col_num = self.user_num, self.item_num @@ -1545,9 +1606,10 @@ def _history_matrix(self, row, value_field=None): col_num = np.max(history_len) if col_num > max_col_num * 0.2: - self.logger.warning('max value of {}\'s history interaction records has reached {}% of the total'.format( - row, col_num / max_col_num * 100, - )) + self.logger.warning( + f'Max value of {row}\'s history interaction records has reached ' + f'{col_num / max_col_num * 100}% of the total.' + ) history_matrix = np.zeros((row_num, col_num), dtype=np.int64) history_value = np.zeros((row_num, col_num)) @@ -1559,6 +1621,7 @@ def _history_matrix(self, row, value_field=None): return torch.LongTensor(history_matrix), torch.FloatTensor(history_value), torch.LongTensor(history_len) + @dlapi.set() def history_item_matrix(self, value_field=None): """Get dense matrix describe user's history interaction records. @@ -1583,6 +1646,7 @@ def history_item_matrix(self, value_field=None): """ return self._history_matrix(row='user', value_field=value_field) + @dlapi.set() def history_user_matrix(self, value_field=None): """Get dense matrix describe item's history interaction records. @@ -1620,11 +1684,10 @@ def get_preload_weight(self, field): numpy.ndarray: preloaded weight matrix. See :doc:`../user_guide/data/data_args` for details. """ if field not in self._preloaded_weight: - raise ValueError('field [{}] not in preload_weight'.format(field)) + raise ValueError(f'Field [{field}] not in preload_weight') return self._preloaded_weight[field] - @dlapi.set() - def _dataframe_to_interaction(self, data, *args): + def _dataframe_to_interaction(self, data): """Convert :class:`pandas.DataFrame` to :class:`~recbole.data.interaction.Interaction`. Args: @@ -1633,37 +1696,18 @@ def _dataframe_to_interaction(self, data, *args): Returns: :class:`~recbole.data.interaction.Interaction`: Converted data. """ - data = data.to_dict(orient='list') - return self._dict_to_interaction(data, *args) - - @dlapi.set() - def _dict_to_interaction(self, data, *args): - """Convert :class:`dict` to :class:`~recbole.data.interaction.Interaction`. - - Args: - data (dict): data to be converted. - - Returns: - :class:`~recbole.data.interaction.Interaction`: Converted data. - """ + new_data = {} for k in data: + value = data[k].values ftype = self.field2type[k] if ftype == FeatureType.TOKEN: - data[k] = torch.LongTensor(data[k]) + new_data[k] = torch.LongTensor(value) elif ftype == FeatureType.FLOAT: - data[k] = torch.FloatTensor(data[k]) + new_data[k] = torch.FloatTensor(value) elif ftype == FeatureType.TOKEN_SEQ: - if isinstance(data[k], np.ndarray): - data[k] = torch.LongTensor(data[k][:, :self.field2seqlen[k]]) - else: - seq_data = [torch.LongTensor(d[:self.field2seqlen[k]]) for d in data[k]] - data[k] = rnn_utils.pad_sequence(seq_data, batch_first=True) + seq_data = [torch.LongTensor(d[:self.field2seqlen[k]]) for d in value] + new_data[k] = rnn_utils.pad_sequence(seq_data, batch_first=True) elif ftype == FeatureType.FLOAT_SEQ: - if isinstance(data[k], np.ndarray): - data[k] = torch.FloatTensor(data[k][:, :self.field2seqlen[k]]) - else: - seq_data = [torch.FloatTensor(d[:self.field2seqlen[k]]) for d in data[k]] - data[k] = rnn_utils.pad_sequence(seq_data, batch_first=True) - else: - raise ValueError('Illegal ftype [{}]'.format(ftype)) - return Interaction(data, *args) + seq_data = [torch.FloatTensor(d[:self.field2seqlen[k]]) for d in value] + new_data[k] = rnn_utils.pad_sequence(seq_data, batch_first=True) + return Interaction(new_data) diff --git a/recbole/data/dataset/kg_dataset.py b/recbole/data/dataset/kg_dataset.py index 06855c310..11b8f20f6 100644 --- a/recbole/data/dataset/kg_dataset.py +++ b/recbole/data/dataset/kg_dataset.py @@ -16,9 +16,8 @@ from collections import Counter import numpy as np -import pandas as pd -from scipy.sparse import coo_matrix import torch +from scipy.sparse import coo_matrix from recbole.data.dataset import Dataset from recbole.data.utils import dlapi @@ -59,10 +58,11 @@ class KnowledgeBasedDataset(Dataset): Note: :attr:`entity_field` doesn't exist exactly. It's only a symbol, - representing entitiy features. E.g. it can be written into ``config['fields_in_same_space']``. + representing entity features. E.g. it can be written into ``config['fields_in_same_space']``. ``[UI-Relation]`` is a special relation token. """ + def __init__(self, config, saved_dataset=None): super().__init__(config, saved_dataset=saved_dataset) @@ -80,8 +80,8 @@ def _get_field_from_config(self): self._check_field('head_entity_field', 'tail_entity_field', 'relation_field', 'entity_field') self.set_field_property(self.entity_field, FeatureType.TOKEN, FeatureSource.KG, 1) - self.logger.debug('relation_field: {}'.format(self.relation_field)) - self.logger.debug('entity_field: {}'.format(self.entity_field)) + self.logger.debug(f'relation_field: {self.relation_field}') + self.logger.debug(f'entity_field: {self.entity_field}') def _data_processing(self): self._set_field2ent_level() @@ -116,18 +116,20 @@ def _load_data(self, token, dataset_path): self.item2entity, self.entity2item = self._load_link(self.dataset_name, self.dataset_path) def __str__(self): - info = [super().__str__(), - 'The number of entities: {}'.format(self.entity_num), - 'The number of relations: {}'.format(self.relation_num), - 'The number of triples: {}'.format(len(self.kg_feat)), - 'The number of items that have been linked to KG: {}'.format(len(self.item2entity))] + info = [ + super().__str__(), + f'The number of entities: {self.entity_num}', + f'The number of relations: {self.relation_num}', + f'The number of triples: {len(self.kg_feat)}', + f'The number of items that have been linked to KG: {len(self.item2entity)}' + ] # yapf: disable return '\n'.join(info) - def _build_feat_list(self): - feat_list = super()._build_feat_list() + def _build_feat_name_list(self): + feat_name_list = super()._build_feat_name_list() if self.kg_feat is not None: - feat_list.append(self.kg_feat) - return feat_list + feat_name_list.append('kg_feat') + return feat_name_list def _restore_saved_dataset(self, saved_dataset): raise NotImplementedError() @@ -136,10 +138,10 @@ def save(self, filepath): raise NotImplementedError() def _load_kg(self, token, dataset_path): - self.logger.debug('loading kg from [{}]'.format(dataset_path)) - kg_path = os.path.join(dataset_path, '{}.{}'.format(token, 'kg')) + self.logger.debug(f'Loading kg from [{dataset_path}].') + kg_path = os.path.join(dataset_path, f'{token}.kg') if not os.path.isfile(kg_path): - raise ValueError('[{}.{}] not found in [{}]'.format(token, 'kg', dataset_path)) + raise ValueError(f'[{token}.kg] not found in [{dataset_path}].') df = self._load_feat(kg_path, FeatureSource.KG) self._check_kg(df) return df @@ -151,10 +153,10 @@ def _check_kg(self, kg): assert self.relation_field in kg, kg_warn_message.format(self.relation_field) def _load_link(self, token, dataset_path): - self.logger.debug('loading link from [{}]'.format(dataset_path)) - link_path = os.path.join(dataset_path, '{}.{}'.format(token, 'link')) + self.logger.debug(f'Loading link from [{dataset_path}].') + link_path = os.path.join(dataset_path, f'{token}.link') if not os.path.isfile(link_path): - raise ValueError('[{}.{}] not found in [{}]'.format(token, 'link', dataset_path)) + raise ValueError(f'[{token}.link] not found in [{dataset_path}].') df = self._load_feat(link_path, 'link') self._check_link(df) @@ -179,9 +181,7 @@ def _get_fields_in_same_space(self): - ``head_entity_id`` and ``target_entity_id`` should be remapped with ``item_id``. """ fields_in_same_space = super()._get_fields_in_same_space() - fields_in_same_space = [ - _ for _ in fields_in_same_space if not self._contain_ent_field(_) - ] + fields_in_same_space = [_ for _ in fields_in_same_space if not self._contain_ent_field(_)] ent_fields = self._get_ent_fields_in_same_space() for field_set in fields_in_same_space: if self.iid_field in field_set: @@ -207,7 +207,7 @@ def _get_ent_fields_in_same_space(self): if self._contain_ent_field(field_set): field_set = self._remove_ent_field(field_set) ent_fields.update(field_set) - self.logger.debug('ent_fields: {}'.format(fields_in_same_space)) + self.logger.debug(f'ent_fields: {fields_in_same_space}') return ent_fields def _remove_ent_field(self, field_set): @@ -268,7 +268,7 @@ def _remap_entities_by_link(self): source = self.field2source[ent_field] if not isinstance(source, str): source = source.value - feat = getattr(self, '{}_feat'.format(source)) + feat = getattr(self, f'{source}_feat') entity_list = feat[ent_field].values for i, entity_id in enumerate(entity_list): if entity_id in self.entity2item: @@ -309,7 +309,7 @@ def _reset_ent_remapID(self, field, new_id_token): if self.item_feat is not None: feats.append(self.item_feat) else: - feats = [getattr(self, '{}_feat'.format(source))] + feats = [getattr(self, f'{source}_feat')] for feat in feats: old_idx = feat[field].values new_idx = np.array([idmap[_] for _ in old_idx]) @@ -382,7 +382,7 @@ def head_entities(self): Returns: numpy.ndarray: List of head entities of kg triplets. """ - return self.kg_feat[self.head_entity_field].values + return self.kg_feat[self.head_entity_field].numpy() @property @dlapi.set() @@ -391,7 +391,7 @@ def tail_entities(self): Returns: numpy.ndarray: List of tail entities of kg triplets. """ - return self.kg_feat[self.tail_entity_field].values + return self.kg_feat[self.tail_entity_field].numpy() @property @dlapi.set() @@ -400,7 +400,7 @@ def relations(self): Returns: numpy.ndarray: List of relations of kg triplets. """ - return self.kg_feat[self.relation_field].values + return self.kg_feat[self.relation_field].numpy() @property @dlapi.set() @@ -419,7 +419,7 @@ def kg_graph(self, form='coo', value_field=None): else ``graph[src, tgt] = self.kg_feat[value_field][src, tgt]``. Currently, we support graph in `DGL`_ and `PyG`_, - and two type of sparse matrixes, ``coo`` and ``csr``. + and two type of sparse matrices, ``coo`` and ``csr``. Args: form (str, optional): Format of sparse matrix, or library of graph data structure. @@ -447,11 +447,11 @@ def kg_graph(self, form='coo', value_field=None): def _create_ckg_sparse_matrix(self, form='coo', show_relation=False): user_num = self.user_num - hids = self.kg_feat[self.head_entity_field].values + user_num - tids = self.kg_feat[self.tail_entity_field].values + user_num + hids = self.head_entities + user_num + tids = self.tail_entities + user_num - uids = self.inter_feat[self.uid_field].values - iids = self.inter_feat[self.iid_field].values + user_num + uids = self.inter_feat[self.uid_field].numpy() + iids = self.inter_feat[self.iid_field].numpy() + user_num ui_rel_num = len(uids) ui_rel_id = self.relation_num - 1 @@ -463,7 +463,7 @@ def _create_ckg_sparse_matrix(self, form='coo', show_relation=False): if not show_relation: data = np.ones(len(src)) else: - kg_rel = self.kg_feat[self.relation_field].values + kg_rel = self.kg_feat[self.relation_field].numpy() ui_rel = np.full(2 * ui_rel_num, ui_rel_id, dtype=kg_rel.dtype) data = np.concatenate([ui_rel, kg_rel]) node_num = self.entity_num + self.user_num @@ -473,13 +473,13 @@ def _create_ckg_sparse_matrix(self, form='coo', show_relation=False): elif form == 'csr': return mat.tocsr() else: - raise NotImplementedError('sparse matrix format [{}] has not been implemented.'.format(form)) + raise NotImplementedError(f'Sparse matrix format [{form}] has not been implemented.') def _create_ckg_graph(self, form='dgl', show_relation=False): user_num = self.user_num - kg_tensor = self._dataframe_to_interaction(self.kg_feat) - inter_tensor = self._dataframe_to_interaction(self.inter_feat) + kg_tensor = self.kg_feat + inter_tensor = self.inter_feat head_entity = kg_tensor[self.head_entity_field] + user_num tail_entity = kg_tensor[self.tail_entity_field] + user_num @@ -510,7 +510,7 @@ def _create_ckg_graph(self, form='dgl', show_relation=False): graph = Data(edge_index=torch.stack([src, tgt]), edge_attr=edge_attr) return graph else: - raise NotImplementedError('graph format [{}] has not been implemented.'.format(form)) + raise NotImplementedError(f'Graph format [{form}] has not been implemented.') @dlapi.set() def ckg_graph(self, form='coo', value_field=None): @@ -524,7 +524,7 @@ def ckg_graph(self, form='coo', value_field=None): or ``graph[src, tgt] = [UI-Relation]``. Currently, we support graph in `DGL`_ and `PyG`_, - and two type of sparse matrixes, ``coo`` and ``csr``. + and two type of sparse matrices, ``coo`` and ``csr``. Args: form (str, optional): Format of sparse matrix, or library of graph data structure. @@ -542,9 +542,7 @@ def ckg_graph(self, form='coo', value_field=None): https://github.com/rusty1s/pytorch_geometric """ if value_field is not None and value_field != self.relation_field: - raise ValueError('value_field [{}] can only be [{}] in ckg_graph.'.format( - value_field, self.relation_field - )) + raise ValueError(f'Value_field [{value_field}] can only be [{self.relation_field}] in ckg_graph.') show_relation = value_field is not None if form in ['coo', 'csr']: diff --git a/recbole/data/dataset/kg_seq_dataset.py b/recbole/data/dataset/kg_seq_dataset.py index 90a34800c..c1c2e3102 100644 --- a/recbole/data/dataset/kg_seq_dataset.py +++ b/recbole/data/dataset/kg_seq_dataset.py @@ -16,5 +16,6 @@ class Kg_Seq_Dataset(SequentialDataset, KnowledgeBasedDataset): Inherit from :class:`~recbole.data.dataset.sequential_dataset.SequentialDataset` and :class:`~recbole.data.dataset.kg_dataset.KnowledgeBasedDataset`. """ + def __init__(self, config, saved_dataset=None): super().__init__(config, saved_dataset=saved_dataset) diff --git a/recbole/data/dataset/sequential_dataset.py b/recbole/data/dataset/sequential_dataset.py index 4f2af4139..1720d2865 100644 --- a/recbole/data/dataset/sequential_dataset.py +++ b/recbole/data/dataset/sequential_dataset.py @@ -12,10 +12,10 @@ ############################### """ -import numpy as np -import pandas as pd import copy +import numpy as np + from recbole.data.dataset import Dataset @@ -34,6 +34,7 @@ class SequentialDataset(Dataset): item_list_length (numpy.ndarray): List of item sequences' length after augmentation. """ + def __init__(self, config, saved_dataset=None): super().__init__(config, saved_dataset=saved_dataset) @@ -54,20 +55,13 @@ def prepare_data_augmentation(self): ``u1, <i1, i2, i3> | i4`` - Returns: - Tuple of ``self.uid_list``, ``self.item_list_index``, - ``self.target_index``, ``self.item_list_length``. - See :class:`SequentialDataset`'s attributes for details. - Note: - Actually, we do not realy generate these new item sequences. + Actually, we do not really generate these new item sequences. One user's item sequence is stored only once in memory. We store the index (slice) of each item sequence after augmentation, which saves memory and accelerates a lot. """ self.logger.debug('prepare_data_augmentation') - if hasattr(self, 'uid_list'): - return self.uid_list, self.item_list_index, self.target_index, self.item_list_length self._check_field('uid_field', 'time_field') max_item_list_len = self.config['MAX_ITEM_LIST_LENGTH'] @@ -75,7 +69,7 @@ def prepare_data_augmentation(self): last_uid = None uid_list, item_list_index, target_index, item_list_length = [], [], [], [] seq_start = 0 - for i, uid in enumerate(self.inter_feat[self.uid_field].values): + for i, uid in enumerate(self.inter_feat[self.uid_field].numpy()): if last_uid != uid: last_uid = uid seq_start = i @@ -90,17 +84,20 @@ def prepare_data_augmentation(self): self.uid_list = np.array(uid_list) self.item_list_index = np.array(item_list_index) self.target_index = np.array(target_index) - self.item_list_length = np.array(item_list_length) - return self.uid_list, self.item_list_index, self.target_index, self.item_list_length + self.item_list_length = np.array(item_list_length, dtype=np.int64) def leave_one_out(self, group_by, leave_one_num=1): - self.logger.debug('leave one out, group_by=[{}], leave_one_num=[{}]'.format(group_by, leave_one_num)) + self.logger.debug(f'Leave one out, group_by=[{group_by}], leave_one_num=[{leave_one_num}].') if group_by is None: - raise ValueError('leave one out strategy require a group field') + raise ValueError('Leave one out strategy require a group field.') + if group_by != self.uid_field: + raise ValueError('Sequential models require group by user.') self.prepare_data_augmentation() - grouped_index = pd.DataFrame(self.uid_list).groupby(by=0).groups.values() + grouped_index = self._grouped_index(self.uid_list) next_index = self._split_index_by_leave_one_out(grouped_index, leave_one_num) + + self._drop_unused_col() next_ds = [] for index in next_index: ds = copy.copy(self) @@ -108,3 +105,43 @@ def leave_one_out(self, group_by, leave_one_num=1): setattr(ds, field, np.array(getattr(ds, field)[index])) next_ds.append(ds) return next_ds + + def inter_matrix(self, form='coo', value_field=None): + """Get sparse matrix that describe interactions between user_id and item_id. + Sparse matrix has shape (user_num, item_num). + For a row of <src, tgt>, ``matrix[src, tgt] = 1`` if ``value_field`` is ``None``, + else ``matrix[src, tgt] = self.inter_feat[src, tgt]``. + + Args: + form (str, optional): Sparse matrix format. Defaults to ``coo``. + value_field (str, optional): Data of sparse matrix, which should exist in ``df_feat``. + Defaults to ``None``. + + Returns: + scipy.sparse: Sparse matrix in form ``coo`` or ``csr``. + """ + if not self.uid_field or not self.iid_field: + raise ValueError('dataset does not exist uid/iid, thus can not converted to sparse matrix.') + + self.logger.warning('Load interaction matrix may lead to label leakage from testing phase, this implementation ' + 'only provides the interactions corresponding to specific phase') + local_inter_feat = self.inter_feat[self.uid_list] + return self._create_sparse_matrix(local_inter_feat, self.uid_field, self.iid_field, form, value_field) + + def build(self, eval_setting): + ordering_args = eval_setting.ordering_args + if ordering_args['strategy'] == 'shuffle': + raise ValueError('Ordering strategy `shuffle` is not supported in sequential models.') + elif ordering_args['strategy'] == 'by': + if ordering_args['field'] != self.time_field: + raise ValueError('Sequential models require `TO` (time ordering) strategy.') + if ordering_args['ascending'] is not True: + raise ValueError('Sequential models require `time_field` to sort in ascending order.') + + group_field = eval_setting.group_field + + split_args = eval_setting.split_args + if split_args['strategy'] == 'loo': + return self.leave_one_out(group_by=group_field, leave_one_num=split_args['leave_one_num']) + else: + ValueError('Sequential models require `loo` (leave one out) split strategy.') diff --git a/recbole/data/dataset/social_dataset.py b/recbole/data/dataset/social_dataset.py index 9cf4a93c9..f53016ccf 100644 --- a/recbole/data/dataset/social_dataset.py +++ b/recbole/data/dataset/social_dataset.py @@ -14,9 +14,6 @@ import os -import numpy as np -from scipy.sparse import coo_matrix - from recbole.data.dataset import Dataset from recbole.data.utils import dlapi from recbole.utils import FeatureSource @@ -37,6 +34,7 @@ class SocialDataset(Dataset): net_feat (pandas.DataFrame): Internal data structure stores the network features. It's loaded from file ``.net``. """ + def __init__(self, config, saved_dataset=None): super().__init__(config, saved_dataset=saved_dataset) @@ -47,8 +45,8 @@ def _get_field_from_config(self): self.target_field = self.config['TARGET_ID_FIELD'] self._check_field('source_field', 'target_field') - self.logger.debug('source_id_field: {}'.format(self.source_field)) - self.logger.debug('target_id_field: {}'.format(self.target_field)) + self.logger.debug(f'source_id_field: {self.source_field}') + self.logger.debug(f'target_id_field: {self.target_field}') def _load_data(self, token, dataset_path): """Load ``.net`` additionally. @@ -56,22 +54,22 @@ def _load_data(self, token, dataset_path): super()._load_data(token, dataset_path) self.net_feat = self._load_net(self.dataset_name, self.dataset_path) - def _build_feat_list(self): - feat_list = super()._build_feat_list() + def _build_feat_name_list(self): + feat_name_list = super()._build_feat_name_list() if self.net_feat is not None: - feat_list.append(self.net_feat) - return feat_list + feat_name_list.append('net_feat') + return feat_name_list - def _load_net(self, dataset_name, dataset_path): - net_file_path = os.path.join(dataset_path, '{}.{}'.format(dataset_name, 'net')) + def _load_net(self, dataset_name, dataset_path): + net_file_path = os.path.join(dataset_path, f'{dataset_name}.net') if os.path.isfile(net_file_path): net_feat = self._load_feat(net_file_path, FeatureSource.NET) if net_feat is None: raise ValueError('.net file exist, but net_feat is None, please check your load_col') return net_feat else: - raise ValueError('File {} not exist'.format(net_file_path)) - + raise ValueError(f'File {net_file_path} not exist.') + def _get_fields_in_same_space(self): """Parsing ``config['fields_in_same_space']``. See :doc:`../user_guide/data/data_args` for detail arg setting. @@ -82,8 +80,9 @@ def _get_fields_in_same_space(self): - ``source_id`` and ``target_id`` should be remapped with ``user_id``. """ fields_in_same_space = super()._get_fields_in_same_space() - fields_in_same_space = [_ for _ in fields_in_same_space if (self.source_field not in _) and - (self.target_field not in _)] + fields_in_same_space = [ + _ for _ in fields_in_same_space if (self.source_field not in _) and (self.target_field not in _) + ] for field_set in fields_in_same_space: if self.uid_field in field_set: field_set.update({self.source_field, self.target_field}) @@ -98,7 +97,7 @@ def net_graph(self, form='coo', value_field=None): else ``graph[src, tgt] = self.net_feat[value_field][src, tgt]``. Currently, we support graph in `DGL`_ and `PyG`_, - and two type of sparse matrixes, ``coo`` and ``csr``. + and two type of sparse matrices, ``coo`` and ``csr``. Args: form (str, optional): Format of sparse matrix, or library of graph data structure. @@ -124,6 +123,5 @@ def net_graph(self, form='coo', value_field=None): raise NotImplementedError('net graph format [{}] has not been implemented.') def __str__(self): - info = [super().__str__(), - 'The number of connections of social network: {}'.format(len(self.net_feat))] + info = [super().__str__(), f'The number of connections of social network: {len(self.net_feat)}'] return '\n'.join(info) diff --git a/recbole/data/dataset/xgboost_dataset.py b/recbole/data/dataset/xgboost_dataset.py new file mode 100644 index 000000000..f74782284 --- /dev/null +++ b/recbole/data/dataset/xgboost_dataset.py @@ -0,0 +1,95 @@ +# @Time : 2020/12/17 +# @Author : Chen Yang +# @Email : 254170321@qq.com + +""" +recbole.data.xgboost_dataset +########################## +""" + +from recbole.data.dataset import Dataset +from recbole.utils import FeatureType + + +class XgboostDataset(Dataset): + """:class:`XgboostDataset` is based on :class:`~recbole.data.dataset.dataset.Dataset`, + and + + Attributes: + + """ + + def __init__(self, config, saved_dataset=None): + super().__init__(config, saved_dataset=saved_dataset) + + def _judge_token_and_convert(self, feat): + # get columns whose type is token + col_list = [] + for col_name in feat: + if col_name == self.uid_field or col_name == self.iid_field: + continue + if self.field2type[col_name] == FeatureType.TOKEN: + col_list.append(col_name) + elif self.field2type[col_name] in {FeatureType.TOKEN_SEQ, FeatureType.FLOAT_SEQ}: + feat = feat.drop([col_name], axis=1, inplace=False) + + # get hash map + for col in col_list: + self.hash_map[col] = dict({}) + self.hash_count[col] = 0 + + del_col = [] + for col in self.hash_map: + if col in feat.keys(): + for value in feat[col]: + # print(value) + if value not in self.hash_map[col]: + self.hash_map[col][value] = self.hash_count[col] + self.hash_count[col] = self.hash_count[col] + 1 + if self.hash_count[col] > self.config['token_num_threshold']: + del_col.append(col) + break + + for col in del_col: + del self.hash_count[col] + del self.hash_map[col] + col_list.remove(col) + self.convert_col_list.extend(col_list) + + # transform the original data + for col in self.hash_map.keys(): + if col in feat.keys(): + feat[col] = feat[col].map(self.hash_map[col]) + + return feat + + def _convert_token_to_hash(self): + """Convert the data of token type to hash form + + """ + self.hash_map = {} + self.hash_count = {} + self.convert_col_list = [] + if self.config['convert_token_to_onehot']: + for feat_name in ['inter_feat', 'user_feat', 'item_feat']: + feat = getattr(self, feat_name) + if feat is not None: + feat = self._judge_token_and_convert(feat) + setattr(self, feat_name, feat) + + def _from_scratch(self): + """Load dataset from scratch. + Initialize attributes firstly, then load data from atomic files, pre-process the dataset lastly. + """ + self.logger.debug(f'Loading {self.__class__} from scratch.') + + self._get_preset() + self._get_field_from_config() + self._load_data(self.dataset_name, self.dataset_path) + self._data_processing() + self._convert_token_to_hash() + self._change_feat_format() + + def __getitem__(self, index, join=True): + df = self.inter_feat[index] + return self.join(df) if join else df diff --git a/recbole/data/interaction.py b/recbole/data/interaction.py index 1787bd30b..218b8b035 100644 --- a/recbole/data/interaction.py +++ b/recbole/data/interaction.py @@ -13,6 +13,7 @@ """ import numpy as np +import torch class Interaction(object): @@ -81,13 +82,20 @@ class Interaction(object): def __init__(self, interaction, pos_len_list=None, user_len_list=None): self.interaction = interaction + self.pos_len_list = self.user_len_list = None + self.set_additional_info(pos_len_list, user_len_list) + for k in self.interaction: + if not isinstance(self.interaction[k], torch.Tensor): + raise ValueError(f'Interaction [{interaction}] should only contains torch.Tensor') + self.length = -1 + for k in self.interaction: + self.length = max(self.length, self.interaction[k].shape[0]) + + def set_additional_info(self, pos_len_list=None, user_len_list=None): self.pos_len_list = pos_len_list self.user_len_list = user_len_list if (self.pos_len_list is None) ^ (self.user_len_list is None): raise ValueError('pos_len_list and user_len_list should be both None or valued.') - for k in self.interaction: - self.length = self.interaction[k].shape[0] - break def __iter__(self): return self.interaction.__iter__() @@ -101,13 +109,17 @@ def __getitem__(self, index): ret[k] = self.interaction[k][index] return Interaction(ret) + def __contains__(self, item): + return item in self.interaction + def __len__(self): return self.length def __str__(self): - info = ['The batch_size of interaction: {}'.format(self.length)] + info = [f'The batch_size of interaction: {self.length}'] for k in self.interaction: - temp_str = " {}, {}, {}".format(k, self.interaction[k].shape, self.interaction[k].device.type) + inter = self.interaction[k] + temp_str = f" {k}, {inter.shape}, {inter.device.type}, {inter.dtype}" info.append(temp_str) info.append('\n') return '\n'.join(info) @@ -115,6 +127,14 @@ def __str__(self): def __repr__(self): return self.__str__() + @property + def columns(self): + """ + Returns: + list of str: The columns of interaction. + """ + return list(self.interaction.keys()) + def to(self, device, selected_field=None): """Transfer Tensors in this Interaction object to the specified device. @@ -124,20 +144,21 @@ def to(self, device, selected_field=None): with keys in selected_field will be sent to device. Returns: - Interaction: a copyed Interaction object with Tensors which are sented to + Interaction: a coped Interaction object with Tensors which are sent to the specified device. """ ret = {} if isinstance(selected_field, str): selected_field = [selected_field] - try: + + if selected_field is not None: selected_field = set(selected_field) for k in self.interaction: if k in selected_field: ret[k] = self.interaction[k].to(device) else: ret[k] = self.interaction[k] - except: + else: for k in self.interaction: ret[k] = self.interaction[k].to(device) return Interaction(ret) @@ -146,7 +167,7 @@ def cpu(self): """Transfer Tensors in this Interaction object to cpu. Returns: - Interaction: a copyed Interaction object with Tensors which are sented to cpu. + Interaction: a coped Interaction object with Tensors which are sent to cpu. """ ret = {} for k in self.interaction: @@ -214,8 +235,113 @@ def repeat_interleave(self, repeats, dim=0): def update(self, new_inter): """Similar to ``dict.update()`` + + Args: + new_inter (Interaction): current interaction will be updated by new_inter. """ for k in new_inter.interaction: self.interaction[k] = new_inter.interaction[k] - self.pos_len_list = new_inter.pos_len_list - self.user_len_list = new_inter.user_len_list + if new_inter.pos_len_list is not None: + self.pos_len_list = new_inter.pos_len_list + if new_inter.user_len_list is not None: + self.user_len_list = new_inter.user_len_list + + def drop(self, column): + """Drop column in interaction. + + Args: + column (str): the column to be dropped. + """ + if column not in self.interaction: + raise ValueError(f'Column [{column}] is not in [{self}].') + del self.interaction[column] + + def _reindex(self, index): + """Reset the index of interaction inplace. + + Args: + index: the new index of current interaction. + """ + for k in self.interaction: + self.interaction[k] = self.interaction[k][index] + if self.pos_len_list is not None: + self.pos_len_list = self.pos_len_list[index] + if self.user_len_list is not None: + self.user_len_list = self.user_len_list[index] + + def shuffle(self): + """Shuffle current interaction inplace. + """ + index = torch.randperm(self.length) + self._reindex(index) + + def sort(self, by, ascending=True): + """Sort the current interaction inplace. + + Args: + by (str or list of str): Field that as the key in the sorting process. + ascending (bool or list of bool, optional): Results are ascending if ``True``, otherwise descending. + Defaults to ``True`` + """ + if isinstance(by, str): + if by not in self.interaction: + raise ValueError(f'[{by}] is not exist in interaction [{self}].') + by = [by] + elif isinstance(by, (list, tuple)): + for b in by: + if b not in self.interaction: + raise ValueError(f'[{b}] is not exist in interaction [{self}].') + else: + raise TypeError(f'Wrong type of by [{by}].') + + if isinstance(ascending, bool): + ascending = [ascending] + elif isinstance(ascending, (list, tuple)): + for a in ascending: + if not isinstance(a, bool): + raise TypeError(f'Wrong type of ascending [{ascending}].') + else: + raise TypeError(f'Wrong type of ascending [{ascending}].') + + if len(by) != len(ascending): + if len(ascending) == 1: + ascending = ascending * len(by) + else: + raise ValueError(f'by [{by}] and ascending [{ascending}] should have same length.') + + for b, a in zip(by[::-1], ascending[::-1]): + index = np.argsort(self.interaction[b], kind='stable') + if not a: + index = index[::-1] + self._reindex(index) + + def add_prefix(self, prefix): + """Add prefix to current interaction's columns. + + Args: + prefix (str): The prefix to be added. + """ + self.interaction = {prefix + key: value for key, value in self.interaction.items()} + + +def cat_interactions(interactions): + """Concatenate list of interactions to single interaction. + + Args: + interactions (list of :class:`Interaction`): List of interactions to be concatenated. + + Returns: + :class:`Interaction`: Concatenated interaction. + """ + if not isinstance(interactions, (list, tuple)): + raise TypeError(f'Interactions [{interactions}] should be list or tuple.') + if len(interactions) == 0: + raise ValueError(f'Interactions [{interactions}] should have some interactions.') + + columns_set = set(interactions[0].columns) + for inter in interactions: + if columns_set != set(inter.columns): + raise ValueError(f'Interactions [{interactions}] should have some interactions.') + + new_inter = {col: torch.cat([inter[col] for inter in interactions]) for col in columns_set} + return Interaction(new_inter) diff --git a/recbole/data/utils.py b/recbole/data/utils.py index 810c96e0d..3f44d9a7d 100644 --- a/recbole/data/utils.py +++ b/recbole/data/utils.py @@ -13,13 +13,13 @@ """ import copy -import os import importlib +import os from recbole.config import EvalSetting -from recbole.sampler import KGSampler, Sampler, RepeatableSampler -from recbole.utils import ModelType from recbole.data.dataloader import * +from recbole.sampler import KGSampler, Sampler, RepeatableSampler +from recbole.utils import ModelType, ensure_dir def create_dataset(config): @@ -31,9 +31,10 @@ def create_dataset(config): Returns: Dataset: Constructed dataset. """ - try: - return getattr(importlib.import_module('recbole.data.dataset'), config['model'] + 'Dataset')(config) - except AttributeError: + dataset_module = importlib.import_module('recbole.data.dataset') + if hasattr(dataset_module, config['model'] + 'Dataset'): + return getattr(dataset_module, config['model'] + 'Dataset')(config) + else: model_type = config['MODEL_TYPE'] if model_type == ModelType.SEQUENTIAL: from .dataset import SequentialDataset @@ -44,6 +45,9 @@ def create_dataset(config): elif model_type == ModelType.SOCIAL: from .dataset import SocialDataset return SocialDataset(config) + elif model_type == ModelType.XGBOOST: + from .dataset import XgboostDataset + return XgboostDataset(config) else: from .dataset import Dataset return Dataset(config) @@ -69,34 +73,27 @@ def data_preparation(config, dataset, save=False): es_str = [_.strip() for _ in config['eval_setting'].split(',')] es = EvalSetting(config) + es.set_ordering_and_splitting(es_str[0]) - kwargs = {} - if 'RS' in es_str[0]: - kwargs['ratios'] = config['split_ratio'] - if kwargs['ratios'] is None: - raise ValueError('`ratios` should be set if `RS` is set') - if 'LS' in es_str[0]: - kwargs['leave_one_num'] = config['leave_one_num'] - if kwargs['leave_one_num'] is None: - raise ValueError('`leave_one_num` should be set if `LS` is set') - kwargs['group_by_user'] = config['group_by_user'] - getattr(es, es_str[0])(**kwargs) - - if es.split_args['strategy'] != 'loo' and model_type == ModelType.SEQUENTIAL: - raise ValueError('Sequential models require "loo" split strategy.') - - builded_datasets = dataset.build(es) - train_dataset, valid_dataset, test_dataset = builded_datasets + built_datasets = dataset.build(es) + train_dataset, valid_dataset, test_dataset = built_datasets phases = ['train', 'valid', 'test'] + sampler = None if save: - save_datasets(config['checkpoint_dir'], name=phases, dataset=builded_datasets) + save_datasets(config['checkpoint_dir'], name=phases, dataset=built_datasets) kwargs = {} if config['training_neg_sample_num']: - es.neg_sample_by(config['training_neg_sample_num']) + if dataset.label_field in dataset.inter_feat: + raise ValueError( + f'`training_neg_sample_num` should be 0 ' + f'if inter_feat have label_field [{dataset.label_field}].' + ) + train_distribution = config['training_neg_sample_distribution'] or 'uniform' + es.neg_sample_by(by=config['training_neg_sample_num'], distribution=train_distribution) if model_type != ModelType.SEQUENTIAL: - sampler = Sampler(phases, builded_datasets, es.neg_sample_args['distribution']) + sampler = Sampler(phases, built_datasets, es.neg_sample_args['distribution']) else: sampler = RepeatableSampler(phases, dataset, es.neg_sample_args['distribution']) kwargs['sampler'] = sampler.set_phase('train') @@ -117,9 +114,18 @@ def data_preparation(config, dataset, save=False): kwargs = {} if len(es_str) > 1 and getattr(es, es_str[1], None): + if dataset.label_field in dataset.inter_feat: + raise ValueError( + f'It can not validate with `{es_str[1]}` ' + f'when inter_feat have label_field [{dataset.label_field}].' + ) getattr(es, es_str[1])() - if 'sampler' not in locals(): - sampler = Sampler(phases, builded_datasets, es.neg_sample_args['distribution']) + if sampler is None: + if model_type != ModelType.SEQUENTIAL: + sampler = Sampler(phases, built_datasets, es.neg_sample_args['distribution']) + else: + sampler = RepeatableSampler(phases, dataset, es.neg_sample_args['distribution']) + sampler.set_distribution(es.neg_sample_args['distribution']) kwargs['sampler'] = [sampler.set_phase('valid'), sampler.set_phase('test')] kwargs['neg_sample_args'] = copy.deepcopy(es.neg_sample_args) valid_data, test_data = dataloader_construct( @@ -134,9 +140,9 @@ def data_preparation(config, dataset, save=False): return train_data, valid_data, test_data -def dataloader_construct(name, config, eval_setting, dataset, - dl_format=InputType.POINTWISE, - batch_size=1, shuffle=False, **kwargs): +def dataloader_construct( + name, config, eval_setting, dataset, dl_format=InputType.POINTWISE, batch_size=1, shuffle=False, **kwargs +): """Get a correct dataloader class by calling :func:`get_data_loader` to construct dataloader. Args: @@ -161,36 +167,30 @@ def dataloader_construct(name, config, eval_setting, dataset, batch_size = [batch_size] * len(dataset) if len(dataset) != len(batch_size): - raise ValueError('dataset {} and batch_size {} should have the same length'.format(dataset, batch_size)) + raise ValueError(f'Dataset {dataset} and batch_size {batch_size} should have the same length.') - kwargs_list = [{} for i in range(len(dataset))] + kwargs_list = [{} for _ in range(len(dataset))] for key, value in kwargs.items(): key = [key] * len(dataset) if not isinstance(value, list): value = [value] * len(dataset) if len(dataset) != len(value): - raise ValueError('dataset {} and {} {} should have the same length'.format(dataset, key, value)) + raise ValueError(f'Dataset {dataset} and {key} {value} should have the same length.') for kw, k, w in zip(kwargs_list, key, value): kw[k] = w model_type = config['MODEL_TYPE'] logger = getLogger() - logger.info('Build [{}] DataLoader for [{}] with format [{}]'.format(model_type, name, dl_format)) + logger.info(f'Build [{model_type}] DataLoader for [{name}] with format [{dl_format}]') logger.info(eval_setting) - logger.info('batch_size = [{}], shuffle = [{}]\n'.format(batch_size, shuffle)) + logger.info(f'batch_size = [{batch_size}], shuffle = [{shuffle}]\n') - DataLoader = get_data_loader(name, config, eval_setting) + dataloader = get_data_loader(name, config, eval_setting) try: ret = [ - DataLoader( - config=config, - dataset=ds, - batch_size=bs, - dl_format=dl_format, - shuffle=shuffle, - **kw - ) for ds, bs, kw in zip(dataset, batch_size, kwargs_list) + dataloader(config=config, dataset=ds, batch_size=bs, dl_format=dl_format, shuffle=shuffle, **kw) + for ds, bs, kw in zip(dataset, batch_size, kwargs_list) ] except TypeError: raise ValueError('training_neg_sample_num should be 0') @@ -213,11 +213,10 @@ def save_datasets(save_path, name, dataset): name = [name] dataset = [dataset] if len(name) != len(dataset): - raise ValueError('len of name {} should equal to len of dataset'.format(name, dataset)) + raise ValueError(f'Length of name {name} should equal to length of dataset {dataset}.') for i, d in enumerate(dataset): cur_path = os.path.join(save_path, name[i]) - if not os.path.isdir(cur_path): - os.makedirs(cur_path) + ensure_dir(cur_path) d.save(cur_path) @@ -233,7 +232,12 @@ def get_data_loader(name, config, eval_setting): type: The dataloader class that meets the requirements in :attr:`config` and :attr:`eval_setting`. """ register_table = { - 'DIN': _get_DIN_data_loader + 'DIN': _get_DIN_data_loader, + "MultiDAE": _get_AE_data_loader, + "MultiVAE": _get_AE_data_loader, + 'MacridVAE': _get_AE_data_loader, + 'CDAE': _get_AE_data_loader, + 'ENMF': _get_AE_data_loader } if config['model'] in register_table: @@ -254,7 +258,7 @@ def get_data_loader(name, config, eval_setting): elif neg_sample_strategy == 'by': return ContextNegSampleDataLoader elif neg_sample_strategy == 'full': - raise NotImplementedError('context model\'s full_sort has not been implemented') + return ContextFullDataLoader elif model_type == ModelType.SEQUENTIAL: if neg_sample_strategy == 'none': return SequentialDataLoader @@ -262,6 +266,13 @@ def get_data_loader(name, config, eval_setting): return SequentialNegSampleDataLoader elif neg_sample_strategy == 'full': return SequentialFullDataLoader + elif model_type == ModelType.XGBOOST: + if neg_sample_strategy == 'none': + return XgboostDataLoader + elif neg_sample_strategy == 'by': + return XgboostNegSampleDataLoader + elif neg_sample_strategy == 'full': + return XgboostFullDataLoader elif model_type == ModelType.KNOWLEDGE: if neg_sample_strategy == 'by': if name == 'train': @@ -273,10 +284,11 @@ def get_data_loader(name, config, eval_setting): elif neg_sample_strategy == 'none': # return GeneralDataLoader # TODO 训练也可以为none? 看general的逻辑似乎是都可以为None - raise NotImplementedError('The use of external negative sampling for knowledge model ' - 'has not been implemented') + raise NotImplementedError( + 'The use of external negative sampling for knowledge model has not been implemented' + ) else: - raise NotImplementedError('model_type [{}] has not been implemented'.format(model_type)) + raise NotImplementedError(f'Model_type [{model_type}] has not been implemented.') def _get_DIN_data_loader(name, config, eval_setting): @@ -299,6 +311,29 @@ def _get_DIN_data_loader(name, config, eval_setting): return SequentialFullDataLoader +def _get_AE_data_loader(name, config, eval_setting): + """Customized function for Multi-DAE and Multi-VAE to get correct dataloader class. + + Args: + name (str): The stage of dataloader. It can only take two values: 'train' or 'evaluation'. + config (Config): An instance object of Config, used to record parameter information. + eval_setting (EvalSetting): An instance object of EvalSetting, used to record evaluation settings. + + Returns: + type: The dataloader class that meets the requirements in :attr:`config` and :attr:`eval_setting`. + """ + neg_sample_strategy = eval_setting.neg_sample_args['strategy'] + if name == "train": + return UserDataLoader + else: + if neg_sample_strategy == 'none': + return GeneralDataLoader + elif neg_sample_strategy == 'by': + return GeneralNegSampleDataLoader + elif neg_sample_strategy == 'full': + return GeneralFullDataLoader + + class DLFriendlyAPI(object): """A Decorator class, which helps copying :class:`Dataset` methods to :class:`DataLoader`. @@ -306,13 +341,15 @@ class DLFriendlyAPI(object): E.g. if ``train_data`` is an object of :class:`DataLoader`, and :meth:`~recbole.data.dataset.dataset.Dataset.num` is a method of :class:`~recbole.data.dataset.dataset.Dataset`, - Cause it has been decorated, :meth:`~recbole.data.dataset.dataset.Dataset.num` can be called directly by ``train_data``. + Cause it has been decorated, :meth:`~recbole.data.dataset.dataset.Dataset.num` can be called directly by + ``train_data``. See the example of :meth:`set` for details. Attributes: dataloader_apis (set): Register table that saves all the method names of DataLoader Friendly APIs. """ + def __init__(self): self.dataloader_apis = set() @@ -330,9 +367,11 @@ def set(self): def dataset_meth(): ... """ + def decorator(f): self.dataloader_apis.add(f.__name__) return f + return decorator diff --git a/recbole/evaluator/__init__.py b/recbole/evaluator/__init__.py index 61b72ad22..73f4cad5f 100644 --- a/recbole/evaluator/__init__.py +++ b/recbole/evaluator/__init__.py @@ -1,4 +1,4 @@ from recbole.evaluator.abstract_evaluator import * -from recbole.evaluator.loss_evaluator import * +from recbole.evaluator.proxy_evaluator import * from recbole.evaluator.metrics import * -from recbole.evaluator.topk_evaluator import * +from recbole.evaluator.evaluators import * diff --git a/recbole/evaluator/abstract_evaluator.py b/recbole/evaluator/abstract_evaluator.py index b47d4f419..45254c9a6 100644 --- a/recbole/evaluator/abstract_evaluator.py +++ b/recbole/evaluator/abstract_evaluator.py @@ -3,18 +3,27 @@ # @Author : Kaiyuan Li # @email : tsotfsk@outlook.com +# UPDATE +# @Time : 2020/10/21, 2020/12/18 +# @Author : Kaiyuan Li, Zhichao Feng +# @email : tsotfsk@outlook.com, fzcbupt@gmail.com + """ recbole.evaluator.abstract_evaluator ##################################### """ +import numpy as np +import torch +from torch.nn.utils.rnn import pad_sequence + -class AbstractEvaluator(object): - """:class:`AbstractEvaluator` is an abstract object which supports +class BaseEvaluator(object): + """:class:`BaseEvaluator` is an object which supports the evaluation of the model. It is called by :class:`Trainer`. - Note: - If you want to inherit this class and implement your own evalautor class, + Note: + If you want to inherit this class and implement your own evaluator class, you must implement the following functions. Args: @@ -22,25 +31,106 @@ class AbstractEvaluator(object): """ - def __init__(self, config): - self.metrics = config['metrics'] - - def _check_args(self): - """check the correct of the setting""" - raise NotImplementedError + def __init__(self, config, metrics): + self.metrics = metrics + self.full = ('full' in config['eval_setting']) + self.precision = config['metric_decimal_place'] - def collect(self): + def collect(self, *args): """get the intermediate results for each batch, it is called at the end of each batch""" raise NotImplementedError - def evaluate(self): + def evaluate(self, *args): """calculate the metrics of all batches, it is called at the end of each epoch""" raise NotImplementedError - def metrics_info(self): - """get metrics result""" - raise NotImplementedError - - def _calculate_metrics(self): + def _calculate_metrics(self, *args): """ to calculate the metrics""" raise NotImplementedError + + +class GroupedEvaluator(BaseEvaluator): + """:class:`GroupedEvaluator` is an object which supports the evaluation of the model. + + Note: + If you want to implement a new group-based metric, + you may need to inherit this class + + """ + + def __init__(self, config, metrics): + super().__init__(config, metrics) + pass + + def sample_collect(self, scores_tensor, user_len_list): + """padding scores_tensor. It is called when evaluation sample distribution is `uniform` or `popularity`. + + """ + scores_list = torch.split(scores_tensor, user_len_list, dim=0) + padding_score = pad_sequence(scores_list, batch_first=True, padding_value=-np.inf) # n_users x items + return padding_score + + def full_sort_collect(self, scores_tensor, user_len_list): + """it is called when evaluation sample distribution is `full`. + + """ + return scores_tensor.view(len(user_len_list), -1) + + def get_score_matrix(self, scores_tensor, user_len_list): + """get score matrix. + + Args: + scores_tensor (tensor): the tensor of model output with size of `(N, )` + user_len_list(list): number of all items + + """ + if self.full: + scores_matrix = self.full_sort_collect(scores_tensor, user_len_list) + else: + scores_matrix = self.sample_collect(scores_tensor, user_len_list) + return scores_matrix + + +class IndividualEvaluator(BaseEvaluator): + """:class:`IndividualEvaluator` is an object which supports the evaluation of the model. + + Note: + If you want to implement a new non-group-based metric, + you may need to inherit this class + + """ + + def __init__(self, config, metrics): + super().__init__(config, metrics) + self._check_args() + + def sample_collect(self, true_scores, pred_scores): + """It is called when evaluation sample distribution is `uniform` or `popularity`. + + """ + return torch.stack((true_scores, pred_scores.detach()), dim=1) + + def full_sort_collect(self, true_scores, pred_scores): + """it is called when evaluation sample distribution is `full`. + + """ + raise NotImplementedError('full sort can\'t use IndividualEvaluator') + + def get_score_matrix(self, true_scores, pred_scores): + """get score matrix + + Args: + true_scores (tensor): the label of predicted items + pred_scores (tensor): the tensor of model output with a size of `(N, )` + + """ + if self.full: + scores_matrix = self.full_sort_collect(true_scores, pred_scores) + else: + scores_matrix = self.sample_collect(true_scores, pred_scores) + + return scores_matrix + + def _check_args(self): + if self.full: + raise NotImplementedError('full sort can\'t use IndividualEvaluator') diff --git a/recbole/evaluator/evaluators.py b/recbole/evaluator/evaluators.py new file mode 100644 index 000000000..8b0d97238 --- /dev/null +++ b/recbole/evaluator/evaluators.py @@ -0,0 +1,370 @@ +# -*- encoding: utf-8 -*- +# @Time : 2020/08/04 +# @Author : Kaiyuan Li +# @email : tsotfsk@outlook.com + +# UPDATE +# @Time : 2021/01/07, 2020/08/11, 2020/12/18 +# @Author : Kaiyuan Li, Yupeng Hou, Zhichao Feng +# @email : tsotfsk@outlook.com, houyupeng@ruc.edu.cn, fzcbupt@gmail.com + +""" +recbole.evaluator.evaluators +##################################### +""" + +from collections import ChainMap + +import numpy as np +import torch + +from recbole.evaluator.abstract_evaluator import GroupedEvaluator, IndividualEvaluator +from recbole.evaluator.metrics import metrics_dict + +# These metrics are typical in topk recommendations +topk_metrics = {metric.lower(): metric for metric in ['Hit', 'Recall', 'MRR', 'Precision', 'NDCG', 'MAP']} +# These metrics are typical in loss recommendations +loss_metrics = {metric.lower(): metric for metric in ['AUC', 'RMSE', 'MAE', 'LOGLOSS']} +# For GAUC +rank_metrics = {metric.lower(): metric for metric in ['GAUC']} + +# group-based metrics +group_metrics = ChainMap(topk_metrics, rank_metrics) +# not group-based metrics +individual_metrics = ChainMap(loss_metrics) + + +class TopKEvaluator(GroupedEvaluator): + r"""TopK Evaluator is mainly used in ranking tasks. Now, we support six topk metrics which + contain `'Hit', 'Recall', 'MRR', 'Precision', 'NDCG', 'MAP'`. + + Note: + The metrics used calculate group-based metrics which considers the metrics scores averaged + across users. Some of them are also limited to k. + + """ + + def __init__(self, config, metrics): + super().__init__(config, metrics) + + self.topk = config['topk'] + self._check_args() + + def collect(self, interaction, scores_tensor): + """collect the topk intermediate result of one batch, this function mainly + implements padding and TopK finding. It is called at the end of each batch + + Args: + interaction (Interaction): :class:`AbstractEvaluator` of the batch + scores_tensor (tensor): the tensor of model output with size of `(N, )` + + Returns: + torch.Tensor : a matrix contain topk matrix and shape matrix + + """ + user_len_list = interaction.user_len_list + + scores_matrix = self.get_score_matrix(scores_tensor, user_len_list) + scores_matrix = torch.flip(scores_matrix, dims=[-1]) + shape_matrix = torch.full((len(user_len_list), 1), scores_matrix.shape[1], device=scores_matrix.device) + + # get topk + _, topk_idx = torch.topk(scores_matrix, max(self.topk), dim=-1) # n_users x k + + # pack top_idx and shape_matrix + result = torch.cat((topk_idx, shape_matrix), dim=1) + return result + + def evaluate(self, batch_matrix_list, eval_data): + """calculate the metrics of all batches. It is called at the end of each epoch + + Args: + batch_matrix_list (list): the results of all batches + eval_data (Dataset): the class of test data + + Returns: + dict: such as ``{'Hit@20': 0.3824, 'Recall@20': 0.0527, 'Hit@10': 0.3153, 'Recall@10': 0.0329}`` + + """ + pos_len_list = eval_data.get_pos_len_list() + batch_result = torch.cat(batch_matrix_list, dim=0).cpu().numpy() + + # unpack top_idx and shape_matrix + topk_idx = batch_result[:, :-1] + shapes = batch_result[:, -1] + + assert len(pos_len_list) == len(topk_idx) + # get metrics + metric_dict = {} + result_list = self._calculate_metrics(pos_len_list, topk_idx, shapes) + for metric, value in zip(self.metrics, result_list): + for k in self.topk: + key = '{}@{}'.format(metric, k) + metric_dict[key] = round(value[k - 1], self.precision) + + return metric_dict + + def _check_args(self): + + # Check topk: + if isinstance(self.topk, (int, list)): + if isinstance(self.topk, int): + self.topk = [self.topk] + for topk in self.topk: + if topk <= 0: + raise ValueError( + 'topk must be a positive integer or a list of positive integers, ' + 'but get `{}`'.format(topk) + ) + else: + raise TypeError('The topk must be a integer, list') + + def _calculate_metrics(self, pos_len_list, topk_idx, shapes): + """integrate the results of each batch and evaluate the topk metrics by users + + Args: + pos_len_list (numpy.ndarray): a list of users' positive items + topk_idx (numpy.ndarray): a matrix which contains the index of the topk items for users + shapes (numpy.ndarray): a list which contains the columns of the padded batch matrix + + Returns: + numpy.ndarray: a matrix which contains the metrics result + + """ + pos_idx_matrix = (topk_idx >= (shapes - pos_len_list).reshape(-1, 1)) + result_list = [] + for metric in self.metrics: + metric_fuc = metrics_dict[metric.lower()] + result = metric_fuc(pos_idx_matrix, pos_len_list) + result_list.append(result) # n_users x len(metrics) x len(ranks) + result = np.stack(result_list, axis=0).mean(axis=1) # len(metrics) x len(ranks) + return result + + def __str__(self): + msg = 'The TopK Evaluator Info:\n' + \ + '\tMetrics:[' + \ + ', '.join([topk_metrics[metric.lower()] for metric in self.metrics]) + \ + '], TopK:[' + \ + ', '.join(map(str, self.topk)) + \ + ']' + return msg + + +class RankEvaluator(GroupedEvaluator): + r"""Rank Evaluator is mainly used in ranking tasks except for topk tasks. Now, we support one + rank metric containing `'GAUC'`. + + Note: + The metrics used calculate group-based metrics which considers the metrics scores averaged + across users except for top-k metrics. + + """ + + def __init__(self, config, metrics): + super().__init__(config, metrics) + pass + + def get_user_pos_len_list(self, interaction, scores_tensor): + """get number of positive items and all items in test set of each user + + Args: + interaction (Interaction): :class:`AbstractEvaluator` of the batch + scores_tensor (tensor): the tensor of model output with size of `(N, )` + + Returns: + list: number of positive items, + list: number of all items + """ + pos_len_list = torch.Tensor(interaction.pos_len_list).to(scores_tensor.device) + user_len_list = interaction.user_len_list + return pos_len_list, user_len_list + + def average_rank(self, scores): + """Get the ranking of an ordered tensor, and take the average of the ranking for positions with equal values. + + Args: + scores(tensor): an ordered tensor, with size of `(N, )` + + Returns: + torch.Tensor: average_rank + + Example: + >>> average_rank(tensor([[1,2,2,2,3,3,6],[2,2,2,2,4,5,5]])) + tensor([[1.0000, 3.0000, 3.0000, 3.0000, 5.5000, 5.5000, 7.0000], + [2.5000, 2.5000, 2.5000, 2.5000, 5.0000, 6.5000, 6.5000]]) + + Reference: + https://github.com/scipy/scipy/blob/v0.17.1/scipy/stats/stats.py#L5262-L5352 + + """ + length, width = scores.shape + device = scores.device + true_tensor = torch.full((length, 1), True, dtype=torch.bool, device=device) + + obs = torch.cat([true_tensor, scores[:, 1:] != scores[:, :-1]], dim=1) + # bias added to dense + bias = torch.arange(0, length, device=device).repeat(width).reshape(width, -1). \ + transpose(1, 0).reshape(-1) + dense = obs.view(-1).cumsum(0) + bias + + # cumulative counts of each unique value + count = torch.where(torch.cat([obs, true_tensor], dim=1))[1] + # get average rank + avg_rank = .5 * (count[dense] + count[dense - 1] + 1).view(length, -1) + + return avg_rank + + def collect(self, interaction, scores_tensor): + """collect the rank intermediate result of one batch, this function mainly implements ranking + and calculating the sum of rank for positive items. It is called at the end of each batch + + Args: + interaction (Interaction): :class:`AbstractEvaluator` of the batch + scores_tensor (tensor): the tensor of model output with size of `(N, )` + + """ + pos_len_list, user_len_list = self.get_user_pos_len_list(interaction, scores_tensor) + scores_matrix = self.get_score_matrix(scores_tensor, user_len_list) + desc_scores, desc_index = torch.sort(scores_matrix, dim=-1, descending=True) + + # get the index of positive items in the ranking list + pos_index = (desc_index < pos_len_list.reshape(-1, 1)) + + avg_rank = self.average_rank(desc_scores) + pos_rank_sum = torch.where(pos_index, avg_rank, torch.zeros_like(avg_rank)).sum(axis=-1).reshape(-1, 1) + + return pos_rank_sum + + def evaluate(self, batch_matrix_list, eval_data): + """calculate the metrics of all batches. It is called at the end of each epoch + + Args: + batch_matrix_list (list): the results of all batches + eval_data (Dataset): the class of test data + + Returns: + dict: such as ``{'GAUC': 0.9286}`` + + """ + pos_len_list = eval_data.get_pos_len_list() + user_len_list = eval_data.get_user_len_list() + pos_rank_sum = torch.cat(batch_matrix_list, dim=0).cpu().numpy() + assert len(pos_len_list) == len(pos_rank_sum) + + # get metrics + metric_dict = {} + result_list = self._calculate_metrics(user_len_list, pos_len_list, pos_rank_sum) + for metric, value in zip(self.metrics, result_list): + key = '{}'.format(metric) + metric_dict[key] = round(value, self.precision) + + return metric_dict + + def _calculate_metrics(self, user_len_list, pos_len_list, pos_rank_sum): + """integrate the results of each batch and evaluate the topk metrics by users + + Args: + pos_len_list (numpy.ndarray): a list of users' positive items + topk_idx (numpy.ndarray): a matrix which contains the index of the topk items for users + + Returns: + numpy.ndarray: a matrix which contains the metrics result + + """ + result_list = [] + for metric in self.metrics: + metric_fuc = metrics_dict[metric.lower()] + result = metric_fuc(user_len_list, pos_len_list, pos_rank_sum) + result_list.append(result) + return result_list + + def __str__(self): + msg = 'The Rank Evaluator Info:\n' + \ + '\tMetrics:[' + \ + ', '.join([rank_metrics[metric.lower()] for metric in self.metrics]) + \ + ']' + return msg + + +class LossEvaluator(IndividualEvaluator): + r"""Loss Evaluator is mainly used in rating prediction and click through rate prediction. Now, we support four + loss metrics which contain `'AUC', 'RMSE', 'MAE', 'LOGLOSS'`. + + Note: + The metrics used do not calculate group-based metrics which considers the metrics scores averaged + across users. They are also not limited to k. Instead, they calculate the scores on the entire + prediction results regardless the users. + + """ + + def __init__(self, config, metrics): + super().__init__(config, metrics) + + self.label_field = config['LABEL_FIELD'] + + def collect(self, interaction, pred_scores): + """collect the loss intermediate result of one batch, this function mainly + implements concatenating preds and trues. It is called at the end of each batch + + Args: + interaction (Interaction): :class:`AbstractEvaluator` of the batch + pred_scores (tensor): the tensor of model output with a size of `(N, )` + + Returns: + tensor : a batch of scores with a size of `(N, 2)` + + """ + true_scores = interaction[self.label_field].to(pred_scores.device) + assert len(true_scores) == len(pred_scores) + return self.get_score_matrix(true_scores, pred_scores) + + def evaluate(self, batch_matrix_list, *args): + """calculate the metrics of all batches. It is called at the end of each epoch + + Args: + batch_matrix_list (list): the results of all batches + + Returns: + dict: such as {'AUC': 0.83} + + """ + concat = torch.cat(batch_matrix_list, dim=0).cpu().numpy() + + trues = concat[:, 0] + preds = concat[:, 1] + + # get metrics + metric_dict = {} + result_list = self._calculate_metrics(trues, preds) + for metric, value in zip(self.metrics, result_list): + key = '{}'.format(metric) + metric_dict[key] = round(value, self.precision) + return metric_dict + + def _calculate_metrics(self, trues, preds): + """get metrics result + + Args: + trues (numpy.ndarray): the true scores' list + preds (numpy.ndarray): the predict scores' list + + Returns: + list: a list of metrics result + + """ + result_list = [] + for metric in self.metrics: + metric_fuc = metrics_dict[metric.lower()] + result = metric_fuc(trues, preds) + result_list.append(result) + return result_list + + def __str__(self): + msg = 'The Loss Evaluator Info:\n' + \ + '\tMetrics:[' + \ + ', '.join([loss_metrics[metric.lower()] for metric in self.metrics]) + \ + ']' + return msg + + +metric_eval_bind = [(topk_metrics, TopKEvaluator), (loss_metrics, LossEvaluator), (rank_metrics, RankEvaluator)] diff --git a/recbole/evaluator/loss_evaluator.py b/recbole/evaluator/loss_evaluator.py deleted file mode 100644 index 54822ca46..000000000 --- a/recbole/evaluator/loss_evaluator.py +++ /dev/null @@ -1,117 +0,0 @@ -# -*- encoding: utf-8 -*- -# @Time : 2020/08/04 -# @Author : Kaiyuan Li -# @email : tsotfsk@outlook.com - -# UPDATE -# @Time : 2020/08/04 2020/08/09 -# @Author : Kaiyuan Li Zhichao Feng -# @email : tsotfsk@outlook.com fzcbupt@gmail.com - -""" -recbole.evaluator.loss_evaluator -################################ -""" - -import numpy as np -import torch -from recbole.evaluator.abstract_evaluator import AbstractEvaluator -from recbole.evaluator.metrics import metrics_dict - -# These metrics are typical in loss recommendations -loss_metrics = {metric.lower(): metric for metric in ['AUC', 'RMSE', 'MAE', 'LOGLOSS']} - - -class LossEvaluator(AbstractEvaluator): - r"""Loss Evaluator is mainly used in rating prediction and click through rate prediction. Now, we support four - loss metrics which contain `'AUC', 'RMSE', 'MAE', 'LOGLOSS'`. - - Note: - The metrics used do not calculate group-based metrics which considers the metrics scores averaged across users. - They are also not limited to k. Instead, they calculate the scores on the entire prediction results regardless the users. - - """ - def __init__(self, config): - super().__init__(config) - - self.label_field = config['LABEL_FIELD'] - self._check_args() - - def collect(self, interaction, pred_scores): - """collect the loss intermediate result of one batch, this function mainly - implements concatenating preds and trues. It is called at the end of each batch - - Args: - interaction (Interaction): :class:`AbstractEvaluator` of the batch - pred_scores (tensor): the tensor of model output with a size of `(N, )` - - Returns: - tensor : a batch of socres with a size of `(N, 2)` - - """ - true_scores = interaction[self.label_field].to(pred_scores.device) - assert len(true_scores) == len(pred_scores) - return torch.stack((true_scores, pred_scores.detach()), dim=1) - - def evaluate(self, batch_matrix_list, *args): - """calculate the metrics of all batches. It is called at the end of each epoch - - Args: - batch_matrix_list (list): the results of all batches - - Returns: - dict: such as {'AUC': 0.83} - - """ - concat = torch.cat(batch_matrix_list, dim=0).cpu().numpy() - - trues = concat[:, 0] - preds = concat[:, 1] - - # get metrics - metric_dict = {} - result_list = self._calculate_metrics(trues, preds) - for metric, value in zip(self.metrics, result_list): - key = '{}'.format(metric) - metric_dict[key] = round(value, 4) - return metric_dict - - def _check_args(self): - - # Check metrics - if isinstance(self.metrics, (str, list)): - if isinstance(self.metrics, str): - self.metrics = [self.metrics] - else: - raise TypeError('metrics must be str or list') - - # Convert metric to lowercase - for m in self.metrics: - if m.lower() not in loss_metrics: - raise ValueError("There is no loss metric named {}!".format(m)) - self.metrics = [metric.lower() for metric in self.metrics] - - def metrics_info(self, trues, preds): - """get metrics result - - Args: - trues (np.ndarray): the true scores' list - preds (np.ndarray): the predict scores' list - - Returns: - list: a list of metrics result - - """ - result_list = [] - for metric in self.metrics: - metric_fuc = metrics_dict[metric.lower()] - result = metric_fuc(trues, preds) - result_list.append(result) - return result_list - - def _calculate_metrics(self, trues, preds): - return self.metrics_info(trues, preds) - - def __str__(self): - mesg = 'The Loss Evaluator Info:\n' + '\tMetrics:[' + ', '.join([loss_metrics[metric.lower()] for metric in self.metrics]) + ']' - return mesg diff --git a/recbole/evaluator/metrics.py b/recbole/evaluator/metrics.py index 976ac9880..9c470e12b 100644 --- a/recbole/evaluator/metrics.py +++ b/recbole/evaluator/metrics.py @@ -4,7 +4,7 @@ # @email : tsotfsk@outlook.com # UPDATE -# @Time : 2020/08/12, 2020/08/21, 2020/9/16 +# @Time : 2020/08/12, 2020/12/21, 2020/9/16 # @Author : Kaiyuan Li, Zhichao Feng, Xingyu Pan # @email : tsotfsk@outlook.com, fzcbupt@gmail.com, panxy@ruc.edu.cn @@ -16,9 +16,10 @@ from logging import getLogger import numpy as np -from recbole.evaluator.utils import _binary_clf_curve from sklearn.metrics import auc as sk_auc -from sklearn.metrics import log_loss, mean_absolute_error, mean_squared_error +from sklearn.metrics import mean_absolute_error, mean_squared_error + +from recbole.evaluator.utils import _binary_clf_curve # TopK Metrics # @@ -85,7 +86,7 @@ def map_(pos_index, pos_len): actual_len = np.where(pos_len > len_rank, len_rank, pos_len) result = np.zeros_like(pos_index, dtype=np.float) for row, lens in enumerate(actual_len): - ranges = np.arange(1, pos_index.shape[1]+1) + ranges = np.arange(1, pos_index.shape[1] + 1) ranges[lens:] = ranges[lens - 1] result[row] = sum_pre[row] / ranges return result @@ -100,7 +101,7 @@ def recall_(pos_index, pos_len): .. math:: \mathrm {Recall@K} = \frac{|Rel_u\cap Rec_u|}{Rel_u} - :math:`Rel_u` is the set of items relavent to user :math:`U`, + :math:`Rel_u` is the set of items relevant to user :math:`U`, :math:`Rec_u` is the top K items recommended to users. We obtain the result by calculating the average :math:`Recall@K` of each user. @@ -119,16 +120,15 @@ def ndcg_(pos_index, pos_len): \mathrm {DCG@K}=\sum_{i=1}^{K} \frac{2^{rel_i}-1}{\log_{2}{(i+1)}}\\ \mathrm {IDCG@K}=\sum_{i=1}^{K}\frac{1}{\log_{2}{(i+1)}}\\ \mathrm {NDCG_u@K}=\frac{DCG_u@K}{IDCG_u@K}\\ - \mathrm {NDCG@K}=\frac{\sum \nolimits_{u \in u^{te}NDCG_u@K}}{|u^{te}|} + \mathrm {NDCG@K}=\frac{\sum \nolimits_{u \in U^{te}NDCG_u@K}}{|U^{te}|} \end{gather} :math:`K` stands for recommending :math:`K` items. And the :math:`rel_i` is the relevance of the item in position :math:`i` in the recommendation list. - :math:`2^{rel_i}` equals to 1 if the item hits otherwise 0. - :math:`U^{te}` is for all users in the test set. + :math:`{rel_i}` equals to 1 if the item is ground truth otherwise 0. + :math:`U^{te}` stands for all users in the test set. """ - len_rank = np.full_like(pos_len, pos_index.shape[1]) idcg_len = np.where(pos_len > len_rank, len_rank, pos_len) @@ -156,7 +156,7 @@ def precision_(pos_index, pos_len): .. math:: \mathrm {Precision@K} = \frac{|Rel_u \cap Rec_u|}{Rec_u} - :math:`Rel_u` is the set of items relavent to user :math:`U`, + :math:`Rel_u` is the set of items relevant to user :math:`U`, :math:`Rec_u` is the top K items recommended to users. We obtain the result by calculating the average :math:`Precision@K` of each user. @@ -164,8 +164,63 @@ def precision_(pos_index, pos_len): return pos_index.cumsum(axis=1) / np.arange(1, pos_index.shape[1] + 1) -# CTR Metrics # +def gauc_(user_len_list, pos_len_list, pos_rank_sum): + r"""GAUC_ (also known as Group Area Under Curve) is used to evaluate the two-class model, referring to + the area under the ROC curve grouped by user. + + .. _GAUC: https://dl.acm.org/doi/10.1145/3219819.3219823 + + Note: + It calculates the AUC score of each user, and finally obtains GAUC by weighting the user AUC. + It is also not limited to k. Due to our padding for `scores_tensor` in `RankEvaluator` with + `-np.inf`, the padding value will influence the ranks of origin items. Therefore, we use + descending sort here and make an identity transformation to the formula of `AUC`, which is + shown in `auc_` function. For readability, we didn't do simplification in the code. + + .. math:: + \mathrm {GAUC} = \frac {{{M} \times {(M+N+1)} - \frac{M \times (M+1)}{2}} - + \sum\limits_{i=1}^M rank_{i}} {{M} \times {N}} + :math:`M` is the number of positive samples. + :math:`N` is the number of negative samples. + :math:`rank_i` is the descending rank of the ith positive sample. + + """ + neg_len_list = user_len_list - pos_len_list + + # check positive and negative samples + any_without_pos = np.any(pos_len_list == 0) + any_without_neg = np.any(neg_len_list == 0) + non_zero_idx = np.full(len(user_len_list), True, dtype=np.bool) + if any_without_pos: + logger = getLogger() + logger.warning( + "No positive samples in some users, " + "true positive value should be meaningless, " + "these users have been removed from GAUC calculation" + ) + non_zero_idx *= (pos_len_list != 0) + if any_without_neg: + logger = getLogger() + logger.warning( + "No negative samples in some users, " + "false positive value should be meaningless, " + "these users have been removed from GAUC calculation" + ) + non_zero_idx *= (neg_len_list != 0) + if any_without_pos or any_without_neg: + item_list = user_len_list, neg_len_list, pos_len_list, pos_rank_sum + user_len_list, neg_len_list, pos_len_list, pos_rank_sum = \ + map(lambda x: x[non_zero_idx], item_list) + + pair_num = (user_len_list + 1) * pos_len_list - pos_len_list * (pos_len_list + 1) / 2 - np.squeeze(pos_rank_sum) + user_auc = pair_num / (neg_len_list * pos_len_list) + result = (user_auc * pos_len_list).sum() / pos_len_list.sum() + + return result + + +# CTR Metrics # def auc_(trues, preds): r"""AUC_ (also known as Area Under Curve) is used to evaluate the two-class model, referring to the area under the ROC curve @@ -179,11 +234,11 @@ def auc_(trues, preds): .. math:: \mathrm {AUC} = \frac{\sum\limits_{i=1}^M rank_{i} - - {{M} \times {(M+1)}}} {{M} \times {N}} + - \frac {{M} \times {(M+1)}}{2}} {{{M} \times {N}}} :math:`M` is the number of positive samples. :math:`N` is the number of negative samples. - :math:`rank_i` is the rank of the ith positive sample. + :math:`rank_i` is the ascending rank of the ith positive sample. """ fps, tps = _binary_clf_curve(trues, preds) @@ -198,16 +253,14 @@ def auc_(trues, preds): if fps[-1] <= 0: logger = getLogger() - logger.warning("No negative samples in y_true, " - "false positive value should be meaningless") + logger.warning("No negative samples in y_true, " "false positive value should be meaningless") fpr = np.repeat(np.nan, fps.shape) else: fpr = fps / fps[-1] if tps[-1] <= 0: logger = getLogger() - logger.warning("No positive samples in y_true, " - "true positive value should be meaningless") + logger.warning("No positive samples in y_true, " "true positive value should be meaningless") tpr = np.repeat(np.nan, tps.shape) else: tpr = tps / tps[-1] @@ -217,6 +270,7 @@ def auc_(trues, preds): # Loss based Metrics # + def mae_(trues, preds): r"""`Mean absolute error regression loss`__ @@ -262,7 +316,7 @@ def log_loss_(trues, preds): eps = 1e-15 preds = np.float64(preds) preds = np.clip(preds, eps, 1 - eps) - loss = np.sum(- trues * np.log(preds) - (1 - trues) * np.log(1 - preds)) + loss = np.sum(-trues * np.log(preds) - (1 - trues) * np.log(1 - preds)) return loss / len(preds) @@ -273,19 +327,15 @@ def log_loss_(trues, preds): # def coverage_(): # raise NotImplementedError - # def gini_index_(): # raise NotImplementedError - # def shannon_entropy_(): # raise NotImplementedError - # def diversity_(): # raise NotImplementedError - """Function name and function mapper. Useful when we have to serialize evaluation metric names and call the functions based on deserialized names @@ -300,5 +350,6 @@ def log_loss_(trues, preds): 'rmse': rmse_, 'mae': mae_, 'logloss': log_loss_, - 'auc': auc_ + 'auc': auc_, + 'gauc': gauc_ } diff --git a/recbole/evaluator/proxy_evaluator.py b/recbole/evaluator/proxy_evaluator.py new file mode 100644 index 000000000..a0df5246e --- /dev/null +++ b/recbole/evaluator/proxy_evaluator.py @@ -0,0 +1,109 @@ +# -*- encoding: utf-8 -*- +# @Time : 2020/12/9 +# @Author : Zhichao Feng +# @email : fzcbupt@gmail.com + +# UPDATE +# @Time : 2020/12/9 +# @Author : Zhichao Feng +# @email : fzcbupt@gmail.com + +""" +recbole.evaluator.proxy_evaluator +##################################### +""" + +from collections import ChainMap + +from recbole.evaluator.evaluators import metric_eval_bind, group_metrics, individual_metrics + + +class ProxyEvaluator(object): + r"""ProxyEvaluator is used to assign the corresponding evaluator according to the evaluation metrics, + for example, TopkEvaluator for top-k metrics, and summarize the results of all evaluators. + + """ + + def __init__(self, config): + self.config = config + self.valid_metrics = ChainMap(group_metrics, individual_metrics) + self.metrics = self.config['metrics'] + self._check_args() + self.evaluators = self.build() + + def build(self): + """assign evaluators according to metrics. + + Returns: + list: a list of evaluators. + + """ + evaluator_list = [] + metrics_list = [metric.lower() for metric in self.metrics] + for metrics, evaluator in metric_eval_bind: + used_metrics = [metric for metric in metrics_list if metric in metrics] + if used_metrics: + evaluator_list.append(evaluator(self.config, used_metrics)) + return evaluator_list + + def collect(self, interaction, scores): + """collect the all used evaluators' intermediate result of one batch. + + Args: + interaction (Interaction): :class:`AbstractEvaluator` of the batch + scores (tensor): the tensor of model output with size of `(N, )` + + """ + results = [] + for evaluator in self.evaluators: + results.append(evaluator.collect(interaction, scores)) + return results + + def merge_batch_result(self, batch_matrix_list): + """merge all the intermediate result got in `self.collect` for used evaluators separately. + + Args: + batch_matrix_list (list): the results of all batches not separated + + Returns: + dict: used evaluators' results of all batches + + """ + matrix_dict = {} + for collect_list in batch_matrix_list: + for i, value in enumerate(collect_list): + matrix_dict.setdefault(i, []).append(value) + + return matrix_dict + + def evaluate(self, batch_matrix_list, eval_data): + """calculate the metrics of all batches. It is called at the end of each epoch + + Args: + batch_matrix_list (list): the results of all batches + eval_data (Dataset): the class of test data + + Returns: + dict: such as ``{'Hit@20': 0.3824, 'Recall@20': 0.0527, 'Hit@10': 0.3153, 'GAUC': 0.9236}`` + + """ + matrix_dict = self.merge_batch_result(batch_matrix_list) + result_dict = {} + for i, evaluator in enumerate(self.evaluators): + res = evaluator.evaluate(matrix_dict[i], eval_data) + result_dict.update(res) + return result_dict + + def _check_args(self): + + # Check metrics + if isinstance(self.metrics, (str, list)): + if isinstance(self.metrics, str): + self.metrics = [self.metrics] + else: + raise TypeError('metrics must be str or list') + + # Convert metric to lowercase + for m in self.metrics: + if m.lower() not in self.valid_metrics: + raise ValueError("There is no metric named {}!".format(m)) diff --git a/recbole/evaluator/topk_evaluator.py b/recbole/evaluator/topk_evaluator.py deleted file mode 100644 index b62789b0f..000000000 --- a/recbole/evaluator/topk_evaluator.py +++ /dev/null @@ -1,151 +0,0 @@ -# -*- encoding: utf-8 -*- -# @Time : 2020/08/04 -# @Author : Kaiyuan Li -# @email : tsotfsk@outlook.com - -# UPDATE -# @Time : 2020/08/04, 2020/08/11 -# @Author : Kaiyuan Li, Yupeng Hou -# @email : tsotfsk@outlook.com, houyupeng@ruc.edu.cn - -""" -recbole.evaluator.topk_evaluator -################################ -""" - -import numpy as np -import torch -from recbole.evaluator.abstract_evaluator import AbstractEvaluator -from recbole.evaluator.metrics import metrics_dict -from torch.nn.utils.rnn import pad_sequence - -# These metrics are typical in topk recommendations -topk_metrics = {metric.lower(): metric for metric in ['Hit', 'Recall', 'MRR', 'Precision', 'NDCG', 'MAP']} - - -class TopKEvaluator(AbstractEvaluator): - r"""TopK Evaluator is mainly used in ranking tasks. Now, we support six topk metrics which - contain `'Hit', 'Recall', 'MRR', 'Precision', 'NDCG', 'MAP'`. - - Note: - The metrics used calculate group-based metrics which considers the metrics scores averaged - across users. Some of them are also limited to k. - - """ - def __init__(self, config): - super().__init__(config) - - self.topk = config['topk'] - self._check_args() - - def collect(self, interaction, scores_tensor, full=False): - """collect the topk intermediate result of one batch, this function mainly - implements padding and TopK finding. It is called at the end of each batch - - Args: - interaction (Interaction): :class:`AbstractEvaluator` of the batch - scores_tensor (tensor): the tensor of model output with size of `(N, )` - full (bool, optional): whether it is full sort. Default: False. - - """ - user_len_list = interaction.user_len_list - if full is True: - scores_matrix = scores_tensor.view(len(user_len_list), -1) - else: - scores_list = torch.split(scores_tensor, user_len_list, dim=0) - scores_matrix = pad_sequence(scores_list, batch_first=True, padding_value=-np.inf) # nusers x items - - # get topk - _, topk_index = torch.topk(scores_matrix, max(self.topk), dim=-1) # nusers x k - - return topk_index - - def evaluate(self, batch_matrix_list, eval_data): - """calculate the metrics of all batches. It is called at the end of each epoch - - Args: - batch_matrix_list (list): the results of all batches - eval_data (Dataset): the class of test data - - Returns: - dict: such as ``{'Hit@20': 0.3824, 'Recall@20': 0.0527, 'Hit@10': 0.3153, 'Recall@10': 0.0329}`` - - """ - pos_len_list = eval_data.get_pos_len_list() - topk_index = torch.cat(batch_matrix_list, dim=0).cpu().numpy() - - assert len(pos_len_list) == len(topk_index) - # get metrics - metric_dict = {} - result_list = self._calculate_metrics(pos_len_list, topk_index) - for metric, value in zip(self.metrics, result_list): - for k in self.topk: - key = '{}@{}'.format(metric, k) - metric_dict[key] = round(value[k - 1], 4) - return metric_dict - - def _check_args(self): - - # Check metrics - if isinstance(self.metrics, (str, list)): - if isinstance(self.metrics, str): - self.metrics = [self.metrics] - else: - raise TypeError('metrics must be str or list') - - # Convert metric to lowercase - for m in self.metrics: - if m.lower() not in topk_metrics: - raise ValueError("There is no user grouped topk metric named {}!".format(m)) - self.metrics = [metric.lower() for metric in self.metrics] - - # Check topk: - if isinstance(self.topk, (int, list)): - if isinstance(self.topk, int): - self.topk = [self.topk] - for topk in self.topk: - if topk <= 0: - raise ValueError('topk must be a positive integer or a list of positive integers, but get `{}`'.format(topk)) - else: - raise TypeError('The topk must be a integer, list') - - def metrics_info(self, pos_idx, pos_len): - """get metrics result - - Args: - pos_idx (np.ndarray): the bool index of all users' topk items that indicating the postive items are - topk items or not - pos_len (list): the length of all users' postivite items - - Returns: - list: a list of matrix which record the results from `1` to `max(topk)` - - """ - result_list = [] - for metric in self.metrics: - metric_fuc = metrics_dict[metric.lower()] - result = metric_fuc(pos_idx, pos_len) - result_list.append(result) - return result_list - - def _calculate_metrics(self, pos_len_list, topk_index): - """integrate the results of each batch and evaluate the topk metrics by users - - Args: - pos_len_list (list): a list of users' positive items - topk_index (np.ndarray): a matrix which contains the index of the topk items for users - - Returns: - np.ndarray: a matrix which contains the metrics result - - """ - - pos_idx_matrix = (topk_index < pos_len_list.reshape(-1, 1)) - result_list = self.metrics_info(pos_idx_matrix, pos_len_list) # n_users x len(metrics) x len(ranks) - result = np.stack(result_list, axis=0).mean(axis=1) # len(metrics) x len(ranks) - return result - - def __str__(self): - mesg = 'The TopK Evaluator Info:\n' + '\tMetrics:[' + ', '.join([topk_metrics[metric.lower()] for metric in self.metrics]) \ - + '], TopK:[' + ', '.join(map(str, self.topk)) +']' - return mesg diff --git a/recbole/evaluator/utils.py b/recbole/evaluator/utils.py index 4f14b2ae8..392873493 100644 --- a/recbole/evaluator/utils.py +++ b/recbole/evaluator/utils.py @@ -14,7 +14,6 @@ """ import itertools -from enum import Enum import numpy as np import torch @@ -54,20 +53,20 @@ def trunc(scores, method): """Round the scores by using the given method Args: - scores (np.ndarray): scores + scores (numpy.ndarray): scores method (str): one of ['ceil', 'floor', 'around'] Raises: NotImplementedError: method error Returns: - np.ndarray: processed scores + numpy.ndarray: processed scores """ try: cut_method = getattr(np, method) except NotImplementedError: - raise NotImplementedError("module 'numpy' has no fuction named '{}'".format(method)) + raise NotImplementedError("module 'numpy' has no function named '{}'".format(method)) scores = cut_method(scores) return scores @@ -76,11 +75,11 @@ def cutoff(scores, threshold): """cut of the scores based on threshold Args: - scores (np.ndarray): scores + scores (numpy.ndarray): scores threshold (float): between 0 and 1 Returns: - np.ndarray: processed scores + numpy.ndarray: processed scores """ return np.where(scores > threshold, 1, 0) @@ -93,7 +92,7 @@ def _binary_clf_curve(trues, preds): preds (numpy.ndarray): the predict scores' list Returns: - fps (np.ndarray): A count of false positives, at index i being the number of negative + fps (numpy.ndarray): A count of false positives, at index i being the number of negative samples assigned a score >= thresholds[i] preds (numpy.ndarray): An increasing count of true positives, at index i being the number of positive samples assigned a score >= thresholds[i]. diff --git a/recbole/model/abstract_recommender.py b/recbole/model/abstract_recommender.py index 18983b90a..d5d7f8dc0 100644 --- a/recbole/model/abstract_recommender.py +++ b/recbole/model/abstract_recommender.py @@ -7,24 +7,29 @@ # @Author : Shanlei Mu, Yupeng Hou # @Email : slmu@ruc.edu.cn, houyupeng@ruc.edu.cn - """ recbole.model.abstract_recommender ################################## """ +from logging import getLogger + import numpy as np import torch import torch.nn as nn -from recbole.utils import ModelType, InputType, FeatureSource, FeatureType from recbole.model.layers import FMEmbedding, FMFirstOrderLinear +from recbole.utils import ModelType, InputType, FeatureSource, FeatureType class AbstractRecommender(nn.Module): r"""Base class for all models """ + def __init__(self): + self.logger = getLogger() + super(AbstractRecommender, self).__init__() + def calculate_loss(self, interaction): r"""Calculate the training loss for a batch data. @@ -86,7 +91,6 @@ def __init__(self, config, dataset): self.n_items = dataset.num(self.ITEM_ID) # load parameters info - self.batch_size = config['train_batch_size'] self.device = config['device'] @@ -110,7 +114,7 @@ def __init__(self, config, dataset): self.n_items = dataset.num(self.ITEM_ID) def gather_indexes(self, output, gather_index): - """Gathers the vectors at the spexific positions over a minibatch""" + """Gathers the vectors at the specific positions over a minibatch""" gather_index = gather_index.view(-1, 1, 1).expand(-1, -1, output.shape[-1]) output_tensor = output.gather(dim=1, index=gather_index) return output_tensor.squeeze(1) @@ -140,7 +144,6 @@ def __init__(self, config, dataset): self.n_relations = dataset.num(self.RELATION_ID) # load parameters info - self.batch_size = config['train_batch_size'] self.device = config['device'] @@ -215,11 +218,13 @@ def __init__(self, config, dataset): self.num_feature_field += 1 if len(self.token_field_dims) > 0: self.token_field_offsets = np.array((0, *np.cumsum(self.token_field_dims)[:-1]), dtype=np.long) - self.token_embedding_table = FMEmbedding(self.token_field_dims, self.token_field_offsets, - self.embedding_size) + self.token_embedding_table = FMEmbedding( + self.token_field_dims, self.token_field_offsets, self.embedding_size + ) if len(self.float_field_dims) > 0: - self.float_embedding_table = nn.Embedding(np.sum(self.float_field_dims, dtype=np.int32), - self.embedding_size) + self.float_embedding_table = nn.Embedding( + np.sum(self.float_field_dims, dtype=np.int32), self.embedding_size + ) if len(self.token_seq_field_dims) > 0: self.token_seq_embedding_table = nn.ModuleList() for token_seq_field_dim in self.token_seq_field_dims: @@ -330,8 +335,10 @@ def double_tower_embed_input_fields(self, interaction): first_dense_embedding, second_dense_embedding = None, None if sparse_embedding is not None: - sizes = [self.user_token_seq_field_num, self.item_token_seq_field_num, - self.user_token_field_num, self.item_token_field_num] + sizes = [ + self.user_token_seq_field_num, self.item_token_seq_field_num, self.user_token_field_num, + self.item_token_field_num + ] first_token_seq_embedding, second_token_seq_embedding, first_token_embedding, second_token_embedding = \ torch.split(sparse_embedding, sizes, dim=1) first_sparse_embedding = torch.cat([first_token_seq_embedding, first_token_embedding], dim=1) @@ -341,6 +348,15 @@ def double_tower_embed_input_fields(self, interaction): return first_sparse_embedding, first_dense_embedding, second_sparse_embedding, second_dense_embedding + def concat_embed_input_fields(self, interaction): + sparse_embedding, dense_embedding = self.embed_input_fields(interaction) + all_embeddings = [] + if sparse_embedding is not None: + all_embeddings.append(sparse_embedding) + if dense_embedding is not None and len(dense_embedding.shape) == 3: + all_embeddings.append(dense_embedding) + return torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + def embed_input_fields(self, interaction): """Embed the whole feature columns. @@ -353,8 +369,10 @@ def embed_input_fields(self, interaction): """ float_fields = [] for field_name in self.float_field_names: - float_fields.append(interaction[field_name] - if len(interaction[field_name].shape) == 2 else interaction[field_name].unsqueeze(1)) + if len(interaction[field_name].shape) == 2: + float_fields.append(interaction[field_name]) + else: + float_fields.append(interaction[field_name].unsqueeze(1)) if len(float_fields) > 0: float_fields = torch.cat(float_fields, dim=1) # [batch_size, num_float_field] else: diff --git a/recbole/model/context_aware_recommender/afm.py b/recbole/model/context_aware_recommender/afm.py index 5c615e42c..e1fe1c03c 100644 --- a/recbole/model/context_aware_recommender/afm.py +++ b/recbole/model/context_aware_recommender/afm.py @@ -16,8 +16,8 @@ import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import AttLayer from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import AttLayer class AFM(ContextRecommender): @@ -99,15 +99,7 @@ def afm_layer(self, infeature): return att_pooling def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - afm_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + afm_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] output = self.sigmoid(self.first_order_linear(interaction) + self.afm_layer(afm_all_embeddings)) return output.squeeze() diff --git a/recbole/model/context_aware_recommender/autoint.py b/recbole/model/context_aware_recommender/autoint.py index b22e331ee..893d5d8f2 100644 --- a/recbole/model/context_aware_recommender/autoint.py +++ b/recbole/model/context_aware_recommender/autoint.py @@ -13,12 +13,12 @@ """ import torch -import torch.nn.functional as F import torch.nn as nn +import torch.nn.functional as F from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers class AutoInt(ContextRecommender): @@ -95,15 +95,7 @@ def autoint_layer(self, infeature): return att_output def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - autoint_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + autoint_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] output = self.first_order_linear(interaction) + self.autoint_layer(autoint_all_embeddings) return self.sigmoid(output.squeeze(1)) diff --git a/recbole/model/context_aware_recommender/dcn.py b/recbole/model/context_aware_recommender/dcn.py index 3c9c66ecd..0edfc6ad8 100644 --- a/recbole/model/context_aware_recommender/dcn.py +++ b/recbole/model/context_aware_recommender/dcn.py @@ -22,9 +22,9 @@ import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.loss import RegLoss -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers +from recbole.model.loss import RegLoss class DCN(ContextRecommender): @@ -32,6 +32,7 @@ class DCN(ContextRecommender): automatically construct limited high-degree cross features, and learns the corresponding weights. """ + def __init__(self, config, dataset): super(DCN, self).__init__(config, dataset) @@ -43,13 +44,14 @@ def __init__(self, config, dataset): # define layers and loss # init weight and bias of each cross layer - self.cross_layer_parameter = [nn.Parameter(torch.empty(self.num_feature_field * self.embedding_size, - device=self.device)) - for _ in range(self.cross_layer_num * 2)] self.cross_layer_w = nn.ParameterList( - self.cross_layer_parameter[:self.cross_layer_num]) + nn.Parameter(torch.randn(self.num_feature_field * self.embedding_size).to(self.device)) + for _ in range(self.cross_layer_num) + ) self.cross_layer_b = nn.ParameterList( - self.cross_layer_parameter[self.cross_layer_num:]) + nn.Parameter(torch.zeros(self.num_feature_field * self.embedding_size).to(self.device)) + for _ in range(self.cross_layer_num) + ) # size of mlp hidden layer size_list = [self.embedding_size * self.num_feature_field] + self.mlp_hidden_size @@ -97,16 +99,7 @@ def cross_network(self, x_0): return x_l def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - - dcn_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + dcn_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] batch_size = dcn_all_embeddings.shape[0] dcn_all_embeddings = dcn_all_embeddings.view(batch_size, -1) diff --git a/recbole/model/context_aware_recommender/deepfm.py b/recbole/model/context_aware_recommender/deepfm.py index 5ff10298b..f1791564e 100644 --- a/recbole/model/context_aware_recommender/deepfm.py +++ b/recbole/model/context_aware_recommender/deepfm.py @@ -16,12 +16,11 @@ Huifeng Guo et al. "DeepFM: A Factorization-Machine based Neural Network for CTR Prediction." in IJCAI 2017. """ -import torch import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import BaseFactorizationMachine, MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import BaseFactorizationMachine, MLPLayers class DeepFM(ContextRecommender): @@ -29,6 +28,7 @@ class DeepFM(ContextRecommender): Also DeepFM can be seen as a combination of FNN and FM. """ + def __init__(self, config, dataset): super(DeepFM, self).__init__(config, dataset) @@ -56,20 +56,11 @@ def _init_weights(self, module): constant_(module.bias.data, 0) def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - deepfm_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + deepfm_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] batch_size = deepfm_all_embeddings.shape[0] y_fm = self.first_order_linear(interaction) + self.fm(deepfm_all_embeddings) - y_deep = self.deep_predict_layer( - self.mlp_layers(deepfm_all_embeddings.view(batch_size, -1))) + y_deep = self.deep_predict_layer(self.mlp_layers(deepfm_all_embeddings.view(batch_size, -1))) y = self.sigmoid(y_fm + y_deep) return y.squeeze() diff --git a/recbole/model/context_aware_recommender/dssm.py b/recbole/model/context_aware_recommender/dssm.py index 8261be2eb..59d951af0 100644 --- a/recbole/model/context_aware_recommender/dssm.py +++ b/recbole/model/context_aware_recommender/dssm.py @@ -4,7 +4,6 @@ # @Email : gmqszyq@qq.com # @File : dssm.py - """ DSSM ################################################ @@ -12,13 +11,12 @@ PS Huang et al. "Learning Deep Structured Semantic Models for Web Search using Clickthrough Data" in CIKM 2013. """ - import torch import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers class DSSM(ContextRecommender): @@ -26,6 +24,7 @@ class DSSM(ContextRecommender): and uses cosine distance to calculate the distance between the two semantic vectors. """ + def __init__(self, config, dataset): super(DSSM, self).__init__(config, dataset) @@ -43,7 +42,7 @@ def __init__(self, config, dataset): self.item_mlp_layers = MLPLayers(item_size_list, self.dropout_prob, activation='tanh', bn=True) self.loss = nn.BCELoss() - self.sigmod = nn.Sigmoid() + self.sigmoid = nn.Sigmoid() # parameters initialization self.apply(self._init_weights) @@ -86,7 +85,7 @@ def forward(self, interaction): item_dnn_out = self.item_mlp_layers(embed_item.view(batch_size, -1)) score = torch.cosine_similarity(user_dnn_out, item_dnn_out, dim=1) - sig_score = self.sigmod(score) + sig_score = self.sigmoid(score) return sig_score.squeeze() def calculate_loss(self, interaction): diff --git a/recbole/model/context_aware_recommender/ffm.py b/recbole/model/context_aware_recommender/ffm.py index 617860ea1..6879adaef 100644 --- a/recbole/model/context_aware_recommender/ffm.py +++ b/recbole/model/context_aware_recommender/ffm.py @@ -8,7 +8,7 @@ FFM ##################################################### Reference: - Yuchin Juan et al. "Field-aware Factorization Machines for CTR Prediction" in RecSys 2016. + Yuchin Juan et al. "Field-aware Factorization Machines for CTR Prediction" in RecSys 2016. Reference code: https://github.com/rixwew/pytorch-fm @@ -37,19 +37,22 @@ def __init__(self, config, dataset): super(FFM, self).__init__(config, dataset) # load parameters info - self.fields = config['fields'] # a dict; key: field_id; value: feature_list + self.fields = config['fields'] # a dict; key: field_id; value: feature_list self.sigmoid = nn.Sigmoid() self.feature2id = {} self.feature2field = {} - + self.feature_names = (self.token_field_names, self.float_field_names, self.token_seq_field_names) self.feature_dims = (self.token_field_dims, self.float_field_dims, self.token_seq_field_dims) self._get_feature2field() self.num_fields = len(set(self.feature2field.values())) # the number of fields - self.ffm = FieldAwareFactorizationMachine(self.feature_names, self.feature_dims, self.feature2id, self.feature2field, self.num_fields, self.embedding_size, self.device) + self.ffm = FieldAwareFactorizationMachine( + self.feature_names, self.feature_dims, self.feature2id, self.feature2field, self.num_fields, + self.embedding_size, self.device + ) self.loss = nn.BCELoss() # parameters initialization @@ -73,7 +76,7 @@ def _get_feature2field(self): for name in names: self.feature2id[name] = fea_id fea_id += 1 - + if self.fields is None: field_id = 0 for key, value in self.feature2id.items(): @@ -96,25 +99,25 @@ def get_ffm_input(self, interaction): for tn in self.token_field_names: token_ffm_input.append(torch.unsqueeze(interaction[tn], 1)) if len(token_ffm_input) > 0: - token_ffm_input = torch.cat(token_ffm_input, dim=1) # [batch_size, num_token_features] + token_ffm_input = torch.cat(token_ffm_input, dim=1) # [batch_size, num_token_features] float_ffm_input = [] if self.float_field_names is not None: for fn in self.float_field_names: float_ffm_input.append(torch.unsqueeze(interaction[fn], 1)) if len(float_ffm_input) > 0: - float_ffm_input = torch.cat(float_ffm_input, dim=1) # [batch_size, num_float_features] + float_ffm_input = torch.cat(float_ffm_input, dim=1) # [batch_size, num_float_features] token_seq_ffm_input = [] if self.token_seq_field_names is not None: for tsn in self.token_seq_field_names: - token_seq_ffm_input.append(interaction[tsn]) # a list + token_seq_ffm_input.append(interaction[tsn]) # a list - return (token_ffm_input, float_ffm_input, token_seq_ffm_input) + return token_ffm_input, float_ffm_input, token_seq_ffm_input def forward(self, interaction): ffm_input = self.get_ffm_input(interaction) ffm_output = torch.sum(torch.sum(self.ffm(ffm_input), dim=1), dim=1, keepdim=True) output = self.sigmoid(self.first_order_linear(interaction) + ffm_output) - + return output.squeeze() def calculate_loss(self, interaction): @@ -144,7 +147,8 @@ def __init__(self, feature_names, feature_dims, feature2id, feature2field, num_f self.feature2id = feature2id self.feature2field = feature2field - self.num_features = len(self.token_feature_names) + len(self.float_feature_names) + len(self.token_seq_feature_names) + self.num_features = len(self.token_feature_names) + len(self.float_feature_names) \ + + len(self.token_seq_feature_names) self.num_fields = num_fields self.embed_dim = embed_dim self.device = device @@ -201,7 +205,7 @@ def forward(self, input_x): token_input_x_emb = self._emb_token_ffm_input(token_ffm_input) float_input_x_emb = self._emb_float_ffm_input(float_ffm_input) token_seq_input_x_emb = self._emb_token_seq_ffm_input(token_seq_ffm_input) - + input_x_emb = self._get_input_x_emb(token_input_x_emb, float_input_x_emb, token_seq_input_x_emb) output = list() @@ -215,24 +219,17 @@ def forward(self, input_x): def _get_input_x_emb(self, token_input_x_emb, float_input_x_emb, token_seq_input_x_emb): # merge different types of field-aware embeddings input_x_emb = [] # [num_fields: [batch_size, num_fields, emb_dim]] - if len(self.token_feature_names) > 0 and len(self.float_feature_names) > 0 and len(self.token_seq_feature_names) > 0: - for i in range(self.num_fields): - input_x_emb.append(torch.cat([token_input_x_emb[i], float_input_x_emb[i], token_seq_input_x_emb[i]], dim=1)) - elif len(self.token_feature_names) > 0 and len(self.float_feature_names) > 0: - for i in range(self.num_fields): - input_x_emb.append(torch.cat([token_input_x_emb[i], float_input_x_emb[i]], dim=1)) - elif len(self.float_feature_names) > 0 and len(self.token_seq_feature_names) > 0: - for i in range(self.num_fields): - input_x_emb.append(torch.cat([float_input_x_emb[i], token_seq_input_x_emb[i]], dim=1)) - elif len(self.token_feature_names) > 0 and len(self.token_seq_feature_names) > 0: - for i in range(self.num_fields): - input_x_emb.append(torch.cat([token_input_x_emb[i], token_seq_input_x_emb[i]], dim=1)) - elif len(self.token_feature_names) > 0: - input_x_emb = token_input_x_emb - elif len(self.float_feature_names) > 0: - input_x_emb = float_input_x_emb - elif len(self.token_seq_feature_names) > 0: - input_x_emb = token_seq_input_x_emb + + zip_args = [] + if len(self.token_feature_names) > 0: + zip_args.append(token_input_x_emb) + if len(self.float_feature_names) > 0: + zip_args.append(float_input_x_emb) + if len(self.token_seq_feature_names) > 0: + zip_args.append(token_seq_input_x_emb) + + for tensors in zip(*zip_args): + input_x_emb.append(torch.cat(tensors, dim=1)) return input_x_emb @@ -241,7 +238,9 @@ def _emb_token_ffm_input(self, token_ffm_input): token_input_x_emb = [] if len(self.token_feature_names) > 0: token_input_x = token_ffm_input + token_ffm_input.new_tensor(self.token_offsets).unsqueeze(0) - token_input_x_emb = [self.token_embeddings[i](token_input_x) for i in range(self.num_fields)] # [num_fields: [batch_size, num_token_features, emb_dim]] + token_input_x_emb = [ + self.token_embeddings[i](token_input_x) for i in range(self.num_fields) + ] # [num_fields: [batch_size, num_token_features, emb_dim]] return token_input_x_emb @@ -249,8 +248,13 @@ def _emb_float_ffm_input(self, float_ffm_input): # get float field-aware embeddings float_input_x_emb = [] if len(self.float_feature_names) > 0: - index = torch.arange(0, self.num_float_features).unsqueeze(0).expand_as(float_ffm_input).long().to(self.device) # [batch_size, num_float_features] - float_input_x_emb = [torch.mul(self.float_embeddings[i](index), float_ffm_input.unsqueeze(2)) for i in range(self.num_fields)] # [num_fields: [batch_size, num_float_features, emb_dim]] + index = torch.arange(0, self.num_float_features).unsqueeze(0).expand_as(float_ffm_input).long().to( + self.device + ) # [batch_size, num_float_features] + float_input_x_emb = [ + torch.mul(self.float_embeddings[i](index), float_ffm_input.unsqueeze(2)) + for i in range(self.num_fields) + ] # [num_fields: [batch_size, num_float_features, emb_dim]] return float_input_x_emb @@ -276,6 +280,8 @@ def _emb_token_seq_ffm_input(self, token_seq_ffm_input): result = result.unsqueeze(1) # [batch_size, 1, embed_dim] token_seq_result.append(result) - token_seq_input_x_emb.append(torch.cat(token_seq_result, dim=1)) # [num_fields: batch_size, num_token_seq_features, embed_dim] + token_seq_input_x_emb.append( + torch.cat(token_seq_result, dim=1) + ) # [num_fields: batch_size, num_token_seq_features, embed_dim] return token_seq_input_x_emb diff --git a/recbole/model/context_aware_recommender/fm.py b/recbole/model/context_aware_recommender/fm.py index ae64f9bf5..c43b2aa21 100644 --- a/recbole/model/context_aware_recommender/fm.py +++ b/recbole/model/context_aware_recommender/fm.py @@ -16,12 +16,11 @@ Steffen Rendle et al. "Factorization Machines." in ICDM 2010. """ -import torch import torch.nn as nn from torch.nn.init import xavier_normal_ -from recbole.model.layers import BaseFactorizationMachine from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import BaseFactorizationMachine class FM(ContextRecommender): @@ -46,15 +45,7 @@ def _init_weights(self, module): xavier_normal_(module.weight.data) def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - fm_all_embeddings = torch.cat(all_embeddings, dim=1) + fm_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] y = self.sigmoid(self.first_order_linear(interaction) + self.fm(fm_all_embeddings)) return y.squeeze() diff --git a/recbole/model/context_aware_recommender/fnn.py b/recbole/model/context_aware_recommender/fnn.py index 70cda0373..d8e2a1987 100644 --- a/recbole/model/context_aware_recommender/fnn.py +++ b/recbole/model/context_aware_recommender/fnn.py @@ -11,12 +11,11 @@ Weinan Zhang1 et al. "Deep Learning over Multi-field Categorical Data" in ECIR 2016 """ -import torch import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers class FNN(ContextRecommender): @@ -59,15 +58,7 @@ def _init_weights(self, module): constant_(module.bias.data, 0) def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - fnn_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + fnn_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] batch_size = fnn_all_embeddings.shape[0] output = self.predict_layer(self.mlp_layers(fnn_all_embeddings.view(batch_size, -1))) diff --git a/recbole/model/context_aware_recommender/fwfm.py b/recbole/model/context_aware_recommender/fwfm.py index 5de226476..5c1280f2d 100644 --- a/recbole/model/context_aware_recommender/fwfm.py +++ b/recbole/model/context_aware_recommender/fwfm.py @@ -35,20 +35,20 @@ def __init__(self, config, dataset): # load parameters info self.dropout_prob = config['dropout_prob'] - self.fields = config['fields'] # a dict; key: field_id; value: feature_list + self.fields = config['fields'] # a dict; key: field_id; value: feature_list self.num_features = self.num_feature_field - + self.dropout_layer = nn.Dropout(p=self.dropout_prob) self.sigmoid = nn.Sigmoid() self.feature2id = {} self.feature2field = {} - + self.feature_names = (self.token_field_names, self.token_seq_field_names, self.float_field_names) self.feature_dims = (self.token_field_dims, self.token_seq_field_dims, self.float_field_dims) self._get_feature2field() - self.num_fields = len(set(self.feature2field.values())) # the number of fields + self.num_fields = len(set(self.feature2field.values())) # the number of fields self.num_pair = self.num_fields * self.num_fields self.loss = nn.BCELoss() @@ -71,11 +71,10 @@ def _get_feature2field(self): fea_id = 0 for names in self.feature_names: if names is not None: - print(names) for name in names: self.feature2id[name] = fea_id fea_id += 1 - + if self.fields is None: field_id = 0 for key, value in self.feature2id.items(): @@ -101,33 +100,27 @@ def fwfm_layer(self, infeature): """ # get r(Fi, Fj) batch_size = infeature.shape[0] - para = torch.randn(self.num_fields*self.num_fields*self.embedding_size).expand(batch_size, self.num_fields*self.num_fields*self.embedding_size).to(self.device) # [batch_size*num_pairs*emb_dim] - para = torch.reshape(para, (batch_size, self.num_fields, self.num_fields, self.embedding_size)) - r = nn.Parameter(para, requires_grad=True) # [batch_size, num_fields, num_fields, emb_dim] + para = torch.randn(self.num_fields * self.num_fields * self.embedding_size).\ + expand(batch_size, self.num_fields * self.num_fields * self.embedding_size).\ + to(self.device) # [batch_size*num_pairs*emb_dim] + para = para.reshape(batch_size, self.num_fields, self.num_fields, self.embedding_size) + r = nn.Parameter(para, requires_grad=True) # [batch_size, num_fields, num_fields, emb_dim] - fwfm_inter = list() # [batch_size, num_fields, emb_dim] + fwfm_inter = list() # [batch_size, num_fields, emb_dim] for i in range(self.num_features - 1): for j in range(i + 1, self.num_features): Fi, Fj = self.feature2field[i], self.feature2field[j] fwfm_inter.append(infeature[:, i] * infeature[:, j] * r[:, Fi, Fj]) fwfm_inter = torch.stack(fwfm_inter, dim=1) - fwfm_inter = torch.sum(fwfm_inter, dim=1) # [batch_size, emb_dim] - fwfm_inter = self.dropout_layer(fwfm_inter) + fwfm_inter = torch.sum(fwfm_inter, dim=1) # [batch_size, emb_dim] + fwfm_inter = self.dropout_layer(fwfm_inter) fwfm_output = torch.sum(fwfm_inter, dim=1, keepdim=True) # [batch_size, 1] return fwfm_output def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - fwfm_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + fwfm_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] output = self.sigmoid(self.first_order_linear(interaction) + self.fwfm_layer(fwfm_all_embeddings)) diff --git a/recbole/model/context_aware_recommender/lr.py b/recbole/model/context_aware_recommender/lr.py index e8d35f347..0c3a6a759 100644 --- a/recbole/model/context_aware_recommender/lr.py +++ b/recbole/model/context_aware_recommender/lr.py @@ -27,6 +27,7 @@ class LR(ContextRecommender): Z = \sum_{i} {w_i}{x_i} """ + def __init__(self, config, dataset): super(LR, self).__init__(config, dataset) diff --git a/recbole/model/context_aware_recommender/nfm.py b/recbole/model/context_aware_recommender/nfm.py index d967b9daf..4e0e1750f 100644 --- a/recbole/model/context_aware_recommender/nfm.py +++ b/recbole/model/context_aware_recommender/nfm.py @@ -11,18 +11,18 @@ He X, Chua T S. "Neural factorization machines for sparse predictive analytics" in SIGIR 2017 """ -import torch import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import BaseFactorizationMachine, MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import BaseFactorizationMachine, MLPLayers class NFM(ContextRecommender): """ NFM replace the fm part as a mlp to model the feature interaction. """ + def __init__(self, config, dataset): super(NFM, self).__init__(config, dataset) @@ -51,18 +51,11 @@ def _init_weights(self, module): constant_(module.bias.data, 0) def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - nfm_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + nfm_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] bn_nfm_all_embeddings = self.bn(self.fm(nfm_all_embeddings)) - output = self.sigmoid(self.predict_layer(self.mlp_layers(bn_nfm_all_embeddings)) + self.first_order_linear(interaction)) + output = self.predict_layer(self.mlp_layers(bn_nfm_all_embeddings)) + self.first_order_linear(interaction) + output = self.sigmoid(output) return output.squeeze() def calculate_loss(self, interaction): diff --git a/recbole/model/context_aware_recommender/pnn.py b/recbole/model/context_aware_recommender/pnn.py index 13933fe2a..bac4941cf 100644 --- a/recbole/model/context_aware_recommender/pnn.py +++ b/recbole/model/context_aware_recommender/pnn.py @@ -20,8 +20,8 @@ import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers class PNN(ContextRecommender): @@ -50,8 +50,7 @@ def __init__(self, config, dataset): if self.use_outer: product_out_dim += self.num_pair - self.outer_product = OuterProductLayer( - self.num_feature_field, self.embedding_size, device=self.device) + self.outer_product = OuterProductLayer(self.num_feature_field, self.embedding_size, device=self.device) size_list = [product_out_dim] + self.mlp_hidden_size self.mlp_layers = MLPLayers(size_list, self.dropout_prob, bn=False) self.predict_layer = nn.Linear(self.mlp_hidden_size[-1], 1) @@ -64,7 +63,7 @@ def __init__(self, config, dataset): def reg_loss(self): """Calculate the L2 normalization loss of model parameters. - Including weight matrixes of mlp layers. + Including weight matrices of mlp layers. Returns: loss(torch.FloatTensor): The L2 Loss tensor. shape of [1,] @@ -84,15 +83,7 @@ def _init_weights(self, module): constant_(module.bias.data, 0) def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - pnn_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + pnn_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] batch_size = pnn_all_embeddings.shape[0] # linear part linear_part = pnn_all_embeddings.view(batch_size, -1) # [batch_size,num_field*embed_dim] @@ -104,7 +95,7 @@ def forward(self, interaction): if self.use_outer: outer_product = self.outer_product(pnn_all_embeddings).view(batch_size, -1) # [batch_size,num_pairs] output.append(outer_product) - output = torch.cat(output, dim=1) # [batch_size,d] + output = torch.cat(output, dim=1) # [batch_size,d] output = self.predict_layer(self.mlp_layers(output)) # [batch_size,1] output = self.sigmoid(output) @@ -125,6 +116,7 @@ class InnerProductLayer(nn.Module): product or inner product between feature vectors. """ + def __init__(self, num_feature_field, device): """ Args: @@ -159,7 +151,7 @@ def forward(self, feat_emb): class OuterProductLayer(nn.Module): - """OutterProduct Layer used in PNN. This implemention is + """OuterProduct Layer used in PNN. This implementation is adapted from code that the author of the paper published on https://github.com/Atomu2014/product-nets. """ diff --git a/recbole/model/context_aware_recommender/widedeep.py b/recbole/model/context_aware_recommender/widedeep.py index 72a36a7f8..e1c820efd 100644 --- a/recbole/model/context_aware_recommender/widedeep.py +++ b/recbole/model/context_aware_recommender/widedeep.py @@ -11,12 +11,11 @@ Heng-Tze Cheng et al. "Wide & Deep Learning for Recommender Systems." in RecSys 2016. """ -import torch import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers class WideDeep(ContextRecommender): @@ -54,20 +53,11 @@ def _init_weights(self, module): constant_(module.bias.data, 0) def forward(self, interaction): - # sparse_embedding shape: [batch_size, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding shape: [batch_size, num_float_field] or [batch_size, num_float_field, embed_dim] or None - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - widedeep_all_embeddings = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + widedeep_all_embeddings = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] batch_size = widedeep_all_embeddings.shape[0] fm_output = self.first_order_linear(interaction) - deep_output = self.deep_predict_layer( - self.mlp_layers(widedeep_all_embeddings.view(batch_size, -1))) + deep_output = self.deep_predict_layer(self.mlp_layers(widedeep_all_embeddings.view(batch_size, -1))) output = self.sigmoid(fm_output + deep_output) return output.squeeze() diff --git a/recbole/model/context_aware_recommender/xdeepfm.py b/recbole/model/context_aware_recommender/xdeepfm.py index 2a5d60326..0af6ef770 100644 --- a/recbole/model/context_aware_recommender/xdeepfm.py +++ b/recbole/model/context_aware_recommender/xdeepfm.py @@ -23,10 +23,9 @@ import torch import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from logging import getLogger -from recbole.model.layers import MLPLayers, activation_layer from recbole.model.abstract_recommender import ContextRecommender +from recbole.model.layers import MLPLayers, activation_layer class xDeepFM(ContextRecommender): @@ -49,9 +48,10 @@ def __init__(self, config, dataset): if not self.direct: self.cin_layer_size = list(map(lambda x: int(x // 2 * 2), temp_cin_size)) if self.cin_layer_size[:-1] != temp_cin_size[:-1]: - logger = getLogger() - logger.warning('Layer size of CIN should be even except for the last layer when direct is True.' - 'It is changed to {}'.format(self.cin_layer_size)) + self.logger.warning( + 'Layer size of CIN should be even except for the last layer when direct is True.' + 'It is changed to {}'.format(self.cin_layer_size) + ) # Create a convolutional layer for each CIN layer self.conv1d_list = [] @@ -65,8 +65,7 @@ def __init__(self, config, dataset): self.field_nums.append(layer_size // 2) # Create MLP layer - size_list = [self.embedding_size * self.num_feature_field - ] + self.mlp_hidden_size + [1] + size_list = [self.embedding_size * self.num_feature_field] + self.mlp_hidden_size + [1] self.mlp_layers = MLPLayers(size_list, dropout=self.dropout_prob) # Get the output size of CIN @@ -102,7 +101,7 @@ def reg_loss(self, parameters): def calculate_reg_loss(self): """Calculate the final L2 normalization loss of model parameters. - Including weight matrixes of mlp layers, linear layer and convolutional layers. + Including weight matrices of mlp layers, linear layer and convolutional layers. Returns: loss(torch.FloatTensor): The L2 Loss tensor. shape of [1,] @@ -158,8 +157,7 @@ def compressed_interaction_network(self, input_features, activation='identity'): next_hidden = output else: if i != len(self.cin_layer_size) - 1: - next_hidden, direct_connect = torch.split( - output, 2 * [layer_size // 2], 1) + next_hidden, direct_connect = torch.split(output, 2 * [layer_size // 2], 1) else: direct_connect = output next_hidden = 0 @@ -171,15 +169,8 @@ def compressed_interaction_network(self, input_features, activation='identity'): return result def forward(self, interaction): - sparse_embedding, dense_embedding = self.embed_input_fields(interaction) - all_embeddings = [] - if sparse_embedding is not None: - all_embeddings.append(sparse_embedding) - if dense_embedding is not None and len(dense_embedding.shape) == 3: - all_embeddings.append(dense_embedding) - # Get the output of CIN. - xdeepfm_input = torch.cat(all_embeddings, dim=1) # [batch_size, num_field, embed_dim] + xdeepfm_input = self.concat_embed_input_fields(interaction) # [batch_size, num_field, embed_dim] cin_output = self.compressed_interaction_network(xdeepfm_input) cin_output = self.cin_linear(cin_output) diff --git a/recbole/model/exlib_recommender/xgboost.py b/recbole/model/exlib_recommender/xgboost.py new file mode 100644 index 000000000..82d8d6db1 --- /dev/null +++ b/recbole/model/exlib_recommender/xgboost.py @@ -0,0 +1,26 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/19 +# @Author : Chen Yang +# @Email : 254170321@qq.com + +r""" +recbole.model.exlib_recommender.xgboost +######################################## +""" + +import xgboost as xgb +from recbole.utils import ModelType, InputType + + +class xgboost(xgb.Booster): + r"""xgboost is inherited from xgb.Booster + + """ + type = ModelType.XGBOOST + input_type = InputType.POINTWISE + + def __init__(self, config, dataset): + super().__init__(params=None, cache=(), model_file=None) + + def to(self, device): + return self diff --git a/recbole/model/general_recommender/__init__.py b/recbole/model/general_recommender/__init__.py index 7766bf1c3..7d029f280 100644 --- a/recbole/model/general_recommender/__init__.py +++ b/recbole/model/general_recommender/__init__.py @@ -1,4 +1,5 @@ from recbole.model.general_recommender.bpr import BPR +from recbole.model.general_recommender.cdae import CDAE from recbole.model.general_recommender.convncf import ConvNCF from recbole.model.general_recommender.dgcf import DGCF from recbole.model.general_recommender.dmf import DMF @@ -6,6 +7,10 @@ from recbole.model.general_recommender.gcmc import GCMC from recbole.model.general_recommender.itemknn import ItemKNN from recbole.model.general_recommender.lightgcn import LightGCN +from recbole.model.general_recommender.line import LINE +from recbole.model.general_recommender.macridvae import MacridVAE +from recbole.model.general_recommender.multidae import MultiDAE +from recbole.model.general_recommender.multivae import MultiVAE from recbole.model.general_recommender.nais import NAIS from recbole.model.general_recommender.neumf import NeuMF from recbole.model.general_recommender.ngcf import NGCF diff --git a/recbole/model/general_recommender/bpr.py b/recbole/model/general_recommender/bpr.py index 586f83df2..8886ab25d 100644 --- a/recbole/model/general_recommender/bpr.py +++ b/recbole/model/general_recommender/bpr.py @@ -8,7 +8,6 @@ # @Author : Shanlei Mu # @Email : slmu@ruc.edu.cn - r""" BPR ################################################ @@ -19,10 +18,10 @@ import torch import torch.nn as nn -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender -from recbole.model.loss import BPRLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import BPRLoss +from recbole.utils import InputType class BPR(GeneralRecommender): diff --git a/recbole/model/general_recommender/cdae.py b/recbole/model/general_recommender/cdae.py new file mode 100644 index 000000000..627b01abd --- /dev/null +++ b/recbole/model/general_recommender/cdae.py @@ -0,0 +1,131 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/12/12 +# @Author : Xingyu Pan +# @Email : panxy@ruc.edu.cn + +r""" +CDAE +################################################ +Reference: + Yao Wu et al., Collaborative denoising auto-encoders for top-n recommender systems. WSDM 2016. + +Reference code: + https://github.com/jasonyaw/CDAE +""" + +import torch +import torch.nn as nn + +from recbole.model.abstract_recommender import GeneralRecommender +from recbole.model.init import xavier_normal_initialization +from recbole.utils import InputType + + +class CDAE(GeneralRecommender): + r"""Collaborative Denoising Auto-Encoder (CDAE) is a recommendation model + for top-N recommendation that utilizes the idea of Denoising Auto-Encoders. + We implement the the CDAE model with only user dataloader. + """ + input_type = InputType.POINTWISE + + def __init__(self, config, dataset): + super(CDAE, self).__init__(config, dataset) + + self.reg_weight_1 = config['reg_weight_1'] + self.reg_weight_2 = config['reg_weight_2'] + self.loss_type = config['loss_type'] + self.hid_activation = config['hid_activation'] + self.out_activation = config['out_activation'] + self.embedding_size = config['embedding_size'] + self.corruption_ratio = config['corruption_ratio'] + + self.history_item_id, self.history_item_value, _ = dataset.history_item_matrix() + self.history_item_id = self.history_item_id.to(self.device) + self.history_item_value = self.history_item_value.to(self.device) + + if self.hid_activation == 'sigmoid': + self.h_act = nn.Sigmoid() + elif self.hid_activation == 'relu': + self.h_act = nn.ReLU() + elif self.hid_activation == 'tanh': + self.h_act = nn.Tanh() + else: + raise ValueError('Invalid hidden layer activation function') + + if self.out_activation == 'sigmoid': + self.o_act = nn.Sigmoid() + elif self.out_activation == 'relu': + self.o_act = nn.ReLU() + else: + raise ValueError('Invalid output layer activation function') + + self.dropout = nn.Dropout(p=self.corruption_ratio) + + self.h_user = nn.Embedding(self.n_users, self.embedding_size) + self.h_item = nn.Linear(self.n_items, self.embedding_size) + self.out_layer = nn.Linear(self.embedding_size, self.n_items) + + # parameters initialization + self.apply(xavier_normal_initialization) + + def forward(self, x_items, x_users): + h_i = self.dropout(x_items) + h_i = self.h_item(h_i) + h_u = self.h_user(x_users) + h = torch.add(h_u, h_i) + h = self.h_act(h) + out = self.out_layer(h) + return self.o_act(out) + + def get_rating_matrix(self, user): + r"""Get a batch of user's feature with the user's id and history interaction matrix. + + Args: + user (torch.LongTensor): The input tensor that contains user's id, shape: [batch_size, ] + + Returns: + torch.FloatTensor: The user's feature of a batch of user, shape: [batch_size, n_items] + """ + # Following lines construct tensor of shape [B,n_items] using the tensor of shape [B,H] + col_indices = self.history_item_id[user].flatten() + row_indices = torch.arange(user.shape[0]).to(self.device) \ + .repeat_interleave(self.history_item_id.shape[1], dim=0) + rating_matrix = torch.zeros(1).to(self.device).repeat(user.shape[0], self.n_items) + rating_matrix.index_put_((row_indices, col_indices), self.history_item_value[user].flatten()) + return rating_matrix + + def calculate_loss(self, interaction): + x_users = interaction[self.USER_ID] + x_items = self.get_rating_matrix(x_users) + predict = self.forward(x_items, x_users) + + if self.loss_type == 'MSE': + loss_func = nn.MSELoss(reduction='sum') + elif self.loss_type == 'BCE': + loss_func = nn.BCELoss(reduction='sum') + else: + raise ValueError('Invalid loss_type, loss_type must in [MSE, BCE]') + + loss = loss_func(predict, x_items) + # l1-regularization + loss += self.reg_weight_1 * (self.h_user.weight.norm(p=1) + self.h_item.weight.norm(p=1)) + # l2-regularization + loss += self.reg_weight_2 * (self.h_user.weight.norm() + self.h_item.weight.norm()) + + return loss + + def predict(self, interaction): + users = interaction[self.USER_ID] + predict_items = interaction[self.ITEM_ID] + + items = self.get_rating_matrix(users) + scores = self.forward(items, users) + + return scores[[users, predict_items]] + + def full_sort_predict(self, interaction): + users = interaction[self.USER_ID] + + items = self.get_rating_matrix(users) + predict = self.forward(items, users) + return predict.view(-1) diff --git a/recbole/model/general_recommender/convncf.py b/recbole/model/general_recommender/convncf.py index 2ee9714e1..e90808f1d 100644 --- a/recbole/model/general_recommender/convncf.py +++ b/recbole/model/general_recommender/convncf.py @@ -16,13 +16,12 @@ import torch import torch.nn as nn -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender from recbole.model.layers import MLPLayers, CNNLayers +from recbole.utils import InputType class ConvNCFBPRLoss(nn.Module): - """ ConvNCFBPRLoss, based on Bayesian Personalized Ranking, Shape: @@ -38,9 +37,10 @@ class ConvNCFBPRLoss(nn.Module): >>> output = loss(pos_score, neg_score) >>> output.backward() """ + def __init__(self): super(ConvNCFBPRLoss, self).__init__() - + def forward(self, pos_score, neg_score): distance = pos_score - neg_score loss = torch.sum(torch.log((1 + torch.exp(-distance)))) @@ -94,7 +94,7 @@ def forward(self, user, item): def reg_loss(self): r"""Calculate the L2 normalization loss of model parameters. - Including embedding matrixes and weight matrixes of model. + Including embedding matrices and weight matrices of model. Returns: loss(torch.FloatTensor): The L2 Loss tensor. shape of [1,] @@ -118,7 +118,7 @@ def calculate_loss(self, interaction): pos_item_score = self.forward(user, pos_item) neg_item_score = self.forward(user, neg_item) - + loss = self.loss(pos_item_score, neg_item_score) opt_loss = loss + self.reg_loss() diff --git a/recbole/model/general_recommender/dgcf.py b/recbole/model/general_recommender/dgcf.py index 56d688e8d..3560dd2b9 100644 --- a/recbole/model/general_recommender/dgcf.py +++ b/recbole/model/general_recommender/dgcf.py @@ -18,17 +18,18 @@ https://github.com/xiangwang1223/disentangled_graph_collaborative_filtering """ -import numpy as np import random as rd + +import numpy as np import torch import torch.nn as nn import torch.nn.functional as F from torch.autograd import Variable -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType def sample_cor_samples(n_users, n_items, cor_batch_size): @@ -44,7 +45,7 @@ def sample_cor_samples(n_users, n_items, cor_batch_size): Note: We have to sample some embedded representations out of all nodes. - Becasue we have no way to store cor-distance for each pair. + Because we have no way to store cor-distance for each pair. """ cor_users = rd.sample(list(range(n_users)), cor_batch_size) cor_items = rd.sample(list(range(n_items)), cor_batch_size) @@ -73,7 +74,7 @@ def __init__(self, config, dataset): self.n_layers = config['n_layers'] self.reg_weight = config['reg_weight'] self.cor_weight = config['cor_weight'] - n_batch = dataset.dataset.inter_num // self.batch_size + 1 + n_batch = dataset.dataset.inter_num // config['train_batch_size'] + 1 self.cor_batch_size = int(max(self.n_users / n_batch, self.n_items / n_batch)) # ensure embedding can be divided into <n_factors> intent assert self.embedding_size % self.n_factors == 0 @@ -117,9 +118,9 @@ def _build_sparse_tensor(self, indices, values, size): def _get_ego_embeddings(self): # concat of user embeddings and item embeddings - user_embd = self.user_embedding.weight - item_embd = self.item_embedding.weight - ego_embeddings = torch.cat([user_embd, item_embd], dim=0) + user_emb = self.user_embedding.weight + item_emb = self.item_embedding.weight + ego_embeddings = torch.cat([user_emb, item_emb], dim=0) return ego_embeddings def build_matrix(self, A_values): @@ -148,7 +149,7 @@ def build_matrix(self, A_values): try: assert not torch.isnan(d_values).any() except AssertionError: - print("d_values", torch.min(d_values), torch.max(d_values)) + self.logger.info("d_values", torch.min(d_values), torch.max(d_values)) d_values = 1.0 / torch.sqrt(d_values) head_term = torch.sparse.mm(self.head2edge_mat, d_values) @@ -169,7 +170,8 @@ def forward(self): layer_embeddings = [] # split the input embedding table - # .... ego_layer_embeddings is a (n_factors)-leng list of embeddings [n_users+n_items, embed_size/n_factors] + # .... ego_layer_embeddings is a (n_factors)-length list of embeddings + # [n_users+n_items, embed_size/n_factors] ego_layer_embeddings = torch.chunk(ego_embeddings, self.n_factors, 1) for t in range(0, self.n_iterations): iter_embeddings = [] @@ -194,19 +196,20 @@ def forward(self): # get the factor-wise embeddings # .... head_factor_embeddings is a dense tensor with the size of [all_h_list, embed_size/n_factors] # .... analogous to tail_factor_embeddings - head_factor_embedings = torch.index_select(factor_embeddings, dim=0, index=self.all_h_list) - tail_factor_embedings = torch.index_select(ego_layer_embeddings[i], dim=0, index=self.all_t_list) + head_factor_embeddings = torch.index_select(factor_embeddings, dim=0, index=self.all_h_list) + tail_factor_embeddings = torch.index_select(ego_layer_embeddings[i], dim=0, index=self.all_t_list) # .... constrain the vector length # .... make the following attentive weights within the range of (0,1) # to adapt to torch version - head_factor_embedings = F.normalize(head_factor_embedings, p=2, dim=1) - tail_factor_embedings = F.normalize(tail_factor_embedings, p=2, dim=1) + head_factor_embeddings = F.normalize(head_factor_embeddings, p=2, dim=1) + tail_factor_embeddings = F.normalize(tail_factor_embeddings, p=2, dim=1) # get the attentive weights # .... A_factor_values is a dense tensor with the size of [num_edge, 1] - A_factor_values = torch.sum(head_factor_embedings * torch.tanh(tail_factor_embedings), - dim=1, keepdim=True) + A_factor_values = torch.sum( + head_factor_embeddings * torch.tanh(tail_factor_embeddings), dim=1, keepdim=True + ) # update the attentive weights A_iter_values.append(A_factor_values) @@ -243,18 +246,18 @@ def calculate_loss(self, interaction): user_all_embeddings, item_all_embeddings = self.forward() u_embeddings = user_all_embeddings[user] - posi_embeddings = item_all_embeddings[pos_item] - negi_embeddings = item_all_embeddings[neg_item] + pos_embeddings = item_all_embeddings[pos_item] + neg_embeddings = item_all_embeddings[neg_item] - pos_scores = torch.mul(u_embeddings, posi_embeddings).sum(dim=1) - neg_scores = torch.mul(u_embeddings, negi_embeddings).sum(dim=1) + pos_scores = torch.mul(u_embeddings, pos_embeddings).sum(dim=1) + neg_scores = torch.mul(u_embeddings, neg_embeddings).sum(dim=1) mf_loss = self.mf_loss(pos_scores, neg_scores) - # cul regularizer + # cul regularized u_ego_embeddings = self.user_embedding(user) - posi_ego_embeddings = self.item_embedding(pos_item) - negi_ego_embeddings = self.item_embedding(neg_item) - reg_loss = self.reg_loss(u_ego_embeddings, posi_ego_embeddings, negi_ego_embeddings) + pos_ego_embeddings = self.item_embedding(pos_item) + neg_ego_embeddings = self.item_embedding(neg_item) + reg_loss = self.reg_loss(u_ego_embeddings, pos_ego_embeddings, neg_ego_embeddings) if self.n_factors > 1 and self.cor_weight > 1e-9: cor_users, cor_items = sample_cor_samples(self.n_users, self.n_items, self.cor_batch_size) diff --git a/recbole/model/general_recommender/dmf.py b/recbole/model/general_recommender/dmf.py index ce04bbdf0..7c0fbe57d 100644 --- a/recbole/model/general_recommender/dmf.py +++ b/recbole/model/general_recommender/dmf.py @@ -20,9 +20,9 @@ import torch.nn as nn from torch.nn.init import normal_ -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender from recbole.model.layers import MLPLayers +from recbole.utils import InputType class DMF(GeneralRecommender): @@ -102,7 +102,8 @@ def forward(self, user, item): # Following lines construct tensor of shape [B,n_users] using the tensor of shape [B,H] col_indices = self.history_user_id[item].flatten() - row_indices = torch.arange(item.shape[0]).to(self.device).repeat_interleave(self.history_user_id.shape[1], dim=0) + row_indices = torch.arange(item.shape[0]).to(self.device). \ + repeat_interleave(self.history_user_id.shape[1], dim=0) matrix_01 = torch.zeros(1).to(self.device).repeat(item.shape[0], self.n_users) matrix_01.index_put_((row_indices, col_indices), self.history_user_value[item].flatten()) item = self.item_linear(matrix_01) @@ -149,8 +150,8 @@ def get_user_embedding(self, user): """ # Following lines construct tensor of shape [B,n_items] using the tensor of shape [B,H] col_indices = self.history_item_id[user].flatten() - row_indices = torch.arange(user.shape[0]).to(self.device).repeat_interleave(self.history_item_id.shape[1], - dim=0) + row_indices = torch.arange(user.shape[0]).to(self.device) + row_indices = row_indices.repeat_interleave(self.history_item_id.shape[1], dim=0) matrix_01 = torch.zeros(1).to(self.device).repeat(user.shape[0], self.n_items) matrix_01.index_put_((row_indices, col_indices), self.history_item_value[user].flatten()) user = self.user_linear(matrix_01) @@ -170,7 +171,8 @@ def get_item_embedding(self): col = interaction_matrix.col i = torch.LongTensor([row, col]) data = torch.FloatTensor(interaction_matrix.data) - item_matrix = torch.sparse.FloatTensor(i, data, torch.Size(interaction_matrix.shape)).to(self.device).transpose(0, 1) + item_matrix = torch.sparse.FloatTensor(i, data, torch.Size(interaction_matrix.shape)).to(self.device).\ + transpose(0, 1) item = torch.sparse.mm(item_matrix, self.item_linear.weight.t()) item = self.item_fc_layers(item) diff --git a/recbole/model/general_recommender/fism.py b/recbole/model/general_recommender/fism.py index 56744a32a..fdfecc216 100644 --- a/recbole/model/general_recommender/fism.py +++ b/recbole/model/general_recommender/fism.py @@ -3,7 +3,6 @@ # @Author : Kaiyuan Li # @email : tsotfsk@outlook.com - """ FISM ####################################### @@ -14,13 +13,12 @@ https://github.com/AaronHeee/Neural-Attentive-Item-Similarity-Model """ -from logging import getLogger - import torch import torch.nn as nn +from torch.nn.init import normal_ + from recbole.model.abstract_recommender import GeneralRecommender from recbole.utils import InputType -from torch.nn.init import normal_ class FISM(GeneralRecommender): @@ -37,9 +35,7 @@ def __init__(self, config, dataset): # load dataset info self.LABEL = config['LABEL_FIELD'] - self.logger = getLogger() - - # get all users's history interaction information.the history item + # get all users' history interaction information.the history item # matrix is padding by the maximum number of a user's interactions self.history_item_matrix, self.history_lens, self.mask_mat = self.get_history_info(dataset) @@ -57,7 +53,6 @@ def __init__(self, config, dataset): 'you need to increase it \n\t\t\tuntil the error disappears. For example, ' + \ 'you can append it in the command line such as `--split_to=5`') - # define layers and loss # construct source and destination item embedding matrix self.item_src_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) @@ -92,7 +87,7 @@ def reg_loss(self): Returns: torch.Tensor: reg loss - """ + """ reg_1, reg_2 = self.reg_weights loss_1 = reg_1 * self.item_src_embedding.weight.norm(2) loss_2 = reg_2 * self.item_dst_embedding.weight.norm(2) @@ -132,7 +127,7 @@ def user_forward(self, user_input, item_num, user_bias, repeats=None, pred_slc=N Args: user_input (torch.Tensor): user input tensor - item_num (torch.Tensor): user hitory interaction lens + item_num (torch.Tensor): user history interaction lens repeats (int, optional): the number of items to be evaluated pred_slc (torch.Tensor, optional): continuous index which controls the current evaluation items, if pred_slc is None, it will evaluate all items @@ -180,7 +175,9 @@ def full_sort_predict(self, interaction): else: output = [] for mask in self.group: - tmp_output = self.user_forward(user_input[:item_num], item_num, user_bias, repeats=len(mask), pred_slc=mask) + tmp_output = self.user_forward( + user_input[:item_num], item_num, user_bias, repeats=len(mask), pred_slc=mask + ) output.append(tmp_output) output = torch.cat(output, dim=0) scores.append(output) diff --git a/recbole/model/general_recommender/gcmc.py b/recbole/model/general_recommender/gcmc.py index b6619b39d..e3715493d 100644 --- a/recbole/model/general_recommender/gcmc.py +++ b/recbole/model/general_recommender/gcmc.py @@ -19,15 +19,16 @@ https://github.com/riannevdberg/gc-mc """ - import math + +import numpy as np +import scipy.sparse as sp import torch import torch.nn as nn -import scipy.sparse as sp -import numpy as np -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender +from recbole.model.layers import SparseDropout +from recbole.utils import InputType class GCMC(GeneralRecommender): @@ -54,8 +55,7 @@ def __init__(self, config, dataset): # load dataset info self.num_all = self.n_users + self.n_items - self.interaction_matrix = dataset.inter_matrix( - form='coo').astype(np.float32) # csr + self.interaction_matrix = dataset.inter_matrix(form='coo').astype(np.float32) # csr # load parameters info self.dropout_prob = config['dropout_prob'] @@ -70,19 +70,20 @@ def __init__(self, config, dataset): features = self.get_sparse_eye_mat(self.num_all) i = features._indices() v = features._values() - self.user_features = torch.sparse.FloatTensor(i[:, :self.n_users], v[:self.n_users], - torch.Size([self.n_users, self.num_all])).to(self.device) + self.user_features = torch.sparse.FloatTensor( + i[:, :self.n_users], v[:self.n_users], torch.Size([self.n_users, self.num_all]) + ).to(self.device) item_i = i[:, self.n_users:] item_i[0, :] = item_i[0, :] - self.n_users - self.item_features = torch.sparse.FloatTensor(item_i, v[self.n_users:], - torch.Size([self.n_items, self.num_all])).to(self.device) + self.item_features = torch.sparse.FloatTensor( + item_i, v[self.n_users:], torch.Size([self.n_items, self.num_all]) + ).to(self.device) else: features = torch.eye(self.num_all).to(self.device) - self.user_features, self.item_features = torch.split( - features, [self.n_users, self.n_items]) + self.user_features, self.item_features = torch.split(features, [self.n_users, self.n_items]) self.input_dim = self.user_features.shape[1] - # adj matrixs for each relation are stored in self.support + # adj matrices for each relation are stored in self.support self.Graph = self.get_norm_adj_mat().to(self.device) self.support = [self.Graph] @@ -91,26 +92,32 @@ def __init__(self, config, dataset): if self.accum == 'stack': div = self.gcn_output_dim // len(self.support) if self.gcn_output_dim % len(self.support) != 0: - print("""\nWARNING: HIDDEN[0] (=%d) of stack layer is adjusted to %d (in %d splits).\n""" - % (self.gcn_output_dim, len(self.support) * div, len(self.support))) + self.logger.warning( + "HIDDEN[0] (=%d) of stack layer is adjusted to %d (in %d splits)." % + (self.gcn_output_dim, len(self.support) * div, len(self.support)) + ) self.gcn_output_dim = len(self.support) * div # define layers and loss - self.GcEncoder = GcEncoder(accum=self.accum, - num_user=self.n_users, - num_item=self.n_items, - support=self.support, - input_dim=self.input_dim, - gcn_output_dim=self.gcn_output_dim, - dense_output_dim=self.dense_output_dim, - drop_prob=self.dropout_prob, - device=self.device, - sparse_feature=self.sparse_feature).to(self.device) - self.BiDecoder = BiDecoder(input_dim=self.dense_output_dim, - output_dim=self.n_class, - drop_prob=0., - device=self.device, - num_weights=self.num_basis_functions).to(self.device) + self.GcEncoder = GcEncoder( + accum=self.accum, + num_user=self.n_users, + num_item=self.n_items, + support=self.support, + input_dim=self.input_dim, + gcn_output_dim=self.gcn_output_dim, + dense_output_dim=self.dense_output_dim, + drop_prob=self.dropout_prob, + device=self.device, + sparse_feature=self.sparse_feature + ).to(self.device) + self.BiDecoder = BiDecoder( + input_dim=self.dense_output_dim, + output_dim=self.n_class, + drop_prob=0., + device=self.device, + num_weights=self.num_basis_functions + ).to(self.device) self.loss_function = nn.CrossEntropyLoss() def get_sparse_eye_mat(self, num): @@ -141,18 +148,15 @@ def get_norm_adj_mat(self): Sparse tensor of the normalized interaction matrix. """ # build adj matrix - A = sp.dok_matrix((self.n_users + self.n_items, - self.n_users + self.n_items), dtype=np.float32) + A = sp.dok_matrix((self.n_users + self.n_items, self.n_users + self.n_items), dtype=np.float32) inter_M = self.interaction_matrix inter_M_t = self.interaction_matrix.transpose() - data_dict = dict(zip(zip(inter_M.row, inter_M.col+self.n_users), - [1]*inter_M.nnz)) - data_dict.update(dict(zip(zip(inter_M_t.row+self.n_users, inter_M_t.col), - [1]*inter_M_t.nnz))) + data_dict = dict(zip(zip(inter_M.row, inter_M.col + self.n_users), [1] * inter_M.nnz)) + data_dict.update(dict(zip(zip(inter_M_t.row + self.n_users, inter_M_t.col), [1] * inter_M_t.nnz))) A._update(data_dict) # norm adj matrix sumArr = (A > 0).sum(axis=1) - # add epsilon to avoid Devide by zero Warning + # add epsilon to avoid divide by zero Warning diag = np.array(sumArr.flatten())[0] + 1e-7 diag = np.power(diag, -0.5) D = sp.diags(diag) @@ -169,8 +173,7 @@ def get_norm_adj_mat(self): def forward(self, user_X, item_X, user, item): # Graph autoencoders are comprised of a graph encoder model and a pairwise decoder model. user_embedding, item_embedding = self.GcEncoder(user_X, item_X) - predict_score = self.BiDecoder( - user_embedding, item_embedding, user, item) + predict_score = self.BiDecoder(user_embedding, item_embedding, user, item) return predict_score def calculate_loss(self, interaction): @@ -210,15 +213,29 @@ def full_sort_predict(self, interaction): class GcEncoder(nn.Module): - """Graph Convolutional Encoder + r"""Graph Convolutional Encoder GcEncoder take as input an :math:`N \times D` feature matrix :math:`X` and a graph adjacency matrix :math:`A`, and produce an :math:`N \times E` node embedding matrix; Note that :math:`N` denotes the number of nodes, :math:`D` the number of input features, and :math:`E` the embedding size. """ - def __init__(self, accum, num_user, num_item, support, input_dim, gcn_output_dim, dense_output_dim, drop_prob, device, - sparse_feature=True, act_dense=lambda x: x, share_user_item_weights=True, bias=False): + def __init__( + self, + accum, + num_user, + num_item, + support, + input_dim, + gcn_output_dim, + dense_output_dim, + drop_prob, + device, + sparse_feature=True, + act_dense=lambda x: x, + share_user_item_weights=True, + bias=False + ): super(GcEncoder, self).__init__() self.num_users = num_user self.num_items = num_item @@ -246,58 +263,58 @@ def __init__(self, accum, num_user, num_item, support, input_dim, gcn_output_dim # gcn layer if self.accum == 'sum': - self.weights_u = nn.ParameterList( - [nn.Parameter(torch.FloatTensor(self.input_dim, self.gcn_output_dim).to(self.device), - requires_grad=True) - for _ in range(self.num_support)]) + self.weights_u = nn.ParameterList([ + nn.Parameter( + torch.FloatTensor(self.input_dim, self.gcn_output_dim).to(self.device), requires_grad=True + ) for _ in range(self.num_support) + ]) if share_user_item_weights: self.weights_v = self.weights_u else: - self.weights_v = nn.ParameterList( - [nn.Parameter(torch.FloatTensor(self.input_dim, self.gcn_output_dim).to(self.device), - requires_grad=True) for _ in range(self.num_support)]) + self.weights_v = nn.ParameterList([ + nn.Parameter( + torch.FloatTensor(self.input_dim, self.gcn_output_dim).to(self.device), requires_grad=True + ) for _ in range(self.num_support) + ]) else: assert self.gcn_output_dim % self.num_support == 0, 'output_dim must be multiple of num_support for stackGC' self.sub_hidden_dim = self.gcn_output_dim // self.num_support - self.weights_u = nn.ParameterList( - [nn.Parameter(torch.FloatTensor(self.input_dim, self.sub_hidden_dim).to(self.device), - requires_grad=True) - for _ in range(self.num_support)]) + self.weights_u = nn.ParameterList([ + nn.Parameter( + torch.FloatTensor(self.input_dim, self.sub_hidden_dim).to(self.device), requires_grad=True + ) for _ in range(self.num_support) + ]) if share_user_item_weights: self.weights_v = self.weights_u else: - self.weights_v = nn.ParameterList( - [nn.Parameter(torch.FloatTensor(self.input_dim, self.sub_hidden_dim).to(self.device), - requires_grad=True) for _ in range(self.num_support)]) + self.weights_v = nn.ParameterList([ + nn.Parameter( + torch.FloatTensor(self.input_dim, self.sub_hidden_dim).to(self.device), requires_grad=True + ) for _ in range(self.num_support) + ]) # dense layer - self.dense_layer_u = nn.Linear( - self.gcn_output_dim, self.dense_output_dim, bias=self.bias) + self.dense_layer_u = nn.Linear(self.gcn_output_dim, self.dense_output_dim, bias=self.bias) if share_user_item_weights: self.dense_layer_v = self.dense_layer_u else: - self.dense_layer_v = nn.Linear( - self.gcn_output_dim, self.dense_output_dim, bias=self.bias) + self.dense_layer_v = nn.Linear(self.gcn_output_dim, self.dense_output_dim, bias=self.bias) self._init_weights() def _init_weights(self): - init_range = math.sqrt((self.num_support + 1) / - (self.input_dim + self.gcn_output_dim)) + init_range = math.sqrt((self.num_support + 1) / (self.input_dim + self.gcn_output_dim)) for w in range(self.num_support): self.weights_u[w].data.uniform_(-init_range, init_range) if not self.share_weights: for w in range(self.num_support): self.weights_v[w].data.uniform_(-init_range, init_range) - dense_init_range = math.sqrt( - (self.num_support + 1) / (self.dense_output_dim + self.gcn_output_dim)) - self.dense_layer_u.weight.data.uniform_( - -dense_init_range, dense_init_range) + dense_init_range = math.sqrt((self.num_support + 1) / (self.dense_output_dim + self.gcn_output_dim)) + self.dense_layer_u.weight.data.uniform_(-dense_init_range, dense_init_range) if not self.share_weights: - self.dense_layer_v.weight.data.uniform_( - -dense_init_range, dense_init_range) + self.dense_layer_v.weight.data.uniform_(-dense_init_range, dense_init_range) if self.bias: self.dense_layer_u.bias.data.fill_(0) @@ -353,8 +370,7 @@ def forward(self, user_X, item_X): embeddings = torch.cat(embeddings, dim=1) - users, items = torch.split( - embeddings, [self.num_users, self.num_items]) + users, items = torch.split(embeddings, [self.num_users, self.num_items]) u_hidden = self.activate(users) v_hidden = self.activate(items) @@ -374,12 +390,11 @@ def forward(self, user_X, item_X): class BiDecoder(nn.Module): - """Bilinear decoder + """Bi-linear decoder BiDecoder takes pairs of node embeddings and predicts respective entries in the adjacency matrix. """ - def __init__(self, input_dim, output_dim, drop_prob, device, - num_weights=3, act=lambda x: x): + def __init__(self, input_dim, output_dim, drop_prob, device, num_weights=3, act=lambda x: x): super(BiDecoder, self).__init__() self.input_dim = input_dim self.output_dim = output_dim @@ -390,18 +405,15 @@ def __init__(self, input_dim, output_dim, drop_prob, device, self.dropout_prob = drop_prob self.dropout = nn.Dropout(p=self.dropout_prob) - self.weights = nn.ParameterList( - [nn.Parameter(orthogonal([self.input_dim, self.input_dim]).to(self.device)) - for _ in range(self.num_weights)]) - self.dense_layer = nn.Linear( - self.num_weights, self.output_dim, bias=False) + self.weights = nn.ParameterList([ + nn.Parameter(orthogonal([self.input_dim, self.input_dim]).to(self.device)) for _ in range(self.num_weights) + ]) + self.dense_layer = nn.Linear(self.num_weights, self.output_dim, bias=False) self._init_weights() def _init_weights(self): - dense_init_range = math.sqrt( - self.output_dim / (self.num_weights + self.output_dim)) - self.dense_layer.weight.data.uniform_( - -dense_init_range, dense_init_range) + dense_init_range = math.sqrt(self.output_dim / (self.num_weights + self.output_dim)) + self.dense_layer.weight.data.uniform_(-dense_init_range, dense_init_range) def forward(self, u_inputs, i_inputs, users, items=None): u_inputs = self.dropout(u_inputs) @@ -434,25 +446,6 @@ def forward(self, u_inputs, i_inputs, users, items=None): return output -class SparseDropout(nn.Module): - """ - This is a Module that execute Dropout on Pytorch sparse tensor. - """ - - def __init__(self, p=0.5): - super(SparseDropout, self).__init__() - # p is ratio of dropout - # convert to keep probability - self.kprob = 1 - p - - def forward(self, x): - mask = ((torch.rand(x._values().size()) + - self.kprob).floor()).type(torch.bool) - rc = x._indices()[:, mask] - val = x._values()[mask] * (1.0 / self.kprob) - return torch.sparse.FloatTensor(rc, val, x.shape) - - def orthogonal(shape, scale=1.1): """ Initialization function for weights in class GCMC. diff --git a/recbole/model/general_recommender/itemknn.py b/recbole/model/general_recommender/itemknn.py index c98c075dd..37ace63c1 100644 --- a/recbole/model/general_recommender/itemknn.py +++ b/recbole/model/general_recommender/itemknn.py @@ -15,8 +15,8 @@ import scipy.sparse as sp import torch -from recbole.utils import InputType, ModelType from recbole.model.abstract_recommender import GeneralRecommender +from recbole.utils import InputType, ModelType class ComputeSimilarity: @@ -123,14 +123,11 @@ def compute_similarity(self, block_size=100): # End while on columns - W_sparse = sp.csr_matrix((values, (rows, cols)), - shape=(self.n_columns, self.n_columns), - dtype=np.float32) + W_sparse = sp.csr_matrix((values, (rows, cols)), shape=(self.n_columns, self.n_columns), dtype=np.float32) return W_sparse.tocsc() class ItemKNN(GeneralRecommender): - r"""ItemKNN is a basic model that compute item similarity with the interaction matrix. """ diff --git a/recbole/model/general_recommender/lightgcn.py b/recbole/model/general_recommender/lightgcn.py index 4cf3604d9..b57c9b1d2 100644 --- a/recbole/model/general_recommender/lightgcn.py +++ b/recbole/model/general_recommender/lightgcn.py @@ -23,10 +23,10 @@ import scipy.sparse as sp import torch -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_uniform_initialization +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType class LightGCN(GeneralRecommender): @@ -45,19 +45,16 @@ def __init__(self, config, dataset): super(LightGCN, self).__init__(config, dataset) # load dataset info - self.interaction_matrix = dataset.inter_matrix( - form='coo').astype(np.float32) + self.interaction_matrix = dataset.inter_matrix(form='coo').astype(np.float32) # load parameters info self.latent_dim = config['embedding_size'] # int type:the embedding size of lightGCN self.n_layers = config['n_layers'] # int type:the layer num of lightGCN - self.reg_weight = config['reg_weight'] # float32 type: the weight decay for l2 normalizaton + self.reg_weight = config['reg_weight'] # float32 type: the weight decay for l2 normalization # define layers and loss - self.user_embedding = torch.nn.Embedding( - num_embeddings=self.n_users, embedding_dim=self.latent_dim) - self.item_embedding = torch.nn.Embedding( - num_embeddings=self.n_items, embedding_dim=self.latent_dim) + self.user_embedding = torch.nn.Embedding(num_embeddings=self.n_users, embedding_dim=self.latent_dim) + self.item_embedding = torch.nn.Embedding(num_embeddings=self.n_items, embedding_dim=self.latent_dim) self.mf_loss = BPRLoss() self.reg_loss = EmbLoss() @@ -84,18 +81,15 @@ def get_norm_adj_mat(self): Sparse tensor of the normalized interaction matrix. """ # build adj matrix - A = sp.dok_matrix((self.n_users + self.n_items, - self.n_users + self.n_items), dtype=np.float32) + A = sp.dok_matrix((self.n_users + self.n_items, self.n_users + self.n_items), dtype=np.float32) inter_M = self.interaction_matrix inter_M_t = self.interaction_matrix.transpose() - data_dict = dict(zip(zip(inter_M.row, inter_M.col+self.n_users), - [1]*inter_M.nnz)) - data_dict.update(dict(zip(zip(inter_M_t.row+self.n_users, inter_M_t.col), - [1]*inter_M_t.nnz))) + data_dict = dict(zip(zip(inter_M.row, inter_M.col + self.n_users), [1] * inter_M.nnz)) + data_dict.update(dict(zip(zip(inter_M_t.row + self.n_users, inter_M_t.col), [1] * inter_M_t.nnz))) A._update(data_dict) # norm adj matrix sumArr = (A > 0).sum(axis=1) - # add epsilon to avoid Devide by zero Warning + # add epsilon to avoid divide by zero Warning diag = np.array(sumArr.flatten())[0] + 1e-7 diag = np.power(diag, -0.5) D = sp.diags(diag) @@ -125,14 +119,12 @@ def forward(self): embeddings_list = [all_embeddings] for layer_idx in range(self.n_layers): - all_embeddings = torch.sparse.mm( - self.norm_adj_matrix, all_embeddings) + all_embeddings = torch.sparse.mm(self.norm_adj_matrix, all_embeddings) embeddings_list.append(all_embeddings) lightgcn_all_embeddings = torch.stack(embeddings_list, dim=1) lightgcn_all_embeddings = torch.mean(lightgcn_all_embeddings, dim=1) - user_all_embeddings, item_all_embeddings = torch.split( - lightgcn_all_embeddings, [self.n_users, self.n_items]) + user_all_embeddings, item_all_embeddings = torch.split(lightgcn_all_embeddings, [self.n_users, self.n_items]) return user_all_embeddings, item_all_embeddings def calculate_loss(self, interaction): @@ -146,21 +138,20 @@ def calculate_loss(self, interaction): user_all_embeddings, item_all_embeddings = self.forward() u_embeddings = user_all_embeddings[user] - posi_embeddings = item_all_embeddings[pos_item] - negi_embeddings = item_all_embeddings[neg_item] + pos_embeddings = item_all_embeddings[pos_item] + neg_embeddings = item_all_embeddings[neg_item] # calculate BPR Loss - pos_scores = torch.mul(u_embeddings, posi_embeddings).sum(dim=1) - neg_scores = torch.mul(u_embeddings, negi_embeddings).sum(dim=1) + pos_scores = torch.mul(u_embeddings, pos_embeddings).sum(dim=1) + neg_scores = torch.mul(u_embeddings, neg_embeddings).sum(dim=1) mf_loss = self.mf_loss(pos_scores, neg_scores) # calculate BPR Loss u_ego_embeddings = self.user_embedding(user) - posi_ego_embeddings = self.item_embedding(pos_item) - negi_ego_embeddings = self.item_embedding(neg_item) + pos_ego_embeddings = self.item_embedding(pos_item) + neg_ego_embeddings = self.item_embedding(neg_item) - reg_loss = self.reg_loss( - u_ego_embeddings, posi_ego_embeddings, negi_ego_embeddings) + reg_loss = self.reg_loss(u_ego_embeddings, pos_ego_embeddings, neg_ego_embeddings) loss = mf_loss + self.reg_weight * reg_loss return loss @@ -184,7 +175,6 @@ def full_sort_predict(self, interaction): u_embeddings = self.restore_user_e[user] # dot with all item embedding to accelerate - scores = torch.matmul( - u_embeddings, self.restore_item_e.transpose(0, 1)) + scores = torch.matmul(u_embeddings, self.restore_item_e.transpose(0, 1)) return scores.view(-1) diff --git a/recbole/model/general_recommender/line.py b/recbole/model/general_recommender/line.py new file mode 100644 index 000000000..7c183c2f4 --- /dev/null +++ b/recbole/model/general_recommender/line.py @@ -0,0 +1,183 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/12/8 +# @Author : Yihong Guo +# @Email : gyihong@hotmail.com + +r""" +LINE +################################################ +Reference: + Jian Tang et al. "LINE: Large-scale Information Network Embedding." in WWW 2015. + +Reference code: + https://github.com/shenweichen/GraphEmbedding +""" + +import random + +import numpy as np +import torch +import torch.nn as nn + +from recbole.model.abstract_recommender import GeneralRecommender +from recbole.model.init import xavier_normal_initialization +from recbole.utils import InputType + + +class NegSamplingLoss(nn.Module): + + def __init__(self): + super(NegSamplingLoss, self).__init__() + + def forward(self, score, sign): + return -torch.mean(torch.sigmoid(sign * score)) + + +class LINE(GeneralRecommender): + r"""LINE is a graph embedding model. + + We implement the model to train users and items embedding for recommendation. + """ + input_type = InputType.PAIRWISE + + def __init__(self, config, dataset): + super(LINE, self).__init__(config, dataset) + + self.embedding_size = config['embedding_size'] + self.order = config['order'] + self.second_order_loss_weight = config['second_order_loss_weight'] + self.training_neg_sample_num = config['training_neg_sample_num'] + + self.interaction_feat = dataset.dataset.inter_feat + + self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size) + + if self.order == 2: + self.user_context_embedding = nn.Embedding(self.n_users, self.embedding_size) + self.item_context_embedding = nn.Embedding(self.n_items, self.embedding_size) + + self.loss_fct = NegSamplingLoss() + + self.used_ids = self.get_used_ids() + self.random_list = self.get_user_id_list() + np.random.shuffle(self.random_list) + self.random_pr = 0 + self.random_list_length = len(self.random_list) + + self.apply(xavier_normal_initialization) + + def get_used_ids(self): + cur = np.array([set() for _ in range(self.n_items)]) + for iid, uid in zip(self.interaction_feat[self.USER_ID].numpy(), self.interaction_feat[self.ITEM_ID].numpy()): + cur[iid].add(uid) + return cur + + def sampler(self, key_ids): + + key_ids = np.array(key_ids.cpu()) + key_num = len(key_ids) + total_num = key_num + value_ids = np.zeros(total_num, dtype=np.int64) + check_list = np.arange(total_num) + key_ids = np.tile(key_ids, 1) + while len(check_list) > 0: + value_ids[check_list] = self.random_num(len(check_list)) + check_list = np.array([ + i for i, used, v in zip(check_list, self.used_ids[key_ids[check_list]], value_ids[check_list]) + if v in used + ]) + + return torch.tensor(value_ids, device=self.device) + + def random_num(self, num): + value_id = [] + self.random_pr %= self.random_list_length + while True: + if self.random_pr + num <= self.random_list_length: + value_id.append(self.random_list[self.random_pr:self.random_pr + num]) + self.random_pr += num + break + else: + value_id.append(self.random_list[self.random_pr:]) + num -= self.random_list_length - self.random_pr + self.random_pr = 0 + np.random.shuffle(self.random_list) + return np.concatenate(value_id) + + def get_user_id_list(self): + return np.arange(1, self.n_users) + + def forward(self, h, t): + + h_embedding = self.user_embedding(h) + t_embedding = self.item_embedding(t) + + return torch.sum(h_embedding.mul(t_embedding), dim=1) + + def context_forward(self, h, t, field): + + if field == "uu": + h_embedding = self.user_embedding(h) + t_embedding = self.item_context_embedding(t) + else: + h_embedding = self.item_embedding(h) + t_embedding = self.user_context_embedding(t) + + return torch.sum(h_embedding.mul(t_embedding), dim=1) + + def calculate_loss(self, interaction): + + user = interaction[self.USER_ID] + pos_item = interaction[self.ITEM_ID] + neg_item = interaction[self.NEG_ITEM_ID] + + score_pos = self.forward(user, pos_item) + + ones = torch.ones(len(score_pos), device=self.device) + + if self.order == 1: + if random.random() < 0.5: + score_neg = self.forward(user, neg_item) + else: + neg_user = self.sampler(pos_item) + score_neg = self.forward(neg_user, pos_item) + return self.loss_fct(ones, score_pos) + self.loss_fct(-1 * ones, score_neg) + + else: + # randomly train i-i relation and u-u relation with u-i relation + if random.random() < 0.5: + score_neg = self.forward(user, neg_item) + score_pos_con = self.context_forward(user, pos_item, 'uu') + score_neg_con = self.context_forward(user, neg_item, 'uu') + else: + # sample negative user for item + neg_user = self.sampler(pos_item) + score_neg = self.forward(neg_user, pos_item) + score_pos_con = self.context_forward(pos_item, user, 'ii') + score_neg_con = self.context_forward(pos_item, neg_user, 'ii') + + return self.loss_fct(ones, score_pos) \ + + self.loss_fct(-1 * ones, score_neg) \ + + self.loss_fct(ones, score_pos_con) * self.second_order_loss_weight \ + + self.loss_fct(-1 * ones, score_neg_con) * self.second_order_loss_weight + + def predict(self, interaction): + + user = interaction[self.USER_ID] + item = interaction[self.ITEM_ID] + + scores = self.forward(user, item) + + return scores + + def full_sort_predict(self, interaction): + user = interaction[self.USER_ID] + + # get user embedding from storage variable + u_embeddings = self.user_embedding(user) + i_embedding = self.item_embedding.weight + # dot with all item embedding to accelerate + scores = torch.matmul(u_embeddings, i_embedding.transpose(0, 1)) + + return scores.view(-1) diff --git a/recbole/model/general_recommender/macridvae.py b/recbole/model/general_recommender/macridvae.py new file mode 100644 index 000000000..50ae69eb5 --- /dev/null +++ b/recbole/model/general_recommender/macridvae.py @@ -0,0 +1,204 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/12/23 +# @Author : Yihong Guo +# @Email : gyihong@hotmail.com + +r""" +MacridVAE +################################################ +Reference: + Jianxin Ma et al. "Learning Disentangled Representations for Recommendation." in NeurIPS 2019. + +Reference code: + https://jianxinma.github.io/disentangle-recsys.html +""" + +import torch +import torch.nn as nn +import torch.nn.functional as F +from torch.distributions.one_hot_categorical import OneHotCategorical + +from recbole.model.abstract_recommender import GeneralRecommender +from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import EmbLoss +from recbole.utils import InputType + + +class MacridVAE(GeneralRecommender): + r"""MacridVAE is a item-based model collaborative filtering model that learn disentangled representations from user + behavior and simultaneously rank all items for user. + + We implement the model following the original author + """ + input_type = InputType.PAIRWISE + + def __init__(self, config, dataset): + super(MacridVAE, self).__init__(config, dataset) + + self.layers = config['encoder_hidden_size'] + self.embedding_size = config['embedding_size'] + self.drop_out = config['drop_out'] + self.kfac = config['kfac'] + self.tau = config['tau'] + self.nogb = config['nogb'] + self.anneal_cap = config['anneal_cap'] + self.total_anneal_steps = config['total_anneal_steps'] + self.regs = config['reg_weights'] + self.std = config['std'] + + self.update = 0 + + self.history_item_id, self.history_item_value, _ = dataset.history_item_matrix() + self.history_item_id = self.history_item_id.to(self.device) + self.history_item_value = self.history_item_value.to(self.device) + self.encode_layer_dims = [self.n_items] + self.layers + [self.embedding_size * 2] + + self.encoder = self.mlp_layers(self.encode_layer_dims) + + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size) + self.k_embedding = nn.Embedding(self.kfac, self.embedding_size) + + self.l2_loss = EmbLoss() + # parameters initialization + self.apply(xavier_normal_initialization) + + def get_rating_matrix(self, user): + r"""Get a batch of user's feature with the user's id and history interaction matrix. + + Args: + user (torch.LongTensor): The input tensor that contains user's id, shape: [batch_size, ] + + Returns: + torch.FloatTensor: The user's feature of a batch of user, shape: [batch_size, n_items] + """ + # Following lines construct tensor of shape [B,n_items] using the tensor of shape [B,H] + col_indices = self.history_item_id[user].flatten() + row_indices = torch.arange(user.shape[0]).to(self.device) \ + .repeat_interleave(self.history_item_id.shape[1], dim=0) + rating_matrix = torch.zeros(1).to(self.device).repeat(user.shape[0], self.n_items) + rating_matrix.index_put_((row_indices, col_indices), self.history_item_value[user].flatten()) + return rating_matrix + + def mlp_layers(self, layer_dims): + mlp_modules = [] + for i, (d_in, d_out) in enumerate(zip(layer_dims[:-1], layer_dims[1:])): + mlp_modules.append(nn.Linear(d_in, d_out)) + if i != len(layer_dims[:-1]) - 1: + mlp_modules.append(nn.Tanh()) + return nn.Sequential(*mlp_modules) + + def reparameterize(self, mu, logvar): + if self.training: + std = torch.exp(0.5 * logvar) + epsilon = torch.zeros_like(std).normal_(mean=0, std=self.std) + return mu + epsilon * std + else: + return mu + + def forward(self, rating_matrix): + + cores = F.normalize(self.k_embedding.weight, dim=1) + items = F.normalize(self.item_embedding.weight, dim=1) + + rating_matrix = F.normalize(rating_matrix) + rating_matrix = F.dropout(rating_matrix, self.drop_out, training=self.training) + + cates_logits = torch.matmul(items, cores.transpose(0, 1)) / self.tau + + if self.nogb: + cates = torch.softmax(cates_logits, dim=1) + else: + cates_dist = OneHotCategorical(logits=cates_logits) + cates_sample = cates_dist.sample() + cates_mode = torch.softmax(cates_logits, dim=1) + cates = (self.training * cates_sample + (1 - self.training) * cates_mode) + + probs = None + mulist = [] + logvarlist = [] + for k in range(self.kfac): + cates_k = cates[:, k].reshape(1, -1) + # encoder + x_k = rating_matrix * cates_k + h = self.encoder(x_k) + mu = h[:, :self.embedding_size] + mu = F.normalize(mu, dim=1) + logvar = h[:, self.embedding_size:] + + mulist.append(mu) + logvarlist.append(logvar) + + z = self.reparameterize(mu, logvar) + + # decoder + z_k = F.normalize(z, dim=1) + logits_k = torch.matmul(z_k, items.transpose(0, 1)) / self.tau + probs_k = torch.exp(logits_k) + probs_k = probs_k * cates_k + probs = (probs_k if (probs is None) else (probs + probs_k)) + + logits = torch.log(probs) + + return logits, mulist, logvarlist + + def calculate_loss(self, interaction): + + user = interaction[self.USER_ID] + + rating_matrix = self.get_rating_matrix(user) + + self.update += 1 + if self.total_anneal_steps > 0: + anneal = min(self.anneal_cap, 1. * self.update / self.total_anneal_steps) + else: + anneal = self.anneal_cap + + z, mu, logvar = self.forward(rating_matrix) + kl_loss = None + for i in range(self.kfac): + kl_ = -0.5 * torch.mean(torch.sum(1 + logvar[i] - logvar[i].exp(), dim=1)) + kl_loss = (kl_ if (kl_loss is None) else (kl_loss + kl_)) + + # CE loss + ce_loss = -(F.log_softmax(z, 1) * rating_matrix).sum(1).mean() + + if self.regs[0] != 0 or self.regs[1] != 0: + return ce_loss + kl_loss * anneal + self.reg_loss() + + return ce_loss + kl_loss * anneal + + def reg_loss(self): + r"""Calculate the L2 normalization loss of model parameters. + Including embedding matrices and weight matrices of model. + + Returns: + loss(torch.FloatTensor): The L2 Loss tensor. shape of [1,] + """ + reg_1, reg_2 = self.regs[:2] + loss_1 = reg_1 * self.item_embedding.weight.norm(2) + loss_2 = reg_1 * self.k_embedding.weight.norm(2) + loss_3 = 0 + for name, parm in self.encoder.named_parameters(): + if name.endswith('weight'): + loss_3 = loss_3 + reg_2 * parm.norm(2) + return loss_1 + loss_2 + loss_3 + + def predict(self, interaction): + + user = interaction[self.USER_ID] + item = interaction[self.ITEM_ID] + + rating_matrix = self.get_rating_matrix(user) + + scores, _, _ = self.forward(rating_matrix) + + return scores[[user, item]] + + def full_sort_predict(self, interaction): + user = interaction[self.USER_ID] + + rating_matrix = self.get_rating_matrix(user) + + scores, _, _ = self.forward(rating_matrix) + + return scores.view(-1) diff --git a/recbole/model/general_recommender/multidae.py b/recbole/model/general_recommender/multidae.py new file mode 100644 index 000000000..3b3658b64 --- /dev/null +++ b/recbole/model/general_recommender/multidae.py @@ -0,0 +1,117 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/12/14 +# @Author : Yihong Guo +# @Email : gyihong@hotmail.com + +r""" +MultiDAE +################################################ +Reference: + Dawen Liang et al. "Variational Autoencoders for Collaborative Filtering." in WWW 2018. + +""" + +import torch +import torch.nn as nn +import torch.nn.functional as F + +from recbole.model.abstract_recommender import GeneralRecommender +from recbole.model.init import xavier_normal_initialization +from recbole.model.layers import MLPLayers +from recbole.utils import InputType + + +class MultiDAE(GeneralRecommender): + r"""MultiDAE is a item-based model collaborative filtering model that simultaneously rank all items for user . + + We implement the the MultiDAE model with only user dataloader. + """ + input_type = InputType.PAIRWISE + + def __init__(self, config, dataset): + super(MultiDAE, self).__init__(config, dataset) + + self.layers = config["mlp_hidden_size"] + self.lat_dim = config['latent_dimension'] + self.drop_out = config['dropout_prob'] + + self.history_item_id, self.history_item_value, _ = dataset.history_item_matrix() + self.history_item_id = self.history_item_id.to(self.device) + self.history_item_value = self.history_item_value.to(self.device) + + self.encode_layer_dims = [self.n_items] + self.layers + [self.lat_dim] + self.decode_layer_dims = [self.lat_dim] + self.encode_layer_dims[::-1][1:] + + self.encoder = MLPLayers(self.encode_layer_dims, activation='tanh') + self.decoder = self.mlp_layers(self.decode_layer_dims) + + # parameters initialization + self.apply(xavier_normal_initialization) + + def get_rating_matrix(self, user): + r"""Get a batch of user's feature with the user's id and history interaction matrix. + + Args: + user (torch.LongTensor): The input tensor that contains user's id, shape: [batch_size, ] + + Returns: + torch.FloatTensor: The user's feature of a batch of user, shape: [batch_size, n_items] + """ + # Following lines construct tensor of shape [B,n_items] using the tensor of shape [B,H] + col_indices = self.history_item_id[user].flatten() + row_indices = torch.arange(user.shape[0]).to(self.device) \ + .repeat_interleave(self.history_item_id.shape[1], dim=0) + rating_matrix = torch.zeros(1).to(self.device).repeat(user.shape[0], self.n_items) + rating_matrix.index_put_((row_indices, col_indices), self.history_item_value[user].flatten()) + return rating_matrix + + def mlp_layers(self, layer_dims): + mlp_modules = [] + for i, (d_in, d_out) in enumerate(zip(layer_dims[:-1], layer_dims[1:])): + mlp_modules.append(nn.Linear(d_in, d_out)) + if i != len(layer_dims[:-1]) - 1: + mlp_modules.append(nn.Tanh()) + return nn.Sequential(*mlp_modules) + + def forward(self, rating_matrix): + + h = F.normalize(rating_matrix) + + h = F.dropout(h, self.drop_out, training=self.training) + + h = self.encoder(h) + return self.decoder(h) + + def calculate_loss(self, interaction): + + user = interaction[self.USER_ID] + + rating_matrix = self.get_rating_matrix(user) + + z = self.forward(rating_matrix) + + # CE loss + ce_loss = -(F.log_softmax(z, 1) * rating_matrix).sum(1).mean() + + return ce_loss + + def predict(self, interaction): + + user = interaction[self.USER_ID] + item = interaction[self.ITEM_ID] + + rating_matrix = self.get_rating_matrix(user) + + scores = self.forward(rating_matrix) + + return scores[[user, item]] + + def full_sort_predict(self, interaction): + + user = interaction[self.USER_ID] + + rating_matrix = self.get_rating_matrix(user) + + scores = self.forward(rating_matrix) + + return scores.view(-1) diff --git a/recbole/model/general_recommender/multivae.py b/recbole/model/general_recommender/multivae.py new file mode 100644 index 000000000..83495fa12 --- /dev/null +++ b/recbole/model/general_recommender/multivae.py @@ -0,0 +1,141 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/12/14 +# @Author : Yihong Guo +# @Email : gyihong@hotmail.com + +r""" +MultiVAE +################################################ +Reference: + Dawen Liang et al. "Variational Autoencoders for Collaborative Filtering." in WWW 2018. + +""" + +import torch +import torch.nn as nn +import torch.nn.functional as F + +from recbole.model.abstract_recommender import GeneralRecommender +from recbole.model.init import xavier_normal_initialization +from recbole.utils import InputType + + +class MultiVAE(GeneralRecommender): + r"""MultiVAE is a item-based model collaborative filtering model that simultaneously rank all items for user . + + We implement the MultiVAE model with only user dataloader. + """ + input_type = InputType.PAIRWISE + + def __init__(self, config, dataset): + super(MultiVAE, self).__init__(config, dataset) + + self.layers = config["mlp_hidden_size"] + self.lat_dim = config['latent_dimension'] + self.drop_out = config['dropout_prob'] + self.anneal_cap = config['anneal_cap'] + self.total_anneal_steps = config["total_anneal_steps"] + + self.history_item_id, self.history_item_value, _ = dataset.history_item_matrix() + self.history_item_id = self.history_item_id.to(self.device) + self.history_item_value = self.history_item_value.to(self.device) + + self.update = 0 + + self.encode_layer_dims = [self.n_items] + self.layers + [self.lat_dim] + self.decode_layer_dims = [int(self.lat_dim / 2)] + self.encode_layer_dims[::-1][1:] + + self.encoder = self.mlp_layers(self.encode_layer_dims) + self.decoder = self.mlp_layers(self.decode_layer_dims) + + # parameters initialization + self.apply(xavier_normal_initialization) + + def get_rating_matrix(self, user): + r"""Get a batch of user's feature with the user's id and history interaction matrix. + + Args: + user (torch.LongTensor): The input tensor that contains user's id, shape: [batch_size, ] + + Returns: + torch.FloatTensor: The user's feature of a batch of user, shape: [batch_size, n_items] + """ + # Following lines construct tensor of shape [B,n_items] using the tensor of shape [B,H] + col_indices = self.history_item_id[user].flatten() + row_indices = torch.arange(user.shape[0]).to(self.device) \ + .repeat_interleave(self.history_item_id.shape[1], dim=0) + rating_matrix = torch.zeros(1).to(self.device).repeat(user.shape[0], self.n_items) + rating_matrix.index_put_((row_indices, col_indices), self.history_item_value[user].flatten()) + return rating_matrix + + def mlp_layers(self, layer_dims): + mlp_modules = [] + for i, (d_in, d_out) in enumerate(zip(layer_dims[:-1], layer_dims[1:])): + mlp_modules.append(nn.Linear(d_in, d_out)) + if i != len(layer_dims[:-1]) - 1: + mlp_modules.append(nn.Tanh()) + return nn.Sequential(*mlp_modules) + + def reparameterize(self, mu, logvar): + if self.training: + std = torch.exp(0.5 * logvar) + epsilon = torch.zeros_like(std).normal_(mean=0, std=0.01) + return mu + epsilon * std + else: + return mu + + def forward(self, rating_matrix): + + h = F.normalize(rating_matrix) + + h = F.dropout(h, self.drop_out, training=self.training) + + h = self.encoder(h) + + mu = h[:, :int(self.lat_dim / 2)] + logvar = h[:, int(self.lat_dim / 2):] + + z = self.reparameterize(mu, logvar) + z = self.decoder(z) + return z, mu, logvar + + def calculate_loss(self, interaction): + + user = interaction[self.USER_ID] + rating_matrix = self.get_rating_matrix(user) + + self.update += 1 + if self.total_anneal_steps > 0: + anneal = min(self.anneal_cap, 1. * self.update / self.total_anneal_steps) + else: + anneal = self.anneal_cap + + z, mu, logvar = self.forward(rating_matrix) + + # KL loss + kl_loss = -0.5 * torch.mean(torch.sum(1 + logvar - mu.pow(2) - logvar.exp(), dim=1)) * anneal + + # CE loss + ce_loss = -(F.log_softmax(z, 1) * rating_matrix).sum(1).mean() + + return ce_loss + kl_loss + + def predict(self, interaction): + + user = interaction[self.USER_ID] + item = interaction[self.ITEM_ID] + + rating_matrix = self.get_rating_matrix(user) + + scores, _, _ = self.forward(rating_matrix) + + return scores[[user, item]] + + def full_sort_predict(self, interaction): + user = interaction[self.USER_ID] + + rating_matrix = self.get_rating_matrix(user) + + scores, _, _ = self.forward(rating_matrix) + + return scores.view(-1) diff --git a/recbole/model/general_recommender/nais.py b/recbole/model/general_recommender/nais.py index 9c408a3b9..c59cf594b 100644 --- a/recbole/model/general_recommender/nais.py +++ b/recbole/model/general_recommender/nais.py @@ -18,14 +18,13 @@ https://github.com/AaronHeee/Neural-Attentive-Item-Similarity-Model """ -from logging import getLogger - import torch import torch.nn as nn +from torch.nn.init import constant_, normal_, xavier_normal_ + from recbole.model.abstract_recommender import GeneralRecommender from recbole.model.layers import MLPLayers from recbole.utils import InputType -from torch.nn.init import constant_, normal_, xavier_normal_ class NAIS(GeneralRecommender): @@ -45,9 +44,8 @@ def __init__(self, config, dataset): # load dataset info self.LABEL = config['LABEL_FIELD'] - self.logger = getLogger() - # get all users's history interaction information.the history item + # get all users' history interaction information.the history item # matrix is padding by the maximum number of a user's interactions self.history_item_matrix, self.history_lens, self.mask_mat = self.get_history_info(dataset) @@ -76,7 +74,7 @@ def __init__(self, config, dataset): self.item_dst_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) self.bias = nn.Parameter(torch.zeros(self.n_items)) if self.algorithm == 'concat': - self.mlp_layers = MLPLayers([self.embedding_size*2, self.weight_size]) + self.mlp_layers = MLPLayers([self.embedding_size * 2, self.weight_size]) elif self.algorithm == 'prod': self.mlp_layers = MLPLayers([self.embedding_size, self.weight_size]) else: @@ -89,7 +87,7 @@ def __init__(self, config, dataset): self.logger.info('use pretrain from [{}]...'.format(self.pretrain_path)) self._load_pretrain() else: - self.logger.info('unuse pretrain...') + self.logger.info('unused pretrain...') self.apply(self._init_weights) def _init_weights(self, module): @@ -143,7 +141,7 @@ def reg_loss(self): Returns: torch.Tensor: reg loss - """ + """ reg_1, reg_2, reg_3 = self.reg_weights loss_1 = reg_1 * self.item_src_embedding.weight.norm(2) loss_2 = reg_2 * self.item_dst_embedding.weight.norm(2) @@ -167,7 +165,8 @@ def attention_mlp(self, inter, target): if self.algorithm == 'prod': mlp_input = inter * target.unsqueeze(1) # batch_size x max_len x embedding_size else: - mlp_input = torch.cat([inter, target.unsqueeze(1).expand_as(inter)], dim=2) # batch_size x max_len x embedding_size*2 + mlp_input = torch.cat([inter, target.unsqueeze(1).expand_as(inter)], + dim=2) # batch_size x max_len x embedding_size*2 mlp_output = self.mlp_layers(mlp_input) # batch_size x max_len x weight_size logits = torch.matmul(mlp_output, self.weight_layer).squeeze(2) # batch_size x max_len @@ -177,9 +176,9 @@ def mask_softmax(self, similarity, logits, bias, item_num, batch_mask_mat): """softmax the unmasked user history items and get the final output Args: - similarity (torch.Tensor): the similarity between the histoy items and target items + similarity (torch.Tensor): the similarity between the history items and target items logits (torch.Tensor): the initial weights of the history items - item_num (torch.Tensor): user hitory interaction lengths + item_num (torch.Tensor): user history interaction lengths bias (torch.Tensor): bias batch_mask_mat (torch.Tensor): the mask of user history interactions @@ -189,7 +188,7 @@ def mask_softmax(self, similarity, logits, bias, item_num, batch_mask_mat): """ exp_logits = torch.exp(logits) # batch_size x max_len - exp_logits = batch_mask_mat * exp_logits # batch_size x max_len + exp_logits = batch_mask_mat * exp_logits # batch_size x max_len exp_sum = torch.sum(exp_logits, dim=1, keepdim=True) exp_sum = torch.pow(exp_sum, self.beta) weights = torch.div(exp_logits, exp_sum) @@ -203,9 +202,9 @@ def softmax(self, similarity, logits, item_num, bias): """softmax the user history features and get the final output Args: - similarity (torch.Tensor): the similarity between the histoy items and target items + similarity (torch.Tensor): the similarity between the history items and target items logits (torch.Tensor): the initial weights of the history items - item_num (torch.Tensor): user hitory interaction lengths + item_num (torch.Tensor): user history interaction lengths bias (torch.Tensor): bias Returns: @@ -241,7 +240,7 @@ def user_forward(self, user_input, item_num, repeats=None, pred_slc=None): Args: user_input (torch.Tensor): user input tensor - item_num (torch.Tensor): user hitory interaction lens + item_num (torch.Tensor): user history interaction lens repeats (int, optional): the number of items to be evaluated pred_slc (torch.Tensor, optional): continuous index which controls the current evaluation items, if pred_slc is None, it will evaluate all items diff --git a/recbole/model/general_recommender/neumf.py b/recbole/model/general_recommender/neumf.py index bd95a4bab..33d11d46f 100644 --- a/recbole/model/general_recommender/neumf.py +++ b/recbole/model/general_recommender/neumf.py @@ -19,9 +19,9 @@ import torch.nn as nn from torch.nn.init import normal_ -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender from recbole.model.layers import MLPLayers +from recbole.utils import InputType class NeuMF(GeneralRecommender): @@ -57,10 +57,10 @@ def __init__(self, config, dataset): self.item_mf_embedding = nn.Embedding(self.n_items, self.mf_embedding_size) self.user_mlp_embedding = nn.Embedding(self.n_users, self.mlp_embedding_size) self.item_mlp_embedding = nn.Embedding(self.n_items, self.mlp_embedding_size) - self.mlp_layers = MLPLayers([2 * self.mlp_embedding_size] + self.mlp_hidden_size) + self.mlp_layers = MLPLayers([2 * self.mlp_embedding_size] + self.mlp_hidden_size, self.dropout_prob) self.mlp_layers.logger = None # remove logger to use torch.save() if self.mf_train and self.mlp_train: - self.predict_layer = nn.Linear(self.mf_embedding_size + self.mlp_hidden_size[-1], 1, self.dropout_prob) + self.predict_layer = nn.Linear(self.mf_embedding_size + self.mlp_hidden_size[-1], 1) elif self.mf_train: self.predict_layer = nn.Linear(self.mf_embedding_size, 1) elif self.mlp_train: @@ -90,8 +90,7 @@ def load_pretrain(self): m1.weight.data.copy_(m2.weight) m1.bias.data.copy_(m2.bias) - predict_weight = torch.cat([mf.predict_layer.weight, - mlp.predict_layer.weight], dim=1) + predict_weight = torch.cat([mf.predict_layer.weight, mlp.predict_layer.weight], dim=1) predict_bias = mf.predict_layer.bias + mlp.predict_layer.bias self.predict_layer.weight.data.copy_(0.5 * predict_weight) @@ -107,9 +106,9 @@ def forward(self, user, item): user_mlp_e = self.user_mlp_embedding(user) item_mlp_e = self.item_mlp_embedding(item) if self.mf_train: - mf_output = torch.mul(user_mf_e, item_mf_e) # [batch_size, embedding_size] + mf_output = torch.mul(user_mf_e, item_mf_e) # [batch_size, embedding_size] if self.mlp_train: - mlp_output = self.mlp_layers(torch.cat((user_mlp_e, item_mlp_e), -1)) # [batch_size, layers[-1]] + mlp_output = self.mlp_layers(torch.cat((user_mlp_e, item_mlp_e), -1)) # [batch_size, layers[-1]] if self.mf_train and self.mlp_train: output = self.sigmoid(self.predict_layer(torch.cat((mf_output, mlp_output), -1))) elif self.mf_train: diff --git a/recbole/model/general_recommender/ngcf.py b/recbole/model/general_recommender/ngcf.py index 6416d0e25..8352aec53 100644 --- a/recbole/model/general_recommender/ngcf.py +++ b/recbole/model/general_recommender/ngcf.py @@ -25,42 +25,11 @@ import torch.nn as nn import torch.nn.functional as F -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender -from recbole.model.loss import BPRLoss, EmbLoss -from recbole.model.layers import BiGNNLayer from recbole.model.init import xavier_normal_initialization - - -def sparse_dropout(x, rate, noise_shape): - r"""This is a function that execute Dropout on Pytorch sparse tensor. - - A random dropout will be applied to the input sparse tensor. - - Note: - input tensor SHOULD be a sparse float tensor. - we suggest to use '._nnz()' as the shape of sparse tensor for an easy calling. - - Args: - x (torch.sparse.FloatTensor): The input sparse tensor. - rate (float): Dropout rate which should in [0,1]. - noise_shape(tuple): Shape of the input sparse tensor. suggest '._nnz()' - - Returns: - torch.sparse.FloatTensor: The result sparse tensor after dropout. - - """ - random_tensor = 1 - rate - random_tensor += torch.rand(noise_shape).to(x.device) - dropout_mask = torch.floor(random_tensor).type(torch.bool) - i = x._indices() - v = x._values() - - i = i[:, dropout_mask] - v = v[dropout_mask] - - out = torch.sparse.FloatTensor(i, v, x.shape).to(x.device) - return out * (1. / (1 - rate)) +from recbole.model.layers import BiGNNLayer, SparseDropout +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType class NGCF(GeneralRecommender): @@ -84,6 +53,7 @@ def __init__(self, config, dataset): self.reg_weight = config['reg_weight'] # define layers and loss + self.sparse_dropout = SparseDropout(self.node_dropout) self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) self.item_embedding = nn.Embedding(self.n_items, self.embedding_size) self.GNNlayers = torch.nn.ModuleList() @@ -119,14 +89,12 @@ def get_norm_adj_mat(self): A = sp.dok_matrix((self.n_users + self.n_items, self.n_users + self.n_items), dtype=np.float32) inter_M = self.interaction_matrix inter_M_t = self.interaction_matrix.transpose() - data_dict = dict(zip(zip(inter_M.row, inter_M.col + self.n_users), - [1] * inter_M.nnz)) - data_dict.update(dict(zip(zip(inter_M_t.row + self.n_users, inter_M_t.col), - [1] * inter_M_t.nnz))) + data_dict = dict(zip(zip(inter_M.row, inter_M.col + self.n_users), [1] * inter_M.nnz)) + data_dict.update(dict(zip(zip(inter_M_t.row + self.n_users, inter_M_t.col), [1] * inter_M_t.nnz))) A._update(data_dict) # norm adj matrix sumArr = (A > 0).sum(axis=1) - diag = np.array(sumArr.flatten())[0] + 1e-7 # add epsilon to avoid Devide by zero Warning + diag = np.array(sumArr.flatten())[0] + 1e-7 # add epsilon to avoid divide by zero Warning diag = np.power(diag, -0.5) D = sp.diags(diag) L = D * A * D @@ -162,9 +130,8 @@ def get_ego_embeddings(self): return ego_embeddings def forward(self): - # A_hat: spare tensor with shape of [n_items+n_users,n_items+n_users] - A_hat = sparse_dropout(self.norm_adj_matrix, self.node_dropout, - self.norm_adj_matrix._nnz()) if self.node_dropout != 0 else self.norm_adj_matrix + + A_hat = self.sparse_dropout(self.norm_adj_matrix) if self.node_dropout != 0 else self.norm_adj_matrix all_embeddings = self.get_ego_embeddings() embeddings_list = [all_embeddings] for gnn in self.GNNlayers: @@ -190,14 +157,14 @@ def calculate_loss(self, interaction): user_all_embeddings, item_all_embeddings = self.forward() u_embeddings = user_all_embeddings[user] - posi_embeddings = item_all_embeddings[pos_item] - negi_embeddings = item_all_embeddings[neg_item] + pos_embeddings = item_all_embeddings[pos_item] + neg_embeddings = item_all_embeddings[neg_item] - pos_scores = torch.mul(u_embeddings, posi_embeddings).sum(dim=1) - neg_scores = torch.mul(u_embeddings, negi_embeddings).sum(dim=1) + pos_scores = torch.mul(u_embeddings, pos_embeddings).sum(dim=1) + neg_scores = torch.mul(u_embeddings, neg_embeddings).sum(dim=1) mf_loss = self.mf_loss(pos_scores, neg_scores) # calculate BPR Loss - reg_loss = self.reg_loss(u_embeddings, posi_embeddings, negi_embeddings) # L2 regularization of embeddings + reg_loss = self.reg_loss(u_embeddings, pos_embeddings, neg_embeddings) # L2 regularization of embeddings return mf_loss + self.reg_weight * reg_loss diff --git a/recbole/model/general_recommender/pop.py b/recbole/model/general_recommender/pop.py index 879283d5a..6a0cc261e 100644 --- a/recbole/model/general_recommender/pop.py +++ b/recbole/model/general_recommender/pop.py @@ -6,6 +6,7 @@ # @Time : 2020/11/9 # @Author : Zihan Lin # @Email : zhlin@ruc.edu.cn + r""" Pop ################################################ @@ -14,8 +15,8 @@ import torch -from recbole.utils import InputType, ModelType from recbole.model.abstract_recommender import GeneralRecommender +from recbole.utils import InputType, ModelType class Pop(GeneralRecommender): @@ -36,7 +37,6 @@ def forward(self): pass def calculate_loss(self, interaction): - item = interaction[self.ITEM_ID] self.item_cnt[item, :] = self.item_cnt[item, :] + 1 @@ -45,7 +45,6 @@ def calculate_loss(self, interaction): return torch.nn.Parameter(torch.zeros(1)) def predict(self, interaction): - item = interaction[self.ITEM_ID] result = torch.true_divide(self.item_cnt[item, :], self.max_cnt) return result.squeeze() diff --git a/recbole/model/general_recommender/spectralcf.py b/recbole/model/general_recommender/spectralcf.py index f15715097..83e17e076 100644 --- a/recbole/model/general_recommender/spectralcf.py +++ b/recbole/model/general_recommender/spectralcf.py @@ -18,10 +18,10 @@ import scipy.sparse as sp import torch -from recbole.utils import InputType from recbole.model.abstract_recommender import GeneralRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_uniform_initialization +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType class SpectralCF(GeneralRecommender): @@ -58,22 +58,21 @@ def __init__(self, config, dataset): # generate intermediate data # "A_hat = I + L" is equivalent to "A_hat = U U^T + U \Lambda U^T" - self.interaction_matrix = dataset.inter_matrix( - form='coo').astype(np.float32) + self.interaction_matrix = dataset.inter_matrix(form='coo').astype(np.float32) I = self.get_eye_mat(self.n_items + self.n_users) L = self.get_laplacian_matrix() A_hat = I + L self.A_hat = A_hat.to(self.device) # define layers and loss - self.user_embedding = torch.nn.Embedding( - num_embeddings=self.n_users, embedding_dim=self.emb_dim) - self.item_embedding = torch.nn.Embedding( - num_embeddings=self.n_items, embedding_dim=self.emb_dim) - self.filters = torch.nn.ParameterList( - [torch.nn.Parameter(torch.normal(mean=0.01, std=0.02, size=(self.emb_dim, self.emb_dim)).to(self.device), - requires_grad=True) - for _ in range(self.n_layers)]) + self.user_embedding = torch.nn.Embedding(num_embeddings=self.n_users, embedding_dim=self.emb_dim) + self.item_embedding = torch.nn.Embedding(num_embeddings=self.n_items, embedding_dim=self.emb_dim) + self.filters = torch.nn.ParameterList([ + torch.nn.Parameter( + torch.normal(mean=0.01, std=0.02, size=(self.emb_dim, self.emb_dim)).to(self.device), + requires_grad=True + ) for _ in range(self.n_layers) + ]) self.sigmoid = torch.nn.Sigmoid() self.mf_loss = BPRLoss() @@ -94,14 +93,11 @@ def get_laplacian_matrix(self): Sparse tensor of the laplacian matrix. """ # build adj matrix - A = sp.dok_matrix((self.n_users + self.n_items, - self.n_users + self.n_items), dtype=np.float32) + A = sp.dok_matrix((self.n_users + self.n_items, self.n_users + self.n_items), dtype=np.float32) inter_M = self.interaction_matrix inter_M_t = self.interaction_matrix.transpose() - data_dict = dict(zip(zip(inter_M.row, inter_M.col+self.n_users), - [1]*inter_M.nnz)) - data_dict.update(dict(zip(zip(inter_M_t.row+self.n_users, inter_M_t.col), - [1]*inter_M_t.nnz))) + data_dict = dict(zip(zip(inter_M.row, inter_M.col + self.n_users), [1] * inter_M.nnz)) + data_dict.update(dict(zip(zip(inter_M_t.row + self.n_users, inter_M_t.col), [1] * inter_M_t.nnz))) A._update(data_dict) # norm adj matrix @@ -153,13 +149,11 @@ def forward(self): for k in range(self.n_layers): all_embeddings = torch.sparse.mm(self.A_hat, all_embeddings) - all_embeddings = self.sigmoid( - torch.mm(all_embeddings, self.filters[k])) + all_embeddings = self.sigmoid(torch.mm(all_embeddings, self.filters[k])) embeddings_list.append(all_embeddings) new_embeddings = torch.cat(embeddings_list, dim=1) - user_all_embeddings, item_all_embeddings = torch.split( - new_embeddings, [self.n_users, self.n_items]) + user_all_embeddings, item_all_embeddings = torch.split(new_embeddings, [self.n_users, self.n_items]) return user_all_embeddings, item_all_embeddings def calculate_loss(self, interaction): @@ -172,14 +166,13 @@ def calculate_loss(self, interaction): user_all_embeddings, item_all_embeddings = self.forward() u_embeddings = user_all_embeddings[user] - posi_embeddings = item_all_embeddings[pos_item] - negi_embeddings = item_all_embeddings[neg_item] - pos_scores = torch.mul(u_embeddings, posi_embeddings).sum(dim=1) - neg_scores = torch.mul(u_embeddings, negi_embeddings).sum(dim=1) + pos_embeddings = item_all_embeddings[pos_item] + neg_embeddings = item_all_embeddings[neg_item] + pos_scores = torch.mul(u_embeddings, pos_embeddings).sum(dim=1) + neg_scores = torch.mul(u_embeddings, neg_embeddings).sum(dim=1) mf_loss = self.mf_loss(pos_scores, neg_scores) - reg_loss = self.reg_loss( - u_embeddings, posi_embeddings, negi_embeddings) + reg_loss = self.reg_loss(u_embeddings, pos_embeddings, neg_embeddings) loss = mf_loss + self.reg_weight * reg_loss return loss @@ -201,6 +194,5 @@ def full_sort_predict(self, interaction): self.restore_user_e, self.restore_item_e = self.forward() u_embeddings = self.restore_user_e[user] - scores = torch.matmul( - u_embeddings, self.restore_item_e.transpose(0, 1)) + scores = torch.matmul(u_embeddings, self.restore_item_e.transpose(0, 1)) return scores.view(-1) diff --git a/recbole/model/knowledge_aware_recommender/cfkg.py b/recbole/model/knowledge_aware_recommender/cfkg.py index 489c79e69..dc21aa935 100644 --- a/recbole/model/knowledge_aware_recommender/cfkg.py +++ b/recbole/model/knowledge_aware_recommender/cfkg.py @@ -14,9 +14,9 @@ import torch.nn as nn import torch.nn.functional as F -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender from recbole.model.init import xavier_normal_initialization +from recbole.utils import InputType class CFKG(KnowledgeRecommender): @@ -83,7 +83,7 @@ def _get_kg_embedding(self, head, pos_tail, neg_tail, relation): def _get_score(self, h_e, t_e, r_e): if self.loss_function == 'transe': - return - torch.norm(h_e + r_e - t_e, p=2, dim=1) + return -torch.norm(h_e + r_e - t_e, p=2, dim=1) else: return torch.mul(h_e + r_e, t_e).sum(dim=1) @@ -124,4 +124,4 @@ def __init__(self): def forward(self, anchor, positive, negative): pos_score = torch.mul(anchor, positive).sum(dim=1) neg_score = torch.mul(anchor, negative).sum(dim=1) - return (F.softplus(- pos_score) + F.softplus(neg_score)).mean() + return (F.softplus(-pos_score) + F.softplus(neg_score)).mean() diff --git a/recbole/model/knowledge_aware_recommender/cke.py b/recbole/model/knowledge_aware_recommender/cke.py index 21738a653..180aea446 100644 --- a/recbole/model/knowledge_aware_recommender/cke.py +++ b/recbole/model/knowledge_aware_recommender/cke.py @@ -14,10 +14,10 @@ import torch.nn as nn import torch.nn.functional as F -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType class CKE(KnowledgeRecommender): diff --git a/recbole/model/knowledge_aware_recommender/kgat.py b/recbole/model/knowledge_aware_recommender/kgat.py index 13dd1a01e..02f4571d6 100644 --- a/recbole/model/knowledge_aware_recommender/kgat.py +++ b/recbole/model/knowledge_aware_recommender/kgat.py @@ -13,17 +13,16 @@ https://github.com/xiangwang1223/knowledge_graph_attention_network """ -import copy +import numpy as np +import scipy.sparse as sp import torch import torch.nn as nn import torch.nn.functional as F -import numpy as np -import scipy.sparse as sp -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType class Aggregator(nn.Module): @@ -99,7 +98,7 @@ def __init__(self, config, dataset): self.reg_weight = config['reg_weight'] # generate intermediate data - self.A_in = self.init_graph() # init the attention matrix by the structure of ckg + self.A_in = self.init_graph() # init the attention matrix by the structure of ckg # define layers and loss self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) @@ -128,7 +127,7 @@ def init_graph(self): adj_list = [] for rel_type in range(1, self.n_relations, 1): edge_idxs = self.ckg.filter_edges(lambda edge: edge.data['relation_id'] == rel_type) - sub_graph = dgl.edge_subgraph(self.ckg, edge_idxs, preserve_nodes=True).\ + sub_graph = dgl.edge_subgraph(self.ckg, edge_idxs, preserve_nodes=True). \ adjacency_matrix(transpose=False, scipy_fmt='coo').astype('float') rowsum = np.array(sub_graph.sum(1)) d_inv = np.power(rowsum, -1).flatten() @@ -184,13 +183,13 @@ def calculate_loss(self, interaction): user_all_embeddings, entity_all_embeddings = self.forward() u_embeddings = user_all_embeddings[user] - posi_embeddings = entity_all_embeddings[pos_item] - negi_embeddings = entity_all_embeddings[neg_item] + pos_embeddings = entity_all_embeddings[pos_item] + neg_embeddings = entity_all_embeddings[neg_item] - pos_scores = torch.mul(u_embeddings, posi_embeddings).sum(dim=1) - neg_scores = torch.mul(u_embeddings, negi_embeddings).sum(dim=1) + pos_scores = torch.mul(u_embeddings, pos_embeddings).sum(dim=1) + neg_scores = torch.mul(u_embeddings, neg_embeddings).sum(dim=1) mf_loss = self.mf_loss(pos_scores, neg_scores) - reg_loss = self.reg_loss(u_embeddings, posi_embeddings, negi_embeddings) + reg_loss = self.reg_loss(u_embeddings, pos_embeddings, neg_embeddings) loss = mf_loss + self.reg_weight * reg_loss return loss @@ -268,7 +267,7 @@ def update_attentive_A(self): # Current PyTorch version does not support softmax on SparseCUDA, temporarily move to CPU to calculate softmax A_in = torch.sparse.FloatTensor(indices, kg_score, self.matrix_size).cpu() A_in = torch.sparse.softmax(A_in, dim=1).to(self.device) - self.A_in = copy.copy(A_in) + self.A_in = A_in def predict(self, interaction): user = interaction[self.USER_ID] diff --git a/recbole/model/knowledge_aware_recommender/kgcn.py b/recbole/model/knowledge_aware_recommender/kgcn.py index ad48a7b86..b8e76b025 100644 --- a/recbole/model/knowledge_aware_recommender/kgcn.py +++ b/recbole/model/knowledge_aware_recommender/kgcn.py @@ -14,14 +14,14 @@ https://github.com/hwwang55/KGCN """ +import numpy as np import torch import torch.nn as nn -import numpy as np -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import EmbLoss +from recbole.utils import InputType class KGCN(KnowledgeRecommender): @@ -46,24 +46,24 @@ def __init__(self, config, dataset): # define embedding self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) - self.entity_embedding = nn.Embedding( - self.n_entities, self.embedding_size) - self.relation_embedding = nn.Embedding( - self.n_relations + 1, self.embedding_size) + self.entity_embedding = nn.Embedding(self.n_entities, self.embedding_size) + self.relation_embedding = nn.Embedding(self.n_relations + 1, self.embedding_size) # sample neighbors kg_graph = dataset.kg_graph(form='coo', value_field='relation_id') adj_entity, adj_relation = self.construct_adj(kg_graph) - self.adj_entity, self.adj_relation = adj_entity.to( - self.device), adj_relation.to(self.device) + self.adj_entity, self.adj_relation = adj_entity.to(self.device), adj_relation.to(self.device) # define function self.softmax = nn.Softmax(dim=-1) self.linear_layers = torch.nn.ModuleList() for i in range(self.n_iter): - self.linear_layers.append(nn.Linear( - self.embedding_size if not self.aggregator_class == 'concat' else self.embedding_size * 2, - self.embedding_size)) + self.linear_layers.append( + nn.Linear( + self.embedding_size if not self.aggregator_class == 'concat' else self.embedding_size * 2, + self.embedding_size + ) + ) self.ReLU = nn.ReLU() self.Tanh = nn.Tanh() @@ -86,7 +86,7 @@ def construct_adj(self, kg_graph): - adj_relation(torch.LongTensor): each line stores the corresponding sampled neighbor relations, shape: [n_entities, neighbor_sample_size] """ - # print('constructing knowledge graph ...') + # self.logger.info('constructing knowledge graph ...') # treat the KG as an undirected graph kg_dict = dict() for triple in zip(kg_graph.row, kg_graph.data, kg_graph.col): @@ -100,34 +100,30 @@ def construct_adj(self, kg_graph): kg_dict[tail] = [] kg_dict[tail].append((head, relation)) - # print('constructing adjacency matrix ...') + # self.logger.info('constructing adjacency matrix ...') # each line of adj_entity stores the sampled neighbor entities for a given entity # each line of adj_relation stores the corresponding sampled neighbor relations entity_num = kg_graph.shape[0] - adj_entity = np.zeros( - [entity_num, self.neighbor_sample_size], dtype=np.int64) - adj_relation = np.zeros( - [entity_num, self.neighbor_sample_size], dtype=np.int64) + adj_entity = np.zeros([entity_num, self.neighbor_sample_size], dtype=np.int64) + adj_relation = np.zeros([entity_num, self.neighbor_sample_size], dtype=np.int64) for entity in range(entity_num): if entity not in kg_dict.keys(): - adj_entity[entity] = np.array( - [entity] * self.neighbor_sample_size) - adj_relation[entity] = np.array( - [0] * self.neighbor_sample_size) + adj_entity[entity] = np.array([entity] * self.neighbor_sample_size) + adj_relation[entity] = np.array([0] * self.neighbor_sample_size) continue neighbors = kg_dict[entity] n_neighbors = len(neighbors) if n_neighbors >= self.neighbor_sample_size: - sampled_indices = np.random.choice(list(range(n_neighbors)), size=self.neighbor_sample_size, - replace=False) + sampled_indices = np.random.choice( + list(range(n_neighbors)), size=self.neighbor_sample_size, replace=False + ) else: - sampled_indices = np.random.choice(list(range(n_neighbors)), size=self.neighbor_sample_size, - replace=True) - adj_entity[entity] = np.array( - [neighbors[i][0] for i in sampled_indices]) - adj_relation[entity] = np.array( - [neighbors[i][1] for i in sampled_indices]) + sampled_indices = np.random.choice( + list(range(n_neighbors)), size=self.neighbor_sample_size, replace=True + ) + adj_entity[entity] = np.array([neighbors[i][0] for i in sampled_indices]) + adj_relation[entity] = np.array([neighbors[i][1] for i in sampled_indices]) return torch.from_numpy(adj_entity), torch.from_numpy(adj_relation) @@ -153,10 +149,8 @@ def get_neighbors(self, items): relations = [] for i in range(self.n_iter): index = torch.flatten(entities[i]) - neighbor_entities = torch.reshape(torch.index_select( - self.adj_entity, 0, index), (self.batch_size, -1)) - neighbor_relations = torch.reshape(torch.index_select( - self.adj_relation, 0, index), (self.batch_size, -1)) + neighbor_entities = torch.index_select(self.adj_entity, 0, index).reshape(self.batch_size, -1) + neighbor_relations = torch.index_select(self.adj_relation, 0, index).reshape(self.batch_size, -1) entities.append(neighbor_entities) relations.append(neighbor_relations) return entities, relations @@ -178,20 +172,22 @@ def mix_neighbor_vectors(self, neighbor_vectors, neighbor_relations, user_embedd """ avg = False if not avg: - user_embeddings = torch.reshape(user_embeddings, - (self.batch_size, 1, 1, self.embedding_size)) # [batch_size, 1, 1, dim] - user_relation_scores = torch.mean(user_embeddings * neighbor_relations, - dim=-1) # [batch_size, -1, n_neighbor] - user_relation_scores_normalized = self.softmax( - user_relation_scores) # [batch_size, -1, n_neighbor] - - user_relation_scores_normalized = torch.unsqueeze(user_relation_scores_normalized, - dim=-1) # [batch_size, -1, n_neighbor, 1] - neighbors_aggregated = torch.mean(user_relation_scores_normalized * neighbor_vectors, - dim=2) # [batch_size, -1, dim] - else: + user_embeddings = user_embeddings.reshape( + self.batch_size, 1, 1, self.embedding_size + ) # [batch_size, 1, 1, dim] + user_relation_scores = torch.mean( + user_embeddings * neighbor_relations, dim=-1 + ) # [batch_size, -1, n_neighbor] + user_relation_scores_normalized = self.softmax(user_relation_scores) # [batch_size, -1, n_neighbor] + + user_relation_scores_normalized = torch.unsqueeze( + user_relation_scores_normalized, dim=-1 + ) # [batch_size, -1, n_neighbor, 1] neighbors_aggregated = torch.mean( - neighbor_vectors, dim=2) # [batch_size, -1, dim] + user_relation_scores_normalized * neighbor_vectors, dim=2 + ) # [batch_size, -1, dim] + else: + neighbors_aggregated = torch.mean(neighbor_vectors, dim=2) # [batch_size, -1, dim] return neighbors_aggregated def aggregate(self, user_embeddings, entities, relations): @@ -218,36 +214,29 @@ def aggregate(self, user_embeddings, entities, relations): for i in range(self.n_iter): entity_vectors_next_iter = [] for hop in range(self.n_iter - i): - shape = (self.batch_size, -1, - self.neighbor_sample_size, self.embedding_size) + shape = (self.batch_size, -1, self.neighbor_sample_size, self.embedding_size) self_vectors = entity_vectors[hop] - neighbor_vectors = torch.reshape( - entity_vectors[hop + 1], shape) - neighbor_relations = torch.reshape( - relation_vectors[hop], shape) + neighbor_vectors = entity_vectors[hop + 1].reshape(shape) + neighbor_relations = relation_vectors[hop].reshape(shape) - neighbors_agg = self.mix_neighbor_vectors(neighbor_vectors, neighbor_relations, - user_embeddings) # [batch_size, -1, dim] + neighbors_agg = self.mix_neighbor_vectors( + neighbor_vectors, neighbor_relations, user_embeddings + ) # [batch_size, -1, dim] if self.aggregator_class == 'sum': - output = torch.reshape( - self_vectors + neighbors_agg, (-1, self.embedding_size)) # [-1, dim] + output = (self_vectors + neighbors_agg).reshape(-1, self.embedding_size) # [-1, dim] elif self.aggregator_class == 'neighbor': - output = torch.reshape( - neighbors_agg, (-1, self.embedding_size)) # [-1, dim] + output = neighbors_agg.reshape(-1, self.embedding_size) # [-1, dim] elif self.aggregator_class == 'concat': # [batch_size, -1, dim * 2] output = torch.cat([self_vectors, neighbors_agg], dim=-1) - output = torch.reshape( - output, (-1, self.embedding_size * 2)) # [-1, dim * 2] + output = output.reshape(-1, self.embedding_size * 2) # [-1, dim * 2] else: - raise Exception("Unknown aggregator: " + - self.aggregator_class) + raise Exception("Unknown aggregator: " + self.aggregator_class) output = self.linear_layers[i](output) # [batch_size, -1, dim] - output = torch.reshape( - output, [self.batch_size, -1, self.embedding_size]) + output = output.reshape(self.batch_size, -1, self.embedding_size) if i == self.n_iter - 1: vector = self.Tanh(output) @@ -257,8 +246,7 @@ def aggregate(self, user_embeddings, entities, relations): entity_vectors_next_iter.append(vector) entity_vectors = entity_vectors_next_iter - item_embeddings = torch.reshape( - entity_vectors[0], (self.batch_size, self.embedding_size)) + item_embeddings = entity_vectors[0].reshape(self.batch_size, self.embedding_size) return item_embeddings @@ -286,8 +274,7 @@ def calculate_loss(self, interaction): neg_item_score = torch.mul(user_e, neg_item_e).sum(dim=1) predict = torch.cat((pos_item_score, neg_item_score)) - target = torch.zeros( - len(user) * 2, dtype=torch.float32).to(self.device) + target = torch.zeros(len(user) * 2, dtype=torch.float32).to(self.device) target[:len(user)] = 1 rec_loss = self.bce_loss(predict, target) @@ -306,11 +293,9 @@ def full_sort_predict(self, interaction): user_index = interaction[self.USER_ID] item_index = torch.tensor(range(self.n_items)).to(self.device) - user = torch.unsqueeze(user_index, dim=1).repeat( - 1, item_index.shape[0]) + user = torch.unsqueeze(user_index, dim=1).repeat(1, item_index.shape[0]) user = torch.flatten(user) - item = torch.unsqueeze(item_index, dim=0).repeat( - user_index.shape[0], 1) + item = torch.unsqueeze(item_index, dim=0).repeat(user_index.shape[0], 1) item = torch.flatten(item) user_e, item_e = self.forward(user, item) diff --git a/recbole/model/knowledge_aware_recommender/kgnnls.py b/recbole/model/knowledge_aware_recommender/kgnnls.py index 8d18f30de..0e262a8cf 100644 --- a/recbole/model/knowledge_aware_recommender/kgnnls.py +++ b/recbole/model/knowledge_aware_recommender/kgnnls.py @@ -15,15 +15,16 @@ https://github.com/hwwang55/KGNN-LS """ +import random + +import numpy as np import torch import torch.nn as nn -import numpy as np -import random -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import EmbLoss +from recbole.utils import InputType class KGNNLS(KnowledgeRecommender): @@ -51,33 +52,31 @@ def __init__(self, config, dataset): # define embedding self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) - self.entity_embedding = nn.Embedding( - self.n_entities, self.embedding_size) - self.relation_embedding = nn.Embedding( - self.n_relations + 1, self.embedding_size) + self.entity_embedding = nn.Embedding(self.n_entities, self.embedding_size) + self.relation_embedding = nn.Embedding(self.n_relations + 1, self.embedding_size) # sample neighbors and construct interaction table kg_graph = dataset.kg_graph(form='coo', value_field='relation_id') adj_entity, adj_relation = self.construct_adj(kg_graph) - self.adj_entity, self.adj_relation = adj_entity.to( - self.device), adj_relation.to(self.device) + self.adj_entity, self.adj_relation = adj_entity.to(self.device), adj_relation.to(self.device) - inter_feat = dataset.dataset.inter_feat.values - pos_users = torch.from_numpy(inter_feat[:, 0]) - pos_items = torch.from_numpy(inter_feat[:, 1]) + inter_feat = dataset.dataset.inter_feat + pos_users = inter_feat[dataset.dataset.uid_field] + pos_items = inter_feat[dataset.dataset.iid_field] pos_label = torch.ones(pos_items.shape) - pos_interaction_table, self.offset = self.get_interaction_table( - pos_users, pos_items, pos_label) - self.interaction_table = self.sample_neg_interaction( - pos_interaction_table, self.offset) + pos_interaction_table, self.offset = self.get_interaction_table(pos_users, pos_items, pos_label) + self.interaction_table = self.sample_neg_interaction(pos_interaction_table, self.offset) # define function self.softmax = nn.Softmax(dim=-1) self.linear_layers = torch.nn.ModuleList() for i in range(self.n_iter): - self.linear_layers.append(nn.Linear( - self.embedding_size if not self.aggregator_class == 'concat' else self.embedding_size * 2, - self.embedding_size)) + self.linear_layers.append( + nn.Linear( + self.embedding_size if not self.aggregator_class == 'concat' else self.embedding_size * 2, + self.embedding_size + ) + ) self.ReLU = nn.ReLU() self.Tanh = nn.Tanh() @@ -145,7 +144,7 @@ def construct_adj(self, kg_graph): - adj_relation (torch.LongTensor): each line stores the corresponding sampled neighbor relations, shape: [n_entities, neighbor_sample_size] """ - # print('constructing knowledge graph ...') + # self.logger.info('constructing knowledge graph ...') # treat the KG as an undirected graph kg_dict = dict() for triple in zip(kg_graph.row, kg_graph.data, kg_graph.col): @@ -159,34 +158,30 @@ def construct_adj(self, kg_graph): kg_dict[tail] = [] kg_dict[tail].append((head, relation)) - # print('constructing adjacency matrix ...') + # self.logger.info('constructing adjacency matrix ...') # each line of adj_entity stores the sampled neighbor entities for a given entity # each line of adj_relation stores the corresponding sampled neighbor relations entity_num = kg_graph.shape[0] - adj_entity = np.zeros( - [entity_num, self.neighbor_sample_size], dtype=np.int64) - adj_relation = np.zeros( - [entity_num, self.neighbor_sample_size], dtype=np.int64) + adj_entity = np.zeros([entity_num, self.neighbor_sample_size], dtype=np.int64) + adj_relation = np.zeros([entity_num, self.neighbor_sample_size], dtype=np.int64) for entity in range(entity_num): if entity not in kg_dict.keys(): - adj_entity[entity] = np.array( - [entity] * self.neighbor_sample_size) - adj_relation[entity] = np.array( - [0] * self.neighbor_sample_size) + adj_entity[entity] = np.array([entity] * self.neighbor_sample_size) + adj_relation[entity] = np.array([0] * self.neighbor_sample_size) continue neighbors = kg_dict[entity] n_neighbors = len(neighbors) if n_neighbors >= self.neighbor_sample_size: - sampled_indices = np.random.choice(list(range(n_neighbors)), size=self.neighbor_sample_size, - replace=False) + sampled_indices = np.random.choice( + list(range(n_neighbors)), size=self.neighbor_sample_size, replace=False + ) else: - sampled_indices = np.random.choice(list(range(n_neighbors)), size=self.neighbor_sample_size, - replace=True) - adj_entity[entity] = np.array( - [neighbors[i][0] for i in sampled_indices]) - adj_relation[entity] = np.array( - [neighbors[i][1] for i in sampled_indices]) + sampled_indices = np.random.choice( + list(range(n_neighbors)), size=self.neighbor_sample_size, replace=True + ) + adj_entity[entity] = np.array([neighbors[i][0] for i in sampled_indices]) + adj_relation[entity] = np.array([neighbors[i][1] for i in sampled_indices]) return torch.from_numpy(adj_entity), torch.from_numpy(adj_relation) @@ -212,10 +207,8 @@ def get_neighbors(self, items): relations = [] for i in range(self.n_iter): index = torch.flatten(entities[i]) - neighbor_entities = torch.reshape(torch.index_select( - self.adj_entity, 0, index), (self.batch_size, -1)) - neighbor_relations = torch.reshape(torch.index_select( - self.adj_relation, 0, index), (self.batch_size, -1)) + neighbor_entities = torch.index_select(self.adj_entity, 0, index).reshape(self.batch_size, -1) + neighbor_relations = torch.index_select(self.adj_relation, 0, index).reshape(self.batch_size, -1) entities.append(neighbor_entities) relations.append(neighbor_relations) return entities, relations @@ -244,43 +237,39 @@ def aggregate(self, user_embeddings, entities, relations): for i in range(self.n_iter): entity_vectors_next_iter = [] for hop in range(self.n_iter - i): - shape = (self.batch_size, -1, - self.neighbor_sample_size, self.embedding_size) + shape = (self.batch_size, -1, self.neighbor_sample_size, self.embedding_size) self_vectors = entity_vectors[hop] - neighbor_vectors = torch.reshape( - entity_vectors[hop + 1], shape) - neighbor_relations = torch.reshape( - relation_vectors[hop], shape) + neighbor_vectors = entity_vectors[hop + 1].reshape(shape) + neighbor_relations = relation_vectors[hop].reshape(shape) # mix_neighbor_vectors - user_embeddings = torch.reshape(user_embeddings, - (self.batch_size, 1, 1, self.embedding_size)) # [batch_size, 1, 1, dim] - user_relation_scores = torch.mean(user_embeddings * neighbor_relations, - dim=-1) # [batch_size, -1, n_neighbor] - user_relation_scores_normalized = torch.unsqueeze(self.softmax(user_relation_scores), - dim=-1) # [batch_size, -1, n_neighbor, 1] - neighbors_agg = torch.mean(user_relation_scores_normalized * neighbor_vectors, - dim=2) # [batch_size, -1, dim] + user_embeddings = user_embeddings.reshape( + self.batch_size, 1, 1, self.embedding_size + ) # [batch_size, 1, 1, dim] + user_relation_scores = torch.mean( + user_embeddings * neighbor_relations, dim=-1 + ) # [batch_size, -1, n_neighbor] + user_relation_scores_normalized = torch.unsqueeze( + self.softmax(user_relation_scores), dim=-1 + ) # [batch_size, -1, n_neighbor, 1] + neighbors_agg = torch.mean( + user_relation_scores_normalized * neighbor_vectors, dim=2 + ) # [batch_size, -1, dim] if self.aggregator_class == 'sum': - output = torch.reshape( - self_vectors + neighbors_agg, (-1, self.embedding_size)) # [-1, dim] + output = (self_vectors + neighbors_agg).reshape(-1, self.embedding_size) # [-1, dim] elif self.aggregator_class == 'neighbor': - output = torch.reshape( - neighbors_agg, (-1, self.embedding_size)) # [-1, dim] + output = neighbors_agg.reshape(-1, self.embedding_size) # [-1, dim] elif self.aggregator_class == 'concat': # [batch_size, -1, dim * 2] output = torch.cat([self_vectors, neighbors_agg], dim=-1) - output = torch.reshape( - output, (-1, self.embedding_size * 2)) # [-1, dim * 2] + output = output.reshape(-1, self.embedding_size * 2) # [-1, dim * 2] else: - raise Exception("Unknown aggregator: " + - self.aggregator_class) + raise Exception("Unknown aggregator: " + self.aggregator_class) output = self.linear_layers[i](output) # [batch_size, -1, dim] - output = torch.reshape( - output, [self.batch_size, -1, self.embedding_size]) + output = output.reshape(self.batch_size, -1, self.embedding_size) if i == self.n_iter - 1: vector = self.Tanh(output) @@ -290,8 +279,7 @@ def aggregate(self, user_embeddings, entities, relations): entity_vectors_next_iter.append(vector) entity_vectors = entity_vectors_next_iter - res = torch.reshape( - entity_vectors[0], (self.batch_size, self.embedding_size)) + res = entity_vectors[0].reshape(self.batch_size, self.embedding_size) return res def label_smoothness_predict(self, user_embeddings, user, entities, relations): @@ -320,8 +308,7 @@ def label_smoothness_predict(self, user_embeddings, user, entities, relations): for entities_per_iter in entities: users = torch.unsqueeze(user, dim=1) # [batch_size, 1] - user_entity_concat = users * self.offset + \ - entities_per_iter # [batch_size, n_neighbor^i] + user_entity_concat = users * self.offset + entities_per_iter # [batch_size, n_neighbor^i] # the first one in entities is the items to be held out if holdout_item_for_user is None: @@ -340,10 +327,9 @@ def lookup_interaction_table(x, _): holdout_mask = (holdout_item_for_user - user_entity_concat).bool() # True if the entity is a labeled item reset_mask = (initial_label - 0.5).bool() - reset_mask = torch.logical_and( - reset_mask, holdout_mask) # remove held-out items - initial_label = holdout_mask.float() * initial_label + torch.logical_not( - holdout_mask).float() * 0.5 # label initialization + reset_mask = torch.logical_and(reset_mask, holdout_mask) # remove held-out items + initial_label = holdout_mask.float() * initial_label + \ + torch.logical_not(holdout_mask).float() * 0.5 # label initialization reset_masks.append(reset_mask) entity_labels.append(initial_label) @@ -357,24 +343,25 @@ def lookup_interaction_table(x, _): for hop in range(self.n_iter - i): masks = reset_masks[hop] self_labels = entity_labels[hop] - neighbor_labels = torch.reshape(entity_labels[hop + 1], - [self.batch_size, -1, self.neighbor_sample_size]) - neighbor_relations = torch.reshape(relation_vectors[hop], - [self.batch_size, -1, self.neighbor_sample_size, - self.embedding_size]) + neighbor_labels = entity_labels[hop + 1].reshape(self.batch_size, -1, self.neighbor_sample_size) + neighbor_relations = relation_vectors[hop].reshape( + self.batch_size, -1, self.neighbor_sample_size, self.embedding_size + ) # mix_neighbor_labels - user_embeddings = torch.reshape(user_embeddings, - [self.batch_size, 1, 1, self.embedding_size]) # [batch_size, 1, 1, dim] - user_relation_scores = torch.mean(user_embeddings * neighbor_relations, - dim=-1) # [batch_size, -1, n_neighbor] - user_relation_scores_normalized = self.softmax( - user_relation_scores) # [batch_size, -1, n_neighbor] - - neighbors_aggregated_label = torch.mean(user_relation_scores_normalized * neighbor_labels, - dim=2) # [batch_size, -1, dim] # [batch_size, -1] - output = masks.float() * self_labels + torch.logical_not(masks).float() * \ - neighbors_aggregated_label + user_embeddings = user_embeddings.reshape( + self.batch_size, 1, 1, self.embedding_size + ) # [batch_size, 1, 1, dim] + user_relation_scores = torch.mean( + user_embeddings * neighbor_relations, dim=-1 + ) # [batch_size, -1, n_neighbor] + user_relation_scores_normalized = self.softmax(user_relation_scores) # [batch_size, -1, n_neighbor] + + neighbors_aggregated_label = torch.mean( + user_relation_scores_normalized * neighbor_labels, dim=2 + ) # [batch_size, -1, dim] # [batch_size, -1] + output = masks.float() * self_labels + \ + torch.logical_not(masks).float() * neighbors_aggregated_label entity_labels_next_iter.append(output) entity_labels = entity_labels_next_iter @@ -408,8 +395,7 @@ def calculate_ls_loss(self, user, item, target): user_e = self.user_embedding(user) entities, relations = self.get_neighbors(item) - predicted_labels = self.label_smoothness_predict( - user_e, user, entities, relations) + predicted_labels = self.label_smoothness_predict(user_e, user, entities, relations) ls_loss = self.bce_loss(predicted_labels, target) return ls_loss @@ -417,8 +403,7 @@ def calculate_loss(self, interaction): user = interaction[self.USER_ID] pos_item = interaction[self.ITEM_ID] neg_item = interaction[self.NEG_ITEM_ID] - target = torch.zeros( - len(user) * 2, dtype=torch.float32).to(self.device) + target = torch.zeros(len(user) * 2, dtype=torch.float32).to(self.device) target[:len(user)] = 1 users = torch.cat((user, user)) @@ -444,11 +429,9 @@ def full_sort_predict(self, interaction): user_index = interaction[self.USER_ID] item_index = torch.tensor(range(self.n_items)).to(self.device) - user = torch.unsqueeze(user_index, dim=1).repeat( - 1, item_index.shape[0]) + user = torch.unsqueeze(user_index, dim=1).repeat(1, item_index.shape[0]) user = torch.flatten(user) - item = torch.unsqueeze(item_index, dim=0).repeat( - user_index.shape[0], 1) + item = torch.unsqueeze(item_index, dim=0).repeat(user_index.shape[0], 1) item = torch.flatten(item) user_e, item_e = self.forward(user, item) diff --git a/recbole/model/knowledge_aware_recommender/ktup.py b/recbole/model/knowledge_aware_recommender/ktup.py index 72d7034ed..adafa68e6 100644 --- a/recbole/model/knowledge_aware_recommender/ktup.py +++ b/recbole/model/knowledge_aware_recommender/ktup.py @@ -18,10 +18,10 @@ import torch.nn.functional as F from torch.autograd import Variable -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender -from recbole.model.loss import BPRLoss, EmbMarginLoss from recbole.model.init import xavier_uniform_initialization +from recbole.model.loss import BPRLoss, EmbMarginLoss +from recbole.utils import InputType class KTUP(KnowledgeRecommender): @@ -74,7 +74,7 @@ def __init__(self, config, dataset): self.relation_norm_embedding.weight.data = normalize_rel_norm_emb def _masked_softmax(self, logits): - probs = F.softmax(logits, dim=len(logits.shape)-1) + probs = F.softmax(logits, dim=len(logits.shape) - 1) return probs def convert_to_one_hot(self, indices, num_classes): @@ -120,14 +120,14 @@ def st_gumbel_softmax(self, logits, temperature=1.0): y = logits + gumbel_noise y = self._masked_softmax(logits=y / temperature) y_argmax = y.max(len(y.shape) - 1)[1] - y_hard = self.convert_to_one_hot( - indices=y_argmax, - num_classes=y.size(len(y.shape) - 1)).float() + y_hard = self.convert_to_one_hot(indices=y_argmax, num_classes=y.size(len(y.shape) - 1)).float() y = (y_hard - y).detach() + y return y def _get_preferences(self, user_e, item_e, use_st_gumbel=False): - pref_probs = torch.matmul(user_e + item_e, torch.t(self.pref_embedding.weight + self.relation_embedding.weight)) / 2 + pref_probs = torch.matmul( + user_e + item_e, torch.t(self.pref_embedding.weight + self.relation_embedding.weight) + ) / 2 if use_st_gumbel: # todo: different torch versions may cause the st_gumbel_softmax to report errors, wait to be test pref_probs = self.st_gumbel_softmax(pref_probs) @@ -141,9 +141,9 @@ def _transH_projection(original, norm): def _get_score(self, h_e, r_e, t_e): if self.L1_flag: - score = - torch.sum(torch.abs(h_e + r_e - t_e), 1) + score = -torch.sum(torch.abs(h_e + r_e - t_e), 1) else: - score = - torch.sum((h_e + r_e - t_e) ** 2, 1) + score = -torch.sum((h_e + r_e - t_e) ** 2, 1) return score def forward(self, user, item): @@ -209,7 +209,9 @@ def calculate_kg_loss(self, interaction): loss = self.kg_weight * (kg_loss + orthogonal_loss + reg_loss) entity = torch.cat([h, pos_t, neg_t]) entity = entity[entity < self.n_items] - align_loss = self.align_weight * alignLoss(self.item_embedding(entity), self.entity_embedding(entity), self.L1_flag) + align_loss = self.align_weight * alignLoss( + self.item_embedding(entity), self.entity_embedding(entity), self.L1_flag + ) return loss, align_loss @@ -221,8 +223,10 @@ def predict(self, interaction): def orthogonalLoss(rel_embeddings, norm_embeddings): - return torch.sum(torch.sum(norm_embeddings * rel_embeddings, dim=1, keepdim=True) ** 2 / - torch.sum(rel_embeddings ** 2, dim=1, keepdim=True)) + return torch.sum( + torch.sum(norm_embeddings * rel_embeddings, dim=1, keepdim=True) ** 2 / + torch.sum(rel_embeddings ** 2, dim=1, keepdim=True) + ) def alignLoss(emb1, emb2, L1_flag=False): diff --git a/recbole/model/knowledge_aware_recommender/mkr.py b/recbole/model/knowledge_aware_recommender/mkr.py index 7354ced6e..7250dcc81 100644 --- a/recbole/model/knowledge_aware_recommender/mkr.py +++ b/recbole/model/knowledge_aware_recommender/mkr.py @@ -16,10 +16,10 @@ import torch import torch.nn as nn -from recbole.utils import InputType -from recbole.model.layers import MLPLayers from recbole.model.abstract_recommender import KnowledgeRecommender from recbole.model.init import xavier_normal_initialization +from recbole.model.layers import MLPLayers +from recbole.utils import InputType class MKR(KnowledgeRecommender): @@ -38,8 +38,8 @@ def __init__(self, config, dataset): self.LABEL = config['LABEL_FIELD'] self.embedding_size = config['embedding_size'] self.kg_embedding_size = config['kg_embedding_size'] - self.L = config['low_layers_num'] # the number of low layers - self.H = config['high_layers_num'] # the number of high layers + self.L = config['low_layers_num'] # the number of low layers + self.H = config['high_layers_num'] # the number of high layers self.reg_weight = config['reg_weight'] self.use_inner_product = config['use_inner_product'] self.dropout_prob = config['dropout_prob'] @@ -75,24 +75,28 @@ def __init__(self, config, dataset): # parameters initialization self.apply(xavier_normal_initialization) - def forward(self, user_indices=None, item_indices=None, head_indices=None, - relation_indices=None, tail_indices=None): + def forward( + self, user_indices=None, item_indices=None, head_indices=None, relation_indices=None, tail_indices=None + ): self.item_embeddings = self.item_embeddings_lookup(item_indices) self.head_embeddings = self.entity_embeddings_lookup(head_indices) - self.item_embeddings, self.head_embeddings = self.cc_unit([self.item_embeddings, self.head_embeddings]) # calculate feature interactions between items and entities + self.item_embeddings, self.head_embeddings = self.cc_unit( + [self.item_embeddings, self.head_embeddings] + ) # calculate feature interactions between items and entities if user_indices is not None: # RS self.user_embeddings = self.user_embeddings_lookup(user_indices) self.user_embeddings = self.user_mlp(self.user_embeddings) - - if self.use_inner_product: # get scores by inner product. - self.scores = torch.sum(self.user_embeddings * self.item_embeddings, 1) # [batch_size] - else: # get scores by mlp layers - self.user_item_concat = torch.cat([self.user_embeddings, self.item_embeddings], 1) # [batch_size, emb_dim*2] + + if self.use_inner_product: # get scores by inner product. + self.scores = torch.sum(self.user_embeddings * self.item_embeddings, 1) # [batch_size] + else: # get scores by mlp layers + self.user_item_concat = torch.cat([self.user_embeddings, self.item_embeddings], + 1) # [batch_size, emb_dim*2] self.user_item_concat = self.rs_mlp(self.user_item_concat) - - self.scores = torch.squeeze(self.rs_pred_mlp(self.user_item_concat)) # [batch_size] + + self.scores = torch.squeeze(self.rs_pred_mlp(self.user_item_concat)) # [batch_size] self.scores_normalized = torch.sigmoid(self.scores) outputs = [self.user_embeddings, self.item_embeddings, self.scores, self.scores_normalized] @@ -101,16 +105,17 @@ def forward(self, user_indices=None, item_indices=None, head_indices=None, self.tail_embeddings = self.entity_embeddings_lookup(tail_indices) self.relation_embeddings = self.relation_embeddings_lookup(relation_indices) self.tail_embeddings = self.tail_mlp(self.tail_embeddings) - - self.head_relation_concat = torch.cat([self.head_embeddings, self.relation_embeddings], 1) # [batch_size, emb_dim*2] + + self.head_relation_concat = torch.cat([self.head_embeddings, self.relation_embeddings], + 1) # [batch_size, emb_dim*2] self.head_relation_concat = self.kge_mlp(self.head_relation_concat) - self.tail_pred = self.kge_pred_mlp(self.head_relation_concat) # [batch_size, 1] + self.tail_pred = self.kge_pred_mlp(self.head_relation_concat) # [batch_size, 1] self.tail_pred = torch.sigmoid(self.tail_pred) self.scores_kge = torch.sigmoid(torch.sum(self.tail_embeddings * self.tail_pred, 1)) self.rmse = torch.mean( - torch.sqrt(torch.sum(torch.pow(self.tail_embeddings - - self.tail_pred, 2), 1) / self.embedding_size)) + torch.sqrt(torch.sum(torch.pow(self.tail_embeddings - self.tail_pred, 2), 1) / self.embedding_size) + ) outputs = [self.head_embeddings, self.tail_embeddings, self.scores_kge, self.rmse] return outputs @@ -179,6 +184,7 @@ class CrossCompressUnit(nn.Module): r"""This is Cross&Compress Unit for MKR model to model feature interactions between items and entities. """ + def __init__(self, dim): super(CrossCompressUnit, self).__init__() self.dim = dim @@ -194,7 +200,7 @@ def forward(self, inputs): e = torch.unsqueeze(e, 1) # [batch_size, dim, dim] c_matrix = torch.matmul(v, e) - c_matrix_transpose = c_matrix.permute(0,2,1) + c_matrix_transpose = c_matrix.permute(0, 2, 1) # [batch_size * dim, dim] c_matrix = c_matrix.view(-1, self.dim) c_matrix_transpose = c_matrix_transpose.contiguous().view(-1, self.dim) diff --git a/recbole/model/knowledge_aware_recommender/ripplenet.py b/recbole/model/knowledge_aware_recommender/ripplenet.py index 65c0e9508..899d74bbc 100644 --- a/recbole/model/knowledge_aware_recommender/ripplenet.py +++ b/recbole/model/knowledge_aware_recommender/ripplenet.py @@ -3,7 +3,6 @@ # @Author : gaole he # @Email : hegaole@ruc.edu.cn - r""" RippleNet ##################################################### @@ -12,15 +11,16 @@ in CIKM 2018. """ +import collections + +import numpy as np import torch import torch.nn as nn -import numpy as np -import collections -from recbole.utils import InputType from recbole.model.abstract_recommender import KnowledgeRecommender -from recbole.model.loss import BPRLoss, EmbLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import BPRLoss, EmbLoss +from recbole.utils import InputType class RippleNet(KnowledgeRecommender): @@ -112,12 +112,12 @@ def _build_ripple_set(self): # we simply copy the ripple set of the last hop here if len(memories_h) == 0: if h == 0: - # print("user {} without 1-hop kg facts, fill with padding".format(user)) + # self.logger.info("user {} without 1-hop kg facts, fill with padding".format(user)) # raise AssertionError("User without facts in 1st hop") n_padding += 1 - memories_h = [0 for i in range(self.n_memory)] - memories_r = [0 for i in range(self.n_memory)] - memories_t = [0 for i in range(self.n_memory)] + memories_h = [0 for _ in range(self.n_memory)] + memories_r = [0 for _ in range(self.n_memory)] + memories_t = [0 for _ in range(self.n_memory)] memories_h = torch.LongTensor(memories_h).to(self.device) memories_r = torch.LongTensor(memories_r).to(self.device) memories_t = torch.LongTensor(memories_t).to(self.device) @@ -135,7 +135,7 @@ def _build_ripple_set(self): memories_r = torch.LongTensor(memories_r).to(self.device) memories_t = torch.LongTensor(memories_t).to(self.device) ripple_set[user].append((memories_h, memories_r, memories_t)) - print("{} among {} users are padded".format(n_padding, len(self.user_dict))) + self.logger.info("{} among {} users are padded".format(n_padding, len(self.user_dict))) return ripple_set def forward(self, interaction): @@ -161,7 +161,7 @@ def forward(self, interaction): head_ent = torch.cat(memories_h[i], dim=0) relation = torch.cat(memories_r[i], dim=0) tail_ent = torch.cat(memories_t[i], dim=0) - # print("Hop {}, size {}".format(i, head_ent.size(), relation.size(), tail_ent.size())) + # self.logger.info("Hop {}, size {}".format(i, head_ent.size(), relation.size(), tail_ent.size())) # [batch size * n_memory, dim] self.h_emb_list.append(self.entity_embedding(head_ent)) @@ -259,7 +259,8 @@ def _key_addressing_full(self): r"""Conduct reasoning for specific item and user ripple set Returns: - o_list (dict -> torch.cuda.FloatTensor): list of torch.cuda.FloatTensor n_hop * [batch_size, n_item, embedding_size] + o_list (dict -> torch.cuda.FloatTensor): list of torch.cuda.FloatTensor + n_hop * [batch_size, n_item, embedding_size] """ o_list = [] for hop in range(self.n_hop): @@ -332,7 +333,7 @@ def full_sort_predict(self, interaction): head_ent = torch.cat(memories_h[i], dim=0) relation = torch.cat(memories_r[i], dim=0) tail_ent = torch.cat(memories_t[i], dim=0) - # print("Hop {}, size {}".format(i, head_ent.size(), relation.size(), tail_ent.size())) + # self.logger.info("Hop {}, size {}".format(i, head_ent.size(), relation.size(), tail_ent.size())) # [batch size * n_memory, dim] self.h_emb_list.append(self.entity_embedding(head_ent)) diff --git a/recbole/model/layers.py b/recbole/model/layers.py index 81cbe75aa..3edb31568 100644 --- a/recbole/model/layers.py +++ b/recbole/model/layers.py @@ -15,15 +15,16 @@ Common Layers in recommender system """ -from logging import getLogger -import numpy as np import copy import math + +import numpy as np import torch import torch.nn as nn import torch.nn.functional as fn from torch.nn.init import normal_ -from recbole.utils import ModelType, InputType, FeatureType + +from recbole.utils import FeatureType class MLPLayers(nn.Module): @@ -32,8 +33,8 @@ class MLPLayers(nn.Module): Args: - layers(list): a list contains the size of each layer in mlp layers - dropout(float): probability of an element to be zeroed. Default: 0 - - activation(str): activation function after each layer in mlp layers. Default: 'relu' - candidates: 'sigmoid', 'tanh', 'relu', 'leekyrelu', 'none' + - activation(str): activation function after each layer in mlp layers. Default: 'relu'. + candidates: 'sigmoid', 'tanh', 'relu', 'leekyrelu', 'none' Shape: @@ -50,7 +51,7 @@ class MLPLayers(nn.Module): >>> torch.Size([128, 16]) """ - def __init__(self, layers, dropout=0, activation='relu', bn=False, init_method=None): + def __init__(self, layers, dropout=0., activation='relu', bn=False, init_method=None): super(MLPLayers, self).__init__() self.layers = layers self.dropout = dropout @@ -214,14 +215,14 @@ def __init__(self, in_dim, att_dim): self.h = nn.Parameter(torch.randn(att_dim), requires_grad=True) def forward(self, infeatures): - att_singal = self.w(infeatures) # [batch_size, M, att_dim] - att_singal = fn.relu(att_singal) # [batch_size, M, att_dim] + att_signal = self.w(infeatures) # [batch_size, M, att_dim] + att_signal = fn.relu(att_signal) # [batch_size, M, att_dim] - att_singal = torch.mul(att_singal, self.h) # [batch_size, M, att_dim] - att_singal = torch.sum(att_singal, dim=2) # [batch_size, M] - att_singal = fn.softmax(att_singal, dim=1) # [batch_size, M] + att_signal = torch.mul(att_signal, self.h) # [batch_size, M, att_dim] + att_signal = torch.sum(att_signal, dim=2) # [batch_size, M] + att_signal = fn.softmax(att_signal, dim=1) # [batch_size, M] - return att_singal + return att_signal class Dice(nn.Module): @@ -259,8 +260,9 @@ class SequenceAttLayer(nn.Module): torch.Tensor: result """ - def __init__(self, mask_mat, att_hidden_size=(80, 40), activation='sigmoid', softmax_stag=False, - return_seq_weight=True): + def __init__( + self, mask_mat, att_hidden_size=(80, 40), activation='sigmoid', softmax_stag=False, return_seq_weight=True + ): super(SequenceAttLayer, self).__init__() self.att_hidden_size = att_hidden_size self.activation = activation @@ -271,11 +273,11 @@ def __init__(self, mask_mat, att_hidden_size=(80, 40), activation='sigmoid', sof self.dense = nn.Linear(self.att_hidden_size[-1], 1) def forward(self, queries, keys, keys_length): - embbedding_size = queries.shape[-1] # H + embedding_size = queries.shape[-1] # H hist_len = keys.shape[1] # T queries = queries.repeat(1, hist_len) - queries = queries.view(-1, hist_len, embbedding_size) + queries = queries.view(-1, hist_len, embedding_size) # MLP Layer input_tensor = torch.cat([queries, keys, queries - keys, queries * keys], dim=-1) @@ -295,7 +297,7 @@ def forward(self, queries, keys, keys_length): output = output.masked_fill(mask=mask, value=torch.tensor(mask_value)) output = output.unsqueeze(1) - output = output / (embbedding_size ** 0.5) + output = output / (embedding_size ** 0.5) # get the weight of each user's history list about the target item if self.softmax_stag: @@ -319,13 +321,10 @@ class VanillaAttention(nn.Module): weights (torch.Tensor): the attention weights """ + def __init__(self, hidden_dim, attn_dim): super().__init__() - self.projection = nn.Sequential( - nn.Linear(hidden_dim, attn_dim), - nn.ReLU(True), - nn.Linear(attn_dim, 1) - ) + self.projection = nn.Sequential(nn.Linear(hidden_dim, attn_dim), nn.ReLU(True), nn.Linear(attn_dim, 1)) def forward(self, input_tensor): # (B, Len, num, H) -> (B, Len, num, 1) @@ -348,12 +347,14 @@ class MultiHeadAttention(nn.Module): hidden_states (torch.Tensor): the output of the multi-head self-attention layer """ + def __init__(self, n_heads, hidden_size, hidden_dropout_prob, attn_dropout_prob, layer_norm_eps): super(MultiHeadAttention, self).__init__() if hidden_size % n_heads != 0: raise ValueError( "The hidden size (%d) is not a multiple of the number of attention " - "heads (%d)" % (hidden_size, n_heads)) + "heads (%d)" % (hidden_size, n_heads) + ) self.num_attention_heads = n_heads self.attention_head_size = int(hidden_size / n_heads) @@ -420,6 +421,7 @@ class FeedForward(nn.Module): hidden_states (torch.Tensor): the output of the point-wise feed-forward layer """ + def __init__(self, hidden_size, inner_size, hidden_dropout_prob, hidden_act, layer_norm_eps): super(FeedForward, self).__init__() self.dense_1 = nn.Linear(hidden_size, inner_size) @@ -473,16 +475,20 @@ class TransformerLayer(nn.Module): attention_mask (torch.Tensor): the attention mask for the multi-head self-attention sublayer Returns: - feedforward_output (torch.Tensor): the output of the point-wise feed-forward sublayer, is the output of the transformer layer + feedforward_output (torch.Tensor): The output of the point-wise feed-forward sublayer, + is the output of the transformer layer. """ - def __init__(self, n_heads, hidden_size, intermediate_size, - hidden_dropout_prob, attn_dropout_prob, hidden_act, layer_norm_eps): + + def __init__( + self, n_heads, hidden_size, intermediate_size, hidden_dropout_prob, attn_dropout_prob, hidden_act, + layer_norm_eps + ): super(TransformerLayer, self).__init__() - self.multi_head_attention = MultiHeadAttention(n_heads, hidden_size, - hidden_dropout_prob, attn_dropout_prob, layer_norm_eps) - self.feed_forward = FeedForward(hidden_size, intermediate_size, - hidden_dropout_prob, hidden_act, layer_norm_eps) + self.multi_head_attention = MultiHeadAttention( + n_heads, hidden_size, hidden_dropout_prob, attn_dropout_prob, layer_norm_eps + ) + self.feed_forward = FeedForward(hidden_size, intermediate_size, hidden_dropout_prob, hidden_act, layer_norm_eps) def forward(self, hidden_states, attention_mask): attention_output = self.multi_head_attention(hidden_states, attention_mask) @@ -504,32 +510,35 @@ class TransformerEncoder(nn.Module): - layer_norm_eps(float): a value added to the denominator for numerical stability. Default: 1e-12 """ - def __init__(self, - n_layers=2, - n_heads=2, - hidden_size=64, - inner_size=256, - hidden_dropout_prob=0.5, - attn_dropout_prob=0.5, - hidden_act='gelu', - layer_norm_eps=1e-12): + + def __init__( + self, + n_layers=2, + n_heads=2, + hidden_size=64, + inner_size=256, + hidden_dropout_prob=0.5, + attn_dropout_prob=0.5, + hidden_act='gelu', + layer_norm_eps=1e-12 + ): super(TransformerEncoder, self).__init__() - layer = TransformerLayer(n_heads, hidden_size, inner_size, - hidden_dropout_prob, attn_dropout_prob, hidden_act, layer_norm_eps) - self.layer = nn.ModuleList([copy.deepcopy(layer) - for _ in range(n_layers)]) + layer = TransformerLayer( + n_heads, hidden_size, inner_size, hidden_dropout_prob, attn_dropout_prob, hidden_act, layer_norm_eps + ) + self.layer = nn.ModuleList([copy.deepcopy(layer) for _ in range(n_layers)]) def forward(self, hidden_states, attention_mask, output_all_encoded_layers=True): """ Args: - hidden_states (torch.Tensor): the input of the TrandformerEncoder + hidden_states (torch.Tensor): the input of the TransformerEncoder attention_mask (torch.Tensor): the attention mask for the input hidden_states output_all_encoded_layers (Bool): whether output all transformer layers' output Returns: - all_encoder_layers (list): if output_all_encoded_layers is True, return a list consists of all transformer layers' output, - otherwise return a list only consists of the output of last transformer layer. + all_encoder_layers (list): if output_all_encoded_layers is True, return a list consists of all transformer + layers' output, otherwise return a list only consists of the output of last transformer layer. """ all_encoder_layers = [] @@ -586,17 +595,19 @@ def get_embedding(self): self.token_field_offsets[type] = np.array((0, *np.cumsum(self.token_field_dims[type])[:-1]), dtype=np.long) - self.token_embedding_table[type] = FMEmbedding(self.token_field_dims[type], - self.token_field_offsets[type], - self.embedding_size).to(self.device) + self.token_embedding_table[type] = FMEmbedding( + self.token_field_dims[type], self.token_field_offsets[type], self.embedding_size + ).to(self.device) if len(self.float_field_dims[type]) > 0: - self.float_embedding_table[type] = nn.Embedding(np.sum(self.float_field_dims[type], dtype=np.int32), - self.embedding_size).to(self.device) + self.float_embedding_table[type] = nn.Embedding( + np.sum(self.float_field_dims[type], dtype=np.int32), self.embedding_size + ).to(self.device) if len(self.token_seq_field_dims) > 0: self.token_seq_embedding_table[type] = nn.ModuleList() for token_seq_field_dim in self.token_seq_field_dims[type]: self.token_seq_embedding_table[type].append( - nn.Embedding(token_seq_field_dim, self.embedding_size).to(self.device)) + nn.Embedding(token_seq_field_dim, self.embedding_size).to(self.device) + ) def embed_float_fields(self, float_fields, type, embed=True): """Get the embedding of float fields. @@ -626,7 +637,7 @@ def embed_float_fields(self, float_fields, type, embed=True): return float_embedding def embed_token_fields(self, token_fields, type): - """Get the embedding of toekn fields + """Get the embedding of token fields Args: token_fields(torch.Tensor): input, [batch_size, max_item_length, num_token_field] @@ -667,18 +678,18 @@ def embed_token_seq_fields(self, token_seq_fields, type): mask = mask.float() value_cnt = torch.sum(mask, dim=-1, keepdim=True) # [batch_size, max_item_length, 1] token_seq_embedding = embedding_table(token_seq_field) # [batch_size, max_item_length, seq_len, embed_dim] - mask = mask.unsqueeze(-1).expand_as( - token_seq_embedding) # [batch_size, max_item_length, seq_len, embed_dim] + mask = mask.unsqueeze(-1).expand_as(token_seq_embedding) if self.pooling_mode == 'max': - masked_token_seq_embedding = token_seq_embedding - ( - 1 - mask) * 1e9 # [batch_size, max_item_length, seq_len, embed_dim] - result = torch.max(masked_token_seq_embedding, dim=-2, - keepdim=True) # [batch_size, max_item_length, 1, embed_dim] + masked_token_seq_embedding = token_seq_embedding - (1 - mask) * 1e9 + result = torch.max( + masked_token_seq_embedding, dim=-2, keepdim=True + ) # [batch_size, max_item_length, 1, embed_dim] result = result.values elif self.pooling_mode == 'sum': masked_token_seq_embedding = token_seq_embedding * mask.float() - result = torch.sum(masked_token_seq_embedding, dim=-2, - keepdim=True) # [batch_size, max_item_length, 1, embed_dim] + result = torch.sum( + masked_token_seq_embedding, dim=-2, keepdim=True + ) # [batch_size, max_item_length, 1, embed_dim] else: masked_token_seq_embedding = token_seq_embedding * mask.float() result = torch.sum(masked_token_seq_embedding, dim=-2) # [batch_size, max_item_length, embed_dim] @@ -715,9 +726,7 @@ def embed_input_fields(self, user_idx, item_idx): float_fields = [] for field_name in self.float_field_names[type]: feature = user_item_feat[type][field_name][user_item_idx[type]] - float_fields.append(feature - if len(feature.shape) == (2 + (type == 'item')) - else feature.unsqueeze(-1)) + float_fields.append(feature if len(feature.shape) == (2 + (type == 'item')) else feature.unsqueeze(-1)) if len(float_fields) > 0: float_fields = torch.cat(float_fields, dim=1) # [batch_size, max_item_length, num_float_field] else: @@ -750,12 +759,15 @@ def embed_input_fields(self, user_idx, item_idx): if token_seq_fields_embedding[type] is None: sparse_embedding[type] = token_fields_embedding[type] else: - sparse_embedding[type] = torch.cat([token_fields_embedding[type], - token_seq_fields_embedding[type]], dim=-2) + sparse_embedding[type] = torch.cat([token_fields_embedding[type], token_seq_fields_embedding[type]], + dim=-2) dense_embedding[type] = float_fields_embedding[type] - # sparse_embedding[type] shape: [batch_size, max_item_length, num_token_seq_field+num_token_field, embed_dim] or None - # dense_embedding[type] shape: [batch_size, max_item_length, num_float_field] or [batch_size, max_item_length, num_float_field, embed_dim] or None + # sparse_embedding[type] + # shape: [batch_size, max_item_length, num_token_seq_field+num_token_field, embed_dim] or None + # dense_embedding[type] + # shape: [batch_size, max_item_length, num_float_field] + # or [batch_size, max_item_length, num_float_field, embed_dim] or None return sparse_embedding, dense_embedding def forward(self, user_idx, item_idx): @@ -773,8 +785,10 @@ def __init__(self, dataset, embedding_size, pooling_mode, device): self.user_feat = self.dataset.get_user_feature().to(self.device) self.item_feat = self.dataset.get_item_feature().to(self.device) - self.field_names = {'user': list(self.user_feat.interaction.keys()), - 'item': list(self.item_feat.interaction.keys())} + self.field_names = { + 'user': list(self.user_feat.interaction.keys()), + 'item': list(self.item_feat.interaction.keys()) + } self.types = ['user', 'item'] self.pooling_mode = pooling_mode @@ -857,7 +871,9 @@ def __init__(self, channels, kernels, strides, activation='relu', init_method=No cnn_modules = [] for i in range(self.num_of_nets): - cnn_modules.append(nn.Conv2d(self.channels[i], self.channels[i + 1], self.kernels[i], stride=self.strides[i])) + cnn_modules.append( + nn.Conv2d(self.channels[i], self.channels[i + 1], self.kernels[i], stride=self.strides[i]) + ) if self.activation.lower() == 'sigmoid': cnn_modules.append(nn.Sigmoid()) elif self.activation.lower() == 'tanh': @@ -1006,8 +1022,10 @@ def forward(self, interaction): total_fields_embedding = [] float_fields = [] for field_name in self.float_field_names: - float_fields.append(interaction[field_name] - if len(interaction[field_name].shape) == 2 else interaction[field_name].unsqueeze(1)) + if len(interaction[field_name].shape) == 2: + float_fields.append(interaction[field_name]) + else: + float_fields.append(interaction[field_name].unsqueeze(1)) if len(float_fields) > 0: float_fields = torch.cat(float_fields, dim=1) # [batch_size, num_float_field] @@ -1041,3 +1059,24 @@ def forward(self, interaction): total_fields_embedding.append(token_seq_fields_embedding) return torch.sum(torch.cat(total_fields_embedding, dim=1), dim=1) + self.bias # [batch_size, output_dim] + + +class SparseDropout(nn.Module): + """ + This is a Module that execute Dropout on Pytorch sparse tensor. + """ + + def __init__(self, p=0.5): + super(SparseDropout, self).__init__() + # p is ratio of dropout + # convert to keep probability + self.kprob = 1 - p + + def forward(self, x): + if not self.training: + return x + + mask = ((torch.rand(x._values().size()) + self.kprob).floor()).type(torch.bool) + rc = x._indices()[:, mask] + val = x._values()[mask] * (1.0 / self.kprob) + return torch.sparse.FloatTensor(rc, val, x.shape) diff --git a/recbole/model/loss.py b/recbole/model/loss.py index ebfc84ca1..78c9cf8b2 100644 --- a/recbole/model/loss.py +++ b/recbole/model/loss.py @@ -7,20 +7,17 @@ # @Author : Shanlei Mu # @Email : slmu@ruc.edu.cn - """ recbole.model.loss ####################### Common Loss in recommender system """ - import torch import torch.nn as nn class BPRLoss(nn.Module): - """ BPRLoss, based on Bayesian Personalized Ranking Args: @@ -39,12 +36,13 @@ class BPRLoss(nn.Module): >>> output = loss(pos_score, neg_score) >>> output.backward() """ + def __init__(self, gamma=1e-10): super(BPRLoss, self).__init__() self.gamma = gamma def forward(self, pos_score, neg_score): - loss = - torch.log(self.gamma + torch.sigmoid(pos_score - neg_score)).mean() + loss = -torch.log(self.gamma + torch.sigmoid(pos_score - neg_score)).mean() return loss @@ -52,6 +50,7 @@ class RegLoss(nn.Module): """ RegLoss, L2 regularization on model parameters """ + def __init__(self): super(RegLoss, self).__init__() @@ -69,6 +68,7 @@ class EmbLoss(nn.Module): """ EmbLoss, regularization on embeddings """ + def __init__(self, norm=2): super(EmbLoss, self).__init__() self.norm = norm diff --git a/recbole/model/sequential_recommender/bert4rec.py b/recbole/model/sequential_recommender/bert4rec.py index 86d20dafb..1a923b7db 100644 --- a/recbole/model/sequential_recommender/bert4rec.py +++ b/recbole/model/sequential_recommender/bert4rec.py @@ -50,13 +50,18 @@ def __init__(self, config, dataset): self.mask_item_length = int(self.mask_ratio * self.max_seq_length) # define layers and loss - self.item_embedding = nn.Embedding(self.n_items+1, self.hidden_size, padding_idx=0) # mask token add 1 - self.position_embedding = nn.Embedding(self.max_seq_length+1, self.hidden_size) # add mask_token at the last - self.trm_encoder = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.item_embedding = nn.Embedding(self.n_items + 1, self.hidden_size, padding_idx=0) # mask token add 1 + self.position_embedding = nn.Embedding(self.max_seq_length + 1, self.hidden_size) # add mask_token at the last + self.trm_encoder = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.LayerNorm = nn.LayerNorm(self.hidden_size, eps=self.layer_norm_eps) self.dropout = nn.Dropout(self.hidden_dropout_prob) @@ -99,8 +104,8 @@ def _neg_sample(self, item_set): def _padding_sequence(self, sequence, max_length): pad_len = max_length - len(sequence) - sequence = [0]*pad_len + sequence - sequence = sequence[-max_length:] # truncate according to the max_length + sequence = [0] * pad_len + sequence + sequence = sequence[-max_length:] # truncate according to the max_length return sequence def reconstruct_train_data(self, item_seq): @@ -169,9 +174,7 @@ def forward(self, item_seq): input_emb = self.LayerNorm(input_emb) input_emb = self.dropout(input_emb) extended_attention_mask = self.get_attention_mask(item_seq) - trm_output = self.trm_encoder(input_emb, - extended_attention_mask, - output_all_encoded_layers=True) + trm_output = self.trm_encoder(input_emb, extended_attention_mask, output_all_encoded_layers=True) output = trm_output[-1] return output # [B L H] @@ -193,7 +196,7 @@ def multi_hot_embed(self, masked_index, max_length): multi_hot_embed: [[0 1 0 0 0], [0 0 0 1 0]] """ masked_index = masked_index.view(-1) - multi_hot = torch.zeros(masked_index.size(0), max_length).cuda() + multi_hot = torch.zeros(masked_index.size(0), max_length, device=masked_index.device) multi_hot[torch.arange(masked_index.size(0)), masked_index] = 1 return multi_hot @@ -237,7 +240,7 @@ def predict(self, interaction): test_item = interaction[self.ITEM_ID] item_seq = self.reconstruct_test_data(item_seq, item_seq_len) seq_output = self.forward(item_seq) - seq_output = self.gather_indexes(seq_output, item_seq_len-1) # [B H] + seq_output = self.gather_indexes(seq_output, item_seq_len - 1) # [B H] test_item_emb = self.item_embedding(test_item) scores = torch.mul(seq_output, test_item_emb).sum(dim=1) # [B] return scores @@ -247,7 +250,7 @@ def full_sort_predict(self, interaction): item_seq_len = interaction[self.ITEM_SEQ_LEN] item_seq = self.reconstruct_test_data(item_seq, item_seq_len) seq_output = self.forward(item_seq) - seq_output = self.gather_indexes(seq_output, item_seq_len-1) # [B H] + seq_output = self.gather_indexes(seq_output, item_seq_len - 1) # [B H] test_items_emb = self.item_embedding.weight[:self.n_items] # delete masked token scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) # [B, item_num] return scores diff --git a/recbole/model/sequential_recommender/caser.py b/recbole/model/sequential_recommender/caser.py index 0cb018ba4..229816deb 100644 --- a/recbole/model/sequential_recommender/caser.py +++ b/recbole/model/sequential_recommender/caser.py @@ -25,8 +25,8 @@ from torch.nn import functional as F from torch.nn.init import normal_, xavier_normal_, constant_ -from recbole.model.loss import RegLoss, BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import RegLoss, BPRLoss class Caser(SequentialRecommender): @@ -36,6 +36,7 @@ class Caser(SequentialRecommender): We did not use the sliding window to generate training instances as in the paper, in order that the generation method we used is common to other sequential models. For comparison with other models, we set the parameter T in the paper as 1. + In addition, to prevent excessive CNN layers (ValueError: Training loss is nan), please make sure the parameters MAX_ITEM_LIST_LENGTH small, such as 10. """ def __init__(self, config, dataset): @@ -61,7 +62,9 @@ def __init__(self, config, dataset): # horizontal conv layer lengths = [i + 1 for i in range(self.max_seq_length)] - self.conv_h = nn.ModuleList([nn.Conv2d(in_channels=1, out_channels=self.n_h, kernel_size=(i, self.embedding_size)) for i in lengths]) + self.conv_h = nn.ModuleList([ + nn.Conv2d(in_channels=1, out_channels=self.n_h, kernel_size=(i, self.embedding_size)) for i in lengths + ]) # fully-connected layer self.fc1_dim_v = self.n_v * self.embedding_size @@ -95,7 +98,7 @@ def _init_weights(self, module): def forward(self, user, item_seq): # Embedding Look-up - # use unsqueeze() to get a 4-D input for convolution layers. (batchsize * 1 * max_length * embedding_size) + # use unsqueeze() to get a 4-D input for convolution layers. (batch_size * 1 * max_length * embedding_size) item_seq_emb = self.item_embedding(item_seq).unsqueeze(1) user_emb = self.user_embedding(user).squeeze(1) @@ -154,8 +157,9 @@ def calculate_loss(self, interaction): logits = torch.matmul(seq_output, test_item_emb.transpose(0, 1)) loss = self.loss_fct(logits, pos_items) - reg_loss = self.reg_loss([self.user_embedding.weight, self.item_embedding.weight, - self.conv_v.weight,self.fc1.weight, self.fc2.weight]) + reg_loss = self.reg_loss([ + self.user_embedding.weight, self.item_embedding.weight, self.conv_v.weight, self.fc1.weight, self.fc2.weight + ]) loss = loss + self.reg_weight * reg_loss + self.reg_loss_conv_h() return loss diff --git a/recbole/model/sequential_recommender/din.py b/recbole/model/sequential_recommender/din.py index 161454684..dcb7ce395 100644 --- a/recbole/model/sequential_recommender/din.py +++ b/recbole/model/sequential_recommender/din.py @@ -24,9 +24,9 @@ import torch.nn as nn from torch.nn.init import xavier_normal_, constant_ -from recbole.utils import InputType -from recbole.model.layers import MLPLayers, SequenceAttLayer, ContextSeqEmbLayer from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.layers import MLPLayers, SequenceAttLayer, ContextSeqEmbLayer +from recbole.utils import InputType class DIN(SequentialRecommender): @@ -61,24 +61,14 @@ def __init__(self, config, dataset): # self.dnn_list = [(3 * self.num_feature_field['item'] + self.num_feature_field['user']) # * self.embedding_size] + self.mlp_hidden_size num_item_feature = len(self.item_feat.interaction.keys()) - self.dnn_list = [ - (3 * num_item_feature) * self.embedding_size - ] + self.mlp_hidden_size - self.att_list = [ - 4 * num_item_feature * self.embedding_size - ] + self.mlp_hidden_size - - mask_mat = torch.arange(self.max_seq_length).to(self.device).view( - 1, -1) # init mask - self.attention = SequenceAttLayer(mask_mat, - self.att_list, - activation='Sigmoid', - softmax_stag=False, - return_seq_weight=False) - self.dnn_mlp_layers = MLPLayers(self.dnn_list, - activation='Dice', - dropout=self.dropout_prob, - bn=True) + self.dnn_list = [3 * num_item_feature * self.embedding_size] + self.mlp_hidden_size + self.att_list = [4 * num_item_feature * self.embedding_size] + self.mlp_hidden_size + + mask_mat = torch.arange(self.max_seq_length).to(self.device).view(1, -1) # init mask + self.attention = SequenceAttLayer( + mask_mat, self.att_list, activation='Sigmoid', softmax_stag=False, return_seq_weight=False + ) + self.dnn_mlp_layers = MLPLayers(self.dnn_list, activation='Dice', dropout=self.dropout_prob, bn=True) self.embedding_layer = ContextSeqEmbLayer(dataset, self.embedding_size, self.pooling_mode, self.device) self.dnn_predict_layers = nn.Linear(self.mlp_hidden_size[-1], 1) @@ -128,8 +118,7 @@ def forward(self, user, item_seq, item_seq_len, next_items): user_emb = user_emb.squeeze() # input the DNN to get the prediction score - din_in = torch.cat([user_emb, target_item_feat_emb, - user_emb * target_item_feat_emb], dim=-1) + din_in = torch.cat([user_emb, target_item_feat_emb, user_emb * target_item_feat_emb], dim=-1) din_out = self.dnn_mlp_layers(din_in) preds = self.dnn_predict_layers(din_out) preds = self.sigmoid(preds) diff --git a/recbole/model/sequential_recommender/fdsa.py b/recbole/model/sequential_recommender/fdsa.py index 41bbae25d..0fba79689 100644 --- a/recbole/model/sequential_recommender/fdsa.py +++ b/recbole/model/sequential_recommender/fdsa.py @@ -17,14 +17,14 @@ from torch import nn from recbole.model.abstract_recommender import SequentialRecommender -from recbole.model.loss import BPRLoss from recbole.model.layers import TransformerEncoder, FeatureSeqEmbLayer, VanillaAttention +from recbole.model.loss import BPRLoss class FDSA(SequentialRecommender): r""" FDSA is similar with the GRU4RecF implemented in RecBole, which uses two different Transformer encoders to - encode items and features respectively and concatenates the two subparts's outputs as the final output. + encode items and features respectively and concatenates the two subparts' outputs as the final output. """ @@ -53,22 +53,33 @@ def __init__(self, config, dataset): self.item_embedding = nn.Embedding(self.n_items, self.hidden_size, padding_idx=0) self.position_embedding = nn.Embedding(self.max_seq_length, self.hidden_size) - self.feature_embed_layer = FeatureSeqEmbLayer(dataset, self.hidden_size, self.selected_features, - self.pooling_mode, self.device) - - self.item_trm_encoder = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.feature_embed_layer = FeatureSeqEmbLayer( + dataset, self.hidden_size, self.selected_features, self.pooling_mode, self.device + ) + + self.item_trm_encoder = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.feature_att_layer = VanillaAttention(self.hidden_size, self.hidden_size) # For simplicity, we use same architecture for item_trm and feature_trm - self.feature_trm_encoder = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.feature_trm_encoder = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.LayerNorm = nn.LayerNorm(self.hidden_size, eps=self.layer_norm_eps) self.dropout = nn.Dropout(self.hidden_dropout_prob) @@ -149,14 +160,12 @@ def forward(self, item_seq, item_seq_len): extended_attention_mask = self.get_attention_mask(item_seq) - item_trm_output = self.item_trm_encoder(item_trm_input, - extended_attention_mask, - output_all_encoded_layers=True) + item_trm_output = self.item_trm_encoder(item_trm_input, extended_attention_mask, output_all_encoded_layers=True) item_output = item_trm_output[-1] - feature_trm_output = self.feature_trm_encoder(feature_trm_input, - extended_attention_mask, - output_all_encoded_layers=True) # [B Len H] + feature_trm_output = self.feature_trm_encoder( + feature_trm_input, extended_attention_mask, output_all_encoded_layers=True + ) # [B Len H] feature_output = feature_trm_output[-1] item_output = self.gather_indexes(item_output, item_seq_len - 1) # [B H] diff --git a/recbole/model/sequential_recommender/fossil.py b/recbole/model/sequential_recommender/fossil.py new file mode 100644 index 000000000..0432174bf --- /dev/null +++ b/recbole/model/sequential_recommender/fossil.py @@ -0,0 +1,188 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/21 20:00 +# @Author : Shao Weiqi +# @Reviewer : Lin Kun +# @Email : shaoweiqi@ruc.edu.cn + +r""" +FOSSIL +################################################ + +Reference: + Ruining He et al. "Fusing Similarity Models with Markov Chains for Sparse Sequential Recommendation." in ICDM 2016. + + +""" + +import torch +import torch.nn as nn +from torch.nn.init import xavier_normal_ + +from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss + + +class FOSSIL(SequentialRecommender): + r""" + FOSSIL uses similarity of the items as main purpose and uses high MC as a way of sequential preference improve of + ability of sequential recommendation + + """ + + def __init__(self, config, dataset): + super(FOSSIL, self).__init__(config, dataset) + + # load the dataset information + self.n_users = dataset.num(self.USER_ID) + self.device = config['device'] + + # load the parameters + self.embedding_size = config["embedding_size"] + self.order_len = config["order_len"] + assert self.order_len <= self.max_seq_length, "order_len can't longer than the max_seq_length" + self.reg_weight = config["reg_weight"] + self.alpha = config["alpha"] + + # define the layers and loss type + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.user_lambda = nn.Embedding(self.n_users, self.order_len) + self.lambda_ = nn.Parameter(torch.zeros(self.order_len)).to(self.device) + + self.loss_type = config['loss_type'] + if self.loss_type == 'BPR': + self.loss_fct = BPRLoss() + elif self.loss_type == 'CE': + self.loss_fct = nn.CrossEntropyLoss() + else: + raise NotImplementedError("Make sure 'loss_type' in ['BPR', 'CE']!") + + # init the parameters of the model + self.apply(self.init_weights) + + def inverse_seq_item_embedding(self, seq_item_embedding, seq_item_len): + """ + inverse seq_item_embedding like this (simple to 2-dim): + + [1,2,3,0,0,0] -- ??? -- >> [0,0,0,1,2,3] + + first: [0,0,0,0,0,0] concat [1,2,3,0,0,0] + + using gather_indexes: to get one by one + + first get 3,then 2,last 1 + """ + zeros = torch.zeros_like(seq_item_embedding, dtype=torch.float).to(self.device) + # batch_size * seq_len * embedding_size + item_embedding_zeros = torch.cat([zeros, seq_item_embedding], dim=1) + # batch_size * 2_mul_seq_len * embedding_size + embedding_list = list() + for i in range(self.order_len): + embedding = self.gather_indexes( + item_embedding_zeros, self.max_seq_length + seq_item_len - self.order_len + i + ) + embedding_list.append(embedding.unsqueeze(1)) + short_item_embedding = torch.cat(embedding_list, dim=1) + # batch_size * short_len * embedding_size + + return short_item_embedding + + def reg_loss(self, user_embedding, item_embedding, seq_output): + + reg_1 = self.reg_weight + loss_1 = reg_1 * torch.norm(user_embedding, p=2) \ + + reg_1 * torch.norm(item_embedding, p=2) \ + + reg_1 * torch.norm(seq_output, p=2) + + return loss_1 + + def init_weights(self, module): + + if isinstance(module, nn.Embedding) or isinstance(module, nn.Linear): + xavier_normal_(module.weight.data) + + def forward(self, seq_item, seq_item_len, user): + + seq_item_embedding = self.item_embedding(seq_item) + + high_order_seq_item_embedding = self.inverse_seq_item_embedding(seq_item_embedding, seq_item_len) + # batch_size * order_len * embedding + + high_order = self.get_high_order_Markov(high_order_seq_item_embedding, user) + similarity = self.get_similarity(seq_item_embedding, seq_item_len) + + return high_order + similarity + + def get_high_order_Markov(self, high_order_item_embedding, user): + """ + + in order to get the inference of past items and the user's taste to the current predict item + """ + + user_lambda = self.user_lambda(user).unsqueeze(dim=2) + # batch_size * order_len * 1 + lambda_ = self.lambda_.unsqueeze(dim=0).unsqueeze(dim=2) + # 1 * order_len * 1 + lambda_ = torch.add(user_lambda, lambda_) + # batch_size * order_len * 1 + high_order_item_embedding = torch.mul(high_order_item_embedding, lambda_) + # batch_size * order_len * embedding_size + high_order_item_embedding = high_order_item_embedding.sum(dim=1) + # batch_size * embedding_size + + return high_order_item_embedding + + def get_similarity(self, seq_item_embedding, seq_item_len): + """ + in order to get the inference of past items to the current predict item + """ + coeff = torch.pow(seq_item_len.unsqueeze(1), -self.alpha).float() + # batch_size * 1 + similarity = torch.mul(coeff, seq_item_embedding.sum(dim=1)) + # batch_size * embedding_size + + return similarity + + def calculate_loss(self, interaction): + + seq_item = interaction[self.ITEM_SEQ] + user = interaction[self.USER_ID] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + seq_output = self.forward(seq_item, seq_item_len, user) + pos_items = interaction[self.POS_ITEM_ID] + pos_items_emb = self.item_embedding(pos_items) + + user_lambda = self.user_lambda(user) + pos_items_embedding = self.item_embedding(pos_items) + if self.loss_type == 'BPR': + neg_items = interaction[self.NEG_ITEM_ID] + neg_items_emb = self.item_embedding(neg_items) + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) + loss = self.loss_fct(pos_score, neg_score) + return loss + self.reg_loss(user_lambda, pos_items_embedding, seq_output) + else: # self.loss_type = 'CE' + test_item_emb = self.item_embedding.weight + logits = torch.matmul(seq_output, test_item_emb.transpose(0, 1)) + loss = self.loss_fct(logits, pos_items) + return loss + self.reg_loss(user_lambda, pos_items_embedding, seq_output) + + def predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + item_seq_len = interaction[self.ITEM_SEQ_LEN] + test_item = interaction[self.ITEM_ID] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, item_seq_len, user) + test_item_emb = self.item_embedding(test_item) + scores = torch.mul(seq_output, test_item_emb).sum(dim=1) + return scores + + def full_sort_predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + user = interaction[self.USER_ID] + item_seq_len = interaction[self.ITEM_SEQ_LEN] + seq_output = self.forward(item_seq, item_seq_len, user) + test_items_emb = self.item_embedding.weight + scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) + return scores diff --git a/recbole/model/sequential_recommender/fpmc.py b/recbole/model/sequential_recommender/fpmc.py index ce1ba3373..6654b59d1 100644 --- a/recbole/model/sequential_recommender/fpmc.py +++ b/recbole/model/sequential_recommender/fpmc.py @@ -20,9 +20,9 @@ from torch import nn from torch.nn.init import xavier_normal_ -from recbole.utils import InputType -from recbole.model.loss import BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss +from recbole.utils import InputType class FPMC(SequentialRecommender): @@ -66,19 +66,18 @@ def _init_weights(self, module): xavier_normal_(module.weight.data) def forward(self, user, item_seq, item_seq_len, next_item): - item_last_click_index = item_seq_len - 1 item_last_click = torch.gather(item_seq, dim=1, index=item_last_click_index.unsqueeze(1)) - item_seq_emb = self.LI_emb(item_last_click) # [b,1,emb] + item_seq_emb = self.LI_emb(item_last_click) # [b,1,emb] user_emb = self.UI_emb(user) - user_emb = torch.unsqueeze(user_emb, dim=1) # [b,1,emb] + user_emb = torch.unsqueeze(user_emb, dim=1) # [b,1,emb] iu_emb = self.IU_emb(next_item) - iu_emb = torch.unsqueeze(iu_emb, dim=1) # [b,n,emb] in here n = 1 + iu_emb = torch.unsqueeze(iu_emb, dim=1) # [b,n,emb] in here n = 1 il_emb = self.IL_emb(next_item) - il_emb = torch.unsqueeze(il_emb, dim=1) # [b,n,emb] in here n = 1 + il_emb = torch.unsqueeze(il_emb, dim=1) # [b,n,emb] in here n = 1 # This is the core part of the FPMC model,can be expressed by a combination of a MF and a FMC model # MF @@ -119,13 +118,13 @@ def full_sort_predict(self, interaction): user_emb = self.UI_emb(user) all_iu_emb = self.IU_emb.weight - mf = torch.matmul(user_emb, all_iu_emb.transpose(0,1)) + mf = torch.matmul(user_emb, all_iu_emb.transpose(0, 1)) all_il_emb = self.IL_emb.weight item_last_click_index = item_seq_len - 1 item_last_click = torch.gather(item_seq, dim=1, index=item_last_click_index.unsqueeze(1)) item_seq_emb = self.LI_emb(item_last_click) # [b,1,emb] - fmc = torch.matmul(item_seq_emb, all_il_emb.transpose(0,1)) + fmc = torch.matmul(item_seq_emb, all_il_emb.transpose(0, 1)) fmc = torch.squeeze(fmc, dim=1) score = mf + fmc return score diff --git a/recbole/model/sequential_recommender/gcsan.py b/recbole/model/sequential_recommender/gcsan.py index e527c7c3b..e64381473 100644 --- a/recbole/model/sequential_recommender/gcsan.py +++ b/recbole/model/sequential_recommender/gcsan.py @@ -3,7 +3,6 @@ # @Author : Yujie Lu # @Email : yujielu1998@gmail.com - r""" GCSAN ################################################ @@ -14,16 +13,16 @@ """ import math -import numpy as np +import numpy as np import torch from torch import nn from torch.nn import Parameter from torch.nn import functional as F -from recbole.model.loss import EmbLoss, BPRLoss from recbole.model.abstract_recommender import SequentialRecommender from recbole.model.layers import TransformerEncoder +from recbole.model.loss import EmbLoss, BPRLoss class GNN(nn.Module): @@ -68,20 +67,20 @@ def GNNCell(self, A, hidden): """ input_in = torch.matmul(A[:, :, :A.size(1)], self.linear_edge_in(hidden)) - input_out = torch.matmul(A[:, :, A.size(1): 2 * A.size(1)], self.linear_edge_out(hidden)) + input_out = torch.matmul(A[:, :, A.size(1):2 * A.size(1)], self.linear_edge_out(hidden)) # [batch_size, max_session_len, embedding_size * 2] inputs = torch.cat([input_in, input_out], 2) - # gi.size equals to gh.size, shape of [batch_size, max_session_len, embdding_size * 3] + # gi.size equals to gh.size, shape of [batch_size, max_session_len, embedding_size * 3] gi = F.linear(inputs, self.w_ih, self.b_ih) gh = F.linear(hidden, self.w_hh, self.b_hh) # (batch_size, max_session_len, embedding_size) i_r, i_i, i_n = gi.chunk(3, 2) h_r, h_i, h_n = gh.chunk(3, 2) - resetgate = torch.sigmoid(i_r + h_r) - inputgate = torch.sigmoid(i_i + h_i) - newgate = torch.tanh(i_n + resetgate * h_n) - hy = (1 - inputgate) * hidden + inputgate * newgate + reset_gate = torch.sigmoid(i_r + h_r) + input_gate = torch.sigmoid(i_i + h_i) + new_gate = torch.tanh(i_n + reset_gate * h_n) + hy = (1 - input_gate) * hidden + input_gate * new_gate return hy def forward(self, A, hidden): @@ -91,7 +90,7 @@ def forward(self, A, hidden): class GCSAN(SequentialRecommender): - r"""GCSAN captures rich local dependencies via graph nerual network, + r"""GCSAN captures rich local dependencies via graph neural network, and learns long-range dependencies by applying the self-attention mechanism. Note: @@ -124,11 +123,16 @@ def __init__(self, config, dataset): # define layers and loss self.item_embedding = nn.Embedding(self.n_items, self.hidden_size, padding_idx=0) self.gnn = GNN(self.hidden_size, self.step) - self.self_attention = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.self_attention = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.reg_loss = EmbLoss() if self.loss_type == 'BPR': self.loss_fct = BPRLoss() diff --git a/recbole/model/sequential_recommender/gru4rec.py b/recbole/model/sequential_recommender/gru4rec.py index d59d50fcf..0ea93e331 100644 --- a/recbole/model/sequential_recommender/gru4rec.py +++ b/recbole/model/sequential_recommender/gru4rec.py @@ -17,13 +17,12 @@ """ - import torch from torch import nn from torch.nn.init import xavier_uniform_, xavier_normal_ -from recbole.model.loss import BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss class GRU4Rec(SequentialRecommender): @@ -70,9 +69,9 @@ def __init__(self, config, dataset): def _init_weights(self, module): if isinstance(module, nn.Embedding): xavier_normal_(module.weight) - elif isinstance(module,nn.GRU): - xavier_uniform_(self.gru_layers.weight_hh_l0) - xavier_uniform_(self.gru_layers.weight_ih_l0) + elif isinstance(module, nn.GRU): + xavier_uniform_(module.weight_hh_l0) + xavier_uniform_(module.weight_ih_l0) def forward(self, item_seq, item_seq_len): item_seq_emb = self.item_embedding(item_seq) @@ -92,8 +91,8 @@ def calculate_loss(self, interaction): neg_items = interaction[self.NEG_ITEM_ID] pos_items_emb = self.item_embedding(pos_items) neg_items_emb = self.item_embedding(neg_items) - pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) # [B] - neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) # [B] + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) # [B] + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) # [B] loss = self.loss_fct(pos_score, neg_score) return loss else: # self.loss_type = 'CE' diff --git a/recbole/model/sequential_recommender/gru4recf.py b/recbole/model/sequential_recommender/gru4recf.py index 3e4abd855..55f8d2360 100644 --- a/recbole/model/sequential_recommender/gru4recf.py +++ b/recbole/model/sequential_recommender/gru4recf.py @@ -17,9 +17,9 @@ from torch import nn from recbole.model.abstract_recommender import SequentialRecommender -from recbole.model.loss import BPRLoss -from recbole.model.layers import FeatureSeqEmbLayer from recbole.model.init import xavier_normal_initialization +from recbole.model.layers import FeatureSeqEmbLayer +from recbole.model.loss import BPRLoss class GRU4RecF(SequentialRecommender): @@ -34,7 +34,7 @@ class GRU4RecF(SequentialRecommender): (3) Weighted sum of outputs from two different RNNs. We implemented the optimal parallel version(2), which uses different RNNs to - encode items and features respectively and concatenates the two subparts's + encode items and features respectively and concatenates the two subparts' outputs as the final output. The different RNN encoders are trained simultaneously. """ @@ -56,8 +56,9 @@ def __init__(self, config, dataset): # define layers and loss self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) - self.feature_embed_layer = FeatureSeqEmbLayer(dataset, self.embedding_size, self.selected_features, - self.pooling_mode, self.device) + self.feature_embed_layer = FeatureSeqEmbLayer( + dataset, self.embedding_size, self.selected_features, self.pooling_mode, self.device + ) self.item_gru_layers = nn.GRU( input_size=self.embedding_size, hidden_size=self.hidden_size, @@ -106,12 +107,12 @@ def forward(self, item_seq, item_seq_len): feat_num, embedding_size = table_shape[-2], table_shape[-1] feature_emb = feature_table.view(table_shape[:-2] + (feat_num * embedding_size,)) - feature_gru_output, _ = self.feature_gru_layers(feature_emb) # [B Len H] + feature_gru_output, _ = self.feature_gru_layers(feature_emb) # [B Len H] output_concat = torch.cat((item_gru_output, feature_gru_output), -1) # [B Len 2*H] output = self.dense_layer(output_concat) output = self.gather_indexes(output, item_seq_len - 1) # [B H] - return output # [B H] + return output # [B H] def calculate_loss(self, interaction): item_seq = interaction[self.ITEM_SEQ] @@ -120,10 +121,10 @@ def calculate_loss(self, interaction): pos_items = interaction[self.POS_ITEM_ID] if self.loss_type == 'BPR': neg_items = interaction[self.NEG_ITEM_ID] - pos_items_emb = self.item_embedding(pos_items) # [B H] - neg_items_emb = self.item_embedding(neg_items) # [B H] - pos_score = torch.sum(seq_output*pos_items_emb, dim=-1) # [B] - neg_score = torch.sum(seq_output*neg_items_emb, dim=-1) # [B] + pos_items_emb = self.item_embedding(pos_items) # [B H] + neg_items_emb = self.item_embedding(neg_items) # [B H] + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) # [B] + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) # [B] loss = self.loss_fct(pos_score, neg_score) return loss else: # self.loss_type = 'CE' diff --git a/recbole/model/sequential_recommender/gru4reckg.py b/recbole/model/sequential_recommender/gru4reckg.py index a275e7d19..e8e68d5d8 100644 --- a/recbole/model/sequential_recommender/gru4reckg.py +++ b/recbole/model/sequential_recommender/gru4reckg.py @@ -7,7 +7,6 @@ # @Author : Yupeng Hou # @Email : houyupeng@ruc.edu.cn - r""" GRU4RecKG ################################################ diff --git a/recbole/model/sequential_recommender/hgn.py b/recbole/model/sequential_recommender/hgn.py new file mode 100644 index 000000000..dc5f6ec49 --- /dev/null +++ b/recbole/model/sequential_recommender/hgn.py @@ -0,0 +1,205 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/21 16:36 +# @Author : Shao Weiqi +# @Reviewer : Lin Kun +# @Email : shaoweiqi@ruc.edu.cn + +r""" +HGN +################################################ + +Reference: + Chen Ma et al. "Hierarchical Gating Networks for Sequential Recommendation."in SIGKDD 2019 + + +""" + +import torch +import torch.nn as nn +from torch.nn.init import xavier_uniform_, constant_, normal_ + +from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss + + +class HGN(SequentialRecommender): + r""" + HGN sets feature gating and instance gating to get the important feature and item for predicting the next item + + """ + + def __init__(self, config, dataset): + super(HGN, self).__init__(config, dataset) + + # load the dataset information + self.n_user = dataset.num(self.USER_ID) + self.device = config["device"] + + # load the parameter information + self.embedding_size = config["embedding_size"] + self.reg_weight = config["reg_weight"] + self.pool_type = config["pooling_type"] + + if self.pool_type not in ["max", "average"]: + raise NotImplementedError("Make sure 'loss_type' in ['max', 'average']!") + + # define the layers and loss function + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.user_embedding = nn.Embedding(self.n_user, self.embedding_size) + + # define the module feature gating need + self.w1 = nn.Linear(self.embedding_size, self.embedding_size) + self.w2 = nn.Linear(self.embedding_size, self.embedding_size) + self.b = nn.Parameter(torch.zeros(self.embedding_size), requires_grad=True).to(self.device) + + # define the module instance gating need + self.w3 = nn.Linear(self.embedding_size, 1, bias=False) + self.w4 = nn.Linear(self.embedding_size, self.max_seq_length, bias=False) + + # define item_embedding for prediction + self.item_embedding_for_prediction = nn.Embedding(self.n_items, self.embedding_size) + + self.sigmoid = nn.Sigmoid() + + self.loss_type = config['loss_type'] + if self.loss_type == 'BPR': + self.loss_fct = BPRLoss() + elif self.loss_type == 'CE': + self.loss_fct = nn.CrossEntropyLoss() + else: + raise NotImplementedError("Make sure 'loss_type' in ['BPR', 'CE']!") + + # init the parameters of the model + self.apply(self._init_weights) + + def reg_loss(self, user_embedding, item_embedding, seq_item_embedding): + + reg_1, reg_2 = self.reg_weight + loss_1_part_1 = reg_1 * torch.norm(self.w1.weight, p=2) + loss_1_part_2 = reg_1 * torch.norm(self.w2.weight, p=2) + loss_1_part_3 = reg_1 * torch.norm(self.w3.weight, p=2) + loss_1_part_4 = reg_1 * torch.norm(self.w4.weight, p=2) + loss_1 = loss_1_part_1 + loss_1_part_2 + loss_1_part_3 + loss_1_part_4 + + loss_2_part_1 = reg_2 * torch.norm(user_embedding, p=2) + loss_2_part_2 = reg_2 * torch.norm(item_embedding, p=2) + loss_2_part_3 = reg_2 * torch.norm(seq_item_embedding, p=2) + loss_2 = loss_2_part_1 + loss_2_part_2 + loss_2_part_3 + + return loss_1 + loss_2 + + def _init_weights(self, module): + if isinstance(module, nn.Embedding): + normal_(module.weight.data, 0., 1 / self.embedding_size) + elif isinstance(module, nn.Linear): + xavier_uniform_(module.weight.data) + if module.bias is not None: + constant_(module.bias.data, 0) + + def feature_gating(self, seq_item_embedding, user_embedding): + """ + + choose the features that will be sent to the next stage(more important feature, more focus) + """ + + batch_size, seq_len, embedding_size = seq_item_embedding.size() + seq_item_embedding_value = seq_item_embedding + + seq_item_embedding = self.w1(seq_item_embedding) + # batch_size * seq_len * embedding_size + user_embedding = self.w2(user_embedding) + # batch_size * embedding_size + user_embedding = user_embedding.unsqueeze(1).repeat(1, seq_len, 1) + # batch_size * seq_len * embedding_size + + user_item = self.sigmoid(seq_item_embedding + user_embedding + self.b) + # batch_size * seq_len * embedding_size + + user_item = torch.mul(seq_item_embedding_value, user_item) + # batch_size * seq_len * embedding_size + + return user_item + + def instance_gating(self, user_item, user_embedding): + """ + + choose the last click items that will influence the prediction( more important more chance to get attention) + """ + + user_embedding_value = user_item + + user_item = self.w3(user_item) + # batch_size * seq_len * 1 + + user_embedding = self.w4(user_embedding).unsqueeze(2) + # batch_size * seq_len * 1 + + instance_score = self.sigmoid(user_item + user_embedding).squeeze(-1) + # batch_size * seq_len * 1 + output = torch.mul(instance_score.unsqueeze(2), user_embedding_value) + # batch_size * seq_len * embedding_size + + if self.pool_type == "average": + output = torch.div(output.sum(dim=1), instance_score.sum(dim=1).unsqueeze(1)) + # batch_size * embedding_size + else: + # for max_pooling + index = torch.max(instance_score, dim=1)[1] + # batch_size * 1 + output = self.gather_indexes(output, index) + # batch_size * seq_len * embedding_size ==>> batch_size * embedding_size + + return output + + def forward(self, seq_item, user): + + seq_item_embedding = self.item_embedding(seq_item) + user_embedding = self.user_embedding(user) + feature_gating = self.feature_gating(seq_item_embedding, user_embedding) + instance_gating = self.instance_gating(feature_gating, user_embedding) + # batch_size * embedding_size + item_item = torch.sum(seq_item_embedding, dim=1) + # batch_size * embedding_size + + return user_embedding + instance_gating + item_item + + def calculate_loss(self, interaction): + + seq_item = interaction[self.ITEM_SEQ] + seq_item_embedding = self.item_embedding(seq_item) + user = interaction[self.USER_ID] + user_embedding = self.user_embedding(user) + seq_output = self.forward(seq_item, user) + pos_items = interaction[self.POS_ITEM_ID] + pos_items_emb = self.item_embedding_for_prediction(pos_items) + if self.loss_type == 'BPR': + neg_items = interaction[self.NEG_ITEM_ID] + neg_items_emb = self.item_embedding(neg_items) + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) + loss = self.loss_fct(pos_score, neg_score) + return loss + self.reg_loss(user_embedding, pos_items_emb, seq_item_embedding) + else: # self.loss_type = 'CE' + test_item_emb = self.item_embedding_for_prediction.weight + logits = torch.matmul(seq_output, test_item_emb.transpose(0, 1)) + loss = self.loss_fct(logits, pos_items) + return loss + self.reg_loss(user_embedding, pos_items_emb, seq_item_embedding) + + def predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + test_item = interaction[self.ITEM_ID] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user) + test_item_emb = self.item_embedding_for_prediction(test_item) + scores = torch.mul(seq_output, test_item_emb).sum(dim=1) + return scores + + def full_sort_predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user) + test_items_emb = self.item_embedding_for_prediction.weight + scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) + return scores diff --git a/recbole/model/sequential_recommender/hrm.py b/recbole/model/sequential_recommender/hrm.py new file mode 100644 index 000000000..421835266 --- /dev/null +++ b/recbole/model/sequential_recommender/hrm.py @@ -0,0 +1,174 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/22 12:08 +# @Author : Shao Weiqi +# @Reviewer : Lin Kun +# @Email : shaoweiqi@ruc.edu.cn + +r""" +HRM +################################################ + +Reference: + Pengfei Wang et al. "Learning Hierarchical Representation Model for Next Basket Recommendation." in SIGIR 2015. + +Reference code: + https://github.com/wubinzzu/NeuRec + +""" + +import torch +import torch.nn as nn +from torch.nn.init import xavier_normal_ + +from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss + + +class HRM(SequentialRecommender): + r""" + HRM can well capture both sequential behavior and users’ general taste by involving transaction and + user representations in prediction. + + HRM user max- & average- pooling as a good helper. + """ + + def __init__(self, config, dataset): + super(HRM, self).__init__(config, dataset) + + # load the dataset information + self.n_user = dataset.num(self.USER_ID) + self.device = config["device"] + + # load the parameters information + self.embedding_size = config["embedding_size"] + self.pooling_type_layer_1 = config["pooling_type_layer_1"] + self.pooling_type_layer_2 = config["pooling_type_layer_2"] + self.high_order = config["high_order"] + assert self.high_order <= self.max_seq_length, "high_order can't longer than the max_seq_length" + self.reg_weight = config["reg_weight"] + self.dropout_prob = config["dropout_prob"] + + # define the layers and loss type + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.user_embedding = nn.Embedding(self.n_user, self.embedding_size) + self.dropout = nn.Dropout(self.dropout_prob) + + self.loss_type = config['loss_type'] + if self.loss_type == 'BPR': + self.loss_fct = BPRLoss() + elif self.loss_type == 'CE': + self.loss_fct = nn.CrossEntropyLoss() + else: + raise NotImplementedError("Make sure 'loss_type' in ['BPR', 'CE']!") + + # init the parameters of the model + self.apply(self._init_weights) + + def inverse_seq_item(self, seq_item, seq_item_len): + """ + inverse the seq_item, like this + [1,2,3,0,0,0,0] -- after inverse -->> [0,0,0,0,1,2,3] + """ + seq_item = seq_item.cpu().numpy() + seq_item_len = seq_item_len.cpu().numpy() + new_seq_item = [] + for items, length in zip(seq_item, seq_item_len): + item = list(items[:length]) + zeros = list(items[length:]) + seqs = zeros + item + new_seq_item.append(seqs) + seq_item = torch.tensor(new_seq_item, dtype=torch.long, device=self.device) + + return seq_item + + def _init_weights(self, module): + + if isinstance(module, nn.Embedding): + xavier_normal_(module.weight.data) + + def forward(self, seq_item, user, seq_item_len): + + # seq_item=self.inverse_seq_item(seq_item) + seq_item = self.inverse_seq_item(seq_item, seq_item_len) + + seq_item_embedding = self.item_embedding(seq_item) + # batch_size * seq_len * embedding_size + + high_order_item_embedding = seq_item_embedding[:, -self.high_order:, :] + # batch_size * high_order * embedding_size + + user_embedding = self.dropout(self.user_embedding(user)) + # batch_size * embedding_size + + # layer 1 + if self.pooling_type_layer_1 == "max": + high_order_item_embedding = torch.max(high_order_item_embedding, dim=1).values + # batch_size * embedding_size + else: + for idx, len in enumerate(seq_item_len): + if len > self.high_order: + seq_item_len[idx] = self.high_order + high_order_item_embedding = torch.sum(seq_item_embedding, dim=1) + high_order_item_embedding = torch.div(high_order_item_embedding, seq_item_len.unsqueeze(1).float()) + # batch_size * embedding_size + hybrid_user_embedding = self.dropout( + torch.cat([user_embedding.unsqueeze(dim=1), + high_order_item_embedding.unsqueeze(dim=1)], dim=1) + ) + # batch_size * 2_mul_embedding_size + + # layer 2 + if self.pooling_type_layer_2 == "max": + hybrid_user_embedding = torch.max(hybrid_user_embedding, dim=1).values + # batch_size * embedding_size + else: + hybrid_user_embedding = torch.mean(hybrid_user_embedding, dim=1) + # batch_size * embedding_size + + return hybrid_user_embedding + + def calculate_loss(self, interaction): + + seq_item = interaction[self.ITEM_SEQ] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + user = interaction[self.USER_ID] + seq_output = self.forward(seq_item, user, seq_item_len) + pos_items = interaction[self.POS_ITEM_ID] + pos_items_emb = self.item_embedding(pos_items) + if self.loss_type == 'BPR': + neg_items = interaction[self.NEG_ITEM_ID] + neg_items_emb = self.item_embedding(neg_items) + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) + loss = self.loss_fct(pos_score, neg_score) + return loss + else: # self.loss_type = 'CE' + test_item_emb = self.item_embedding.weight.t() + logits = torch.matmul(seq_output, test_item_emb) + loss = self.loss_fct(logits, pos_items) + + return loss + + def predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + test_item = interaction[self.ITEM_ID] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user, seq_item_len) + seq_output = seq_output.repeat(1, self.embedding_size) + test_item_emb = self.item_embedding(test_item) + scores = torch.mul(seq_output, test_item_emb).sum(dim=1) + + return scores + + def full_sort_predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user, seq_item_len) + test_items_emb = self.item_embedding.weight + scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) + + return scores diff --git a/recbole/model/sequential_recommender/ksr.py b/recbole/model/sequential_recommender/ksr.py index c788b9730..bf48c6719 100644 --- a/recbole/model/sequential_recommender/ksr.py +++ b/recbole/model/sequential_recommender/ksr.py @@ -3,7 +3,6 @@ # @Author : Jin Huang and Shanlei Mu # @Email : Betsyj.huang@gmail.com and slmu@ruc.edu.cn - r""" KSR ################################################ @@ -14,14 +13,12 @@ """ - import torch from torch import nn from torch.nn.init import xavier_uniform_, xavier_normal_ -from recbole.utils import InputType -from recbole.model.loss import BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss class KSR(SequentialRecommender): @@ -48,13 +45,13 @@ def __init__(self, config, dataset): self.loss_type = config['loss_type'] self.num_layers = config['num_layers'] self.dropout_prob = config['dropout_prob'] - self.gamma = config['gamma'] # Scaling factor + self.gamma = config['gamma'] # Scaling factor self.device = config['device'] self.loss_type = config['loss_type'] self.freeze_kg = config['freeze_kg'] # define layers and loss - self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) self.entity_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) self.entity_embedding.weight.requires_grad = not self.freeze_kg @@ -75,31 +72,35 @@ def __init__(self, config, dataset): self.loss_fct = nn.CrossEntropyLoss() else: raise NotImplementedError("Make sure 'loss_type' in ['BPR', 'CE']!") - + # parameters initialization self.apply(self._init_weights) self.entity_embedding.weight.data.copy_(torch.from_numpy(self.entity_embedding_matrix[:self.n_items])) - self.relation_Matrix = torch.from_numpy(self.relation_embedding_matrix[:self.n_relations]).to(self.device) # [R H] + self.relation_Matrix = torch.from_numpy(self.relation_embedding_matrix[:self.n_relations] + ).to(self.device) # [R H] def _init_weights(self, module): """ Initialize the weights """ if isinstance(module, nn.Embedding): xavier_normal_(module.weight) - elif isinstance(module,nn.GRU): - xavier_uniform_(self.gru_layers.weight_hh_l0) - xavier_uniform_(self.gru_layers.weight_ih_l0) - + elif isinstance(module, nn.GRU): + xavier_uniform_(module.weight_hh_l0) + xavier_uniform_(module.weight_ih_l0) + def _get_kg_embedding(self, head): - """Difference: We generate the embeddings of the tail entities on every relations only for head due to the 1-N problems. """ - head_e = self.entity_embedding(head) # [B H] - relation_Matrix = self.relation_Matrix.repeat(head_e.size()[0], 1, 1) # [B R H] - head_Matrix = torch.unsqueeze(head_e, 1).repeat(1, self.n_relations, 1) # [B R H] + """Difference: + We generate the embeddings of the tail entities on every relations only for head due to the 1-N problems. + """ + head_e = self.entity_embedding(head) # [B H] + relation_Matrix = self.relation_Matrix.repeat(head_e.size()[0], 1, 1) # [B R H] + head_Matrix = torch.unsqueeze(head_e, 1).repeat(1, self.n_relations, 1) # [B R H] tail_Matrix = head_Matrix + relation_Matrix - + return head_e, tail_Matrix - + def _memory_update_cell(self, user_memory, update_memory): - z = torch.sigmoid(torch.mul(user_memory, update_memory).sum(-1).float()).unsqueeze(-1) # [B R 1], the gate vector + z = torch.sigmoid(torch.mul(user_memory, + update_memory).sum(-1).float()).unsqueeze(-1) # [B R 1], the gate vector updated_user_memory = (1.0 - z) * user_memory + z * update_memory return updated_user_memory @@ -108,19 +109,20 @@ def memory_update(self, item_seq, item_seq_len): step_length = item_seq.size()[1] last_item = item_seq_len - 1 # init user memory with 0s - user_memory = torch.zeros(item_seq.size()[0], self.n_relations, self.embedding_size).float().to(self.device) # [B R H] + user_memory = torch.zeros(item_seq.size()[0], self.n_relations, + self.embedding_size).float().to(self.device) # [B R H] last_user_memory = torch.zeros_like(user_memory) - for i in range(step_length): # [len] - _, update_memory = self._get_kg_embedding(item_seq[:, i]) # [B R H] - user_memory = self._memory_update_cell(user_memory, update_memory) # [B R H] + for i in range(step_length): # [len] + _, update_memory = self._get_kg_embedding(item_seq[:, i]) # [B R H] + user_memory = self._memory_update_cell(user_memory, update_memory) # [B R H] last_user_memory[last_item == i] = user_memory[last_item == i].float() return last_user_memory - + def memory_read(self, user_memory): """ define read operator """ attrs = self.relation_Matrix - attentions = nn.functional.softmax(self.gamma * torch.mul(user_memory, attrs).sum(-1).float(), -1) # [B R] - u_m = torch.mul(user_memory, attentions.unsqueeze(-1)).sum(1) # [B H] + attentions = nn.functional.softmax(self.gamma * torch.mul(user_memory, attrs).sum(-1).float(), -1) # [B R] + u_m = torch.mul(user_memory, attentions.unsqueeze(-1)).sum(1) # [B H] return u_m def forward(self, item_seq, item_seq_len): @@ -141,7 +143,7 @@ def forward(self, item_seq, item_seq_len): # combine them together p_u = self.dense_layer_u(torch.cat((seq_output, u_m), -1)) # [B H] return p_u - + def _get_item_comb_embedding(self, item): h_e, _ = self._get_kg_embedding(item) i_e = self.item_embedding(item) @@ -157,13 +159,15 @@ def calculate_loss(self, interaction): neg_items = interaction[self.NEG_ITEM_ID] pos_items_emb = self._get_item_comb_embedding(pos_items) neg_items_emb = self._get_item_comb_embedding(neg_items) - pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) # [B] - neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) # [B] + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) # [B] + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) # [B] loss = self.loss_fct(pos_score, neg_score) return loss - else: # self.loss_type = 'CE' - test_items_emb = self.dense_layer_i(torch.cat((self.item_embedding.weight, self.entity_embedding.weight), -1)) # [n_items H] - logits = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) + else: # self.loss_type = 'CE' + test_items_emb = self.dense_layer_i( + torch.cat((self.item_embedding.weight, self.entity_embedding.weight), -1) + ) # [n_items H] + logits = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) loss = self.loss_fct(logits, pos_items) return loss @@ -180,6 +184,8 @@ def full_sort_predict(self, interaction): item_seq = interaction[self.ITEM_SEQ] item_seq_len = interaction[self.ITEM_SEQ_LEN] seq_output = self.forward(item_seq, item_seq_len) - test_items_emb = self.dense_layer_i(torch.cat((self.item_embedding.weight, self.entity_embedding.weight), -1)) # [n_items H] + test_items_emb = self.dense_layer_i( + torch.cat((self.item_embedding.weight, self.entity_embedding.weight), -1) + ) # [n_items H] scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) # [B, n_items] return scores diff --git a/recbole/model/sequential_recommender/narm.py b/recbole/model/sequential_recommender/narm.py index 562f74a12..76f5594b7 100644 --- a/recbole/model/sequential_recommender/narm.py +++ b/recbole/model/sequential_recommender/narm.py @@ -24,8 +24,8 @@ from torch import nn from torch.nn.init import xavier_normal_, constant_ -from recbole.model.loss import BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss class NARM(SequentialRecommender): @@ -52,7 +52,7 @@ def __init__(self, config, dataset): self.a_2 = nn.Linear(self.hidden_size, self.hidden_size, bias=False) self.v_t = nn.Linear(self.hidden_size, 1, bias=False) self.ct_dropout = nn.Dropout(self.dropout_probs[1]) - self.b = nn.Linear(2*self.hidden_size, self.embedding_size, bias=False) + self.b = nn.Linear(2 * self.hidden_size, self.embedding_size, bias=False) self.loss_type = config['loss_type'] if self.loss_type == 'BPR': self.loss_fct = BPRLoss() diff --git a/recbole/model/sequential_recommender/nextitnet.py b/recbole/model/sequential_recommender/nextitnet.py index c55851d6f..e0ad3def4 100644 --- a/recbole/model/sequential_recommender/nextitnet.py +++ b/recbole/model/sequential_recommender/nextitnet.py @@ -3,7 +3,6 @@ # @Author : Jingsen Zhang # @Email : zhangjingsen@ruc.edu.cn - r""" NextItNet ################################################ @@ -22,8 +21,8 @@ from torch.nn import functional as F from torch.nn.init import uniform_, xavier_normal_, constant_ -from recbole.model.loss import RegLoss, BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import RegLoss, BPRLoss class NextItNet(SequentialRecommender): @@ -36,6 +35,7 @@ class NextItNet(SequentialRecommender): and then stop the generating process. Although the number of parameters in residual block (a) is less than it in residual block (b), the performance of b is better than a. So in our model, we use residual block (b). + In addition, when dilations is not equal to 1, the training may be slow. To speed up the efficiency, please set the parameters "reproducibility" False. """ def __init__(self, config, dataset): @@ -54,8 +54,11 @@ def __init__(self, config, dataset): self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) # residual blocks dilations in blocks:[1,2,4,8,1,2,4,8,...] - rb = [ResidualBlock_b(self.residual_channels, self.residual_channels, kernel_size=self.kernel_size, - dilation=dilation) for dilation in self.dilations] + rb = [ + ResidualBlock_b( + self.residual_channels, self.residual_channels, kernel_size=self.kernel_size, dilation=dilation + ) for dilation in self.dilations + ] self.residual_blocks = nn.Sequential(*rb) # fully-connected layer @@ -82,11 +85,11 @@ def _init_weights(self, module): constant_(module.bias.data, 0.1) def forward(self, item_seq): - item_seq_emb = self.item_embedding(item_seq) # [batch_size, seq_len, embed_size] + item_seq_emb = self.item_embedding(item_seq) # [batch_size, seq_len, embed_size] # Residual locks dilate_outputs = self.residual_blocks(item_seq_emb) hidden = dilate_outputs[:, -1, :].view(-1, self.residual_channels) # [batch_size, embed_size] - seq_output = self.final_layer(hidden) # [batch_size, embedding_size] + seq_output = self.final_layer(hidden) # [batch_size, embedding_size] return seq_output def reg_loss_rb(self): @@ -97,7 +100,7 @@ def reg_loss_rb(self): if self.reg_weight > 0.0: for name, parm in self.residual_blocks.named_parameters(): if name.endswith('weight'): - loss_rb += torch.norm(parm,2) + loss_rb += torch.norm(parm, 2) return self.reg_weight * loss_rb def calculate_loss(self, interaction): @@ -116,7 +119,7 @@ def calculate_loss(self, interaction): test_item_emb = self.item_embedding.weight logits = torch.matmul(seq_output, test_item_emb.transpose(0, 1)) loss = self.loss_fct(logits, pos_items) - reg_loss = self.reg_loss([self.item_embedding.weight,self.final_layer.weight]) + reg_loss = self.reg_loss([self.item_embedding.weight, self.final_layer.weight]) loss = loss + self.reg_weight * reg_loss + self.reg_loss_rb() return loss @@ -141,14 +144,14 @@ class ResidualBlock_a(nn.Module): r""" Residual block (a) in the paper """ + def __init__(self, in_channel, out_channel, kernel_size=3, dilation=None): super(ResidualBlock_a, self).__init__() - half_channel = out_channel//2 + half_channel = out_channel // 2 self.ln1 = nn.LayerNorm(out_channel, eps=1e-8) self.conv1 = nn.Conv2d(in_channel, half_channel, kernel_size=(1, 1), padding=0) - self.ln2 = nn.LayerNorm(half_channel, eps=1e-8) self.conv2 = nn.Conv2d(half_channel, half_channel, kernel_size=(1, kernel_size), padding=0, dilation=dilation) @@ -176,12 +179,13 @@ def forward(self, x): # x: [batch_size, seq_len, embed_size] def conv_pad(self, x, dilation): # x: [batch_size, seq_len, embed_size] r""" Dropout-mask: To avoid the future information leakage problem, this paper proposed a masking-based dropout trick for the 1D dilated convolution to prevent the network from seeing the future items. - Also the One-dimensional transformation is completed in this funtion. + Also the One-dimensional transformation is completed in this function. """ - inputs_pad = x.permute(0, 2, 1) # [batch_size, embed_size, seq_len] + inputs_pad = x.permute(0, 2, 1) # [batch_size, embed_size, seq_len] inputs_pad = inputs_pad.unsqueeze(2) # [batch_size, embed_size, 1, seq_len] - pad = nn.ZeroPad2d(((self.kernel_size - 1) * dilation, 0, 0, 0)) # padding opration args:(left,right,top,bottom) - inputs_pad = pad(inputs_pad) # [batch_size, embed_size, 1, seq_len+(self.kernel_size-1)*dilations] + pad = nn.ZeroPad2d(((self.kernel_size - 1) * dilation, 0, 0, 0)) + # padding operation args:(left,right,top,bottom) + inputs_pad = pad(inputs_pad) # [batch_size, embed_size, 1, seq_len+(self.kernel_size-1)*dilations] return inputs_pad @@ -189,22 +193,24 @@ class ResidualBlock_b(nn.Module): r""" Residual block (b) in the paper """ + def __init__(self, in_channel, out_channel, kernel_size=3, dilation=None): super(ResidualBlock_b, self).__init__() self.conv1 = nn.Conv2d(in_channel, out_channel, kernel_size=(1, kernel_size), padding=0, dilation=dilation) self.ln1 = nn.LayerNorm(out_channel, eps=1e-8) - self.conv2 = nn.Conv2d(out_channel, out_channel, kernel_size=(1, kernel_size), padding=0, dilation=dilation*2) + self.conv2 = nn.Conv2d(out_channel, out_channel, kernel_size=(1, kernel_size), padding=0, dilation=dilation * 2) self.ln2 = nn.LayerNorm(out_channel, eps=1e-8) self.dilation = dilation self.kernel_size = kernel_size def forward(self, x): # x: [batch_size, seq_len, embed_size] - x_pad = self.conv_pad(x, self.dilation) # [batch_size, embed_size, 1, seq_len+(self.kernel_size-1)*dilations] - out = self.conv1(x_pad).squeeze(2).permute(0, 2, 1) # [batch_size, seq_len+(self.kernel_size-1)*dilations-kernel_size+1, embed_size] + x_pad = self.conv_pad(x, self.dilation) # [batch_size, embed_size, 1, seq_len+(self.kernel_size-1)*dilations] + out = self.conv1(x_pad).squeeze(2).permute(0, 2, 1) + # [batch_size, seq_len+(self.kernel_size-1)*dilations-kernel_size+1, embed_size] out = F.relu(self.ln1(out)) - out_pad = self.conv_pad(out, self.dilation*2) + out_pad = self.conv_pad(out, self.dilation * 2) out2 = self.conv2(out_pad).squeeze(2).permute(0, 2, 1) out2 = F.relu(self.ln2(out2)) return out2 + x @@ -212,7 +218,7 @@ def forward(self, x): # x: [batch_size, seq_len, embed_size] def conv_pad(self, x, dilation): r""" Dropout-mask: To avoid the future information leakage problem, this paper proposed a masking-based dropout trick for the 1D dilated convolution to prevent the network from seeing the future items. - Also the One-dimensional transformation is completed in this funtion. + Also the One-dimensional transformation is completed in this function. """ inputs_pad = x.permute(0, 2, 1) inputs_pad = inputs_pad.unsqueeze(2) diff --git a/recbole/model/sequential_recommender/npe.py b/recbole/model/sequential_recommender/npe.py new file mode 100644 index 000000000..29674241a --- /dev/null +++ b/recbole/model/sequential_recommender/npe.py @@ -0,0 +1,115 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/22 14:56 +# @Author : Shao Weiqi +# @Reviewer : Lin Kun +# @Email : shaoweiqi@ruc.edu.cn + +r""" +NPE +################################################ + +Reference: + ThaiBinh Nguyen, et al. "NPE: Neural Personalized Embedding for Collaborative Filtering" in ijcai2018 + +Reference code: + https://github.com/wubinzzu/NeuRec + +""" + +import torch +import torch.nn as nn +from torch.nn.init import xavier_normal_ + +from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss + + +class NPE(SequentialRecommender): + r""" + models a user’s click to an item in two terms: the personal preference of the user for the item, + and the relationships between this item and other items clicked by the user + + """ + + def __init__(self, config, dataset): + super(NPE, self).__init__(config, dataset) + + # load the dataset information + self.n_user = dataset.num(self.USER_ID) + self.device = config["device"] + + # load the parameters information + self.embedding_size = config["embedding_size"] + self.dropout_prob = config["dropout_prob"] + + # define layers and loss type + self.user_embedding = nn.Embedding(self.n_user, self.embedding_size) + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size) + self.embedding_seq_item = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.relu = nn.ReLU() + self.dropout = nn.Dropout(self.dropout_prob) + + self.loss_type = config['loss_type'] + if self.loss_type == 'BPR': + self.loss_fct = BPRLoss() + elif self.loss_type == 'CE': + self.loss_fct = nn.CrossEntropyLoss() + else: + raise NotImplementedError("Make sure 'loss_type' in ['BPR', 'CE']!") + + # init the parameters of the module + self.apply(self._init_weights) + + def _init_weights(self, module): + if isinstance(module, nn.Embedding): + xavier_normal_(module.weight.data) + + def forward(self, seq_item, user): + + user_embedding = self.dropout(self.relu(self.user_embedding(user))) + # batch_size * embedding_size + seq_item_embedding = self.item_embedding(seq_item).sum(dim=1) + seq_item_embedding = self.dropout(self.relu(seq_item_embedding)) + # batch_size * embedding_size + + return user_embedding + seq_item_embedding + + def calculate_loss(self, interaction): + + seq_item = interaction[self.ITEM_SEQ] + user = interaction[self.USER_ID] + seq_output = self.forward(seq_item, user) + pos_items = interaction[self.POS_ITEM_ID] + pos_items_embs = self.item_embedding(pos_items) + if self.loss_type == 'BPR': + neg_items = interaction[self.NEG_ITEM_ID] + neg_items_emb = self.relu(self.item_embedding(neg_items)) + pos_items_emb = self.relu(pos_items_embs) + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) + loss = self.loss_fct(pos_score, neg_score) + return loss + else: # self.loss_type = 'CE' + test_item_emb = self.relu(self.item_embedding.weight) + logits = torch.matmul(seq_output, test_item_emb.transpose(0, 1)) + loss = self.loss_fct(logits, pos_items) + return loss + + def predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + test_item = interaction[self.ITEM_ID] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user) + test_item_emb = self.relu(self.item_embedding(test_item)) + scores = torch.mul(seq_output, test_item_emb).sum(dim=1) + return scores + + def full_sort_predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user) + test_items_emb = self.relu(self.item_embedding.weight) + scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) + return scores diff --git a/recbole/model/sequential_recommender/repeatnet.py b/recbole/model/sequential_recommender/repeatnet.py new file mode 100644 index 000000000..98cc28583 --- /dev/null +++ b/recbole/model/sequential_recommender/repeatnet.py @@ -0,0 +1,330 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/22 8:30 +# @Author : Shao Weiqi +# @Reviewer : Lin Kun, Fan xinyan +# @Email : shaoweiqi@ruc.edu.cn, xinyan.fan@ruc.edu.cn + +r""" +RepeatNet +################################################ + +Reference: + Pengjie Ren et al. "RepeatNet: A Repeat Aware Neural Recommendation Machine for Session-based Recommendation." + in AAAI 2019 + +Reference code: + https://github.com/PengjieRen/RepeatNet. + +""" + +import torch +from torch import nn +from torch.nn import functional as F +from torch.nn.init import xavier_normal_, constant_ + +from recbole.model.abstract_recommender import SequentialRecommender +from recbole.utils import InputType + + +class RepeatNet(SequentialRecommender): + r""" + RepeatNet explores a hybrid encoder with an repeat module and explore module + repeat module is used for finding out the repeat consume in sequential recommendation + explore module is used for exploring new items for recommendation + + """ + + input_type = InputType.POINTWISE + + def __init__(self, config, dataset): + + super(RepeatNet, self).__init__(config, dataset) + + # load the dataset information + self.device = config["device"] + + # load parameters + self.embedding_size = config["embedding_size"] + self.hidden_size = config["hidden_size"] + self.joint_train = config["joint_train"] + self.dropout_prob = config["dropout_prob"] + + # define the layers and loss function + self.item_matrix = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.gru = nn.GRU(self.embedding_size, self.hidden_size, batch_first=True) + self.repeat_explore_mechanism = Repeat_Explore_Mechanism( + self.device, hidden_size=self.hidden_size, seq_len=self.max_seq_length, dropout_prob=self.dropout_prob + ) + self.repeat_recommendation_decoder = Repeat_Recommendation_Decoder( + self.device, + hidden_size=self.hidden_size, + seq_len=self.max_seq_length, + num_item=self.n_items, + dropout_prob=self.dropout_prob + ) + self.explore_recommendation_decoder = Explore_Recommendation_Decoder( + hidden_size=self.hidden_size, + seq_len=self.max_seq_length, + num_item=self.n_items, + device=self.device, + dropout_prob=self.dropout_prob + ) + + self.loss_fct = F.nll_loss + + # init the weight of the module + self.apply(self._init_weights) + + def _init_weights(self, module): + + if isinstance(module, nn.Embedding): + xavier_normal_(module.weight.data) + elif isinstance(module, nn.Linear): + xavier_normal_(module.weight.data) + if module.bias is not None: + constant_(module.bias.data, 0) + + def forward(self, item_seq, item_seq_len): + + batch_seq_item_embedding = self.item_matrix(item_seq) + # batch_size * seq_len == embedding ==>> batch_size * seq_len * embedding_size + + all_memory, _ = self.gru(batch_seq_item_embedding) + last_memory = self.gather_indexes(all_memory, item_seq_len - 1) + # all_memory: batch_size * item_seq * hidden_size + # last_memory: batch_size * hidden_size + timeline_mask = (item_seq == 0) + + self.repeat_explore = self.repeat_explore_mechanism.forward(all_memory=all_memory, last_memory=last_memory) + # batch_size * 2 + repeat_recommendation_decoder = self.repeat_recommendation_decoder.forward( + all_memory=all_memory, last_memory=last_memory, item_seq=item_seq, mask=timeline_mask + ) + # batch_size * num_item + explore_recommendation_decoder = self.explore_recommendation_decoder.forward( + all_memory=all_memory, last_memory=last_memory, item_seq=item_seq, mask=timeline_mask + ) + # batch_size * num_item + prediction = repeat_recommendation_decoder * self.repeat_explore[:, 0].unsqueeze(1) \ + + explore_recommendation_decoder * self.repeat_explore[:, 1].unsqueeze(1) + # batch_size * num_item + + return prediction + + def calculate_loss(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + item_seq_len = interaction[self.ITEM_SEQ_LEN] + pos_item = interaction[self.POS_ITEM_ID] + prediction = self.forward(item_seq, item_seq_len) + loss = self.loss_fct((prediction + 1e-8).log(), pos_item, ignore_index=0) + if self.joint_train is True: + loss += self.repeat_explore_loss(item_seq, pos_item) + + return loss + + def repeat_explore_loss(self, item_seq, pos_item): + + batch_size = item_seq.size(0) + repeat, explore = torch.zeros(batch_size).to(self.device), torch.ones(batch_size).to(self.device) + index = 0 + for seq_item_ex, pos_item_ex in zip(item_seq, pos_item): + if pos_item_ex in seq_item_ex: + repeat[index] = 1 + explore[index] = 0 + index += 1 + repeat_loss = torch.mul(repeat.unsqueeze(1), torch.log(self.repeat_explore[:, 0] + 1e-8)).mean() + explore_loss = torch.mul(explore.unsqueeze(1), torch.log(self.repeat_explore[:, 1] + 1e-8)).mean() + + return (-repeat_loss - explore_loss) / 2 + + def full_sort_predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + item_seq_len = interaction[self.ITEM_SEQ_LEN] + prediction = self.forward(item_seq, item_seq_len) + + return prediction + + def predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + test_item = interaction[self.ITEM_ID] + item_seq_len = interaction[self.ITEM_SEQ_LEN] + seq_output = self.forward(item_seq, item_seq_len) + # batch_size * num_items + seq_output = seq_output.unsqueeze(-1) + # batch_size * num_items * 1 + scores = self.gather_indexes(seq_output, test_item).squeeze() + + return scores + + +class Repeat_Explore_Mechanism(nn.Module): + + def __init__(self, device, hidden_size, seq_len, dropout_prob): + super(Repeat_Explore_Mechanism, self).__init__() + self.dropout = nn.Dropout(dropout_prob) + self.hidden_size = hidden_size + self.device = device + self.seq_len = seq_len + self.Wre = nn.Linear(hidden_size, hidden_size, bias=False) + self.Ure = nn.Linear(hidden_size, hidden_size, bias=False) + self.tanh = nn.Tanh() + self.Vre = nn.Linear(hidden_size, 1, bias=False) + self.Wcre = nn.Linear(hidden_size, 2, bias=False) + + def forward(self, all_memory, last_memory): + """ + calculate the probability of Repeat and explore + """ + all_memory_values = all_memory + + all_memory = self.dropout(self.Ure(all_memory)) + + last_memory = self.dropout(self.Wre(last_memory)) + last_memory = last_memory.unsqueeze(1) + last_memory = last_memory.repeat(1, self.seq_len, 1) + + output_ere = self.tanh(all_memory + last_memory) + + output_ere = self.Vre(output_ere) + alpha_are = nn.Softmax(dim=1)(output_ere) + alpha_are = alpha_are.repeat(1, 1, self.hidden_size) + output_cre = alpha_are * all_memory_values + output_cre = output_cre.sum(dim=1) + + output_cre = self.Wcre(output_cre) + + repeat_explore_mechanism = nn.Softmax(dim=-1)(output_cre) + + return repeat_explore_mechanism + + +class Repeat_Recommendation_Decoder(nn.Module): + + def __init__(self, device, hidden_size, seq_len, num_item, dropout_prob): + super(Repeat_Recommendation_Decoder, self).__init__() + self.dropout = nn.Dropout(dropout_prob) + self.hidden_size = hidden_size + self.device = device + self.seq_len = seq_len + self.num_item = num_item + self.Wr = nn.Linear(hidden_size, hidden_size, bias=False) + self.Ur = nn.Linear(hidden_size, hidden_size, bias=False) + self.tanh = nn.Tanh() + self.Vr = nn.Linear(hidden_size, 1) + + def forward(self, all_memory, last_memory, item_seq, mask=None): + """ + calculate the the force of repeat + """ + all_memory = self.dropout(self.Ur(all_memory)) + + last_memory = self.dropout(self.Wr(last_memory)) + last_memory = last_memory.unsqueeze(1) + last_memory = last_memory.repeat(1, self.seq_len, 1) + + output_er = self.tanh(last_memory + all_memory) + + output_er = self.Vr(output_er).squeeze(2) + + if mask is not None: + output_er.masked_fill_(mask, -1e9) + + output_er = nn.Softmax(dim=-1)(output_er) + output_er = output_er.unsqueeze(1) + + map_matrix = build_map(item_seq, self.device, max_index=self.num_item) + output_er = torch.matmul(output_er, map_matrix).squeeze(1).to(self.device) + repeat_recommendation_decoder = output_er.squeeze(1).to(self.device) + + return repeat_recommendation_decoder.to(self.device) + + +class Explore_Recommendation_Decoder(nn.Module): + + def __init__(self, hidden_size, seq_len, num_item, device, dropout_prob): + super(Explore_Recommendation_Decoder, self).__init__() + self.dropout = nn.Dropout(dropout_prob) + self.hidden_size = hidden_size + self.seq_len = seq_len + self.num_item = num_item + self.device = device + self.We = nn.Linear(hidden_size, hidden_size) + self.Ue = nn.Linear(hidden_size, hidden_size) + self.tanh = nn.Tanh() + self.Ve = nn.Linear(hidden_size, 1) + self.matrix_for_explore = nn.Linear(2 * self.hidden_size, self.num_item, bias=False) + + def forward(self, all_memory, last_memory, item_seq, mask=None): + """ + calculate the force of explore + """ + all_memory_values, last_memory_values = all_memory, last_memory + + all_memory = self.dropout(self.Ue(all_memory)) + + last_memory = self.dropout(self.We(last_memory)) + last_memory = last_memory.unsqueeze(1) + last_memory = last_memory.repeat(1, self.seq_len, 1) + + output_ee = self.tanh(all_memory + last_memory) + output_ee = self.Ve(output_ee).squeeze(-1) + + if mask is not None: + output_ee.masked_fill_(mask, -1e9) + + output_ee = output_ee.unsqueeze(-1) + + alpha_e = nn.Softmax(dim=1)(output_ee) + alpha_e = alpha_e.repeat(1, 1, self.hidden_size) + output_e = (alpha_e * all_memory_values).sum(dim=1) + output_e = torch.cat([output_e, last_memory_values], dim=1) + output_e = self.dropout(self.matrix_for_explore(output_e)) + + map_matrix = build_map(item_seq, self.device, max_index=self.num_item) + explore_mask = torch.bmm((item_seq > 0).float().unsqueeze(1), map_matrix).squeeze(1) + output_e = output_e.masked_fill(explore_mask.bool(), float('-inf')) + explore_recommendation_decoder = nn.Softmax(1)(output_e) + + return explore_recommendation_decoder + + +def build_map(b_map, device, max_index=None): + """ + project the b_map to the place where it in should be like this: + item_seq A: [3,4,5] n_items: 6 + + after map: A + + [0,0,1,0,0,0] + + [0,0,0,1,0,0] + + [0,0,0,0,1,0] + + batch_size * seq_len ==>> batch_size * seq_len * n_item + + use in RepeatNet: + + [3,4,5] matmul [0,0,1,0,0,0] + + [0,0,0,1,0,0] + + [0,0,0,0,1,0] + + ==>>> [0,0,3,4,5,0] it works in the RepeatNet when project the seq item into all items + + batch_size * 1 * seq_len matmul batch_size * seq_len * n_item ==>> batch_size * 1 * n_item + """ + batch_size, b_len = b_map.size() + if max_index is None: + max_index = b_map.max() + 1 + if torch.cuda.is_available(): + b_map_ = torch.FloatTensor(batch_size, b_len, max_index).fill_(0).to(device) + else: + b_map_ = torch.zeros(batch_size, b_len, max_index) + b_map_.scatter_(2, b_map.unsqueeze(2), 1.) + b_map_.requires_grad = False + return b_map_ diff --git a/recbole/model/sequential_recommender/s3rec.py b/recbole/model/sequential_recommender/s3rec.py index 7a9617886..1aa86184f 100644 --- a/recbole/model/sequential_recommender/s3rec.py +++ b/recbole/model/sequential_recommender/s3rec.py @@ -23,8 +23,8 @@ from torch import nn from recbole.model.abstract_recommender import SequentialRecommender -from recbole.model.loss import BPRLoss from recbole.model.layers import TransformerEncoder +from recbole.model.loss import BPRLoss class S3Rec(SequentialRecommender): @@ -75,11 +75,16 @@ def __init__(self, config, dataset): self.position_embedding = nn.Embedding(self.max_seq_length, self.hidden_size) self.feature_embedding = nn.Embedding(self.n_features, self.hidden_size, padding_idx=0) - self.trm_encoder = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.trm_encoder = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.LayerNorm = nn.LayerNorm(self.hidden_size, eps=self.layer_norm_eps) self.dropout = nn.Dropout(self.hidden_dropout_prob) @@ -107,7 +112,7 @@ def __init__(self, config, dataset): else: # load pretrained model for finetune pretrained = torch.load(self.pre_model_path) - print('Load pretrained model from', self.pre_model_path) + self.logger.info('Load pretrained model from', self.pre_model_path) self.load_state_dict(pretrained['state_dict']) def _init_weights(self, module): @@ -176,14 +181,13 @@ def forward(self, item_seq, bidirectional=True): input_emb = self.LayerNorm(input_emb) input_emb = self.dropout(input_emb) attention_mask = self.get_attention_mask(item_seq, bidirectional=bidirectional) - trm_output = self.trm_encoder(input_emb, - attention_mask, - output_all_encoded_layers=True) - seq_output = trm_output[-1] # [B L H] + trm_output = self.trm_encoder(input_emb, attention_mask, output_all_encoded_layers=True) + seq_output = trm_output[-1] # [B L H] return seq_output - def pretrain(self, features, masked_item_sequence, pos_items, neg_items, - masked_segment_sequence, pos_segment, neg_segment): + def pretrain( + self, features, masked_item_sequence, pos_items, neg_items, masked_segment_sequence, pos_segment, neg_segment + ): """Pretrain out model using four pre-training tasks: 1. Associated Attribute Prediction @@ -231,13 +235,12 @@ def pretrain(self, features, masked_item_sequence, pos_items, neg_items, pos_segment_score = self._segment_prediction(segment_context, pos_segment_emb) neg_segment_score = self._segment_prediction(segment_context, neg_segment_emb) sp_distance = torch.sigmoid(pos_segment_score - neg_segment_score) - sp_loss = torch.sum(self.loss_fct(sp_distance, - torch.ones_like(sp_distance, dtype=torch.float32))) + sp_loss = torch.sum(self.loss_fct(sp_distance, torch.ones_like(sp_distance, dtype=torch.float32))) - pretrain_loss = self.aap_weight*aap_loss \ - + self.mip_weight*mip_loss \ - + self.map_weight*map_loss \ - + self.sp_weight*sp_loss + pretrain_loss = self.aap_weight * aap_loss \ + + self.mip_weight * mip_loss \ + + self.map_weight * map_loss \ + + self.sp_weight * sp_loss return pretrain_loss @@ -250,7 +253,7 @@ def _neg_sample(self, item_set): # [ , ] def _padding_zero_at_left(self, sequence): # had truncated according to the max_length pad_len = self.max_seq_length - len(sequence) - sequence = [0]*pad_len + sequence + sequence = [0] * pad_len + sequence return sequence def reconstruct_pretrain_data(self, item_seq, item_seq_len): @@ -260,7 +263,7 @@ def reconstruct_pretrain_data(self, item_seq, item_seq_len): # We don't need padding for features item_feature_seq = self.item_feat[self.FEATURE_FIELD][item_seq] - 1 - + end_index = item_seq_len.cpu().numpy().tolist() item_seq = item_seq.cpu().numpy().tolist() item_feature_seq = item_feature_seq.cpu().numpy().tolist() @@ -268,7 +271,7 @@ def reconstruct_pretrain_data(self, item_seq, item_seq_len): # we will padding zeros at the left side # these will be train_instances, after will be reshaped to batch sequence_instances = [] - associated_features = [] # For Associated Attribute Prediction and Masked Attribute Prediction + associated_features = [] # For Associated Attribute Prediction and Masked Attribute Prediction long_sequence = [] for i, end_i in enumerate(end_index): sequence_instances.append(item_seq[i][:end_i]) @@ -318,14 +321,14 @@ def reconstruct_pretrain_data(self, item_seq, item_seq_len): sample_length = random.randint(1, len(instance) // 2) start_id = random.randint(0, len(instance) - sample_length) neg_start_id = random.randint(0, len(long_sequence) - sample_length) - pos_segment = instance[start_id: start_id + sample_length] + pos_segment = instance[start_id:start_id + sample_length] neg_segment = long_sequence[neg_start_id:neg_start_id + sample_length] masked_segment = instance[:start_id] + [self.mask_token] * sample_length \ + instance[start_id + sample_length:] - pos_segment = [self.mask_token] * start_id + pos_segment + [self.mask_token] * ( - len(instance) - (start_id + sample_length)) - neg_segment = [self.mask_token] * start_id + neg_segment + [self.mask_token] * ( - len(instance) - (start_id + sample_length)) + pos_segment = [self.mask_token] * start_id + pos_segment + \ + [self.mask_token] * (len(instance) - (start_id + sample_length)) + neg_segment = [self.mask_token] * start_id + neg_segment + \ + [self.mask_token] * (len(instance) - (start_id + sample_length)) masked_segment_list.append(self._padding_zero_at_left(masked_segment)) pos_segment_list.append(self._padding_zero_at_left(pos_segment)) neg_segment_list.append(self._padding_zero_at_left(neg_segment)) @@ -343,7 +346,6 @@ def reconstruct_pretrain_data(self, item_seq, item_seq_len): return associated_features, masked_item_sequence, pos_items, neg_items, \ masked_segment_list, pos_segment_list, neg_segment_list - def calculate_loss(self, interaction): item_seq = interaction[self.ITEM_SEQ] item_seq_len = interaction[self.ITEM_SEQ_LEN] @@ -353,8 +355,9 @@ def calculate_loss(self, interaction): masked_segment_sequence, pos_segment, neg_segment \ = self.reconstruct_pretrain_data(item_seq, item_seq_len) - loss = self.pretrain(features, masked_item_sequence, pos_items, neg_items, - masked_segment_sequence, pos_segment, neg_segment) + loss = self.pretrain( + features, masked_item_sequence, pos_items, neg_items, masked_segment_sequence, pos_segment, neg_segment + ) # finetune else: pos_items = interaction[self.POS_ITEM_ID] @@ -390,6 +393,6 @@ def full_sort_predict(self, interaction): item_seq_len = interaction[self.ITEM_SEQ_LEN] seq_output = self.forward(item_seq, bidirectional=False) seq_output = self.gather_indexes(seq_output, item_seq_len - 1) - test_items_emb = self.item_embedding.weight[:self.n_items-1] # delete masked token + test_items_emb = self.item_embedding.weight[:self.n_items - 1] # delete masked token scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) # [B, n_items] return scores diff --git a/recbole/model/sequential_recommender/sasrec.py b/recbole/model/sequential_recommender/sasrec.py index 564250dbf..ea58a8fdf 100644 --- a/recbole/model/sequential_recommender/sasrec.py +++ b/recbole/model/sequential_recommender/sasrec.py @@ -19,8 +19,8 @@ from torch import nn from recbole.model.abstract_recommender import SequentialRecommender -from recbole.model.loss import BPRLoss from recbole.model.layers import TransformerEncoder +from recbole.model.loss import BPRLoss class SASRec(SequentialRecommender): @@ -29,7 +29,7 @@ class SASRec(SequentialRecommender): NOTE: In the author's implementation, the Point-Wise Feed-Forward Network (PFFN) is implemented - by CNN with 1x1 kernel. In this implementation, we follows the original BERT implmentation + by CNN with 1x1 kernel. In this implementation, we follows the original BERT implementation using Fully Connected Layer to implement the PFFN. """ @@ -39,7 +39,7 @@ def __init__(self, config, dataset): # load parameters info self.n_layers = config['n_layers'] self.n_heads = config['n_heads'] - self.hidden_size = config['hidden_size'] # same as embedding_size + self.hidden_size = config['hidden_size'] # same as embedding_size self.inner_size = config['inner_size'] # the dimensionality in feed-forward layer self.hidden_dropout_prob = config['hidden_dropout_prob'] self.attn_dropout_prob = config['attn_dropout_prob'] @@ -50,13 +50,18 @@ def __init__(self, config, dataset): self.loss_type = config['loss_type'] # define layers and loss - self.item_embedding = nn.Embedding(self.n_items, self.hidden_size , padding_idx=0) + self.item_embedding = nn.Embedding(self.n_items, self.hidden_size, padding_idx=0) self.position_embedding = nn.Embedding(self.max_seq_length, self.hidden_size) - self.trm_encoder = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.trm_encoder = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.LayerNorm = nn.LayerNorm(self.hidden_size, eps=self.layer_norm_eps) self.dropout = nn.Dropout(self.hidden_dropout_prob) @@ -111,12 +116,10 @@ def forward(self, item_seq, item_seq_len): extended_attention_mask = self.get_attention_mask(item_seq) - trm_output = self.trm_encoder(input_emb, - extended_attention_mask, - output_all_encoded_layers=True) + trm_output = self.trm_encoder(input_emb, extended_attention_mask, output_all_encoded_layers=True) output = trm_output[-1] output = self.gather_indexes(output, item_seq_len - 1) - return output # [B H] + return output # [B H] def calculate_loss(self, interaction): item_seq = interaction[self.ITEM_SEQ] diff --git a/recbole/model/sequential_recommender/sasrecf.py b/recbole/model/sequential_recommender/sasrecf.py index 8c7f90c17..b6c4feb4f 100644 --- a/recbole/model/sequential_recommender/sasrecf.py +++ b/recbole/model/sequential_recommender/sasrecf.py @@ -12,8 +12,8 @@ from torch import nn from recbole.model.abstract_recommender import SequentialRecommender -from recbole.model.loss import BPRLoss from recbole.model.layers import TransformerEncoder, FeatureSeqEmbLayer +from recbole.model.loss import BPRLoss class SASRecF(SequentialRecommender): @@ -45,14 +45,20 @@ def __init__(self, config, dataset): # define layers and loss self.item_embedding = nn.Embedding(self.n_items, self.hidden_size, padding_idx=0) self.position_embedding = nn.Embedding(self.max_seq_length, self.hidden_size) - self.feature_embed_layer = FeatureSeqEmbLayer(dataset, self.hidden_size, self.selected_features, - self.pooling_mode, self.device) - - self.trm_encoder = TransformerEncoder(n_layers=self.n_layers, n_heads=self.n_heads, - hidden_size=self.hidden_size, inner_size=self.inner_size, - hidden_dropout_prob=self.hidden_dropout_prob, - attn_dropout_prob=self.attn_dropout_prob, - hidden_act=self.hidden_act, layer_norm_eps=self.layer_norm_eps) + self.feature_embed_layer = FeatureSeqEmbLayer( + dataset, self.hidden_size, self.selected_features, self.pooling_mode, self.device + ) + + self.trm_encoder = TransformerEncoder( + n_layers=self.n_layers, + n_heads=self.n_heads, + hidden_size=self.hidden_size, + inner_size=self.inner_size, + hidden_dropout_prob=self.hidden_dropout_prob, + attn_dropout_prob=self.attn_dropout_prob, + hidden_act=self.hidden_act, + layer_norm_eps=self.layer_norm_eps + ) self.concat_layer = nn.Linear(self.hidden_size * (1 + self.num_feature_field), self.hidden_size) @@ -126,11 +132,10 @@ def forward(self, item_seq, item_seq_len): input_emb = self.dropout(input_emb) extended_attention_mask = self.get_attention_mask(item_seq) - trm_output = self.trm_encoder(input_emb, extended_attention_mask, - output_all_encoded_layers=True) + trm_output = self.trm_encoder(input_emb, extended_attention_mask, output_all_encoded_layers=True) output = trm_output[-1] seq_output = self.gather_indexes(output, item_seq_len - 1) - return seq_output # [B H] + return seq_output # [B H] def calculate_loss(self, interaction): item_seq = interaction[self.ITEM_SEQ] diff --git a/recbole/model/sequential_recommender/shan.py b/recbole/model/sequential_recommender/shan.py new file mode 100644 index 000000000..19d7d435d --- /dev/null +++ b/recbole/model/sequential_recommender/shan.py @@ -0,0 +1,224 @@ +# -*- coding: utf-8 -*- +# @Time : 2020/11/20 22:33 +# @Author : Shao Weiqi +# @Reviewer : Lin Kun +# @Email : shaoweiqi@ruc.edu.cn + +r""" +SHAN +################################################ + +Reference: + Ying, H et al. "Sequential Recommender System based on Hierarchical Attention Network."in IJCAI 2018 + + +""" +import numpy as np +import torch +import torch.nn as nn +from torch.nn.init import normal_, uniform_ + +from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss + + +class SHAN(SequentialRecommender): + r""" + SHAN exploit the Hierarchical Attention Network to get the long-short term preference + first get the long term purpose and then fuse the long-term with recent items to get long-short term purpose + + """ + + def __init__(self, config, dataset): + + super(SHAN, self).__init__(config, dataset) + + # load the dataset information + self.n_users = dataset.num(self.USER_ID) + self.device = config['device'] + + # load the parameter information + self.embedding_size = config["embedding_size"] + self.short_item_length = config["short_item_length"] # the length of the short session items + assert self.short_item_length <= self.max_seq_length, "short_item_length can't longer than the max_seq_length" + self.reg_weight = config["reg_weight"] + + # define layers and loss + self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) + self.user_embedding = nn.Embedding(self.n_users, self.embedding_size) + + self.long_w = nn.Linear(self.embedding_size, self.embedding_size) + self.long_b = nn.Parameter( + uniform_( + tensor=torch.zeros(self.embedding_size), + a=-np.sqrt(3 / self.embedding_size), + b=np.sqrt(3 / self.embedding_size) + ), + requires_grad=True + ).to(self.device) + self.long_short_w = nn.Linear(self.embedding_size, self.embedding_size) + self.long_short_b = nn.Parameter( + uniform_( + tensor=torch.zeros(self.embedding_size), + a=-np.sqrt(3 / self.embedding_size), + b=np.sqrt(3 / self.embedding_size) + ), + requires_grad=True + ).to(self.device) + + self.relu = nn.ReLU() + + self.loss_type = config['loss_type'] + if self.loss_type == 'BPR': + self.loss_fct = BPRLoss() + elif self.loss_type == 'CE': + self.loss_fct = nn.CrossEntropyLoss() + else: + raise NotImplementedError("Make sure 'loss_type' in ['BPR', 'CE']!") + + # init the parameter of the model + self.apply(self.init_weights) + + def reg_loss(self, user_embedding, item_embedding): + + reg_1, reg_2 = self.reg_weight + loss_1 = reg_1 * torch.norm(self.long_w.weight, p=2) + reg_1 * torch.norm(self.long_short_w.weight, p=2) + loss_2 = reg_2 * torch.norm(user_embedding, p=2) + reg_2 * torch.norm(item_embedding, p=2) + + return loss_1 + loss_2 + + def inverse_seq_item(self, seq_item, seq_item_len): + """ + inverse the seq_item, like this + [1,2,3,0,0,0,0] -- after inverse -->> [0,0,0,0,1,2,3] + """ + seq_item = seq_item.cpu().numpy() + seq_item_len = seq_item_len.cpu().numpy() + new_seq_item = [] + for items, length in zip(seq_item, seq_item_len): + item = list(items[:length]) + zeros = list(items[length:]) + seqs = zeros + item + new_seq_item.append(seqs) + seq_item = torch.tensor(new_seq_item, dtype=torch.long, device=self.device) + + return seq_item + + def init_weights(self, module): + if isinstance(module, nn.Embedding): + normal_(module.weight.data, 0., 0.01) + elif isinstance(module, nn.Linear): + uniform_(module.weight.data, -np.sqrt(3 / self.embedding_size), np.sqrt(3 / self.embedding_size)) + elif isinstance(module, nn.Parameter): + uniform_(module.data, -np.sqrt(3 / self.embedding_size), np.sqrt(3 / self.embedding_size)) + print(module.data) + + def forward(self, seq_item, user, seq_item_len): + + seq_item = self.inverse_seq_item(seq_item, seq_item_len) + + seq_item_embedding = self.item_embedding(seq_item) + user_embedding = self.user_embedding(user) + + # get the mask + mask = seq_item.data.eq(0) + long_term_attention_based_pooling_layer = self.long_term_attention_based_pooling_layer( + seq_item_embedding, user_embedding, mask + ) + # batch_size * 1 * embedding_size + + short_item_embedding = seq_item_embedding[:, -self.short_item_length:, :] + mask_long_short = mask[:, -self.short_item_length:] + batch_size = mask_long_short.size(0) + x = torch.zeros(size=(batch_size, 1)).eq(1).to(self.device) + mask_long_short = torch.cat([x, mask_long_short], dim=1) + # batch_size * short_item_length * embedding_size + long_short_item_embedding = torch.cat([long_term_attention_based_pooling_layer, short_item_embedding], dim=1) + # batch_size * 1_plus_short_item_length * embedding_size + + long_short_item_embedding = self.long_and_short_term_attention_based_pooling_layer( + long_short_item_embedding, user_embedding, mask_long_short + ) + # batch_size * embedding_size + + return long_short_item_embedding + + def calculate_loss(self, interaction): + + seq_item = interaction[self.ITEM_SEQ] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + user = interaction[self.USER_ID] + user_embedding = self.user_embedding(user) + seq_output = self.forward(seq_item, user, seq_item_len) + pos_items = interaction[self.POS_ITEM_ID] + pos_items_emb = self.item_embedding(pos_items) + if self.loss_type == 'BPR': + neg_items = interaction[self.NEG_ITEM_ID] + neg_items_emb = self.item_embedding(neg_items) + pos_score = torch.sum(seq_output * pos_items_emb, dim=-1) + neg_score = torch.sum(seq_output * neg_items_emb, dim=-1) + loss = self.loss_fct(pos_score, neg_score) + return loss + self.reg_loss(user_embedding, pos_items_emb) + else: # self.loss_type = 'CE' + test_item_emb = self.item_embedding.weight + logits = torch.matmul(seq_output, test_item_emb.transpose(0, 1)) + loss = self.loss_fct(logits, pos_items) + return loss + self.reg_loss(user_embedding, pos_items_emb) + + def predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + test_item = interaction[self.ITEM_ID] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user, seq_item_len) + test_item_emb = self.item_embedding(test_item) + scores = torch.mul(seq_output, test_item_emb).sum(dim=1) + return scores + + def full_sort_predict(self, interaction): + + item_seq = interaction[self.ITEM_SEQ] + seq_item_len = interaction[self.ITEM_SEQ_LEN] + user = interaction[self.USER_ID] + seq_output = self.forward(item_seq, user, seq_item_len) + test_items_emb = self.item_embedding.weight + scores = torch.matmul(seq_output, test_items_emb.transpose(0, 1)) + return scores + + def long_and_short_term_attention_based_pooling_layer(self, long_short_item_embedding, user_embedding, mask=None): + """ + + fusing the long term purpose with the short-term preference + """ + long_short_item_embedding_value = long_short_item_embedding + + long_short_item_embedding = self.relu(self.long_short_w(long_short_item_embedding) + self.long_short_b) + long_short_item_embedding = torch.matmul(long_short_item_embedding, user_embedding.unsqueeze(2)).squeeze(-1) + # batch_size * seq_len + if mask is not None: + long_short_item_embedding.masked_fill_(mask, -1e9) + long_short_item_embedding = nn.Softmax(dim=-1)(long_short_item_embedding) + long_short_item_embedding = torch.mul(long_short_item_embedding_value, + long_short_item_embedding.unsqueeze(2)).sum(dim=1) + + return long_short_item_embedding + + def long_term_attention_based_pooling_layer(self, seq_item_embedding, user_embedding, mask=None): + """ + + get the long term purpose of user + """ + seq_item_embedding_value = seq_item_embedding + + seq_item_embedding = self.relu(self.long_w(seq_item_embedding) + self.long_b) + user_item_embedding = torch.matmul(seq_item_embedding, user_embedding.unsqueeze(2)).squeeze(-1) + # batch_size * seq_len + if mask is not None: + user_item_embedding.masked_fill_(mask, -1e9) + user_item_embedding = nn.Softmax(dim=1)(user_item_embedding) + user_item_embedding = torch.mul(seq_item_embedding_value, + user_item_embedding.unsqueeze(2)).sum(dim=1, keepdim=True) + # batch_size * 1 * embedding_size + + return user_item_embedding diff --git a/recbole/model/sequential_recommender/srgnn.py b/recbole/model/sequential_recommender/srgnn.py index 665e8791a..0147f1499 100644 --- a/recbole/model/sequential_recommender/srgnn.py +++ b/recbole/model/sequential_recommender/srgnn.py @@ -3,7 +3,6 @@ # @Author : Yujie Lu # @Email : yujielu1998@gmail.com - r""" SRGNN ################################################ @@ -15,16 +14,16 @@ https://github.com/CRIPAC-DIG/SR-GNN """ -import numpy as np import math +import numpy as np import torch from torch import nn from torch.nn import Parameter from torch.nn import functional as F -from recbole.model.loss import BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss class GNN(nn.Module): @@ -58,25 +57,25 @@ def GNNCell(self, A, hidden): [batch_size, max_session_len, embedding_size] Returns: - torch.FloatTensor:Latent vectors of nodes,shape of [batch_size, max_session_len, embedding_size] + torch.FloatTensor: Latent vectors of nodes,shape of [batch_size, max_session_len, embedding_size] """ input_in = torch.matmul(A[:, :, :A.size(1)], self.linear_edge_in(hidden)) + self.b_iah - input_out = torch.matmul(A[:, :, A.size(1): 2 * A.size(1)], self.linear_edge_out(hidden)) + self.b_ioh + input_out = torch.matmul(A[:, :, A.size(1):2 * A.size(1)], self.linear_edge_out(hidden)) + self.b_ioh # [batch_size, max_session_len, embedding_size * 2] inputs = torch.cat([input_in, input_out], 2) - # gi.size equals to gh.size, shape of [batch_size, max_session_len, embdding_size * 3] + # gi.size equals to gh.size, shape of [batch_size, max_session_len, embedding_size * 3] gi = F.linear(inputs, self.w_ih, self.b_ih) gh = F.linear(hidden, self.w_hh, self.b_hh) # (batch_size, max_session_len, embedding_size) i_r, i_i, i_n = gi.chunk(3, 2) h_r, h_i, h_n = gh.chunk(3, 2) - resetgate = torch.sigmoid(i_r + h_r) - inputgate = torch.sigmoid(i_i + h_i) - newgate = torch.tanh(i_n + resetgate * h_n) - hy = (1 - inputgate) * hidden + inputgate * newgate + reset_gate = torch.sigmoid(i_r + h_r) + input_gate = torch.sigmoid(i_i + h_i) + new_gate = torch.tanh(i_n + reset_gate * h_n) + hy = (1 - input_gate) * hidden + input_gate * new_gate return hy def forward(self, A, hidden): @@ -90,7 +89,7 @@ class SRGNN(SequentialRecommender): In addition to considering the connection between the item and the adjacent item, it also considers the connection with other interactive items. - Such as: A example of a session sequence(eg:item1, item2, item3, item2, item4) and the connecion matrix A + Such as: A example of a session sequence(eg:item1, item2, item3, item2, item4) and the connection matrix A Outgoing edges: === ===== ===== ===== ===== diff --git a/recbole/model/sequential_recommender/stamp.py b/recbole/model/sequential_recommender/stamp.py index 71fbce412..f9982734e 100644 --- a/recbole/model/sequential_recommender/stamp.py +++ b/recbole/model/sequential_recommender/stamp.py @@ -21,8 +21,8 @@ from torch import nn from torch.nn.init import normal_ -from recbole.model.loss import BPRLoss from recbole.model.abstract_recommender import SequentialRecommender +from recbole.model.loss import BPRLoss class STAMP(SequentialRecommender): diff --git a/recbole/model/sequential_recommender/transrec.py b/recbole/model/sequential_recommender/transrec.py index fd4a20c0f..5e431c8bd 100644 --- a/recbole/model/sequential_recommender/transrec.py +++ b/recbole/model/sequential_recommender/transrec.py @@ -15,10 +15,10 @@ import torch from torch import nn -from recbole.utils import InputType from recbole.model.abstract_recommender import SequentialRecommender -from recbole.model.loss import BPRLoss, EmbLoss, RegLoss from recbole.model.init import xavier_normal_initialization +from recbole.model.loss import BPRLoss, EmbLoss, RegLoss +from recbole.utils import InputType class TransRec(SequentialRecommender): @@ -41,8 +41,8 @@ def __init__(self, config, dataset): self.user_embedding = nn.Embedding(self.n_users, self.embedding_size, padding_idx=0) self.item_embedding = nn.Embedding(self.n_items, self.embedding_size, padding_idx=0) - self.bias = nn.Embedding(self.n_items, 1, padding_idx=0) # Beta popularity bias - self.T = nn.Parameter(torch.zeros(self.embedding_size)) # average user representation 'global' + self.bias = nn.Embedding(self.n_items, 1, padding_idx=0) # Beta popularity bias + self.T = nn.Parameter(torch.zeros(self.embedding_size)) # average user representation 'global' self.bpr_loss = BPRLoss() self.emb_loss = EmbLoss() @@ -52,21 +52,21 @@ def __init__(self, config, dataset): self.apply(xavier_normal_initialization) def _l2_distance(self, x, y): - return torch.sqrt(torch.sum((x - y)**2, dim=-1, keepdim=True)) # [B 1] + return torch.sqrt(torch.sum((x - y) ** 2, dim=-1, keepdim=True)) # [B 1] def gather_last_items(self, item_seq, gather_index): - "Gathers the last_item at the spexific positions over a minibatch" + """Gathers the last_item at the specific positions over a minibatch""" gather_index = gather_index.view(-1, 1) - last_items = item_seq.gather(index=gather_index, dim=1) # [B 1] - return last_items.squeeze(-1) # [B] + last_items = item_seq.gather(index=gather_index, dim=1) # [B 1] + return last_items.squeeze(-1) # [B] def forward(self, user, item_seq, item_seq_len): # the last item at the last position - last_items = self.gather_last_items(item_seq, item_seq_len - 1) # [B] - user_emb = self.user_embedding(user) # [B H] + last_items = self.gather_last_items(item_seq, item_seq_len - 1) # [B] + user_emb = self.user_embedding(user) # [B H] last_items_emb = self.item_embedding(last_items) # [B H] - T = self.T.expand_as(user_emb) # [B H] - seq_output = user_emb + T + last_items_emb # [B H] + T = self.T.expand_as(user_emb) # [B H] + seq_output = user_emb + T + last_items_emb # [B H] return seq_output def calculate_loss(self, interaction): @@ -117,13 +117,13 @@ def full_sort_predict(self, interaction): seq_output = self.forward(user, item_seq, item_seq_len) # [B H] - test_items_emb = self.item_embedding.weight # [item_num H] - test_items_emb = test_items_emb.repeat(seq_output.size(0), 1, 1) # [user_num item_num H] + test_items_emb = self.item_embedding.weight # [item_num H] + test_items_emb = test_items_emb.repeat(seq_output.size(0), 1, 1) # [user_num item_num H] - user_hidden = seq_output.unsqueeze(1).expand_as(test_items_emb) # [user_num item_num H] - test_bias = self.bias.weight # [item_num 1] - test_bias = test_bias.repeat(user_hidden.size(0), 1, 1) # [user_num item_num 1] + user_hidden = seq_output.unsqueeze(1).expand_as(test_items_emb) # [user_num item_num H] + test_bias = self.bias.weight # [item_num 1] + test_bias = test_bias.repeat(user_hidden.size(0), 1, 1) # [user_num item_num 1] - scores = test_bias - self._l2_distance(user_hidden, test_items_emb) # [user_num item_num 1] + scores = test_bias - self._l2_distance(user_hidden, test_items_emb) # [user_num item_num 1] scores = scores.squeeze(-1) # [B n_items] return scores diff --git a/recbole/properties/dataset/ml-100k.yaml b/recbole/properties/dataset/ml-100k.yaml index c513530cd..9e0beff79 100644 --- a/recbole/properties/dataset/ml-100k.yaml +++ b/recbole/properties/dataset/ml-100k.yaml @@ -27,22 +27,22 @@ ENTITY_ID_FIELD: entity_id load_col: inter: [user_id, item_id, rating, timestamp] unload_col: ~ +unused_col: ~ # Filtering -max_user_inter_num: ~ -min_user_inter_num: ~ -max_item_inter_num: ~ -min_item_inter_num: ~ +rm_dup_inter: ~ lowest_val: ~ highest_val: ~ equal_val: ~ not_equal_val: ~ -drop_filter_field : False +filter_inter_by_user_or_item: True +max_user_inter_num: ~ +min_user_inter_num: ~ +max_item_inter_num: ~ +min_item_inter_num: ~ # Preprocessing fields_in_same_space: ~ -fill_nan: True preload_weight: ~ -drop_preload_weight: True normalize_field: ~ normalize_all: True diff --git a/recbole/properties/dataset/sample.yaml b/recbole/properties/dataset/sample.yaml index 8dcb9cc23..d9869e5e2 100644 --- a/recbole/properties/dataset/sample.yaml +++ b/recbole/properties/dataset/sample.yaml @@ -21,27 +21,26 @@ load_col: inter: [user_id, item_id] # the others unload_col: ~ +unused_col: ~ additional_feat_suffix: ~ # Filtering rm_dup_inter: ~ -max_user_inter_num: ~ -min_user_inter_num: 0 -max_item_inter_num: ~ -min_item_inter_num: 0 lowest_val: ~ highest_val: ~ equal_val: ~ not_equal_val: ~ -drop_filter_field : True +filter_inter_by_user_or_item: True +max_user_inter_num: ~ +min_user_inter_num: 0 +max_item_inter_num: ~ +min_item_inter_num: 0 # Preprocessing fields_in_same_space: ~ -fill_nan: True preload_weight: ~ -drop_preload_weight: True normalize_field: ~ -normalize_all: True +normalize_all: ~ # Sequential Model Needed ITEM_LIST_LENGTH_FIELD: item_length diff --git a/recbole/properties/model/CDAE.yaml b/recbole/properties/model/CDAE.yaml new file mode 100644 index 000000000..e1acdffbf --- /dev/null +++ b/recbole/properties/model/CDAE.yaml @@ -0,0 +1,7 @@ +loss_type: BCE +hid_activation: relu +out_activation: sigmoid +corruption_ratio: 0.5 +embedding_size: 64 +reg_weight_1: 0. +reg_weight_2: 0.01 diff --git a/recbole/properties/model/DMF.yaml b/recbole/properties/model/DMF.yaml index 56040f182..d2ca2cfa4 100644 --- a/recbole/properties/model/DMF.yaml +++ b/recbole/properties/model/DMF.yaml @@ -1,6 +1,6 @@ # WARNING: -# 1.if you set inter_matrix_type='rating', you must set drop_filter_field=False in your data config files. -# 2.The dimensions of the last layer of users and items must be the same +# 1. if you set inter_matrix_type='rating', you must set `unused_col: ~` in your data config files. +# 2. The dimensions of the last layer of users and items must be the same inter_matrix_type: '01' user_embedding_size: 64 diff --git a/recbole/properties/model/FOSSIL.yaml b/recbole/properties/model/FOSSIL.yaml new file mode 100644 index 000000000..1eb00d1bb --- /dev/null +++ b/recbole/properties/model/FOSSIL.yaml @@ -0,0 +1,5 @@ +embedding_size: 64 +loss_type: "CE" +reg_weight: 0.00 +order_len: 3 +alpha: 0.6 \ No newline at end of file diff --git a/recbole/properties/model/HGN.yaml b/recbole/properties/model/HGN.yaml new file mode 100644 index 000000000..69b95a440 --- /dev/null +++ b/recbole/properties/model/HGN.yaml @@ -0,0 +1,4 @@ +embedding_size: 64 +loss_type: 'BPR' +pooling_type: "average" +reg_weight: [0.00,0.00] \ No newline at end of file diff --git a/recbole/properties/model/HRM.yaml b/recbole/properties/model/HRM.yaml new file mode 100644 index 000000000..531e93c3e --- /dev/null +++ b/recbole/properties/model/HRM.yaml @@ -0,0 +1,6 @@ +embedding_size: 64 +high_order: 2 +loss_type: "CE" +dropout_prob: 0.2 +pooling_type_layer_1: "max" +pooling_type_layer_2: "max" \ No newline at end of file diff --git a/recbole/properties/model/LINE.yaml b/recbole/properties/model/LINE.yaml new file mode 100644 index 000000000..eefa06f81 --- /dev/null +++ b/recbole/properties/model/LINE.yaml @@ -0,0 +1,3 @@ +embedding_size: 64 +order: 2 +second_order_loss_weight: 1 \ No newline at end of file diff --git a/recbole/properties/model/MacridVAE.yaml b/recbole/properties/model/MacridVAE.yaml new file mode 100644 index 000000000..db3b426db --- /dev/null +++ b/recbole/properties/model/MacridVAE.yaml @@ -0,0 +1,10 @@ +embedding_size: 64 +drop_out: 0.5 +kfac: 10 +nogb: False +std: 0.01 +encoder_hidden_size: [600] +tau: 0.1 +anneal_cap: 0.2 +total_anneal_steps: 200000 +reg_weights: [0, 0] \ No newline at end of file diff --git a/recbole/properties/model/MultiDAE.yaml b/recbole/properties/model/MultiDAE.yaml new file mode 100644 index 000000000..38e1f44e3 --- /dev/null +++ b/recbole/properties/model/MultiDAE.yaml @@ -0,0 +1,3 @@ +mlp_hidden_size: [600] +latent_dimension: 64 +dropout_prob: 0.5 \ No newline at end of file diff --git a/recbole/properties/model/MultiVAE.yaml b/recbole/properties/model/MultiVAE.yaml new file mode 100644 index 000000000..63021d9b0 --- /dev/null +++ b/recbole/properties/model/MultiVAE.yaml @@ -0,0 +1,5 @@ +mlp_hidden_size: [600] +latent_dimension: 128 +dropout_prob: 0.5 +anneal_cap: 0.2 +total_anneal_steps: 200000 \ No newline at end of file diff --git a/recbole/properties/model/NPE.yaml b/recbole/properties/model/NPE.yaml new file mode 100644 index 000000000..cb93282a7 --- /dev/null +++ b/recbole/properties/model/NPE.yaml @@ -0,0 +1,3 @@ +embedding_size: 64 +loss_type: "CE" +dropout_prob: 0.3 \ No newline at end of file diff --git a/recbole/properties/model/RepeatNet.yaml b/recbole/properties/model/RepeatNet.yaml new file mode 100644 index 000000000..dac46bacc --- /dev/null +++ b/recbole/properties/model/RepeatNet.yaml @@ -0,0 +1,5 @@ +embedding_size: 64 +loss_type: "CE" +hidden_size: 64 +joint_train: False +dropout_prob: 0.5 diff --git a/recbole/properties/model/SHAN.yaml b/recbole/properties/model/SHAN.yaml new file mode 100644 index 000000000..9c83bf5c8 --- /dev/null +++ b/recbole/properties/model/SHAN.yaml @@ -0,0 +1,4 @@ +embedding_size: 64 +short_item_length: 2 +loss_type: "CE" +reg_weight: [0.01,0.0001] \ No newline at end of file diff --git a/recbole/properties/model/xgboost.yaml b/recbole/properties/model/xgboost.yaml new file mode 100644 index 000000000..6a6620bf1 --- /dev/null +++ b/recbole/properties/model/xgboost.yaml @@ -0,0 +1,36 @@ +# Type of training method +convert_token_to_onehot: False +token_num_threshold: 10000 + +# DMatrix +xgb_weight: ~ +xgb_base_margin: ~ +xgb_missing: ~ +xgb_silent: ~ +xgb_feature_names: ~ +xgb_feature_types: ~ +xgb_nthread: ~ + +xgb_model: ~ +xgb_params: + booster: gbtree + objective: binary:logistic + eval_metric: ['auc','logloss'] + # gamma: 0.1 + max_depth: 3 + # lambda: 1 + # subsample: 0.7 + # colsample_bytree: 0.7 + # min_child_weight: 3 + eta: 1 + seed: 2020 + # nthread: -1 +xgb_num_boost_round: 500 +# xgb_evals: ~ +xgb_obj: ~ +xgb_feval: ~ +xgb_maximize: ~ +xgb_early_stopping_rounds: ~ +# xgb_evals_result: ~ +xgb_verbose_eval: 100 + diff --git a/recbole/properties/overall.yaml b/recbole/properties/overall.yaml index 855804ee3..faa8591ba 100644 --- a/recbole/properties/overall.yaml +++ b/recbole/properties/overall.yaml @@ -6,6 +6,7 @@ state: INFO reproducibility: True data_path: 'dataset/' checkpoint_dir: 'saved' +show_progress: True # training settings epochs: 300 @@ -13,16 +14,22 @@ train_batch_size: 2048 learner: adam learning_rate: 0.001 training_neg_sample_num: 1 +training_neg_sample_distribution: uniform eval_step: 1 stopping_step: 10 +clip_grad_norm: ~ +# clip_grad_norm: {'max_norm': 5, 'norm_type': 2} +weight_decay: 0.0 # evaluation settings eval_setting: RO_RS,full group_by_user: True split_ratio: [0.8,0.1,0.1] leave_one_num: 2 -real_time_process: True +real_time_process: False metrics: ["Recall", "MRR","NDCG","Hit","Precision"] topk: [10] valid_metric: MRR@10 eval_batch_size: 4096 +loss_decimal_place: 4 +metric_decimal_place: 4 diff --git a/recbole/properties/quick_start_config/context-aware.yaml b/recbole/properties/quick_start_config/context-aware.yaml new file mode 100644 index 000000000..cdb71f098 --- /dev/null +++ b/recbole/properties/quick_start_config/context-aware.yaml @@ -0,0 +1,5 @@ +eval_setting: RO_RS +group_by_user: False +training_neg_sample_num: 0 +metrics: ['AUC', 'LogLoss'] +valid_metric: AUC \ No newline at end of file diff --git a/recbole/properties/quick_start_config/context-aware_ml-100k.yaml b/recbole/properties/quick_start_config/context-aware_ml-100k.yaml new file mode 100644 index 000000000..150dd4d18 --- /dev/null +++ b/recbole/properties/quick_start_config/context-aware_ml-100k.yaml @@ -0,0 +1,5 @@ +threshold: {'rating': 4} +load_col: + inter: ['user_id', 'item_id', 'rating', 'timestamp'] + user: ['user_id', 'age', 'gender', 'occupation'] + item: ['item_id', 'release_year', 'class'] \ No newline at end of file diff --git a/recbole/properties/quick_start_config/knowledge_base.yaml b/recbole/properties/quick_start_config/knowledge_base.yaml new file mode 100644 index 000000000..379341326 --- /dev/null +++ b/recbole/properties/quick_start_config/knowledge_base.yaml @@ -0,0 +1,4 @@ +load_col: + inter: ['user_id', 'item_id', 'rating', 'timestamp'] + kg: ['head_id', 'relation_id', 'tail_id'] + link: ['item_id', 'entity_id'] \ No newline at end of file diff --git a/recbole/properties/quick_start_config/sequential.yaml b/recbole/properties/quick_start_config/sequential.yaml new file mode 100644 index 000000000..87c0fa053 --- /dev/null +++ b/recbole/properties/quick_start_config/sequential.yaml @@ -0,0 +1 @@ +eval_setting: TO_LS,full \ No newline at end of file diff --git a/recbole/properties/quick_start_config/sequential_DIN.yaml b/recbole/properties/quick_start_config/sequential_DIN.yaml new file mode 100644 index 000000000..58b8db955 --- /dev/null +++ b/recbole/properties/quick_start_config/sequential_DIN.yaml @@ -0,0 +1,3 @@ +eval_setting: TO_LS, uni100 +metrics: ['AUC', 'LogLoss'] +valid_metric: AUC \ No newline at end of file diff --git a/recbole/properties/quick_start_config/sequential_DIN_on_ml-100k.yaml b/recbole/properties/quick_start_config/sequential_DIN_on_ml-100k.yaml new file mode 100644 index 000000000..702a7a862 --- /dev/null +++ b/recbole/properties/quick_start_config/sequential_DIN_on_ml-100k.yaml @@ -0,0 +1,4 @@ +load_col: + inter: ['user_id', 'item_id', 'rating', 'timestamp'] + user: ['user_id', 'age', 'gender', 'occupation'] + item: ['item_id', 'release_year'] \ No newline at end of file diff --git a/recbole/properties/quick_start_config/sequential_embedding_model.yaml b/recbole/properties/quick_start_config/sequential_embedding_model.yaml new file mode 100644 index 000000000..59b920994 --- /dev/null +++ b/recbole/properties/quick_start_config/sequential_embedding_model.yaml @@ -0,0 +1,4 @@ +load_col: + inter: ['user_id', 'item_id', 'rating', 'timestamp'] + ent: ['ent_id', 'ent_emb'] +additional_feat_suffix: ent \ No newline at end of file diff --git a/recbole/properties/quick_start_config/special_sequential_on_ml-100k.yaml b/recbole/properties/quick_start_config/special_sequential_on_ml-100k.yaml new file mode 100644 index 000000000..1fe509fe6 --- /dev/null +++ b/recbole/properties/quick_start_config/special_sequential_on_ml-100k.yaml @@ -0,0 +1,3 @@ +load_col: + inter: ['user_id', 'item_id', 'rating', 'timestamp'] + item: ['item_id', 'release_year', 'class'] \ No newline at end of file diff --git a/recbole/quick_start/quick_start.py b/recbole/quick_start/quick_start.py index 2294ca15e..66aae1c1c 100644 --- a/recbole/quick_start/quick_start.py +++ b/recbole/quick_start/quick_start.py @@ -8,9 +8,10 @@ """ import logging from logging import getLogger -from recbole.utils import init_logger, get_model, get_trainer, init_seed + from recbole.config import Config from recbole.data import create_dataset, data_preparation +from recbole.utils import init_logger, get_model, get_trainer, init_seed def run_recbole(model=None, dataset=None, config_file_list=None, config_dict=None, saved=True): @@ -49,10 +50,12 @@ def run_recbole(model=None, dataset=None, config_file_list=None, config_dict=Non trainer = get_trainer(config['MODEL_TYPE'], config['model'])(config, model) # model training - best_valid_score, best_valid_result = trainer.fit(train_data, valid_data, saved=saved) + best_valid_score, best_valid_result = trainer.fit( + train_data, valid_data, saved=saved, show_progress=config['show_progress'] + ) # model evaluation - test_result = trainer.evaluate(test_data, load_best_model=saved) + test_result = trainer.evaluate(test_data, load_best_model=saved, show_progress=config['show_progress']) logger.info('best valid result: {}'.format(best_valid_result)) logger.info('test result: {}'.format(test_result)) diff --git a/recbole/sampler/sampler.py b/recbole/sampler/sampler.py index 031482be3..e0e1a0d9b 100644 --- a/recbole/sampler/sampler.py +++ b/recbole/sampler/sampler.py @@ -13,9 +13,10 @@ ######################## """ -import random import copy + import numpy as np +import torch class AbstractSampler(object): @@ -32,27 +33,40 @@ class AbstractSampler(object): random_list (list or numpy.ndarray): The shuffled result of :meth:`get_random_list`. used_ids (numpy.ndarray): The result of :meth:`get_used_ids`. """ + def __init__(self, distribution): - self.distribution = distribution + self.distribution = '' + self.random_list = [] + self.random_pr = 0 + self.random_list_length = 0 + self.set_distribution(distribution) + self.used_ids = self.get_used_ids() + + def set_distribution(self, distribution): + """Set the distribution of sampler. + Args: + distribution (str): Distribution of the negative items. + """ + if self.distribution == distribution: + return + self.distribution = distribution self.random_list = self.get_random_list() - random.shuffle(self.random_list) + np.random.shuffle(self.random_list) self.random_pr = 0 self.random_list_length = len(self.random_list) - self.used_ids = self.get_used_ids() - def get_random_list(self): """ Returns: - np.ndarray or list: Random list of value_id. + numpy.ndarray or list: Random list of value_id. """ raise NotImplementedError('method [get_random_list] should be implemented') def get_used_ids(self): """ Returns: - np.ndarray: Used ids. Index is key_id, and element is a set of value_ids. + numpy.ndarray: Used ids. Index is key_id, and element is a set of value_ids. """ raise NotImplementedError('method [get_used_ids] should be implemented') @@ -65,36 +79,84 @@ def random(self): self.random_pr += 1 return value_id - def sample_by_key_ids(self, key_ids, num, used_ids): + def random_num(self, num): + """ + Args: + num (int): Number of random value_ids. + + Returns: + value_ids (numpy.ndarray): Random value_ids. Generated by :attr:`random_list`. + """ + value_id = [] + self.random_pr %= self.random_list_length + while True: + if self.random_pr + num <= self.random_list_length: + value_id.append(self.random_list[self.random_pr:self.random_pr + num]) + self.random_pr += num + break + else: + value_id.append(self.random_list[self.random_pr:]) + num -= self.random_list_length - self.random_pr + self.random_pr = 0 + return np.concatenate(value_id) + + def sample_by_key_ids(self, key_ids, num): """Sampling by key_ids. Args: - key_ids (np.ndarray or list): Input key_ids. + key_ids (numpy.ndarray or list): Input key_ids. num (int): Number of sampled value_ids for each key_id. - used_ids (np.ndarray): Used ids. index is key_id, and element is a set of value_ids. Returns: - np.ndarray: Sampled value_ids. + torch.tensor: Sampled value_ids. value_ids[0], value_ids[len(key_ids)], value_ids[len(key_ids) * 2], ..., value_id[len(key_ids) * (num - 1)] is sampled for key_ids[0]; value_ids[1], value_ids[len(key_ids) + 1], value_ids[len(key_ids) * 2 + 1], ..., value_id[len(key_ids) * (num - 1) + 1] is sampled for key_ids[1]; ...; and so on. """ + key_ids = np.array(key_ids) key_num = len(key_ids) total_num = key_num * num - value_ids = np.zeros(total_num, dtype=np.int64) - used_id_list = np.tile(used_ids, num) - for i, used_ids in enumerate(used_id_list): - cur = self.random() - while cur in used_ids: - cur = self.random() - value_ids[i] = cur - return value_ids + if (key_ids == key_ids[0]).all(): + key_id = key_ids[0] + used = np.array(list(self.used_ids[key_id])) + value_ids = self.random_num(total_num) + check_list = np.arange(total_num)[np.isin(value_ids, used)] + while len(check_list) > 0: + value_ids[check_list] = value = self.random_num(len(check_list)) + perm = value.argsort(kind='quicksort') + aux = value[perm] + mask = np.empty(aux.shape, dtype=np.bool_) + mask[:1] = True + mask[1:] = aux[1:] != aux[:-1] + value = aux[mask] + rev_idx = np.empty(mask.shape, dtype=np.intp) + rev_idx[perm] = np.cumsum(mask) - 1 + ar = np.concatenate((value, used)) + order = ar.argsort(kind='mergesort') + sar = ar[order] + bool_ar = (sar[1:] == sar[:-1]) + flag = np.concatenate((bool_ar, [False])) + ret = np.empty(ar.shape, dtype=bool) + ret[order] = flag + mask = ret[rev_idx] + check_list = check_list[mask] + else: + value_ids = np.zeros(total_num, dtype=np.int64) + check_list = np.arange(total_num) + key_ids = np.tile(key_ids, num) + while len(check_list) > 0: + value_ids[check_list] = self.random_num(len(check_list)) + check_list = np.array([ + i for i, used, v in zip(check_list, self.used_ids[key_ids[check_list]], value_ids[check_list]) + if v in used + ]) + return torch.tensor(value_ids) class Sampler(AbstractSampler): """:class:`Sampler` is used to sample negative items for each input user. In order to avoid positive items - in train-phase to be sampled in vaild-phase, and positive items in train-phase or vaild-phase to be sampled + in train-phase to be sampled in valid-phase, and positive items in train-phase or valid-phase to be sampled in test-phase, we need to input the datasets of all phases for pre-processing. And, before using this sampler, it is needed to call :meth:`set_phase` to get the sampler of corresponding phase. @@ -106,13 +168,14 @@ class Sampler(AbstractSampler): Attributes: phase (str): the phase of sampler. It will not be set until :meth:`set_phase` is called. """ + def __init__(self, phases, datasets, distribution='uniform'): if not isinstance(phases, list): phases = [phases] if not isinstance(datasets, list): datasets = [datasets] if len(phases) != len(datasets): - raise ValueError('phases {} and datasets {} should have the same length'.format(phases, datasets)) + raise ValueError(f'Phases {phases} and datasets {datasets} should have the same length.') self.phases = phases self.datasets = datasets @@ -128,31 +191,39 @@ def __init__(self, phases, datasets, distribution='uniform'): def get_random_list(self): """ Returns: - np.ndarray or list: Random list of item_id. + numpy.ndarray or list: Random list of item_id. """ if self.distribution == 'uniform': - return list(range(1, self.n_items)) + return np.arange(1, self.n_items) elif self.distribution == 'popularity': random_item_list = [] for dataset in self.datasets: - random_item_list.extend(dataset.inter_feat[self.iid_field].values) + random_item_list.extend(dataset.inter_feat[self.iid_field].numpy()) return random_item_list else: - raise NotImplementedError('Distribution [{}] has not been implemented'.format(self.distribution)) + raise NotImplementedError(f'Distribution [{self.distribution}] has not been implemented.') def get_used_ids(self): """ Returns: dict: Used item_ids is the same as positive item_ids. - Key is phase, and value is a np.ndarray which index is user_id, and element is a set of item_ids. + Key is phase, and value is a numpy.ndarray which index is user_id, and element is a set of item_ids. """ used_item_id = dict() - last = [set() for i in range(self.n_users)] + last = [set() for _ in range(self.n_users)] for phase, dataset in zip(self.phases, self.datasets): cur = np.array([set(s) for s in last]) - for uid, iid in dataset.inter_feat[[self.uid_field, self.iid_field]].values: + for uid, iid in zip(dataset.inter_feat[self.uid_field].numpy(), dataset.inter_feat[self.iid_field].numpy()): cur[uid].add(iid) last = used_item_id[phase] = cur + + for used_item_set in used_item_id[self.phases[-1]]: + if len(used_item_set) + 1 == self.n_items: # [pad] is a item. + raise ValueError( + 'Some users have interacted with all items, ' + 'which we can not sample negative items for them. ' + 'Please set `max_user_inter_num` to filter those users.' + ) return used_item_id def set_phase(self, phase): @@ -166,7 +237,7 @@ def set_phase(self, phase): is set to the value of corresponding phase. """ if phase not in self.phases: - raise ValueError('phase [{}] not exist'.format(phase)) + raise ValueError(f'Phase [{phase}] not exist.') new_sampler = copy.copy(self) new_sampler.phase = phase new_sampler.used_ids = new_sampler.used_ids[phase] @@ -176,22 +247,22 @@ def sample_by_user_ids(self, user_ids, num): """Sampling by user_ids. Args: - user_ids (np.ndarray or list): Input user_ids. + user_ids (numpy.ndarray or list): Input user_ids. num (int): Number of sampled item_ids for each user_id. Returns: - np.ndarray: Sampled item_ids. + torch.tensor: Sampled item_ids. item_ids[0], item_ids[len(user_ids)], item_ids[len(user_ids) * 2], ..., item_id[len(user_ids) * (num - 1)] is sampled for user_ids[0]; item_ids[1], item_ids[len(user_ids) + 1], item_ids[len(user_ids) * 2 + 1], ..., item_id[len(user_ids) * (num - 1) + 1] is sampled for user_ids[1]; ...; and so on. """ try: - return self.sample_by_key_ids(user_ids, num, self.used_ids[user_ids]) + return self.sample_by_key_ids(user_ids, num) except IndexError: for user_id in user_ids: if user_id < 0 or user_id >= self.n_users: - raise ValueError('user_id [{}] not exist'.format(user_id)) + raise ValueError(f'user_id [{user_id}] not exist.') class KGSampler(AbstractSampler): @@ -201,6 +272,7 @@ class KGSampler(AbstractSampler): dataset (Dataset): The knowledge graph dataset, which contains triplets in a knowledge graph. distribution (str, optional): Distribution of the negative entities. Defaults to 'uniform'. """ + def __init__(self, dataset, distribution='uniform'): self.dataset = dataset @@ -217,46 +289,53 @@ def __init__(self, dataset, distribution='uniform'): def get_random_list(self): """ Returns: - np.ndarray or list: Random list of entity_id. + numpy.ndarray or list: Random list of entity_id. """ if self.distribution == 'uniform': - return list(range(1, self.entity_num)) + return np.arange(1, self.entity_num) elif self.distribution == 'popularity': return list(self.hid_list) + list(self.tid_list) else: - raise NotImplementedError('Distribution [{}] has not been implemented'.format(self.distribution)) + raise NotImplementedError(f'Distribution [{self.distribution}] has not been implemented.') def get_used_ids(self): """ Returns: - np.ndarray: Used entity_ids is the same as tail_entity_ids in knowledge graph. + numpy.ndarray: Used entity_ids is the same as tail_entity_ids in knowledge graph. Index is head_entity_id, and element is a set of tail_entity_ids. """ - used_tail_entity_id = np.array([set() for i in range(self.entity_num)]) + used_tail_entity_id = np.array([set() for _ in range(self.entity_num)]) for hid, tid in zip(self.hid_list, self.tid_list): used_tail_entity_id[hid].add(tid) + + for used_tail_set in used_tail_entity_id: + if len(used_tail_set) + 1 == self.entity_num: # [pad] is a entity. + raise ValueError( + 'Some head entities have relation with all entities, ' + 'which we can not sample negative entities for them.' + ) return used_tail_entity_id def sample_by_entity_ids(self, head_entity_ids, num=1): """Sampling by head_entity_ids. Args: - head_entity_ids (np.ndarray or list): Input head_entity_ids. + head_entity_ids (numpy.ndarray or list): Input head_entity_ids. num (int, optional): Number of sampled entity_ids for each head_entity_id. Defaults to ``1``. Returns: - np.ndarray: Sampled entity_ids. + torch.tensor: Sampled entity_ids. entity_ids[0], entity_ids[len(head_entity_ids)], entity_ids[len(head_entity_ids) * 2], ..., entity_id[len(head_entity_ids) * (num - 1)] is sampled for head_entity_ids[0]; entity_ids[1], entity_ids[len(head_entity_ids) + 1], entity_ids[len(head_entity_ids) * 2 + 1], ..., entity_id[len(head_entity_ids) * (num - 1) + 1] is sampled for head_entity_ids[1]; ...; and so on. """ try: - return self.sample_by_key_ids(head_entity_ids, num, self.used_ids[head_entity_ids]) + return self.sample_by_key_ids(head_entity_ids, num) except IndexError: for head_entity_id in head_entity_ids: if head_entity_id not in self.head_entities: - raise ValueError('head_entity_id [{}] not exist'.format(head_entity_id)) + raise ValueError(f'head_entity_id [{head_entity_id}] not exist.') class RepeatableSampler(AbstractSampler): @@ -271,6 +350,7 @@ class RepeatableSampler(AbstractSampler): Attributes: phase (str): the phase of sampler. It will not be set until :meth:`set_phase` is called. """ + def __init__(self, phases, dataset, distribution='uniform'): if not isinstance(phases, list): phases = [phases] @@ -278,51 +358,51 @@ def __init__(self, phases, dataset, distribution='uniform'): self.dataset = dataset self.iid_field = dataset.iid_field - self.user_num = dataset.user_num - self.item_num = dataset.item_num + self.n_users = dataset.user_num + self.n_items = dataset.item_num super().__init__(distribution=distribution) def get_random_list(self): """ Returns: - np.ndarray or list: Random list of item_id. + numpy.ndarray or list: Random list of item_id. """ if self.distribution == 'uniform': - return list(range(1, self.item_num)) + return np.arange(1, self.n_items) elif self.distribution == 'popularity': - return self.dataset.inter_feat[self.iid_field].values + return self.dataset.inter_feat[self.iid_field].numpy() else: - raise NotImplementedError('Distribution [{}] has not been implemented'.format(self.distribution)) + raise NotImplementedError(f'Distribution [{self.distribution}] has not been implemented.') def get_used_ids(self): """ Returns: - np.ndarray: Used item_ids is the same as positive item_ids. + numpy.ndarray: Used item_ids is the same as positive item_ids. Index is user_id, and element is a set of item_ids. """ - return np.array([set() for i in range(self.user_num)]) + return np.array([set() for _ in range(self.n_users)]) def sample_by_user_ids(self, user_ids, num): """Sampling by user_ids. Args: - user_ids (np.ndarray or list): Input user_ids. + user_ids (numpy.ndarray or list): Input user_ids. num (int): Number of sampled item_ids for each user_id. Returns: - np.ndarray: Sampled item_ids. + torch.tensor: Sampled item_ids. item_ids[0], item_ids[len(user_ids)], item_ids[len(user_ids) * 2], ..., item_id[len(user_ids) * (num - 1)] is sampled for user_ids[0]; item_ids[1], item_ids[len(user_ids) + 1], item_ids[len(user_ids) * 2 + 1], ..., item_id[len(user_ids) * (num - 1) + 1] is sampled for user_ids[1]; ...; and so on. """ try: - return self.sample_by_key_ids(user_ids, num, self.used_ids[user_ids]) + return self.sample_by_key_ids(user_ids, num) except IndexError: for user_id in user_ids: if user_id < 0 or user_id >= self.n_users: - raise ValueError('user_id [{}] not exist'.format(user_id)) + raise ValueError(f'user_id [{user_id}] not exist.') def set_phase(self, phase): """Get the sampler of corresponding phase. @@ -334,7 +414,7 @@ def set_phase(self, phase): Sampler: the copy of this sampler, and :attr:`phase` is set the same as input phase. """ if phase not in self.phases: - raise ValueError('phase [{}] not exist'.format(phase)) + raise ValueError(f'Phase [{phase}] not exist.') new_sampler = copy.copy(self) new_sampler.phase = phase return new_sampler diff --git a/recbole/trainer/hyper_tuning.py b/recbole/trainer/hyper_tuning.py index 1d7f55dc0..dda0c49e5 100644 --- a/recbole/trainer/hyper_tuning.py +++ b/recbole/trainer/hyper_tuning.py @@ -9,9 +9,10 @@ ############################ """ -import numpy as np from functools import partial +import numpy as np + from recbole.utils.utils import dict2str @@ -43,7 +44,6 @@ def _parameters(space): if isinstance(space, dict): space = list(space.values()) for node in _recursiveFindNodes(space, 'switch'): - # Find the name of this parameter paramNode = node.pos_args[0] assert paramNode.name == 'hyperopt_param' @@ -75,8 +75,10 @@ def _validate_space_exhaustive_search(space): for node in dfs(as_apply(space)): if node.name in implicit_stochastic_symbols: if node.name not in supported_stochastic_symbols: - raise ExhaustiveSearchError('Exhaustive search is only possible with the following stochastic symbols: ' - '' + ', '.join(supported_stochastic_symbols)) + raise ExhaustiveSearchError( + 'Exhaustive search is only possible with the following stochastic symbols: ' + '' + ', '.join(supported_stochastic_symbols) + ) def exhaustive_search(new_ids, domain, trials, seed, nbMaxSucessiveFailures=1000): @@ -86,8 +88,12 @@ def exhaustive_search(new_ids, domain, trials, seed, nbMaxSucessiveFailures=1000 from hyperopt import pyll from hyperopt.base import miscs_update_idxs_vals # Build a hash set for previous trials - hashset = set([hash(frozenset([(key, value[0]) if len(value) > 0 else ((key, None)) - for key, value in trial['misc']['vals'].items()])) for trial in trials.trials]) + hashset = set([ + hash( + frozenset([(key, value[0]) if len(value) > 0 else ((key, None)) + for key, value in trial['misc']['vals'].items()]) + ) for trial in trials.trials + ]) rng = np.random.RandomState(seed) rval = [] @@ -96,19 +102,16 @@ def exhaustive_search(new_ids, domain, trials, seed, nbMaxSucessiveFailures=1000 nbSucessiveFailures = 0 while not newSample: # -- sample new specs, idxs, vals - idxs, vals = pyll.rec_eval( - domain.s_idxs_vals, - memo={ - domain.s_new_ids: [new_id], - domain.s_rng: rng, - }) + idxs, vals = pyll.rec_eval(domain.s_idxs_vals, memo={ + domain.s_new_ids: [new_id], + domain.s_rng: rng, + }) new_result = domain.new_result() new_misc = dict(tid=new_id, cmd=domain.cmd, workdir=domain.workdir) miscs_update_idxs_vals([new_misc], idxs, vals) # Compare with previous hashes - h = hash(frozenset([(key, value[0]) if len(value) > 0 else ( - (key, None)) for key, value in vals.items()])) + h = hash(frozenset([(key, value[0]) if len(value) > 0 else ((key, None)) for key, value in vals.items()])) if h not in hashset: newSample = True else: @@ -119,8 +122,7 @@ def exhaustive_search(new_ids, domain, trials, seed, nbMaxSucessiveFailures=1000 # No more samples to produce return [] - rval.extend(trials.new_trial_docs([new_id], - [None], [new_result], [new_misc])) + rval.extend(trials.new_trial_docs([new_id], [None], [new_result], [new_misc])) return rval @@ -136,8 +138,16 @@ class HyperTuning(object): https://github.com/hyperopt/hyperopt/issues/200 """ - def __init__(self, objective_function, space=None, params_file=None, fixed_config_file_list=None, - algo='exhaustive', max_evals=100): + def __init__( + self, + objective_function, + space=None, + params_file=None, + params_dict=None, + fixed_config_file_list=None, + algo='exhaustive', + max_evals=100 + ): self.best_score = None self.best_params = None self.best_test_result = None @@ -150,8 +160,10 @@ def __init__(self, objective_function, space=None, params_file=None, fixed_confi self.space = space elif params_file: self.space = self._build_space_from_file(params_file) + elif params_dict: + self.space = self._build_space_from_dict(params_dict) else: - raise ValueError('at least one of `space` and `params_file` is provided') + raise ValueError('at least one of `space`, `params_file` and `params_dict` is provided') if isinstance(algo, str): if algo == 'exhaustive': self.algo = partial(exhaustive_search, nbMaxSucessiveFailures=1000) @@ -187,6 +199,38 @@ def _build_space_from_file(file): raise ValueError('Illegal param type [{}]'.format(para_type)) return space + @staticmethod + def _build_space_from_dict(config_dict): + from hyperopt import hp + space = {} + for para_type in config_dict: + if para_type == 'choice': + for para_name in config_dict['choice']: + para_value = config_dict['choice'][para_name] + space[para_name] = hp.choice(para_name, para_value) + elif para_type == 'uniform': + for para_name in config_dict['uniform']: + para_value = config_dict['uniform'][para_name] + low = para_value[0] + high = para_value[1] + space[para_name] = hp.uniform(para_name, float(low), float(high)) + elif para_type == 'quniform': + for para_name in config_dict['quniform']: + para_value = config_dict['quniform'][para_name] + low = para_value[0] + high = para_value[1] + q = para_value[2] + space[para_name] = hp.quniform(para_name, float(low), float(high), float(q)) + elif para_type == 'loguniform': + for para_name in config_dict['loguniform']: + para_value = config_dict['loguniform'][para_name] + low = para_value[0] + high = para_value[1] + space[para_name] = hp.loguniform(para_name, float(low), float(high)) + else: + raise ValueError('Illegal param type [{}]'.format(para_type)) + return space + @staticmethod def params2str(params): r""" convert dict to str @@ -254,7 +298,7 @@ def trial(self, params): self._print_result(result_dict) if bigger: - score = - score + score = -score return {'loss': score, 'status': hyperopt.STATUS_OK} def run(self): diff --git a/recbole/trainer/trainer.py b/recbole/trainer/trainer.py index 34a762c42..9d2c3a8ae 100644 --- a/recbole/trainer/trainer.py +++ b/recbole/trainer/trainer.py @@ -3,9 +3,9 @@ # @Email : slmu@ruc.edu.cn # UPDATE: -# @Time : 2020/8/7, 2020/9/26, 2020/9/26, 2020/10/01, 2020/9/16, 2020/10/8, 2020/10/15 -# @Author : Zihan Lin, Yupeng Hou, Yushuo Chen, Shanlei Mu, Xingyu Pan, Hui Wang, Xinyan Fan -# @Email : linzihan.super@foxmail.com, houyupeng@ruc.edu.cn, chenyushuo@ruc.edu.cn, slmu@ruc.edu.cn, panxy@ruc.edu.cn, hui.wang@ruc.edu.cn, xinyan.fan@ruc.edu.cn +# @Time : 2020/8/7, 2020/9/26, 2020/9/26, 2020/10/01, 2020/9/16, 2020/10/8, 2020/10/15, 2020/11/20 +# @Author : Zihan Lin, Yupeng Hou, Yushuo Chen, Shanlei Mu, Xingyu Pan, Hui Wang, Xinyan Fan, Chen Yang +# @Email : linzihan.super@foxmail.com, houyupeng@ruc.edu.cn, chenyushuo@ruc.edu.cn, slmu@ruc.edu.cn, panxy@ruc.edu.cn, hui.wang@ruc.edu.cn, xinyan.fan@ruc.edu.cn, 254170321@qq.com r""" recbole.trainer.trainer @@ -13,20 +13,19 @@ """ import os -import itertools +from logging import getLogger +from time import time + +import numpy as np import torch import torch.optim as optim from torch.nn.utils.clip_grad import clip_grad_norm_ -import numpy as np -import matplotlib.pyplot as plt +from tqdm import tqdm -from time import time -from logging import getLogger - -from recbole.evaluator import TopKEvaluator, LossEvaluator from recbole.data.interaction import Interaction +from recbole.evaluator import ProxyEvaluator from recbole.utils import ensure_dir, get_local_time, early_stopping, calculate_valid_score, dict2str, \ - DataLoaderType, KGDataLoaderState, EvaluatorType + DataLoaderType, KGDataLoaderState class AbstractTrainer(object): @@ -64,7 +63,7 @@ class Trainer(AbstractTrainer): Initializing the Trainer needs two parameters: `config` and `model`. `config` records the parameters information for controlling training and evaluation, such as `learning_rate`, `epochs`, `eval_step` and so on. - More information can be found in [placeholder]. `model` is the instantiated object of a Model Class. + `model` is the instantiated object of a Model Class. """ @@ -86,6 +85,7 @@ def __init__(self, config, model): ensure_dir(self.checkpoint_dir) saved_model_file = '{}-{}.pth'.format(self.config['model'], get_local_time()) self.saved_model_file = os.path.join(self.checkpoint_dir, saved_model_file) + self.weight_decay = config['weight_decay'] self.start_epoch = 0 self.cur_step = 0 @@ -94,11 +94,7 @@ def __init__(self, config, model): self.train_loss_dict = dict() self.optimizer = self._build_optimizer() self.eval_type = config['eval_type'] - if self.eval_type == EvaluatorType.INDIVIDUAL: - self.evaluator = LossEvaluator(config) - else: - self.evaluator = TopKEvaluator(config) - + self.evaluator = ProxyEvaluator(config) self.item_tensor = None self.tot_item_num = None @@ -109,19 +105,23 @@ def _build_optimizer(self): torch.optim: the optimizer """ if self.learner.lower() == 'adam': - optimizer = optim.Adam(self.model.parameters(), lr=self.learning_rate) + optimizer = optim.Adam(self.model.parameters(), lr=self.learning_rate, weight_decay=self.weight_decay) elif self.learner.lower() == 'sgd': - optimizer = optim.SGD(self.model.parameters(), lr=self.learning_rate) + optimizer = optim.SGD(self.model.parameters(), lr=self.learning_rate, weight_decay=self.weight_decay) elif self.learner.lower() == 'adagrad': - optimizer = optim.Adagrad(self.model.parameters(), lr=self.learning_rate) + optimizer = optim.Adagrad(self.model.parameters(), lr=self.learning_rate, weight_decay=self.weight_decay) elif self.learner.lower() == 'rmsprop': - optimizer = optim.RMSprop(self.model.parameters(), lr=self.learning_rate) + optimizer = optim.RMSprop(self.model.parameters(), lr=self.learning_rate, weight_decay=self.weight_decay) + elif self.learner.lower() == 'sparse_adam': + optimizer = optim.SparseAdam(self.model.parameters(), lr=self.learning_rate) + if self.weight_decay > 0: + self.logger.warning('Sparse Adam cannot argument received argument [{weight_decay}]') else: self.logger.warning('Received unrecognized optimizer, set default Adam optimizer') optimizer = optim.Adam(self.model.parameters(), lr=self.learning_rate) return optimizer - def _train_epoch(self, train_data, epoch_idx, loss_func=None): + def _train_epoch(self, train_data, epoch_idx, loss_func=None, show_progress=False): r"""Train the model in an epoch Args: @@ -129,16 +129,24 @@ def _train_epoch(self, train_data, epoch_idx, loss_func=None): epoch_idx (int): The current epoch id. loss_func (function): The loss function of :attr:`model`. If it is ``None``, the loss function will be :attr:`self.model.calculate_loss`. Defaults to ``None``. + show_progress (bool): Show the progress of training epoch. Defaults to ``False``. Returns: float/tuple: The sum of loss returned by all batches in this epoch. If the loss in each batch contains - multiple parts and the model return these multiple parts loss instead of the sum of loss, It will return a + multiple parts and the model return these multiple parts loss instead of the sum of loss, it will return a tuple which includes the sum of loss in each part. """ self.model.train() loss_func = loss_func or self.model.calculate_loss total_loss = None - for batch_idx, interaction in enumerate(train_data): + iter_data = ( + tqdm( + enumerate(train_data), + total=len(train_data), + desc=f"Train {epoch_idx:>5}", + ) if show_progress else enumerate(train_data) + ) + for batch_idx, interaction in iter_data: interaction = interaction.to(self.device) self.optimizer.zero_grad() losses = loss_func(interaction) @@ -156,17 +164,18 @@ def _train_epoch(self, train_data, epoch_idx, loss_func=None): self.optimizer.step() return total_loss - def _valid_epoch(self, valid_data): + def _valid_epoch(self, valid_data, show_progress=False): r"""Valid the model with valid data Args: - valid_data (DataLoader): the valid data + valid_data (DataLoader): the valid data. + show_progress (bool): Show the progress of evaluate epoch. Defaults to ``False``. Returns: float: valid score dict: valid result """ - valid_result = self.evaluate(valid_data, load_best_model=False) + valid_result = self.evaluate(valid_data, load_best_model=False, show_progress=show_progress) valid_score = calculate_valid_score(valid_result, self.valid_metric) return valid_score, valid_result @@ -202,8 +211,10 @@ def resume_checkpoint(self, resume_file): # load architecture params from checkpoint if checkpoint['config']['model'].lower() != self.config['model'].lower(): - self.logger.warning('Architecture configuration given in config file is different from that of checkpoint. ' - 'This may yield an exception while state_dict is being loaded.') + self.logger.warning( + 'Architecture configuration given in config file is different from that of checkpoint. ' + 'This may yield an exception while state_dict is being loaded.' + ) self.model.load_state_dict(checkpoint['state_dict']) # load optimizer state from checkpoint only when optimizer type is not changed @@ -216,14 +227,17 @@ def _check_nan(self, loss): raise ValueError('Training loss is nan') def _generate_train_loss_output(self, epoch_idx, s_time, e_time, losses): + des = self.config['loss_decimal_place'] or 4 train_loss_output = 'epoch %d training [time: %.2fs, ' % (epoch_idx, e_time - s_time) if isinstance(losses, tuple): - train_loss_output = ', '.join('train_loss%d: %.4f' % (idx + 1, loss) for idx, loss in enumerate(losses)) + des = 'train_loss%d: %.' + str(des) + 'f' + train_loss_output += ', '.join(des % (idx + 1, loss) for idx, loss in enumerate(losses)) else: - train_loss_output += 'train loss: %.4f' % losses + des = '%.' + str(des) + 'f' + train_loss_output += 'train loss:' + des % losses return train_loss_output + ']' - def fit(self, train_data, valid_data=None, verbose=True, saved=True): + def fit(self, train_data, valid_data=None, verbose=True, saved=True, show_progress=False, callback_fn=None): r"""Train the model based on the train data and the valid data. Args: @@ -232,6 +246,9 @@ def fit(self, train_data, valid_data=None, verbose=True, saved=True): If it's None, the early_stopping is invalid. verbose (bool, optional): whether to write training and evaluation information to logger, default: True saved (bool, optional): whether to save the model parameters, default: True + show_progress (bool): Show the progress of training epoch and evaluate epoch. Defaults to ``False``. + callback_fn (callable): Optional callback function executed at end of epoch. + Includes (epoch_idx, valid_score) input arguments. Returns: (float, dict): best valid score and best valid result. If valid_data is None, it returns (-1, None) @@ -242,7 +259,7 @@ def fit(self, train_data, valid_data=None, verbose=True, saved=True): for epoch_idx in range(self.start_epoch, self.epochs): # train training_start_time = time() - train_loss = self._train_epoch(train_data, epoch_idx) + train_loss = self._train_epoch(train_data, epoch_idx, show_progress=show_progress) self.train_loss_dict[epoch_idx] = sum(train_loss) if isinstance(train_loss, tuple) else train_loss training_end_time = time() train_loss_output = \ @@ -260,10 +277,14 @@ def fit(self, train_data, valid_data=None, verbose=True, saved=True): continue if (epoch_idx + 1) % self.eval_step == 0: valid_start_time = time() - valid_score, valid_result = self._valid_epoch(valid_data) + valid_score, valid_result = self._valid_epoch(valid_data, show_progress=show_progress) self.best_valid_score, self.cur_step, stop_flag, update_flag = early_stopping( - valid_score, self.best_valid_score, self.cur_step, - max_step=self.stopping_step, bigger=self.valid_metric_bigger) + valid_score, + self.best_valid_score, + self.cur_step, + max_step=self.stopping_step, + bigger=self.valid_metric_bigger + ) valid_end_time = time() valid_score_output = "epoch %d evaluating [time: %.2fs, valid_score: %f]" % \ (epoch_idx, valid_end_time - valid_start_time, valid_score) @@ -279,6 +300,9 @@ def fit(self, train_data, valid_data=None, verbose=True, saved=True): self.logger.info(update_output) self.best_valid_result = valid_result + if callback_fn: + callback_fn(epoch_idx, valid_score) + if stop_flag: stop_output = 'Finished training, best eval result in epoch %d' % \ (epoch_idx - self.cur_step * self.eval_step) @@ -288,50 +312,33 @@ def fit(self, train_data, valid_data=None, verbose=True, saved=True): return self.best_valid_score, self.best_valid_result def _full_sort_batch_eval(self, batched_data): - # Note: interaction without item ids - interaction, pos_idx, used_idx, pos_len_list, neg_len_list = batched_data - - batch_size = interaction.length * self.tot_item_num - used_idx = torch.cat([used_idx, torch.arange(interaction.length) * self.tot_item_num]) # remove [pad] item - neg_len_list = list(np.subtract(neg_len_list, 1)) + interaction, history_index, swap_row, swap_col_after, swap_col_before = batched_data try: # Note: interaction without item ids - scores = self.model.full_sort_predict(interaction.to(self.device)).flatten() + scores = self.model.full_sort_predict(interaction.to(self.device)) except NotImplementedError: - interaction = interaction.to(self.device).repeat_interleave(self.tot_item_num) - interaction.update(self.item_tensor[:batch_size]) + new_inter = interaction.to(self.device).repeat_interleave(self.tot_item_num) + batch_size = len(new_inter) + new_inter.update(self.item_tensor[:batch_size]) if batch_size <= self.test_batch_size: - scores = self.model.predict(interaction) + scores = self.model.predict(new_inter) else: - scores = self._spilt_predict(interaction, batch_size) - pos_idx = pos_idx.to(self.device) - used_idx = used_idx.to(self.device) - - pos_scores = scores.index_select(dim=0, index=pos_idx) - pos_scores = torch.split(pos_scores, pos_len_list, dim=0) + scores = self._spilt_predict(new_inter, batch_size) - ones_tensor = torch.ones(batch_size, dtype=torch.bool, device=self.device) - used_mask = ones_tensor.index_fill(dim=0, index=used_idx, value=0) - neg_scores = scores.masked_select(used_mask) - neg_scores = torch.split(neg_scores, neg_len_list, dim=0) + scores = scores.view(-1, self.tot_item_num) + scores[:, 0] = -np.inf + if history_index is not None: + scores[history_index] = -np.inf - tmp_len_list = np.add(pos_len_list, neg_len_list).tolist() - final_scores_width = max(self.tot_item_num, max(tmp_len_list)) - extra_len_list = np.subtract(final_scores_width, tmp_len_list).tolist() - padding_nums = final_scores_width * len(tmp_len_list) - np.sum(tmp_len_list) - padding_tensor = torch.tensor([-np.inf], dtype=scores.dtype, device=self.device).repeat(padding_nums) - padding_scores = torch.split(padding_tensor, extra_len_list) + swap_row = swap_row.to(self.device) + swap_col_after = swap_col_after.to(self.device) + swap_col_before = swap_col_before.to(self.device) + scores[swap_row, swap_col_after] = scores[swap_row, swap_col_before] - final_scores = list(itertools.chain.from_iterable(zip(pos_scores, neg_scores, padding_scores))) - final_scores = torch.cat(final_scores) - - setattr(interaction, 'pos_len_list', pos_len_list) - setattr(interaction, 'user_len_list', len(tmp_len_list) * [final_scores_width]) - - return interaction, final_scores + return interaction, scores @torch.no_grad() - def evaluate(self, eval_data, load_best_model=True, model_file=None): + def evaluate(self, eval_data, load_best_model=True, model_file=None, show_progress=False): r"""Evaluate the model based on the eval data. Args: @@ -340,10 +347,14 @@ def evaluate(self, eval_data, load_best_model=True, model_file=None): It should be set True, if users want to test the model after training. model_file (str, optional): the saved model file, default: None. If users want to test the previously trained model file, they can set this parameter. + show_progress (bool): Show the progress of evaluate epoch. Defaults to ``False``. Returns: - dict: eval result, key is the eval metric and value in the corresponding metric value + dict: eval result, key is the eval metric and value in the corresponding metric value. """ + if not eval_data: + return + if load_best_model: if model_file: checkpoint_file = model_file @@ -362,22 +373,25 @@ def evaluate(self, eval_data, load_best_model=True, model_file=None): self.tot_item_num = eval_data.dataset.item_num batch_matrix_list = [] - for batch_idx, batched_data in enumerate(eval_data): + iter_data = ( + tqdm( + enumerate(eval_data), + total=len(eval_data), + desc=f"Evaluate ", + ) if show_progress else enumerate(eval_data) + ) + for batch_idx, batched_data in iter_data: if eval_data.dl_type == DataLoaderType.FULL: - if self.eval_type == EvaluatorType.INDIVIDUAL: - raise ValueError('full sort can\'t use LossEvaluator') interaction, scores = self._full_sort_batch_eval(batched_data) - batch_matrix = self.evaluator.collect(interaction, scores, full=True) else: interaction = batched_data batch_size = interaction.length - if batch_size <= self.test_batch_size: scores = self.model.predict(interaction.to(self.device)) else: scores = self._spilt_predict(interaction, batch_size) - batch_matrix = self.evaluator.collect(interaction, scores) + batch_matrix = self.evaluator.collect(interaction, scores) batch_matrix_list.append(batch_matrix) result = self.evaluator.evaluate(batch_matrix_list, eval_data) @@ -403,10 +417,11 @@ def plot_train_loss(self, show=True, save_path=None): r"""Plot the train loss in each epoch Args: - show (bool, optional): whether to show this figure, default: True - save_path (str, optional): the data path to save the figure, default: None. + show (bool, optional): Whether to show this figure, default: True + save_path (str, optional): The data path to save the figure, default: None. If it's None, it will not be saved. """ + import matplotlib.pyplot as plt epochs = list(self.train_loss_dict.keys()) epochs.sort() values = [float(self.train_loss_dict[epoch]) for epoch in epochs] @@ -432,7 +447,7 @@ def __init__(self, config, model): self.train_rec_step = config['train_rec_step'] self.train_kg_step = config['train_kg_step'] - def _train_epoch(self, train_data, epoch_idx, loss_func=None): + def _train_epoch(self, train_data, epoch_idx, loss_func=None, show_progress=False): if self.train_rec_step is None or self.train_kg_step is None: interaction_state = KGDataLoaderState.RSKG elif epoch_idx % (self.train_rec_step + self.train_kg_step) < self.train_rec_step: @@ -441,9 +456,11 @@ def _train_epoch(self, train_data, epoch_idx, loss_func=None): interaction_state = KGDataLoaderState.KG train_data.set_mode(interaction_state) if interaction_state in [KGDataLoaderState.RSKG, KGDataLoaderState.RS]: - return super()._train_epoch(train_data, epoch_idx) + return super()._train_epoch(train_data, epoch_idx, show_progress=show_progress) elif interaction_state in [KGDataLoaderState.KG]: - return super()._train_epoch(train_data, epoch_idx, self.model.calculate_kg_loss) + return super()._train_epoch( + train_data, epoch_idx, loss_func=self.model.calculate_kg_loss, show_progress=show_progress + ) return None @@ -455,17 +472,21 @@ class KGATTrainer(Trainer): def __init__(self, config, model): super(KGATTrainer, self).__init__(config, model) - def _train_epoch(self, train_data, epoch_idx, loss_func=None): + def _train_epoch(self, train_data, epoch_idx, loss_func=None, show_progress=False): # train rs train_data.set_mode(KGDataLoaderState.RS) - rs_total_loss = super()._train_epoch(train_data, epoch_idx) + rs_total_loss = super()._train_epoch(train_data, epoch_idx, show_progress=show_progress) # train kg train_data.set_mode(KGDataLoaderState.KG) - kg_total_loss = super()._train_epoch(train_data, epoch_idx, self.model.calculate_kg_loss) + kg_total_loss = super()._train_epoch( + train_data, epoch_idx, loss_func=self.model.calculate_kg_loss, show_progress=show_progress + ) # update A - self.model.update_attentive_A() + self.model.eval() + with torch.no_grad(): + self.model.update_attentive_A() return rs_total_loss, kg_total_loss @@ -495,12 +516,12 @@ def save_pretrained_model(self, epoch, saved_model_file): } torch.save(state, saved_model_file) - def pretrain(self, train_data, verbose=True): + def pretrain(self, train_data, verbose=True, show_progress=False): for epoch_idx in range(self.start_epoch, self.epochs): # train training_start_time = time() - train_loss = self._train_epoch(train_data, epoch_idx) + train_loss = self._train_epoch(train_data, epoch_idx, show_progress=show_progress) self.train_loss_dict[epoch_idx] = sum(train_loss) if isinstance(train_loss, tuple) else train_loss training_end_time = time() train_loss_output = \ @@ -509,9 +530,10 @@ def pretrain(self, train_data, verbose=True): self.logger.info(train_loss_output) if (epoch_idx + 1) % self.config['save_step'] == 0: - saved_model_file = os.path.join(self.checkpoint_dir, - '{}-{}-{}.pth'.format(self.config['model'], self.config['dataset'], - str(epoch_idx + 1))) + saved_model_file = os.path.join( + self.checkpoint_dir, + '{}-{}-{}.pth'.format(self.config['model'], self.config['dataset'], str(epoch_idx + 1)) + ) self.save_pretrained_model(epoch_idx, saved_model_file) update_output = 'Saving current: %s' % saved_model_file if verbose: @@ -519,11 +541,11 @@ def pretrain(self, train_data, verbose=True): return self.best_valid_score, self.best_valid_result - def fit(self, train_data, valid_data=None, verbose=True, saved=True): + def fit(self, train_data, valid_data=None, verbose=True, saved=True, show_progress=False, callback_fn=None): if self.model.train_stage == 'pretrain': - return self.pretrain(train_data, verbose) + return self.pretrain(train_data, verbose, show_progress) elif self.model.train_stage == 'finetune': - return super().fit(train_data, valid_data, verbose, saved) + return super().fit(train_data, valid_data, verbose, saved, show_progress, callback_fn) else: raise ValueError("Please make sure that the 'train_stage' is 'pretrain' or 'finetune' ") @@ -537,19 +559,23 @@ def __init__(self, config, model): super(MKRTrainer, self).__init__(config, model) self.kge_interval = config['kge_interval'] - def _train_epoch(self, train_data, epoch_idx, loss_func=None): + def _train_epoch(self, train_data, epoch_idx, loss_func=None, show_progress=False): rs_total_loss, kg_total_loss = 0., 0. # train rs self.logger.info('Train RS') train_data.set_mode(KGDataLoaderState.RS) - rs_total_loss = super()._train_epoch(train_data, epoch_idx, self.model.calculate_rs_loss) + rs_total_loss = super()._train_epoch( + train_data, epoch_idx, loss_func=self.model.calculate_rs_loss, show_progress=show_progress + ) # train kg if epoch_idx % self.kge_interval == 0: self.logger.info('Train KG') train_data.set_mode(KGDataLoaderState.KG) - kg_total_loss = super()._train_epoch(train_data, epoch_idx, self.model.calculate_kg_loss) + kg_total_loss = super()._train_epoch( + train_data, epoch_idx, loss_func=self.model.calculate_kg_loss, show_progress=show_progress + ) return rs_total_loss, kg_total_loss @@ -562,3 +588,182 @@ class TraditionalTrainer(Trainer): def __init__(self, config, model): super(TraditionalTrainer, self).__init__(config, model) self.epochs = 1 # Set the epoch to 1 when running memory based model + + +class xgboostTrainer(AbstractTrainer): + """xgboostTrainer is designed for XGBOOST. + + """ + + def __init__(self, config, model): + super(xgboostTrainer, self).__init__(config, model) + + self.xgb = __import__('xgboost') + + self.logger = getLogger() + self.label_field = config['LABEL_FIELD'] + self.xgb_model = config['xgb_model'] + self.convert_token_to_onehot = self.config['convert_token_to_onehot'] + + # DMatrix params + self.weight = config['xgb_weight'] + self.base_margin = config['xgb_base_margin'] + self.missing = config['xgb_missing'] + self.silent = config['xgb_silent'] + self.feature_names = config['xgb_feature_names'] + self.feature_types = config['xgb_feature_types'] + self.nthread = config['xgb_nthread'] + + # train params + self.params = config['xgb_params'] + self.num_boost_round = config['xgb_num_boost_round'] + self.evals = () + self.obj = config['xgb_obj'] + self.feval = config['xgb_feval'] + self.maximize = config['xgb_maximize'] + self.early_stopping_rounds = config['xgb_early_stopping_rounds'] + self.evals_result = {} + self.verbose_eval = config['xgb_verbose_eval'] + self.callbacks = None + + # evaluator + self.eval_type = config['eval_type'] + self.epochs = config['epochs'] + self.eval_step = min(config['eval_step'], self.epochs) + self.valid_metric = config['valid_metric'].lower() + + self.evaluator = ProxyEvaluator(config) + + # model saved + self.checkpoint_dir = config['checkpoint_dir'] + ensure_dir(self.checkpoint_dir) + saved_model_file = '{}-{}.pth'.format(self.config['model'], get_local_time()) + self.saved_model_file = os.path.join(self.checkpoint_dir, saved_model_file) + + def _interaction_to_DMatrix(self, dataloader): + r"""Convert data format from interaction to DMatrix + + Args: + dataloader (XgboostDataLoader): xgboost dataloader. + Returns: + DMatrix: Data in the form of 'DMatrix'. + """ + interaction = dataloader.dataset[:] + interaction_np = interaction.numpy() + cur_data = np.array([]) + columns = [] + for key, value in interaction_np.items(): + value = np.resize(value, (value.shape[0], 1)) + if key != self.label_field: + columns.append(key) + if cur_data.shape[0] == 0: + cur_data = value + else: + cur_data = np.hstack((cur_data, value)) + + if self.convert_token_to_onehot == True: + from scipy import sparse + from scipy.sparse import dok_matrix + convert_col_list = dataloader.dataset.convert_col_list + hash_count = dataloader.dataset.hash_count + + new_col = cur_data.shape[1] - len(convert_col_list) + for key, values in hash_count.items(): + new_col = new_col + values + onehot_data = dok_matrix((cur_data.shape[0], new_col)) + + cur_j = 0 + new_j = 0 + + for key in columns: + if key in convert_col_list: + for i in range(cur_data.shape[0]): + onehot_data[i, int(new_j + cur_data[i, cur_j])] = 1 + new_j = new_j + hash_count[key] - 1 + else: + for i in range(cur_data.shape[0]): + onehot_data[i, new_j] = cur_data[i, cur_j] + cur_j = cur_j + 1 + new_j = new_j + 1 + + cur_data = sparse.csc_matrix(onehot_data) + + return self.xgb.DMatrix( + data=cur_data, + label=interaction_np[self.label_field], + weight=self.weight, + base_margin=self.base_margin, + missing=self.missing, + silent=self.silent, + feature_names=self.feature_names, + feature_types=self.feature_types, + nthread=self.nthread + ) + + def _train_at_once(self, train_data, valid_data): + r""" + + Args: + train_data (XgboostDataLoader): XgboostDataLoader, which is the same with GeneralDataLoader. + valid_data (XgboostDataLoader): XgboostDataLoader, which is the same with GeneralDataLoader. + """ + self.dtrain = self._interaction_to_DMatrix(train_data) + self.dvalid = self._interaction_to_DMatrix(valid_data) + self.evals = [(self.dtrain, 'train'), (self.dvalid, 'valid')] + self.model = self.xgb.train( + self.params, self.dtrain, self.num_boost_round, self.evals, self.obj, self.feval, self.maximize, + self.early_stopping_rounds, self.evals_result, self.verbose_eval, self.xgb_model, self.callbacks + ) + + self.model.save_model(self.saved_model_file) + self.xgb_model = self.saved_model_file + + def _valid_epoch(self, valid_data): + r""" + + Args: + valid_data (XgboostDataLoader): XgboostDataLoader, which is the same with GeneralDataLoader. + """ + valid_result = self.evaluate(valid_data) + valid_score = calculate_valid_score(valid_result, self.valid_metric) + return valid_result, valid_score + + def fit(self, train_data, valid_data=None, verbose=True, saved=True, show_progress=False): + # load model + if self.xgb_model is not None: + self.model.load_model(self.xgb_model) + + self.best_valid_score = 0. + self.best_valid_result = 0. + + for epoch_idx in range(self.epochs): + self._train_at_once(train_data, valid_data) + + if (epoch_idx + 1) % self.eval_step == 0: + # evaluate + valid_start_time = time() + valid_result, valid_score = self._valid_epoch(valid_data) + valid_end_time = time() + valid_score_output = "epoch %d evaluating [time: %.2fs, valid_score: %f]" % \ + (epoch_idx, valid_end_time - valid_start_time, valid_score) + valid_result_output = 'valid result: \n' + dict2str(valid_result) + if verbose: + self.logger.info(valid_score_output) + self.logger.info(valid_result_output) + + self.best_valid_score = valid_score + self.best_valid_result = valid_result + + return self.best_valid_score, self.best_valid_result + + def evaluate(self, eval_data, load_best_model=True, model_file=None, show_progress=False): + self.eval_pred = torch.Tensor() + self.eval_true = torch.Tensor() + + self.deval = self._interaction_to_DMatrix(eval_data) + self.eval_true = torch.Tensor(self.deval.get_label()) + self.eval_pred = torch.Tensor(self.model.predict(self.deval)) + + batch_matrix_list = [[torch.stack((self.eval_true, self.eval_pred), 1)]] + result = self.evaluator.evaluate(batch_matrix_list, eval_data) + return result diff --git a/recbole/utils/__init__.py b/recbole/utils/__init__.py index d6f8bac2e..22240e0aa 100644 --- a/recbole/utils/__init__.py +++ b/recbole/utils/__init__.py @@ -4,8 +4,9 @@ from recbole.utils.enum_type import * from recbole.utils.argument_list import * - -__all__ = ['init_logger', 'get_local_time', 'ensure_dir', 'get_model', 'get_trainer', 'early_stopping', - 'calculate_valid_score', 'dict2str', 'Enum', 'ModelType', 'DataLoaderType', 'KGDataLoaderState', - 'EvaluatorType', 'InputType', 'FeatureType', 'FeatureSource', 'init_seed', - 'general_arguments', 'training_arguments', 'evaluation_arguments', 'dataset_arguments'] +__all__ = [ + 'init_logger', 'get_local_time', 'ensure_dir', 'get_model', 'get_trainer', 'early_stopping', + 'calculate_valid_score', 'dict2str', 'Enum', 'ModelType', 'DataLoaderType', 'KGDataLoaderState', 'EvaluatorType', + 'InputType', 'FeatureType', 'FeatureSource', 'init_seed', 'general_arguments', 'training_arguments', + 'evaluation_arguments', 'dataset_arguments' +] diff --git a/recbole/utils/argument_list.py b/recbole/utils/argument_list.py index a1af91f4a..95c7fe2c7 100644 --- a/recbole/utils/argument_list.py +++ b/recbole/utils/argument_list.py @@ -2,36 +2,51 @@ # @Author : Shanlei Mu # @Email : slmu@ruc.edu.cn +# yapf: disable -general_arguments = ['gpu_id', 'use_gpu', - 'seed', - 'reproducibility', - 'state', - 'data_path'] +general_arguments = [ + 'gpu_id', 'use_gpu', + 'seed', + 'reproducibility', + 'state', + 'data_path', + 'show_progress', +] -training_arguments = ['epochs', 'train_batch_size', - 'learner', 'learning_rate', - 'training_neg_sample_num', - 'eval_step', 'stopping_step', - 'checkpoint_dir'] +training_arguments = [ + 'epochs', 'train_batch_size', + 'learner', 'learning_rate', + 'training_neg_sample_num', + 'training_neg_sample_distribution', + 'eval_step', 'stopping_step', + 'checkpoint_dir', + 'clip_grad_norm', + 'loss_decimal_place', + 'weight_decay' +] -evaluation_arguments = ['eval_setting', - 'group_by_user', - 'split_ratio', 'leave_one_num', - 'real_time_process', - 'metrics', 'topk', 'valid_metric', - 'eval_batch_size'] +evaluation_arguments = [ + 'eval_setting', + 'group_by_user', + 'split_ratio', 'leave_one_num', + 'real_time_process', + 'metrics', 'topk', 'valid_metric', + 'eval_batch_size', + 'metric_decimal_place' +] -dataset_arguments = ['field_separator', 'seq_separator', - 'USER_ID_FIELD', 'ITEM_ID_FIELD', 'RATING_FIELD', 'TIME_FIELD' - 'seq_len', - 'LABEL_FIELD', 'threshold', - 'NEG_PREFIX', - 'ITEM_LIST_LENGTH_FIELD', 'LIST_SUFFIX', 'MAX_ITEM_LIST_LENGTH', 'POSITION_FIELD', - 'HEAD_ENTITY_ID_FIELD', 'TAIL_ENTITY_ID_FIELD', 'RELATION_ID_FIELD', 'ENTITY_ID_FIELD', - 'load_col', 'unload_col', 'additional_feat_suffix', - 'max_user_inter_num', 'min_user_inter_num', 'max_item_inter_num', 'min_item_inter_num', - 'lowest_val', 'highest_val', 'equal_val', 'not_equal_val', 'drop_filter_field', - 'fields_in_same_space', 'fill_nan', - 'preload_weight', 'drop_preload_weight', - 'normalize_field', 'normalize_all'] +dataset_arguments = [ + 'field_separator', 'seq_separator', + 'USER_ID_FIELD', 'ITEM_ID_FIELD', 'RATING_FIELD', 'TIME_FIELD', + 'seq_len', + 'LABEL_FIELD', 'threshold', + 'NEG_PREFIX', + 'ITEM_LIST_LENGTH_FIELD', 'LIST_SUFFIX', 'MAX_ITEM_LIST_LENGTH', 'POSITION_FIELD', + 'HEAD_ENTITY_ID_FIELD', 'TAIL_ENTITY_ID_FIELD', 'RELATION_ID_FIELD', 'ENTITY_ID_FIELD', + 'load_col', 'unload_col', 'unused_col', 'additional_feat_suffix', + 'max_user_inter_num', 'min_user_inter_num', 'max_item_inter_num', 'min_item_inter_num', + 'lowest_val', 'highest_val', 'equal_val', 'not_equal_val', + 'fields_in_same_space', + 'preload_weight', + 'normalize_field', 'normalize_all' +] diff --git a/recbole/utils/case_study.py b/recbole/utils/case_study.py new file mode 100644 index 000000000..d8a3b38de --- /dev/null +++ b/recbole/utils/case_study.py @@ -0,0 +1,88 @@ +# @Time : 2020/12/25 +# @Author : Yushuo Chen +# @Email : chenyushuo@ruc.edu.cn + +# UPDATE +# @Time : 2020/12/25 +# @Author : Yushuo Chen +# @email : chenyushuo@ruc.edu.cn + +""" +recbole.utils.case_study +##################################### +""" + +import numpy as np +import torch + +from recbole.data.dataloader.general_dataloader import GeneralFullDataLoader +from recbole.data.dataloader.sequential_dataloader import SequentialFullDataLoader + + +@torch.no_grad() +def full_sort_scores(uid_series, model, test_data): + """Calculate the scores of all items for each user in uid_series. + + Note: + The score of [pad] and history items will be set into -inf. + + Args: + uid_series (numpy.ndarray): User id series + model (AbstractRecommender): Model to predict + test_data (AbstractDataLoader): The test_data of model + + Returns: + torch.Tensor: the scores of all items for each user in uid_series. + """ + uid_field = test_data.dataset.uid_field + dataset = test_data.dataset + model.eval() + + if isinstance(test_data, GeneralFullDataLoader): + index = np.isin(test_data.user_df[uid_field].numpy(), uid_series) + input_interaction = test_data.user_df[index] + history_item = test_data.uid2history_item[input_interaction[uid_field].numpy()] + history_row = torch.cat([torch.full_like(hist_iid, i) for i, hist_iid in enumerate(history_item)]) + history_col = torch.cat(list(history_item)) + history_index = history_row, history_col + elif isinstance(test_data, SequentialFullDataLoader): + index = np.isin(test_data.uid_list, uid_series) + input_interaction = test_data.augmentation( + test_data.item_list_index[index], test_data.target_index[index], test_data.item_list_length[index] + ) + history_index = None + else: + raise NotImplementedError + + # Get scores of all items + try: + scores = model.full_sort_predict(input_interaction) + except NotImplementedError: + input_interaction = input_interaction.repeat(dataset.item_num) + input_interaction.update(test_data.get_item_feature().repeat(len(uid_series))) + scores = model.predict(input_interaction) + + scores = scores.view(-1, dataset.item_num) + scores[:, 0] = -np.inf # set scores of [pad] to -inf + if history_index is not None: + scores[history_index] = -np.inf # set scores of history items to -inf + + return scores + + +def full_sort_topk(uid_series, model, test_data, k): + """Calculate the top-k items' scores and ids for each user in uid_series. + + Args: + uid_series (numpy.ndarray): User id series + model (AbstractRecommender): Model to predict + test_data (AbstractDataLoader): The test_data of model + k (int): The top-k items. + + Returns: + tuple: + - topk_scores (torch.Tensor): The scores of topk items. + - topk_index (torch.Tensor): The index of topk items, which is also the internal ids of items. + """ + scores = full_sort_scores(uid_series, model, test_data) + return torch.topk(scores, k) diff --git a/recbole/utils/enum_type.py b/recbole/utils/enum_type.py index 62a11ee7c..84e15b812 100644 --- a/recbole/utils/enum_type.py +++ b/recbole/utils/enum_type.py @@ -26,6 +26,7 @@ class ModelType(Enum): KNOWLEDGE = 4 SOCIAL = 5 TRADITIONAL = 6 + XGBOOST = 7 class DataLoaderType(Enum): diff --git a/recbole/utils/logger.py b/recbole/utils/logger.py index 15e67e47a..76559710f 100644 --- a/recbole/utils/logger.py +++ b/recbole/utils/logger.py @@ -11,7 +11,7 @@ import logging import os -from recbole.utils.utils import get_local_time +from recbole.utils.utils import get_local_time, ensure_dir def init_logger(config): @@ -30,8 +30,7 @@ def init_logger(config): """ LOGROOT = './log/' dir_name = os.path.dirname(LOGROOT) - if not os.path.exists(dir_name): - os.makedirs(dir_name) + ensure_dir(dir_name) logfilename = '{}-{}.log'.format(config['model'], get_local_time()) @@ -64,8 +63,4 @@ def init_logger(config): sh.setLevel(level) sh.setFormatter(sformatter) - logging.basicConfig( - level=level, - handlers=[fh, sh] - ) - + logging.basicConfig(level=level, handlers=[fh, sh]) diff --git a/recbole/utils/utils.py b/recbole/utils/utils.py index c2df49040..9e2012372 100644 --- a/recbole/utils/utils.py +++ b/recbole/utils/utils.py @@ -8,12 +8,14 @@ ################################ """ -import os import datetime import importlib +import os import random -import torch + import numpy as np +import torch + from recbole.utils.enum_type import ModelType @@ -50,18 +52,20 @@ def get_model(model_name): Recommender: model class """ model_submodule = [ - 'general_recommender', - 'context_aware_recommender', - 'sequential_recommender', - 'knowledge_aware_recommender' + 'general_recommender', 'context_aware_recommender', 'sequential_recommender', 'knowledge_aware_recommender', + 'exlib_recommender' ] model_file_name = model_name.lower() + model_module = None for submodule in model_submodule: - module_path = '.'.join(['...model', submodule, model_file_name]) + module_path = '.'.join(['recbole.model', submodule, model_file_name]) if importlib.util.find_spec(module_path, __name__): model_module = importlib.import_module(module_path, __name__) + break + if model_module is None: + raise ValueError('`model_name` [{}] is not the name of an existing model.'.format(model_name)) model_class = getattr(model_module, model_name) return model_class @@ -159,7 +163,7 @@ def dict2str(result_dict): result_str = '' for metric, value in result_dict.items(): - result_str += str(metric) + ' : ' + '%.04f' % value + ' ' + result_str += str(metric) + ' : ' + str(value) + ' ' return result_str diff --git a/requirements.txt b/requirements.txt index 94a327fc4..46f740e11 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,7 +1,7 @@ matplotlib>=3.1.3 -torch>=1.6.0 +torch>=1.7.0 numpy>=1.17.2 -scipy>=1.3.1 +scipy==1.6.0 hyperopt>=0.2.4 pandas>=1.0.5 tqdm>=4.48.2 diff --git a/run_test.sh b/run_test.sh new file mode 100644 index 000000000..6d70c5171 --- /dev/null +++ b/run_test.sh @@ -0,0 +1,22 @@ +#!/bin/bash + + +python -m pytest -v tests/metrics +echo "metrics tests finished" + +python -m pytest -v tests/config/test_config.py +python -m pytest -v tests/config/test_overall.py +export PYTHONPATH=. +python tests/config/test_command_line.py --use_gpu=False --valid_metric=Recall@10 --split_ratio=[0.7,0.2,0.1] --metrics=['Recall@10'] --epochs=200 --eval_setting='LO_RS' --learning_rate=0.3 +echo "config tests finished" + +python -m pytest -v tests/evaluation_setting +echo "evaluation_setting tests finished" + +python -m pytest -v tests/model/test_model_auto.py +python -m pytest -v tests/model/test_model_manual.py +echo "model tests finished" + +python -m pytest -v tests/data/test_dataset.py +python -m pytest -v tests/data/test_dataloader.py +echo "data tests finished" \ No newline at end of file diff --git a/run_test_example.py b/run_test_example.py index 4f7b68cc0..82aacfd92 100644 --- a/run_test_example.py +++ b/run_test_example.py @@ -130,6 +130,22 @@ 'model': 'ConvNCF', 'dataset': 'ml-100k', }, + 'Test LINE': { + 'model': 'LINE', + 'dataset': 'ml-100k', + }, + 'Test MultiDAE': { + 'model': 'MultiDAE', + 'dataset': 'ml-100k', + }, + 'Test MultiVAE': { + 'model': 'LINE', + 'dataset': 'ml-100k', + }, + 'Test MacridVAE': { + 'model': 'MacridVAE', + 'dataset': 'ml-100k', + }, # Context-aware Recommendation 'Test FM': { diff --git a/setup.py b/setup.py index a8c0a358a..4cd4676cf 100644 --- a/setup.py +++ b/setup.py @@ -6,7 +6,7 @@ from setuptools import setup, find_packages -install_requires = ['numpy>=1.17.2', 'torch>=1.6.0', 'scipy>=1.3.1', 'pandas>=1.0.5', 'tqdm>=4.48.2', +install_requires = ['numpy>=1.17.2', 'torch>=1.7.0', 'scipy>=1.3.1', 'pandas>=1.0.5', 'tqdm>=4.48.2', 'scikit_learn>=0.23.2', 'pyyaml>=5.1.0', 'matplotlib>=3.1.3'] setup_requires = [] @@ -20,7 +20,7 @@ long_description = 'RecBole is developed based on Python and PyTorch for ' \ 'reproducing and developing recommendation algorithms in ' \ 'a unified, comprehensive and efficient framework for ' \ - 'research purpose. In the first version, Our library ' \ + 'research purpose. In the first version, our library ' \ 'includes 53 recommendation algorithms, covering four ' \ 'major categories: General Recommendation, Sequential ' \ 'Recommendation, Context-aware Recommendation and ' \ @@ -36,7 +36,7 @@ setup( name='recbole', version= - '0.1.1', # please remember to edit recbole/__init__.py in response, once updating the version + '0.2.0', # please remember to edit recbole/__init__.py in response, once updating the version description='A unified, comprehensive and efficient recommendation library', long_description=long_description, long_description_content_type="text/markdown", diff --git a/style.cfg b/style.cfg new file mode 100644 index 000000000..606c7ccbd --- /dev/null +++ b/style.cfg @@ -0,0 +1,394 @@ +[style] +# Align closing bracket with visual indentation. +align_closing_bracket_with_visual_indent=True + +# Allow dictionary keys to exist on multiple lines. For example: +# +# x = { +# ('this is the first element of a tuple', +# 'this is the second element of a tuple'): +# value, +# } +allow_multiline_dictionary_keys=False + +# Allow lambdas to be formatted on more than one line. +allow_multiline_lambdas=False + +# Allow splitting before a default / named assignment in an argument list. +allow_split_before_default_or_named_assigns=True + +# Allow splits before the dictionary value. +allow_split_before_dict_value=True + +# Let spacing indicate operator precedence. For example: +# +# a = 1 * 2 + 3 / 4 +# b = 1 / 2 - 3 * 4 +# c = (1 + 2) * (3 - 4) +# d = (1 - 2) / (3 + 4) +# e = 1 * 2 - 3 +# f = 1 + 2 + 3 + 4 +# +# will be formatted as follows to indicate precedence: +# +# a = 1*2 + 3/4 +# b = 1/2 - 3*4 +# c = (1+2) * (3-4) +# d = (1-2) / (3+4) +# e = 1*2 - 3 +# f = 1 + 2 + 3 + 4 +# +arithmetic_precedence_indication=False + +# Number of blank lines surrounding top-level function and class +# definitions. +blank_lines_around_top_level_definition=2 + +# Insert a blank line before a class-level docstring. +blank_line_before_class_docstring=False + +# Insert a blank line before a module docstring. +blank_line_before_module_docstring=True + +# Insert a blank line before a 'def' or 'class' immediately nested +# within another 'def' or 'class'. For example: +# +# class Foo: +# # <------ this blank line +# def method(): +# ... +blank_line_before_nested_class_or_def=True + +# Do not split consecutive brackets. Only relevant when +# dedent_closing_brackets is set. For example: +# +# call_func_that_takes_a_dict( +# { +# 'key1': 'value1', +# 'key2': 'value2', +# } +# ) +# +# would reformat to: +# +# call_func_that_takes_a_dict({ +# 'key1': 'value1', +# 'key2': 'value2', +# }) +coalesce_brackets=True + +# The column limit. +column_limit=120 + +# The style for continuation alignment. Possible values are: +# +# - SPACE: Use spaces for continuation alignment. This is default behavior. +# - FIXED: Use fixed number (CONTINUATION_INDENT_WIDTH) of columns +# (ie: CONTINUATION_INDENT_WIDTH/INDENT_WIDTH tabs or +# CONTINUATION_INDENT_WIDTH spaces) for continuation alignment. +# - VALIGN-RIGHT: Vertically align continuation lines to multiple of +# INDENT_WIDTH columns. Slightly right (one tab or a few spaces) if +# cannot vertically align continuation lines with indent characters. +continuation_align_style=SPACE + +# Indent width used for line continuations. +continuation_indent_width=4 + +# Put closing brackets on a separate line, dedented, if the bracketed +# expression can't fit in a single line. Applies to all kinds of brackets, +# including function definitions and calls. For example: +# +# config = { +# 'key1': 'value1', +# 'key2': 'value2', +# } # <--- this bracket is dedented and on a separate line +# +# time_series = self.remote_client.query_entity_counters( +# entity='dev3246.region1', +# key='dns.query_latency_tcp', +# transform=Transformation.AVERAGE(window=timedelta(seconds=60)), +# start_ts=now()-timedelta(days=3), +# end_ts=now(), +# ) # <--- this bracket is dedented and on a separate line +dedent_closing_brackets=True + +# Disable the heuristic which places each list element on a separate line +# if the list is comma-terminated. +disable_ending_comma_heuristic=False + +# Place each dictionary entry onto its own line. +each_dict_entry_on_separate_line=True + +# Require multiline dictionary even if it would normally fit on one line. +# For example: +# +# config = { +# 'key1': 'value1' +# } +force_multiline_dict=False + +# The regex for an i18n comment. The presence of this comment stops +# reformatting of that line, because the comments are required to be +# next to the string they translate. +i18n_comment= + +# The i18n function call names. The presence of this function stops +# reformattting on that line, because the string it has cannot be moved +# away from the i18n comment. +i18n_function_call= + +# Indent blank lines. +indent_blank_lines=False + +# Put closing brackets on a separate line, indented, if the bracketed +# expression can't fit in a single line. Applies to all kinds of brackets, +# including function definitions and calls. For example: +# +# config = { +# 'key1': 'value1', +# 'key2': 'value2', +# } # <--- this bracket is indented and on a separate line +# +# time_series = self.remote_client.query_entity_counters( +# entity='dev3246.region1', +# key='dns.query_latency_tcp', +# transform=Transformation.AVERAGE(window=timedelta(seconds=60)), +# start_ts=now()-timedelta(days=3), +# end_ts=now(), +# ) # <--- this bracket is indented and on a separate line +indent_closing_brackets=False + +# Indent the dictionary value if it cannot fit on the same line as the +# dictionary key. For example: +# +# config = { +# 'key1': +# 'value1', +# 'key2': value1 + +# value2, +# } +indent_dictionary_value=False + +# The number of columns to use for indentation. +indent_width=4 + +# Join short lines into one line. E.g., single line 'if' statements. +join_multiple_lines=True + +# Do not include spaces around selected binary operators. For example: +# +# 1 + 2 * 3 - 4 / 5 +# +# will be formatted as follows when configured with "*,/": +# +# 1 + 2*3 - 4/5 +no_spaces_around_selected_binary_operators= + +# Use spaces around default or named assigns. +spaces_around_default_or_named_assign=False + +# Adds a space after the opening '{' and before the ending '}' dict delimiters. +# +# {1: 2} +# +# will be formatted as: +# +# { 1: 2 } +spaces_around_dict_delimiters=False + +# Adds a space after the opening '[' and before the ending ']' list delimiters. +# +# [1, 2] +# +# will be formatted as: +# +# [ 1, 2 ] +spaces_around_list_delimiters=False + +# Use spaces around the power operator. +spaces_around_power_operator=True + +# Use spaces around the subscript / slice operator. For example: +# +# my_list[1 : 10 : 2] +spaces_around_subscript_colon=False + +# Adds a space after the opening '(' and before the ending ')' tuple delimiters. +# +# (1, 2, 3) +# +# will be formatted as: +# +# ( 1, 2, 3 ) +spaces_around_tuple_delimiters=False + +# The number of spaces required before a trailing comment. +# This can be a single value (representing the number of spaces +# before each trailing comment) or list of values (representing +# alignment column values; trailing comments within a block will +# be aligned to the first column value that is greater than the maximum +# line length within the block). For example: +# +# With spaces_before_comment=5: +# +# 1 + 1 # Adding values +# +# will be formatted as: +# +# 1 + 1 # Adding values <-- 5 spaces between the end of the statement and comment +# +# With spaces_before_comment=15, 20: +# +# 1 + 1 # Adding values +# two + two # More adding +# +# longer_statement # This is a longer statement +# short # This is a shorter statement +# +# a_very_long_statement_that_extends_beyond_the_final_column # Comment +# short # This is a shorter statement +# +# will be formatted as: +# +# 1 + 1 # Adding values <-- end of line comments in block aligned to col 15 +# two + two # More adding +# +# longer_statement # This is a longer statement <-- end of line comments in block aligned to col 20 +# short # This is a shorter statement +# +# a_very_long_statement_that_extends_beyond_the_final_column # Comment <-- the end of line comments are aligned based on the line length +# short # This is a shorter statement +# +spaces_before_comment=2 + +# Insert a space between the ending comma and closing bracket of a list, +# etc. +space_between_ending_comma_and_closing_bracket=False + +# Use spaces inside brackets, braces, and parentheses. For example: +# +# method_call( 1 ) +# my_dict[ 3 ][ 1 ][ get_index( *args, **kwargs ) ] +# my_set = { 1, 2, 3 } +space_inside_brackets=False + +# Split before arguments +split_all_comma_separated_values=False + +# Split before arguments, but do not split all subexpressions recursively +# (unless needed). +split_all_top_level_comma_separated_values=False + +# Split before arguments if the argument list is terminated by a +# comma. +split_arguments_when_comma_terminated=False + +# Set to True to prefer splitting before '+', '-', '*', '/', '//', or '@' +# rather than after. +split_before_arithmetic_operator=False + +# Set to True to prefer splitting before '&', '|' or '^' rather than +# after. +split_before_bitwise_operator=True + +# Split before the closing bracket if a list or dict literal doesn't fit on +# a single line. +split_before_closing_bracket=True + +# Split before a dictionary or set generator (comp_for). For example, note +# the split before the 'for': +# +# foo = { +# variable: 'Hello world, have a nice day!' +# for variable in bar if variable != 42 +# } +split_before_dict_set_generator=True + +# Split before the '.' if we need to split a longer expression: +# +# foo = ('This is a really long string: {}, {}, {}, {}'.format(a, b, c, d)) +# +# would reformat to something like: +# +# foo = ('This is a really long string: {}, {}, {}, {}' +# .format(a, b, c, d)) +split_before_dot=False + +# Split after the opening paren which surrounds an expression if it doesn't +# fit on a single line. +split_before_expression_after_opening_paren=False + +# If an argument / parameter list is going to be split, then split before +# the first argument. +split_before_first_argument=False + +# Set to True to prefer splitting before 'and' or 'or' rather than +# after. +split_before_logical_operator=True + +# Split named assignments onto individual lines. +split_before_named_assigns=True + +# Set to True to split list comprehensions and generators that have +# non-trivial expressions and multiple clauses before each of these +# clauses. For example: +# +# result = [ +# a_long_var + 100 for a_long_var in xrange(1000) +# if a_long_var % 10] +# +# would reformat to something like: +# +# result = [ +# a_long_var + 100 +# for a_long_var in xrange(1000) +# if a_long_var % 10] +split_complex_comprehension=False + +# The penalty for splitting right after the opening bracket. +split_penalty_after_opening_bracket=300 + +# The penalty for splitting the line after a unary operator. +split_penalty_after_unary_operator=10000 + +# The penalty of splitting the line around the '+', '-', '*', '/', '//', +# ``%``, and '@' operators. +split_penalty_arithmetic_operator=300 + +# The penalty for splitting right before an if expression. +split_penalty_before_if_expr=0 + +# The penalty of splitting the line around the '&', '|', and '^' +# operators. +split_penalty_bitwise_operator=300 + +# The penalty for splitting a list comprehension or generator +# expression. +split_penalty_comprehension=80 + +# The penalty for characters over the column limit. +split_penalty_excess_character=7000 + +# The penalty incurred by adding a line split to the unwrapped line. The +# more line splits added the higher the penalty. +split_penalty_for_added_line_split=30 + +# The penalty of splitting a list of "import as" names. For example: +# +# from a_very_long_or_indented_module_name_yada_yad import (long_argument_1, +# long_argument_2, +# long_argument_3) +# +# would reformat to something like: +# +# from a_very_long_or_indented_module_name_yada_yad import ( +# long_argument_1, long_argument_2, long_argument_3) +split_penalty_import_names=0 + +# The penalty of splitting the line around the 'and' and 'or' +# operators. +split_penalty_logical_operator=300 + +# Use the Tab character for indentation. +use_tabs=False + diff --git a/tests/data/build_dataset/build_dataset.inter b/tests/data/build_dataset/build_dataset.inter new file mode 100644 index 000000000..519234283 --- /dev/null +++ b/tests/data/build_dataset/build_dataset.inter @@ -0,0 +1,21 @@ +user_id:token item_id:token timestamp:float +1 1 1 +1 2 2 +1 3 3 +1 4 4 +1 5 5 +1 6 6 +1 7 7 +1 8 8 +1 9 9 +1 10 10 +1 11 11 +1 12 12 +1 13 13 +1 14 14 +1 15 15 +1 16 16 +1 17 17 +1 18 18 +1 19 19 +1 20 20 \ No newline at end of file diff --git a/tests/data/filter_by_field_value/filter_by_field_value.inter b/tests/data/filter_by_field_value/filter_by_field_value.inter new file mode 100644 index 000000000..830b16236 --- /dev/null +++ b/tests/data/filter_by_field_value/filter_by_field_value.inter @@ -0,0 +1,11 @@ +user_id:token item_id:token timestamp:float rating:float +1 1 4 2 +1 1 6 0 +1 1 0 0 +1 1 8 3 +1 1 3 3 +1 1 1 0 +1 1 9 3 +1 1 2 1 +1 1 5 2 +1 1 7 4 \ No newline at end of file diff --git a/tests/data/filter_by_inter_num/filter_by_inter_num.inter b/tests/data/filter_by_inter_num/filter_by_inter_num.inter new file mode 100644 index 000000000..88c40e08c --- /dev/null +++ b/tests/data/filter_by_inter_num/filter_by_inter_num.inter @@ -0,0 +1,14 @@ +user_id:token item_id:token +1 1 +2 1 +2 2 +3 3 +3 4 +3 5 +3 6 +4 3 +4 4 +5 5 +5 6 +6 5 +6 6 \ No newline at end of file diff --git a/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.inter b/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.inter new file mode 100644 index 000000000..ce484721c --- /dev/null +++ b/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.inter @@ -0,0 +1,9 @@ +user_id:token item_id:token +1 1 +1 2 +2 1 +2 2 +3 3 +3 4 +4 3 +4 4 \ No newline at end of file diff --git a/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.item b/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.item new file mode 100644 index 000000000..c8acfcf74 --- /dev/null +++ b/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.item @@ -0,0 +1,4 @@ +item_id:token price:float +1 0 +3 0 +4 0 \ No newline at end of file diff --git a/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.user b/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.user new file mode 100644 index 000000000..548fb05b5 --- /dev/null +++ b/tests/data/filter_inter_by_ui_and_inter_num/filter_inter_by_ui_and_inter_num.user @@ -0,0 +1,4 @@ +user_id:token age:float +1 0 +3 0 +4 0 \ No newline at end of file diff --git a/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.inter b/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.inter new file mode 100644 index 000000000..51b0e8ad2 --- /dev/null +++ b/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.inter @@ -0,0 +1,3 @@ +user_id:token item_id:token +1 1 +2 2 \ No newline at end of file diff --git a/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.item b/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.item new file mode 100644 index 000000000..5a0cb21b9 --- /dev/null +++ b/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.item @@ -0,0 +1,2 @@ +item_id:token price:float +1 1 \ No newline at end of file diff --git a/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.user b/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.user new file mode 100644 index 000000000..0c5348fe1 --- /dev/null +++ b/tests/data/filter_inter_by_user_or_item/filter_inter_by_user_or_item.user @@ -0,0 +1,2 @@ +user_id:token age:float +1 1 \ No newline at end of file diff --git a/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.inter b/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.inter new file mode 100644 index 000000000..bc0d57222 --- /dev/null +++ b/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.inter @@ -0,0 +1,5 @@ +user_id:token item_id:token timestamp:float +1 0 + 1 1 + 2 +1 1 3 \ No newline at end of file diff --git a/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.item b/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.item new file mode 100644 index 000000000..13197d312 --- /dev/null +++ b/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.item @@ -0,0 +1,5 @@ +item_id:token price:float + 0 +1 1 + 2 +2 3 \ No newline at end of file diff --git a/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.user b/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.user new file mode 100644 index 000000000..0ea7fa39a --- /dev/null +++ b/tests/data/filter_nan_user_or_item/filter_nan_user_or_item.user @@ -0,0 +1,4 @@ +user_id:token age:float +1 0 + 1 +2 0 \ No newline at end of file diff --git a/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.inter b/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.inter new file mode 100644 index 000000000..eff0339c2 --- /dev/null +++ b/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.inter @@ -0,0 +1,6 @@ +user_id:token item_id:token +1 1 +1 2 +2 2 +2 3 +3 3 \ No newline at end of file diff --git a/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.item b/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.item new file mode 100644 index 000000000..28bd72865 --- /dev/null +++ b/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.item @@ -0,0 +1,4 @@ +item_id:token price:float +1 3 +2 2 +3 1 \ No newline at end of file diff --git a/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.user b/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.user new file mode 100644 index 000000000..72a75cda7 --- /dev/null +++ b/tests/data/filter_value_and_filter_inter_by_ui/filter_value_and_filter_inter_by_ui.user @@ -0,0 +1,4 @@ +user_id:token age:float +1 1 +2 2 +3 3 \ No newline at end of file diff --git a/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.inter b/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.inter new file mode 100644 index 000000000..387d8f007 --- /dev/null +++ b/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.inter @@ -0,0 +1,15 @@ +user_id:token item_id:token rating:float +1 1 0 +1 2 0 +2 1 0 +2 3 0 +3 2 0 +3 3 0 +4 4 1 +4 5 0 +5 4 0 +5 5 0 +6 6 0 +6 7 0 +7 6 0 +7 7 0 \ No newline at end of file diff --git a/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.item b/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.item new file mode 100644 index 000000000..ed2edf288 --- /dev/null +++ b/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.item @@ -0,0 +1,8 @@ +item_id:token price:float +1 0 +2 0 +3 1 +4 0 +5 0 +6 0 +7 0 \ No newline at end of file diff --git a/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.user b/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.user new file mode 100644 index 000000000..71b1f48b0 --- /dev/null +++ b/tests/data/filter_value_and_inter_num/filter_value_and_inter_num.user @@ -0,0 +1,8 @@ +user_id:token age:float +1 0 +2 0 +3 1 +4 0 +5 0 +6 0 +7 0 \ No newline at end of file diff --git a/tests/data/general_dataloader/general_dataloader.inter b/tests/data/general_dataloader/general_dataloader.inter new file mode 100644 index 000000000..0b7e82c22 --- /dev/null +++ b/tests/data/general_dataloader/general_dataloader.inter @@ -0,0 +1,51 @@ +user_id:token item_id:token timestamp:float +1 1 1 +1 2 2 +1 3 3 +1 4 4 +1 5 5 +1 6 6 +1 7 7 +1 8 8 +1 9 9 +1 10 10 +1 11 11 +1 12 12 +1 13 13 +1 14 14 +1 15 15 +1 16 16 +1 17 17 +1 18 18 +1 19 19 +1 20 20 +1 21 21 +1 22 22 +1 23 23 +1 24 24 +1 25 25 +1 26 26 +1 27 27 +1 28 28 +1 29 29 +1 30 30 +1 31 31 +1 32 32 +1 33 33 +1 34 34 +1 35 35 +1 36 36 +1 37 37 +1 38 38 +1 39 39 +1 40 40 +1 41 41 +1 42 42 +1 43 43 +1 44 44 +1 45 45 +1 46 46 +1 47 47 +1 48 48 +1 49 49 +1 50 50 \ No newline at end of file diff --git a/tests/data/general_dataloader/general_dataloader.item b/tests/data/general_dataloader/general_dataloader.item new file mode 100644 index 000000000..e3239d7df --- /dev/null +++ b/tests/data/general_dataloader/general_dataloader.item @@ -0,0 +1,101 @@ +item_id:token price:float +1 1 +2 2 +3 3 +4 4 +5 5 +6 6 +7 7 +8 8 +9 9 +10 10 +11 11 +12 12 +13 13 +14 14 +15 15 +16 16 +17 17 +18 18 +19 19 +20 20 +21 21 +22 22 +23 23 +24 24 +25 25 +26 26 +27 27 +28 28 +29 29 +30 30 +31 31 +32 32 +33 33 +34 34 +35 35 +36 36 +37 37 +38 38 +39 39 +40 40 +41 41 +42 42 +43 43 +44 44 +45 45 +46 46 +47 47 +48 48 +49 49 +50 50 +51 51 +52 52 +53 53 +54 54 +55 55 +56 56 +57 57 +58 58 +59 59 +60 60 +61 61 +62 62 +63 63 +64 64 +65 65 +66 66 +67 67 +68 68 +69 69 +70 70 +71 71 +72 72 +73 73 +74 74 +75 75 +76 76 +77 77 +78 78 +79 79 +80 80 +81 81 +82 82 +83 83 +84 84 +85 85 +86 86 +87 87 +88 88 +89 89 +90 90 +91 91 +92 92 +93 93 +94 94 +95 95 +96 96 +97 97 +98 98 +99 99 +100 100 \ No newline at end of file diff --git a/tests/data/general_full_dataloader/general_full_dataloader.inter b/tests/data/general_full_dataloader/general_full_dataloader.inter new file mode 100644 index 000000000..0d59a5e00 --- /dev/null +++ b/tests/data/general_full_dataloader/general_full_dataloader.inter @@ -0,0 +1,111 @@ +user_id:token item_id:token timestamp:float +1 1 1 +1 2 2 +1 3 3 +1 4 4 +1 5 5 +1 6 6 +1 7 7 +1 8 8 +1 9 9 +1 10 10 +1 11 11 +1 12 12 +1 13 13 +1 14 14 +1 15 15 +1 16 16 +1 17 17 +1 18 18 +1 19 19 +1 20 20 +1 21 21 +1 22 22 +1 23 23 +1 24 24 +1 25 25 +1 26 26 +1 27 27 +1 28 28 +1 29 29 +1 30 30 +1 31 31 +1 32 32 +1 33 33 +1 34 34 +1 35 35 +1 36 36 +1 37 37 +1 38 38 +1 39 39 +1 40 40 +1 41 41 +1 42 42 +1 43 43 +1 44 44 +1 45 45 +1 46 46 +1 47 47 +1 48 48 +1 49 49 +1 50 50 +2 1 1 +2 2 2 +2 3 3 +2 4 4 +2 5 5 +2 6 6 +2 7 7 +2 8 8 +2 9 9 +2 10 10 +2 11 11 +2 12 12 +2 13 13 +2 14 14 +2 15 15 +2 16 16 +2 17 17 +2 18 18 +2 19 19 +2 20 20 +2 21 21 +2 22 22 +2 23 23 +2 24 24 +2 25 25 +2 26 26 +2 27 27 +2 28 28 +2 29 29 +2 30 30 +2 31 31 +2 32 32 +2 33 33 +2 34 34 +2 35 35 +2 36 36 +2 37 37 +2 38 38 +2 39 39 +2 40 40 +2 38 41 +2 39 42 +2 40 43 +2 41 44 +2 42 45 +2 36 46 +2 37 47 +2 38 48 +2 39 49 +2 40 50 +3 1 1 +3 1 2 +3 1 3 +3 1 4 +3 1 5 +3 1 6 +3 1 7 +3 1 8 +3 1 9 +3 1 10 \ No newline at end of file diff --git a/tests/data/general_full_dataloader/general_full_dataloader.item b/tests/data/general_full_dataloader/general_full_dataloader.item new file mode 100644 index 000000000..e3239d7df --- /dev/null +++ b/tests/data/general_full_dataloader/general_full_dataloader.item @@ -0,0 +1,101 @@ +item_id:token price:float +1 1 +2 2 +3 3 +4 4 +5 5 +6 6 +7 7 +8 8 +9 9 +10 10 +11 11 +12 12 +13 13 +14 14 +15 15 +16 16 +17 17 +18 18 +19 19 +20 20 +21 21 +22 22 +23 23 +24 24 +25 25 +26 26 +27 27 +28 28 +29 29 +30 30 +31 31 +32 32 +33 33 +34 34 +35 35 +36 36 +37 37 +38 38 +39 39 +40 40 +41 41 +42 42 +43 43 +44 44 +45 45 +46 46 +47 47 +48 48 +49 49 +50 50 +51 51 +52 52 +53 53 +54 54 +55 55 +56 56 +57 57 +58 58 +59 59 +60 60 +61 61 +62 62 +63 63 +64 64 +65 65 +66 66 +67 67 +68 68 +69 69 +70 70 +71 71 +72 72 +73 73 +74 74 +75 75 +76 76 +77 77 +78 78 +79 79 +80 80 +81 81 +82 82 +83 83 +84 84 +85 85 +86 86 +87 87 +88 88 +89 89 +90 90 +91 91 +92 92 +93 93 +94 94 +95 95 +96 96 +97 97 +98 98 +99 99 +100 100 \ No newline at end of file diff --git a/tests/data/general_uni100_dataloader/general_uni100_dataloader.inter b/tests/data/general_uni100_dataloader/general_uni100_dataloader.inter new file mode 100644 index 000000000..038b59c3d --- /dev/null +++ b/tests/data/general_uni100_dataloader/general_uni100_dataloader.inter @@ -0,0 +1,41 @@ +user_id:token item_id:token timestamp:float +1 1 1 +1 2 2 +1 3 3 +1 4 4 +1 5 5 +1 6 6 +1 7 7 +1 8 8 +1 9 9 +1 10 10 +2 1 1 +2 1 2 +2 1 3 +2 1 4 +2 1 5 +2 1 6 +2 1 7 +2 1 8 +2 1 9 +2 1 10 +3 1 1 +3 2 2 +3 3 3 +3 4 4 +3 5 5 +3 6 6 +3 7 7 +3 8 8 +3 9 9 +3 10 10 +3 11 11 +3 12 12 +3 13 13 +3 14 14 +3 15 15 +3 16 16 +3 17 17 +3 18 18 +3 19 19 +3 20 20 \ No newline at end of file diff --git a/tests/data/general_uni100_dataloader/general_uni100_dataloader.item b/tests/data/general_uni100_dataloader/general_uni100_dataloader.item new file mode 100644 index 000000000..e3239d7df --- /dev/null +++ b/tests/data/general_uni100_dataloader/general_uni100_dataloader.item @@ -0,0 +1,101 @@ +item_id:token price:float +1 1 +2 2 +3 3 +4 4 +5 5 +6 6 +7 7 +8 8 +9 9 +10 10 +11 11 +12 12 +13 13 +14 14 +15 15 +16 16 +17 17 +18 18 +19 19 +20 20 +21 21 +22 22 +23 23 +24 24 +25 25 +26 26 +27 27 +28 28 +29 29 +30 30 +31 31 +32 32 +33 33 +34 34 +35 35 +36 36 +37 37 +38 38 +39 39 +40 40 +41 41 +42 42 +43 43 +44 44 +45 45 +46 46 +47 47 +48 48 +49 49 +50 50 +51 51 +52 52 +53 53 +54 54 +55 55 +56 56 +57 57 +58 58 +59 59 +60 60 +61 61 +62 62 +63 63 +64 64 +65 65 +66 66 +67 67 +68 68 +69 69 +70 70 +71 71 +72 72 +73 73 +74 74 +75 75 +76 76 +77 77 +78 78 +79 79 +80 80 +81 81 +82 82 +83 83 +84 84 +85 85 +86 86 +87 87 +88 88 +89 89 +90 90 +91 91 +92 92 +93 93 +94 94 +95 95 +96 96 +97 97 +98 98 +99 99 +100 100 \ No newline at end of file diff --git a/tests/data/normalize/normalize.inter b/tests/data/normalize/normalize.inter new file mode 100644 index 000000000..a3ede2f78 --- /dev/null +++ b/tests/data/normalize/normalize.inter @@ -0,0 +1,6 @@ +user_id:token item_id:token rating:float star:float +1 1 0 4 +2 2 1 2 +3 3 4 0 +4 4 3 1 +5 5 2 3 \ No newline at end of file diff --git a/tests/data/remap_id/remap_id.inter b/tests/data/remap_id/remap_id.inter new file mode 100644 index 000000000..7ce9e9b8b --- /dev/null +++ b/tests/data/remap_id/remap_id.inter @@ -0,0 +1,5 @@ +user_id:token item_id:token add_user:token add_item:token user_list:token_seq +ua ia ub ie uc ue +ub ib ue ic +uc ic ud if ua ub uc +ud id uf ia uf \ No newline at end of file diff --git a/tests/data/remove_duplication/remove_duplication.inter b/tests/data/remove_duplication/remove_duplication.inter new file mode 100644 index 000000000..c7356667d --- /dev/null +++ b/tests/data/remove_duplication/remove_duplication.inter @@ -0,0 +1,4 @@ +user_id:token item_id:token timestamp:float +1 1 1 +1 1 0 +1 1 2 \ No newline at end of file diff --git a/tests/data/rm_dup_and_filter_by_inter_num/rm_dup_and_filter_by_inter_num.inter b/tests/data/rm_dup_and_filter_by_inter_num/rm_dup_and_filter_by_inter_num.inter new file mode 100644 index 000000000..0415e73eb --- /dev/null +++ b/tests/data/rm_dup_and_filter_by_inter_num/rm_dup_and_filter_by_inter_num.inter @@ -0,0 +1,9 @@ +user_id:token item_id:token +1 1 +1 2 +2 1 +2 2 +3 3 +3 3 +3 4 +4 4 \ No newline at end of file diff --git a/tests/data/rm_dup_and_filter_value/rm_dup_and_filter_value.inter b/tests/data/rm_dup_and_filter_value/rm_dup_and_filter_value.inter new file mode 100644 index 000000000..ce83c964f --- /dev/null +++ b/tests/data/rm_dup_and_filter_value/rm_dup_and_filter_value.inter @@ -0,0 +1,5 @@ +user_id:token item_id:token timestamp:float rating:float +1 1 1 1 +1 1 0 5 +1 1 2 3 +2 2 0 3 \ No newline at end of file diff --git a/tests/data/set_label_by_threshold/set_label_by_threshold.inter b/tests/data/set_label_by_threshold/set_label_by_threshold.inter new file mode 100644 index 000000000..e1e20b2a5 --- /dev/null +++ b/tests/data/set_label_by_threshold/set_label_by_threshold.inter @@ -0,0 +1,5 @@ +user_id:token item_id:token rating:float +1 1 5 +2 2 3 +3 3 4 +4 4 2 \ No newline at end of file diff --git a/tests/data/test_dataloader.py b/tests/data/test_dataloader.py new file mode 100644 index 000000000..e79e86030 --- /dev/null +++ b/tests/data/test_dataloader.py @@ -0,0 +1,373 @@ +# -*- coding: utf-8 -*- +# @Time : 2021/1/5 +# @Author : Yushuo Chen +# @Email : chenyushuo@ruc.edu.cn + +# UPDATE +# @Time : 2020/1/5 +# @Author : Yushuo Chen +# @email : chenyushuo@ruc.edu.cn + +import logging +import os + +import pytest + +from recbole.config import Config +from recbole.data import create_dataset, data_preparation +from recbole.utils import init_seed + +current_path = os.path.dirname(os.path.realpath(__file__)) + + +def new_dataloader(config_dict=None, config_file_list=None): + config = Config(config_dict=config_dict, config_file_list=config_file_list) + init_seed(config['seed'], config['reproducibility']) + logging.basicConfig(level=logging.ERROR) + dataset = create_dataset(config) + return data_preparation(config, dataset) + + +class TestGeneralDataloader: + def test_general_dataloader(self): + train_batch_size = 6 + eval_batch_size = 2 + config_dict = { + 'model': 'BPR', + 'dataset': 'general_dataloader', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS', + 'training_neg_sample_num': 0, + 'split_ratio': [0.8, 0.1, 0.1], + 'train_batch_size': train_batch_size, + 'eval_batch_size': eval_batch_size, + } + train_data, valid_data, test_data = new_dataloader(config_dict=config_dict) + + def check_dataloader(data, item_list, batch_size): + data.shuffle = False + pr = 0 + for batch_data in data: + batch_item_list = item_list[pr: pr + batch_size] + assert (batch_data['item_id'].numpy() == batch_item_list).all() + pr += batch_size + + check_dataloader(train_data, list(range(1, 41)), train_batch_size) + check_dataloader(valid_data, list(range(41, 46)), eval_batch_size) + check_dataloader(test_data, list(range(46, 51)), eval_batch_size) + + def test_general_neg_sample_dataloader_in_pair_wise(self): + train_batch_size = 6 + eval_batch_size = 100 + config_dict = { + 'model': 'BPR', + 'dataset': 'general_dataloader', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS,full', + 'training_neg_sample_num': 1, + 'split_ratio': [0.8, 0.1, 0.1], + 'train_batch_size': train_batch_size, + 'eval_batch_size': eval_batch_size, + } + train_data, valid_data, test_data = new_dataloader(config_dict=config_dict) + + train_data.shuffle = False + train_item_list = list(range(1, 41)) + pr = 0 + for batch_data in train_data: + batch_item_list = train_item_list[pr: pr + train_batch_size] + assert (batch_data['item_id'].numpy() == batch_item_list).all() + assert (batch_data['item_id'] == batch_data['price']).all() + assert (40 < batch_data['neg_item_id']).all() + assert (batch_data['neg_item_id'] <= 100).all() + assert (batch_data['neg_item_id'] == batch_data['neg_price']).all() + pr += train_batch_size + + def test_general_neg_sample_dataloader_in_point_wise(self): + train_batch_size = 6 + eval_batch_size = 100 + config_dict = { + 'model': 'DMF', + 'dataset': 'general_dataloader', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS,full', + 'training_neg_sample_num': 1, + 'split_ratio': [0.8, 0.1, 0.1], + 'train_batch_size': train_batch_size, + 'eval_batch_size': eval_batch_size, + } + train_data, valid_data, test_data = new_dataloader(config_dict=config_dict) + + train_data.shuffle = False + train_item_list = list(range(1, 41)) + pr = 0 + for batch_data in train_data: + step = len(batch_data) // 2 + batch_item_list = train_item_list[pr: pr + step] + assert (batch_data['item_id'][: step].numpy() == batch_item_list).all() + assert (40 < batch_data['item_id'][step:]).all() + assert (batch_data['item_id'][step:] <= 100).all() + assert (batch_data['item_id'] == batch_data['price']).all() + pr += step + + def test_general_full_dataloader(self): + train_batch_size = 6 + eval_batch_size = 100 + config_dict = { + 'model': 'BPR', + 'dataset': 'general_full_dataloader', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS,full', + 'training_neg_sample_num': 1, + 'split_ratio': [0.8, 0.1, 0.1], + 'train_batch_size': train_batch_size, + 'eval_batch_size': eval_batch_size, + } + train_data, valid_data, test_data = new_dataloader(config_dict=config_dict) + + def check_result(data, result): + assert len(data) == len(result) + for i, batch_data in enumerate(data): + user_df, history_index, swap_row, swap_col_after, swap_col_before = batch_data + history_row, history_col = history_index + assert len(user_df) == result[i]['len_user_df'] + assert (user_df['user_id'].numpy() == result[i]['user_df_user_id']).all() + assert (user_df.pos_len_list == result[i]['pos_len_list']).all() + assert (user_df.user_len_list == result[i]['user_len_list']).all() + assert len(history_row) == len(history_col) == result[i]['history_len'] + assert (history_row.numpy() == result[i]['history_row']).all() + assert (history_col.numpy() == result[i]['history_col']).all() + assert len(swap_row) == len(swap_col_after) == len(swap_col_before) == result[i]['swap_len'] + assert (swap_row.numpy() == result[i]['swap_row']).all() + assert (swap_col_after.numpy() == result[i]['swap_col_after']).all() + assert (swap_col_before.numpy() == result[i]['swap_col_before']).all() + + valid_result = [ + { + 'len_user_df': 1, + 'user_df_user_id': [1], + 'pos_len_list': [5], + 'user_len_list': [101], + 'history_len': 40, + 'history_row': 0, + 'history_col': list(range(1, 41)), + 'swap_len': 10, + 'swap_row': 0, + 'swap_col_after': [0, 1, 2, 3, 4, 41, 42, 43, 44, 45], + 'swap_col_before': [45, 44, 43, 42, 41, 4, 3, 2, 1, 0], + }, + { + 'len_user_df': 1, + 'user_df_user_id': [2], + 'pos_len_list': [5], + 'user_len_list': [101], + 'history_len': 37, + 'history_row': 0, + 'history_col': list(range(1, 38)), + 'swap_len': 10, + 'swap_row': 0, + 'swap_col_after': [0, 1, 2, 3, 4, 38, 39, 40, 41, 42], + 'swap_col_before': [42, 41, 40, 39, 38, 4, 3, 2, 1, 0], + }, + { + 'len_user_df': 1, + 'user_df_user_id': [3], + 'pos_len_list': [1], + 'user_len_list': [101], + 'history_len': 0, + 'history_row': [], + 'history_col': [], + 'swap_len': 2, + 'swap_row': 0, + 'swap_col_after': [0, 1], + 'swap_col_before': [1, 0], + }, + ] + check_result(valid_data, valid_result) + + test_result = [ + { + 'len_user_df': 1, + 'user_df_user_id': [1], + 'pos_len_list': [5], + 'user_len_list': [101], + 'history_len': 45, + 'history_row': 0, + 'history_col': list(range(1, 46)), + 'swap_len': 10, + 'swap_row': 0, + 'swap_col_after': [0, 1, 2, 3, 4, 46, 47, 48, 49, 50], + 'swap_col_before': [50, 49, 48, 47, 46, 4, 3, 2, 1, 0], + }, + { + 'len_user_df': 1, + 'user_df_user_id': [2], + 'pos_len_list': [5], + 'user_len_list': [101], + 'history_len': 37, + 'history_row': 0, + 'history_col': list(range(1, 36)) + [41, 42], + 'swap_len': 10, + 'swap_row': 0, + 'swap_col_after': [0, 1, 2, 3, 4, 36, 37, 38, 39, 40], + 'swap_col_before': [40, 39, 38, 37, 36, 4, 3, 2, 1, 0], + }, + { + 'len_user_df': 1, + 'user_df_user_id': [3], + 'pos_len_list': [1], + 'user_len_list': [101], + 'history_len': 0, + 'history_row': [], + 'history_col': [], + 'swap_len': 2, + 'swap_row': 0, + 'swap_col_after': [0, 1], + 'swap_col_before': [1, 0], + }, + ] + check_result(test_data, test_result) + + def test_general_uni100_dataloader_with_batch_size_in_101(self): + train_batch_size = 6 + eval_batch_size = 101 + config_dict = { + 'model': 'BPR', + 'dataset': 'general_uni100_dataloader', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS,uni100', + 'training_neg_sample_num': 1, + 'split_ratio': [0.8, 0.1, 0.1], + 'train_batch_size': train_batch_size, + 'eval_batch_size': eval_batch_size, + } + train_data, valid_data, test_data = new_dataloader(config_dict=config_dict) + + def check_result(data, result): + assert data.batch_size == 202 + assert len(data) == len(result) + for i, batch_data in enumerate(data): + assert result[i]['item_id_check'](batch_data['item_id']) + assert batch_data.pos_len_list == result[i]['pos_len_list'] + assert batch_data.user_len_list == result[i]['user_len_list'] + + valid_result = [ + { + 'item_id_check': lambda data: data[0] == 9 + and (8 < data[1:]).all() + and (data[1:] <= 100).all(), + 'pos_len_list': [1], + 'user_len_list': [101], + }, + { + 'item_id_check': lambda data: data[0] == 1 + and (data[1:] != 1).all(), + 'pos_len_list': [1], + 'user_len_list': [101], + }, + { + 'item_id_check': lambda data: (data[0: 2].numpy() == [17, 18]).all() + and (16 < data[2:]).all() + and (data[2:] <= 100).all(), + 'pos_len_list': [2], + 'user_len_list': [202], + }, + ] + check_result(valid_data, valid_result) + + test_result = [ + { + 'item_id_check': lambda data: data[0] == 10 + and (9 < data[1:]).all() + and (data[1:] <= 100).all(), + 'pos_len_list': [1], + 'user_len_list': [101], + }, + { + 'item_id_check': lambda data: data[0] == 1 + and (data[1:] != 1).all(), + 'pos_len_list': [1], + 'user_len_list': [101], + }, + { + 'item_id_check': lambda data: (data[0: 2].numpy() == [19, 20]).all() + and (18 < data[2:]).all() + and (data[2:] <= 100).all(), + 'pos_len_list': [2], + 'user_len_list': [202], + }, + ] + check_result(test_data, test_result) + + def test_general_uni100_dataloader_with_batch_size_in_303(self): + train_batch_size = 6 + eval_batch_size = 303 + config_dict = { + 'model': 'BPR', + 'dataset': 'general_uni100_dataloader', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS,uni100', + 'training_neg_sample_num': 1, + 'split_ratio': [0.8, 0.1, 0.1], + 'train_batch_size': train_batch_size, + 'eval_batch_size': eval_batch_size, + } + train_data, valid_data, test_data = new_dataloader(config_dict=config_dict) + + def check_result(data, result): + assert data.batch_size == 303 + assert len(data) == len(result) + for i, batch_data in enumerate(data): + assert result[i]['item_id_check'](batch_data['item_id']) + assert batch_data.pos_len_list == result[i]['pos_len_list'] + assert batch_data.user_len_list == result[i]['user_len_list'] + + valid_result = [ + { + 'item_id_check': lambda data: data[0] == 9 + and (8 < data[1: 101]).all() + and (data[1: 101] <= 100).all() + and data[101] == 1 + and (data[102:202] != 1).all(), + 'pos_len_list': [1, 1], + 'user_len_list': [101, 101], + }, + { + 'item_id_check': lambda data: (data[0: 2].numpy() == [17, 18]).all() + and (16 < data[2:]).all() + and (data[2:] <= 100).all(), + 'pos_len_list': [2], + 'user_len_list': [202], + }, + ] + check_result(valid_data, valid_result) + + test_result = [ + { + 'item_id_check': lambda data: data[0] == 10 + and (9 < data[1:101]).all() + and (data[1:101] <= 100).all() + and data[101] == 1 + and (data[102:202] != 1).all(), + 'pos_len_list': [1, 1], + 'user_len_list': [101, 101], + }, + { + 'item_id_check': lambda data: (data[0: 2].numpy() == [19, 20]).all() + and (18 < data[2:]).all() + and (data[2:] <= 100).all(), + 'pos_len_list': [2], + 'user_len_list': [202], + }, + ] + check_result(test_data, test_result) + + +if __name__ == '__main__': + pytest.main() diff --git a/tests/data/test_dataset.py b/tests/data/test_dataset.py new file mode 100644 index 000000000..9b78cbd9d --- /dev/null +++ b/tests/data/test_dataset.py @@ -0,0 +1,570 @@ +# -*- coding: utf-8 -*- +# @Time : 2021/1/3 +# @Author : Yushuo Chen +# @Email : chenyushuo@ruc.edu.cn + +# UPDATE +# @Time : 2020/1/3 +# @Author : Yushuo Chen +# @email : chenyushuo@ruc.edu.cn + +import logging +import os + +import pytest + +from recbole.config import Config, EvalSetting +from recbole.data import create_dataset +from recbole.utils import init_seed + +current_path = os.path.dirname(os.path.realpath(__file__)) + + +def new_dataset(config_dict=None, config_file_list=None): + config = Config(config_dict=config_dict, config_file_list=config_file_list) + init_seed(config['seed'], config['reproducibility']) + logging.basicConfig(level=logging.ERROR) + return create_dataset(config) + + +def split_dataset(config_dict=None, config_file_list=None): + dataset = new_dataset(config_dict=config_dict, config_file_list=config_file_list) + config = dataset.config + es_str = [_.strip() for _ in config['eval_setting'].split(',')] + es = EvalSetting(config) + es.set_ordering_and_splitting(es_str[0]) + return dataset.build(es) + + +class TestDataset: + def test_filter_nan_user_or_item(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_nan_user_or_item', + 'data_path': current_path, + 'load_col': None, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 1 + assert len(dataset.user_feat) == 3 + assert len(dataset.item_feat) == 3 + + def test_remove_duplication_by_first(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'remove_duplication', + 'data_path': current_path, + 'load_col': None, + 'rm_dup_inter': 'first', + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.inter_feat[dataset.time_field][0] == 0 + + def test_remove_duplication_by_last(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'remove_duplication', + 'data_path': current_path, + 'load_col': None, + 'rm_dup_inter': 'last', + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.inter_feat[dataset.time_field][0] == 2 + + def test_filter_by_field_value_with_lowest_val(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_field_value', + 'data_path': current_path, + 'load_col': None, + 'lowest_val': { + 'timestamp': 4, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 6 + + def test_filter_by_field_value_with_highest_val(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_field_value', + 'data_path': current_path, + 'load_col': None, + 'highest_val': { + 'timestamp': 4, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 5 + + def test_filter_by_field_value_with_equal_val(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_field_value', + 'data_path': current_path, + 'load_col': None, + 'equal_val': { + 'rating': 0, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 3 + + def test_filter_by_field_value_with_not_equal_val(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_field_value', + 'data_path': current_path, + 'load_col': None, + 'not_equal_val': { + 'rating': 4, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 9 + + def test_filter_by_field_value_in_same_field(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_field_value', + 'data_path': current_path, + 'load_col': None, + 'lowest_val': { + 'timestamp': 3, + }, + 'highest_val': { + 'timestamp': 8, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 6 + + def test_filter_by_field_value_in_different_field(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_field_value', + 'data_path': current_path, + 'load_col': None, + 'lowest_val': { + 'timestamp': 3, + }, + 'highest_val': { + 'timestamp': 8, + }, + 'not_equal_val': { + 'rating': 4, + } + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 5 + + def test_filter_inter_by_user_or_item_is_true(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_inter_by_user_or_item', + 'data_path': current_path, + 'load_col': None, + 'filter_inter_by_user_or_item': True, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 1 + + def test_filter_inter_by_user_or_item_is_false(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_inter_by_user_or_item', + 'data_path': current_path, + 'load_col': None, + 'filter_inter_by_user_or_item': False, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 2 + + def test_filter_by_inter_num_in_min_user_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'min_user_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.user_num == 6 + assert dataset.item_num == 7 + + def test_filter_by_inter_num_in_min_item_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'min_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.user_num == 7 + assert dataset.item_num == 6 + + def test_filter_by_inter_num_in_max_user_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'max_user_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.user_num == 6 + assert dataset.item_num == 7 + + def test_filter_by_inter_num_in_max_item_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'max_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.user_num == 5 + assert dataset.item_num == 5 + + def test_filter_by_inter_num_in_min_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'min_user_inter_num': 2, + 'min_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.user_num == 5 + assert dataset.item_num == 5 + + def test_filter_by_inter_num_in_complex_way(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'max_user_inter_num': 3, + 'min_user_inter_num': 2, + 'min_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert dataset.user_num == 3 + assert dataset.item_num == 3 + + def test_rm_dup_by_first_and_filter_value(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'rm_dup_and_filter_value', + 'data_path': current_path, + 'load_col': None, + 'rm_dup_inter': 'first', + 'highest_val': { + 'rating': 4, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 1 + + def test_rm_dup_by_last_and_filter_value(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'rm_dup_and_filter_value', + 'data_path': current_path, + 'load_col': None, + 'rm_dup_inter': 'last', + 'highest_val': { + 'rating': 4, + }, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 2 + + def test_rm_dup_and_filter_by_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'rm_dup_and_filter_by_inter_num', + 'data_path': current_path, + 'load_col': None, + 'rm_dup_inter': 'first', + 'min_user_inter_num': 2, + 'min_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 4 + assert dataset.user_num == 3 + assert dataset.item_num == 3 + + def test_filter_value_and_filter_inter_by_ui(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_value_and_filter_inter_by_ui', + 'data_path': current_path, + 'load_col': None, + 'highest_val': { + 'age': 2, + }, + 'not_equal_val': { + 'price': 2, + }, + 'filter_inter_by_user_or_item': True, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 2 + assert dataset.user_num == 3 + assert dataset.item_num == 3 + + def test_filter_value_and_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_value_and_inter_num', + 'data_path': current_path, + 'load_col': None, + 'highest_val': { + 'rating': 0, + 'age': 0, + 'price': 0, + }, + 'min_user_inter_num': 2, + 'min_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 4 + assert dataset.user_num == 3 + assert dataset.item_num == 3 + + def test_filter_inter_by_ui_and_inter_num(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'filter_inter_by_ui_and_inter_num', + 'data_path': current_path, + 'load_col': None, + 'filter_inter_by_user_or_item': True, + 'min_user_inter_num': 2, + 'min_item_inter_num': 2, + } + dataset = new_dataset(config_dict=config_dict) + assert len(dataset.inter_feat) == 4 + assert dataset.user_num == 3 + assert dataset.item_num == 3 + + def test_remap_id(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'remap_id', + 'data_path': current_path, + 'load_col': None, + 'fields_in_same_space': None, + } + dataset = new_dataset(config_dict=config_dict) + user_list = dataset.token2id('user_id', ['ua', 'ub', 'uc', 'ud']) + item_list = dataset.token2id('item_id', ['ia', 'ib', 'ic', 'id']) + assert (user_list == [1, 2, 3, 4]).all() + assert (item_list == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['user_id'].numpy() == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['item_id'].numpy() == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['add_user'].numpy() == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['add_item'].numpy() == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['user_list'].numpy() == [[1, 2, 0], + [0, 0, 0], + [3, 4, 1], + [5, 0, 0]]).all() + + def test_remap_id_with_fields_in_same_space(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'remap_id', + 'data_path': current_path, + 'load_col': None, + 'fields_in_same_space': [ + ['user_id', 'add_user', 'user_list'], + ['item_id', 'add_item'], + ], + } + dataset = new_dataset(config_dict=config_dict) + user_list = dataset.token2id('user_id', ['ua', 'ub', 'uc', 'ud', 'ue', 'uf']) + item_list = dataset.token2id('item_id', ['ia', 'ib', 'ic', 'id', 'ie', 'if']) + assert (user_list == [1, 2, 3, 4, 5, 6]).all() + assert (item_list == [1, 2, 3, 4, 5, 6]).all() + assert (dataset.inter_feat['user_id'].numpy() == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['item_id'].numpy() == [1, 2, 3, 4]).all() + assert (dataset.inter_feat['add_user'].numpy() == [2, 5, 4, 6]).all() + assert (dataset.inter_feat['add_item'].numpy() == [5, 3, 6, 1]).all() + assert (dataset.inter_feat['user_list'].numpy() == [[3, 5, 0], + [0, 0, 0], + [1, 2, 3], + [6, 0, 0]]).all() + + def test_ui_feat_preparation_and_fill_nan(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'ui_feat_preparation_and_fill_nan', + 'data_path': current_path, + 'load_col': None, + 'filter_inter_by_user_or_item': False, + 'normalize_field': None, + 'normalize_all': None, + } + dataset = new_dataset(config_dict=config_dict) + user_token_list = dataset.id2token('user_id', dataset.user_feat['user_id']) + item_token_list = dataset.id2token('item_id', dataset.item_feat['item_id']) + assert (user_token_list == ['[PAD]', 'ua', 'ub', 'uc', 'ud', 'ue']).all() + assert (item_token_list == ['[PAD]', 'ia', 'ib', 'ic', 'id', 'ie']).all() + assert dataset.inter_feat['rating'][3] == 1.0 + assert dataset.user_feat['age'][4] == 1.5 + assert dataset.item_feat['price'][4] == 1.5 + assert (dataset.inter_feat['time_list'].numpy() == [[1., 2., 3.], + [2., 0., 0.], + [0., 0., 0.], + [5., 4., 0.]]).all() + assert (dataset.user_feat['profile'].numpy() == [[0, 0, 0], + [1, 2, 3], + [0, 0, 0], + [3, 0, 0], + [0, 0, 0], + [3, 2, 0]]).all() + + def test_set_label_by_threshold(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'set_label_by_threshold', + 'data_path': current_path, + 'load_col': None, + 'threshold': { + 'rating': 4, + }, + 'normalize_field': None, + 'normalize_all': None, + } + dataset = new_dataset(config_dict=config_dict) + assert (dataset.inter_feat['label'].numpy() == [1., 0., 1., 0.]).all() + + def test_normalize_all(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'normalize', + 'data_path': current_path, + 'load_col': None, + 'normalize_all': True, + } + dataset = new_dataset(config_dict=config_dict) + assert (dataset.inter_feat['rating'].numpy() == [0., .25, 1., .75, .5]).all() + assert (dataset.inter_feat['star'].numpy() == [1., .5, 0., .25, 0.75]).all() + + def test_normalize_field(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'normalize', + 'data_path': current_path, + 'load_col': None, + 'normalize_field': ['rating'], + 'normalize_all': False, + } + dataset = new_dataset(config_dict=config_dict) + assert (dataset.inter_feat['rating'].numpy() == [0., .25, 1., .75, .5]).all() + assert (dataset.inter_feat['star'].numpy() == [4., 2., 0., 1., 3.]).all() + + def test_TO_RS_811(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS', + 'split_ratio': [0.8, 0.1, 0.1], + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert (train_dataset.inter_feat['item_id'].numpy() == list(range(1, 17))).all() + assert (valid_dataset.inter_feat['item_id'].numpy() == list(range(17, 19))).all() + assert (test_dataset.inter_feat['item_id'].numpy() == list(range(19, 21))).all() + + def test_TO_RS_820(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS', + 'split_ratio': [0.8, 0.2, 0.0], + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert (train_dataset.inter_feat['item_id'].numpy() == list(range(1, 17))).all() + assert (valid_dataset.inter_feat['item_id'].numpy() == list(range(17, 21))).all() + assert len(test_dataset.inter_feat) == 0 + + def test_TO_RS_802(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_RS', + 'split_ratio': [0.8, 0.0, 0.2], + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert (train_dataset.inter_feat['item_id'].numpy() == list(range(1, 17))).all() + assert len(valid_dataset.inter_feat) == 0 + assert (test_dataset.inter_feat['item_id'].numpy() == list(range(17, 21))).all() + + def test_TO_LS(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'TO_LS', + 'leave_one_num': 2, + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert (train_dataset.inter_feat['item_id'].numpy() == list(range(1, 19))).all() + assert (valid_dataset.inter_feat['item_id'].numpy() == list(range(19, 20))).all() + assert (test_dataset.inter_feat['item_id'].numpy() == list(range(20, 21))).all() + + def test_RO_RS_811(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'RO_RS', + 'split_ratio': [0.8, 0.1, 0.1], + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert len(train_dataset.inter_feat) == 16 + assert len(valid_dataset.inter_feat) == 2 + assert len(test_dataset.inter_feat) == 2 + + def test_TO_RS_820(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'RO_RS', + 'split_ratio': [0.8, 0.2, 0.0], + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert len(train_dataset.inter_feat) == 16 + assert len(valid_dataset.inter_feat) == 4 + assert len(test_dataset.inter_feat) == 0 + + def test_RO_RS_802(self): + config_dict = { + 'model': 'BPR', + 'dataset': 'build_dataset', + 'data_path': current_path, + 'load_col': None, + 'eval_setting': 'RO_RS', + 'split_ratio': [0.8, 0.0, 0.2], + } + train_dataset, valid_dataset, test_dataset = split_dataset(config_dict=config_dict) + assert len(train_dataset.inter_feat) == 16 + assert len(valid_dataset.inter_feat) == 0 + assert len(test_dataset.inter_feat) == 4 + + +if __name__ == "__main__": + pytest.main() diff --git a/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.inter b/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.inter new file mode 100644 index 000000000..ea5f045a5 --- /dev/null +++ b/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.inter @@ -0,0 +1,5 @@ +user_id:token item_id:token rating:float time_list:float_seq +ua ia 0 1 2 3 +ub ib 1 2 +uc ic 2 +ud id 5 4 \ No newline at end of file diff --git a/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.item b/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.item new file mode 100644 index 000000000..00c3fd7cd --- /dev/null +++ b/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.item @@ -0,0 +1,5 @@ +item_id:token price:float +ia 0 +ib 1 +ic 2 +ie 3 \ No newline at end of file diff --git a/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.user b/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.user new file mode 100644 index 000000000..6baab598f --- /dev/null +++ b/tests/data/ui_feat_preparation_and_fill_nan/ui_feat_preparation_and_fill_nan.user @@ -0,0 +1,5 @@ +user_id:token age:float profile:token_seq +ua 0 a b c +ub 1 +uc 2 c +ue 3 c b \ No newline at end of file diff --git a/tests/evaluation_setting/test_evaluation_setting.py b/tests/evaluation_setting/test_evaluation_setting.py index 0ba2e719b..059f6f010 100644 --- a/tests/evaluation_setting/test_evaluation_setting.py +++ b/tests/evaluation_setting/test_evaluation_setting.py @@ -45,6 +45,7 @@ def test_rols_full(self): objective_function(config_dict=config_dict, config_file_list=config_file_list, saved=False) ''' + def test_tols_full(self): config_dict = { 'eval_setting': 'TO_LS,full', @@ -72,6 +73,7 @@ def test_tols_full(self): objective_function(config_dict=config_dict, config_file_list=config_file_list, saved=False) ''' + def test_tors_full(self): config_dict = { 'eval_setting': 'TO_RS,full', @@ -182,7 +184,7 @@ def test_tors_uni100(self): 'model': 'BPR', } objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + config_file_list=config_file_list, saved=False) # config_dict = { # 'eval_setting': 'TO_RS,uni100', # 'model': 'NeuMF', @@ -208,10 +210,11 @@ class TestContextRecommender(unittest.TestCase): def test_tors(self): config_dict = { 'eval_setting': 'TO_RS', + 'threshold': {'rating': 4}, 'model': 'FM', } objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + config_file_list=config_file_list, saved=False) # config_dict = { # 'eval_setting': 'TO_RS', # 'model': 'DeepFM', diff --git a/tests/metrics/test_rank_metrics.py b/tests/metrics/test_rank_metrics.py new file mode 100644 index 000000000..c60a2c485 --- /dev/null +++ b/tests/metrics/test_rank_metrics.py @@ -0,0 +1,74 @@ +# -*- encoding: utf-8 -*- +# @Time : 2020/12/21 +# @Author : Zhichao Feng +# @email : fzcbupt@gmail.com + + +import os +import sys +import unittest + +sys.path.append(os.getcwd()) +import numpy as np +import torch +from recbole.config import Config +from recbole.data.interaction import Interaction +from recbole.evaluator.metrics import metrics_dict +from recbole.evaluator.evaluators import RankEvaluator + +parameters_dict = { + 'model': 'BPR', + 'eval_setting': 'RO_RS,uni100', +} + + +class MetricsTestCases(object): + user_len_list0 = np.array([2, 3, 5]) + pos_len_list0 = np.array([1, 2, 3]) + pos_rank_sum0 = np.array([1, 4, 9]) + + user_len_list1 = np.array([3, 6, 4]) + pos_len_list1 = np.array([1, 0, 4]) + pos_rank_sum1 = np.array([3, 0, 6]) + + +class CollectTestCases(object): + interaction0 = Interaction({}, [0, 2, 3, 4], [2, 3, 4, 5]) + scores_tensor0 = torch.Tensor([0.1, 0.2, + 0.1, 0.1, 0.2, + 0.2, 0.2, 0.2, 0.2, + 0.3, 0.2, 0.1, 0.4, 0.3]) + + +def get_metric_result(name, case=0): + func = metrics_dict[name] + return func(getattr(MetricsTestCases, f'user_len_list{case}'), + getattr(MetricsTestCases, f'pos_len_list{case}'), + getattr(MetricsTestCases, f'pos_rank_sum{case}')) + + +def get_collect_result(evaluator, case=0): + func = evaluator.collect + return func(getattr(CollectTestCases, f'interaction{case}'), + getattr(CollectTestCases, f'scores_tensor{case}')) + + +class TestRankMetrics(unittest.TestCase): + def test_gauc(self): + name = 'gauc' + self.assertEqual(get_metric_result(name, case=0), (1 * ((2 - (1 - 1) / 2 - 1 / 1) / (2 - 1)) + + 2 * ((3 - (2 - 1) / 2 - 4 / 2) / (3 - 2)) + + 3 * ((5 - (3 - 1) / 2 - 9 / 3) / (5 - 3))) + / (1 + 2 + 3)) + self.assertEqual(get_metric_result(name, case=1), (3 - 0 - 3 / 1) / (3 - 1)) + + def test_collect(self): + config = Config('BPR', 'ml-100k', config_dict=parameters_dict) + metrics = ['GAUC'] + rank_evaluator = RankEvaluator(config, metrics) + self.assertEqual(get_collect_result(rank_evaluator, case=0).squeeze().cpu().numpy().tolist(), + np.array([0, (2 + 3) / 2 * 2, (1 + 2 + 3 + 4) / 4 * 3, 1 + (2 + 3) / 2 + 4 + 5]).tolist()) + + +if __name__ == "__main__": + unittest.main() diff --git a/tests/model/test_model.yaml b/tests/model/test_model.yaml index 46ffeb24a..4604e8a65 100644 --- a/tests/model/test_model.yaml +++ b/tests/model/test_model.yaml @@ -30,7 +30,7 @@ ENTITY_ID_FIELD: entity_id # Selectively Loading load_col: - inter: [user_id, item_id, rating, timestamp, label] + inter: [user_id, item_id, rating, timestamp] user: [user_id, age, gender, occupation] item: [item_id, movie_title, release_year, class] link: [item_id, entity_id] @@ -47,12 +47,9 @@ lowest_val: ~ highest_val: ~ equal_val: ~ not_equal_val: ~ -drop_filter_field : False # Preprocessing fields_in_same_space: ~ -fill_nan: True preload_weight: ~ -drop_preload_weight: True normalize_field: ~ normalize_all: True diff --git a/tests/model/test_model_auto.py b/tests/model/test_model_auto.py index 74e4c8e79..dbd05f1ae 100644 --- a/tests/model/test_model_auto.py +++ b/tests/model/test_model_auto.py @@ -6,7 +6,7 @@ # UPDATE # @Time : 2020/11/17 # @Author : Xingyu Pan -# @email : panxy@ruc.edu.cn +# @email : panxy@ruc.edu.cn import os import unittest @@ -16,98 +16,157 @@ current_path = os.path.dirname(os.path.realpath(__file__)) config_file_list = [os.path.join(current_path, 'test_model.yaml')] + +def quick_test(config_dict): + objective_function(config_dict=config_dict, config_file_list=config_file_list, saved=False) + + class TestGeneralRecommender(unittest.TestCase): def test_pop(self): config_dict = { 'model': 'Pop', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_itemknn(self): config_dict = { 'model': 'ItemKNN', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_bpr(self): config_dict = { 'model': 'BPR', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_neumf(self): config_dict = { 'model': 'NeuMF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_convncf(self): config_dict = { 'model': 'ConvNCF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_dmf(self): config_dict = { 'model': 'DMF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_dmf_with_rating(self): + config_dict = { + 'model': 'DMF', + 'inter_matrix_type': 'rating', + } + quick_test(config_dict) def test_fism(self): config_dict = { 'model': 'FISM', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_fism_with_split_to_and_alpha(self): + config_dict = { + 'model': 'FISM', + 'split_to': 10, + 'alpha': 0.5, + } + quick_test(config_dict) def test_nais(self): config_dict = { 'model': 'NAIS', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_nais_with_concat(self): + config_dict = { + 'model': 'NAIS', + 'algorithm': 'concat', + 'split_to': 10, + 'alpha': 0.5, + 'beta': 0.1, + } + quick_test(config_dict) def test_spectralcf(self): config_dict = { 'model': 'SpectralCF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_gcmc(self): config_dict = { 'model': 'GCMC', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_gcmc_with_stack(self): + config_dict = { + 'model': 'GCMC', + 'accum': 'stack', + 'sparse_feature': False, + } + quick_test(config_dict) def test_ngcf(self): config_dict = { 'model': 'NGCF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_lightgcn(self): config_dict = { 'model': 'LightGCN', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_dgcf(self): config_dict = { 'model': 'DGCF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_line(self): + config_dict = { + 'model': 'LINE', + } + quick_test(config_dict) + + def test_MultiDAE(self): + config_dict = { + 'model': 'MultiDAE', + 'training_neg_sample_num': 0 + } + quick_test(config_dict) + + def test_MultiVAE(self): + config_dict = { + 'model': 'MultiVAE', + 'training_neg_sample_num': 0 + } + quick_test(config_dict) + + def test_MacridVAE(self): + config_dict = { + 'model': 'MacridVAE', + 'training_neg_sample_num': 0 + } + quick_test(config_dict) + + def test_CDAE(self): + config_dict = { + 'model': 'CDAE', + 'training_neg_sample_num': 0 + } + quick_test(config_dict) class TestContextRecommender(unittest.TestCase): @@ -116,138 +175,193 @@ class TestContextRecommender(unittest.TestCase): def test_lr(self): config_dict = { 'model': 'LR', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_fm(self): config_dict = { 'model': 'FM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_nfm(self): config_dict = { 'model': 'NFM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_deepfm(self): config_dict = { 'model': 'DeepFM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_xdeepfm(self): config_dict = { 'model': 'xDeepFM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_xdeepfm_with_direct(self): + config_dict = { + 'model': 'xDeepFM', + 'threshold': {'rating': 4}, + 'direct': True, + } + quick_test(config_dict) def test_afm(self): config_dict = { 'model': 'AFM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_fnn(self): config_dict = { 'model': 'FNN', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_pnn(self): config_dict = { 'model': 'PNN', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_pnn_with_use_inner_and_use_outer(self): + config_dict = { + 'model': 'PNN', + 'threshold': {'rating': 4}, + 'use_inner': True, + 'use_outer': True, + } + quick_test(config_dict) + + def test_pnn_without_use_inner_and_use_outer(self): + config_dict = { + 'model': 'PNN', + 'threshold': {'rating': 4}, + 'use_inner': False, + 'use_outer': False, + } + quick_test(config_dict) def test_dssm(self): config_dict = { 'model': 'DSSM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_widedeep(self): config_dict = { 'model': 'WideDeep', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - - # def test_dcn(self): - # config_dict = { - # 'model': 'DCN', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_autoint(self): config_dict = { 'model': 'AutoInt', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_ffm(self): config_dict = { 'model': 'FFM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_fwfm(self): config_dict = { 'model': 'FwFM', + 'threshold': {'rating': 4}, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - # def test_din(self): - # config_dict = { - # 'model': 'DIN', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + def test_dcn(self): + config_dict = { + 'model': 'DCN', + 'threshold': {'rating': 4}, + } + quick_test(config_dict) + + def test_xgboost(self): + config_dict = { + 'model': 'xgboost', + 'threshold': {'rating': 4}, + 'xgb_params': { + 'booster': 'gbtree', + 'objective': 'binary:logistic', + 'eval_metric': ['auc', 'logloss'] + }, + 'xgb_num_boost_round': 1, + } + quick_test(config_dict) class TestSequentialRecommender(unittest.TestCase): + def test_din(self): + config_dict = { + 'model': 'DIN', + } + quick_test(config_dict) + def test_fpmc(self): config_dict = { 'model': 'FPMC', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_gru4rec(self): config_dict = { 'model': 'GRU4Rec', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_gru4rec_with_BPR_loss(self): + config_dict = { + 'model': 'GRU4Rec', + 'loss_type': 'BPR', + } + quick_test(config_dict) def test_narm(self): config_dict = { 'model': 'NARM', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_narm_with_BPR_loss(self): + config_dict = { + 'model': 'NARM', + 'loss_type': 'BPR', + } + quick_test(config_dict) def test_stamp(self): config_dict = { 'model': 'STAMP', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_stamp_with_BPR_loss(self): + config_dict = { + 'model': 'STAMP', + 'loss_type': 'BPR', + } + quick_test(config_dict) def test_caser(self): config_dict = { @@ -255,429 +369,370 @@ def test_caser(self): 'MAX_ITEM_LIST_LENGTH': 10, 'reproducibility': False, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_caser_with_BPR_loss(self): + config_dict = { + 'model': 'Caser', + 'loss_type': 'BPR', + 'MAX_ITEM_LIST_LENGTH': 10, + 'reproducibility': False, + } + quick_test(config_dict) def test_nextitnet(self): config_dict = { 'model': 'NextItNet', 'reproducibility': False, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_nextitnet_with_BPR_loss(self): + config_dict = { + 'model': 'NextItNet', + 'loss_type': 'BPR', + 'reproducibility': False, + } + quick_test(config_dict) def test_transrec(self): config_dict = { 'model': 'TransRec', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) def test_sasrec(self): config_dict = { 'model': 'SASRec', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - # def test_bert4rec(self): - # config_dict = { - # 'model': 'BERT4Rec', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + def test_sasrec_with_BPR_loss_and_relu(self): + config_dict = { + 'model': 'SASRec', + 'loss_type': 'BPR', + 'hidden_act': 'relu' + } + quick_test(config_dict) + + def test_sasrec_with_BPR_loss_and_sigmoid(self): + config_dict = { + 'model': 'SASRec', + 'loss_type': 'BPR', + 'hidden_act': 'sigmoid' + } + quick_test(config_dict) def test_srgnn(self): config_dict = { 'model': 'SRGNN', 'MAX_ITEM_LIST_LENGTH': 3, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_srgnn_with_BPR_loss(self): + config_dict = { + 'model': 'SRGNN', + 'loss_type': 'BPR', + 'MAX_ITEM_LIST_LENGTH': 3, + } + quick_test(config_dict) def test_gcsan(self): config_dict = { 'model': 'GCSAN', 'MAX_ITEM_LIST_LENGTH': 3, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_gcsan_with_BPR_loss_and_tanh(self): + config_dict = { + 'model': 'GCSAN', + 'loss_type': 'BPR', + 'hidden_act': 'tanh', + 'MAX_ITEM_LIST_LENGTH': 3, + } + quick_test(config_dict) def test_gru4recf(self): config_dict = { 'model': 'GRU4RecF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + + def test_gru4recf_with_max_pooling(self): + config_dict = { + 'model': 'GRU4RecF', + 'pooling_mode': 'max', + } + quick_test(config_dict) + + def test_gru4recf_with_sum_pooling(self): + config_dict = { + 'model': 'GRU4RecF', + 'pooling_mode': 'sum', + } + quick_test(config_dict) def test_sasrecf(self): config_dict = { 'model': 'SASRecF', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - # def test_fdsa(self): - # config_dict = { - # 'model': 'FDSA', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) - - # def test_gru4reckg(self): - # config_dict = { - # 'model': 'GRU4RecKG', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + def test_sasrecf_with_max_pooling(self): + config_dict = { + 'model': 'SASRecF', + 'pooling_mode': 'max', + } + quick_test(config_dict) - # def test_s3rec(self): - # config_dict = { - # 'model': 'S3Rec', - # 'train_stage': 'pretrain', - # 'save_step': 1, - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + def test_sasrecf_with_sum_pooling(self): + config_dict = { + 'model': 'SASRecF', + 'pooling_mode': 'sum', + } + quick_test(config_dict) - # config_dict = { - # 'model': 'S3Rec', - # 'train_stage': 'finetune', - # 'pre_model_path': './saved/S3Rec-test-1.pth', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + def test_hrm(self): + config_dict = { + 'model': 'HRM', + } + quick_test(config_dict) + def test_hrm_with_BPR_loss(self): + config_dict = { + 'model': 'HRM', + 'loss_type': 'BPR', + } + quick_test(config_dict) -class TestKnowledgeRecommender(unittest.TestCase): + def test_npe(self): + config_dict = { + 'model': 'NPE', + } + quick_test(config_dict) - def test_cke(self): + def test_npe_with_BPR_loss(self): config_dict = { - 'model': 'CKE', + 'model': 'NPE', + 'loss_type': 'BPR', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_cfkg(self): + def test_shan(self): config_dict = { - 'model': 'CFKG', + 'model': 'SHAN', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_ktup(self): + def test_shan_with_BPR_loss(self): config_dict = { - 'model': 'KTUP', - 'train_rec_step': 1, - 'train_kg_step': 1, - 'epochs': 2, + 'model': 'SHAN', + 'loss_type': 'BPR', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_kgat(self): + def test_hgn(self): config_dict = { - 'model': 'KGAT', + 'model': 'HGN', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_ripplenet(self): + def test_hgn_with_CE_loss(self): config_dict = { - 'model': 'RippleNet', + 'model': 'HGN', + 'loss_type': 'CE', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_mkr(self): + def test_fossil(self): config_dict = { - 'model': 'MKR', + 'model': 'FOSSIL', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_kgcn(self): + def test_repeat_net(self): config_dict = { - 'model': 'KGCN', + 'model': 'RepeatNet', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_kgnnls(self): + def test_fdsa(self): config_dict = { - 'model': 'KGNNLS', + 'model': 'FDSA', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - - -class TestGeneralRecommender2(unittest.TestCase): + quick_test(config_dict) - def test_dmf(self): + def test_fdsa_with_max_pooling(self): config_dict = { - 'model': 'DMF', - 'inter_matrix_type': 'rating', + 'model': 'FDSA', + 'pooling_mode': 'max', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_fism(self): + def test_fdsa_with_sum_pooling(self): config_dict = { - 'model': 'FISM', - 'split_to': 10, - 'alpha': 0.5, + 'model': 'FDSA', + 'pooling_mode': 'sum', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_nais(self): + def test_bert4rec(self): config_dict = { - 'model': 'NAIS', - 'algorithm': 'concat', - 'split_to': 10, - 'alpha': 0.5, - 'beta': 0.1, + 'model': 'BERT4Rec', } objective_function(config_dict=config_dict, config_file_list=config_file_list, saved=False) - def test_gcmc(self): + def test_bert4rec_with_BPR_loss_and_swish(self): config_dict = { - 'model': 'GCMC', - 'accum': 'stack', - 'sparse_feature': False, + 'model': 'BERT4Rec', + 'loss_type': 'BPR', + 'hidden_act': 'swish' } objective_function(config_dict=config_dict, config_file_list=config_file_list, saved=False) + # def test_gru4reckg(self): + # config_dict = { + # 'model': 'GRU4RecKG', + # } + # quick_test(config_dict) -class TestKnowledgeRecommender2(unittest.TestCase): + # def test_s3rec(self): + # config_dict = { + # 'model': 'S3Rec', + # 'train_stage': 'pretrain', + # 'save_step': 1, + # } + # quick_test(config_dict) + # + # config_dict = { + # 'model': 'S3Rec', + # 'train_stage': 'finetune', + # 'pre_model_path': './saved/S3Rec-test-1.pth', + # } + # quick_test(config_dict) - def test_cfkg(self): - config_dict = { - 'model': 'CFKG', - 'loss_function': 'transe', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - def test_ktup(self): - config_dict = { - 'model': 'KTUP', - 'use_st_gumbel': False, - 'L1_flag': True, - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) +class TestKnowledgeRecommender(unittest.TestCase): - def test_kgat(self): - config_dict = { - 'model': 'KGAT', - 'aggregator_type': 'gcn', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + def test_cke(self): config_dict = { - 'model': 'KGAT', - 'aggregator_type': 'graphsage', + 'model': 'CKE', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_mkr(self): + def test_cfkg(self): config_dict = { - 'model': 'MKR', - 'use_inner_product': False, + 'model': 'CFKG', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_kgcn(self): + def test_cfkg_with_transe(self): config_dict = { - 'model': 'KGCN', - 'aggregator': 'neighbor', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - config_dict = { - 'model': 'KGCN', - 'aggregator': 'concat', + 'model': 'CFKG', + 'loss_function': 'transe', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_kgnnls(self): - config_dict = { - 'model': 'KGNNLS', - 'aggregator': 'neighbor', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + def test_ktup(self): config_dict = { - 'model': 'KGNNLS', - 'aggregator': 'concat', + 'model': 'KTUP', + 'train_rec_step': 1, + 'train_kg_step': 1, + 'epochs': 2, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - - -class TestContextRecommender2(unittest.TestCase): + quick_test(config_dict) - def test_xdeepfm(self): + def test_ktup_with_L1_flag(self): config_dict = { - 'model': 'xDeepFM', - 'direct': True, + 'model': 'KTUP', + 'use_st_gumbel': False, + 'L1_flag': True, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_pnn(self): - config_dict = { - 'model': 'PNN', - 'use_inner': True, - 'use_outer': True, - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + def test_kgat(self): config_dict = { - 'model': 'PNN', - 'use_inner': False, - 'use_outer': False, + 'model': 'KGAT', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - -class TestSequentialRecommender2(unittest.TestCase): - - def test_gru4rec(self): + def test_kgat_with_gcn(self): config_dict = { - 'model': 'GRU4Rec', - 'loss_type': 'BPR', + 'model': 'KGAT', + 'aggregator_type': 'gcn', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_narm(self): + def test_kgat_with_graphsage(self): config_dict = { - 'model': 'NARM', - 'loss_type': 'BPR', + 'model': 'KGAT', + 'aggregator_type': 'graphsage', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_stamp(self): + def test_ripplenet(self): config_dict = { - 'model': 'STAMP', - 'loss_type': 'BPR', + 'model': 'RippleNet', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_caser(self): + def test_mkr(self): config_dict = { - 'model': 'Caser', - 'loss_type': 'BPR', - 'MAX_ITEM_LIST_LENGTH': 10, - 'reproducibility': False, + 'model': 'MKR', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_nextitnet(self): + def test_mkr_without_use_inner_product(self): config_dict = { - 'model': 'NextItNet', - 'loss_type': 'BPR', - 'reproducibility': False, + 'model': 'MKR', + 'use_inner_product': False, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_srgnn(self): + def test_kgcn(self): config_dict = { - 'model': 'SRGNN', - 'loss_type': 'BPR', - 'MAX_ITEM_LIST_LENGTH': 3, + 'model': 'KGCN', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_sasrec(self): - config_dict = { - 'model': 'SASRec', - 'loss_type': 'BPR', - 'hidden_act': 'relu' - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + def test_kgcn_with_neighbor(self): config_dict = { - 'model': 'SASRec', - 'loss_type': 'BPR', - 'hidden_act': 'sigmoid' + 'model': 'KGCN', + 'aggregator': 'neighbor', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - # def test_bert4rec(self): - # config_dict = { - # 'model': 'BERT4Rec', - # 'loss_type': 'BPR', - # 'hidden_act': 'swish' - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) - - def test_gcsan(self): + def test_kgcn_with_concat(self): config_dict = { - 'model': 'GCSAN', - 'loss_type': 'BPR', - 'hidden_act': 'tanh', - 'MAX_ITEM_LIST_LENGTH': 3, + 'model': 'KGCN', + 'aggregator': 'concat', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_gru4recf(self): - config_dict = { - 'model': 'GRU4RecF', - 'pooling_mode': 'max', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + def test_kgnnls(self): config_dict = { - 'model': 'GRU4RecF', - 'pooling_mode': 'sum', + 'model': 'KGNNLS', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_sasrecf(self): + def test_kgnnls_with_neighbor(self): config_dict = { - 'model': 'SASRecF', - 'pooling_mode': 'max', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - config_dict = { - 'model': 'SASRecF', - 'pooling_mode': 'sum', + 'model': 'KGNNLS', + 'aggregator': 'neighbor', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) - def test_fdsa(self): + def test_kgnnls_with_concat(self): config_dict = { - 'model': 'FDSA', - 'pooling_mode': 'max', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - config_dict = { - 'model': 'FDSA', - 'pooling_mode': 'sum', + 'model': 'KGNNLS', + 'aggregator': 'concat', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) + if __name__ == '__main__': unittest.main() diff --git a/tests/model/test_model_manual.py b/tests/model/test_model_manual.py index b3ce5a2bf..6f9f2a0f9 100644 --- a/tests/model/test_model_manual.py +++ b/tests/model/test_model_manual.py @@ -13,46 +13,17 @@ config_file_list = [os.path.join(current_path, 'test_model.yaml')] -class TestContextRecommender(unittest.TestCase): - # todo: more complex context information should be test, such as criteo dataset - - def test_dcn(self): - config_dict = { - 'model': 'DCN', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - - # def test_din(self): - # config_dict = { - # 'model': 'DIN', - # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) +def quick_test(config_dict): + objective_function(config_dict=config_dict, config_file_list=config_file_list, saved=False) class TestSequentialRecommender(unittest.TestCase): - def test_bert4rec(self): - config_dict = { - 'model': 'BERT4Rec', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - - def test_fdsa(self): - config_dict = { - 'model': 'FDSA', - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - # def test_gru4reckg(self): # config_dict = { # 'model': 'GRU4RecKG', # } - # objective_function(config_dict=config_dict, - # config_file_list=config_file_list, saved=False) + # quick_test(config_dict) def test_s3rec(self): config_dict = { @@ -60,29 +31,15 @@ def test_s3rec(self): 'train_stage': 'pretrain', 'save_step': 1, } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) + quick_test(config_dict) config_dict = { 'model': 'S3Rec', 'train_stage': 'finetune', 'pre_model_path': './saved/S3Rec-test-1.pth', } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - + quick_test(config_dict) -class TestSequentialRecommender2(unittest.TestCase): - - def test_bert4rec(self): - config_dict = { - 'model': 'BERT4Rec', - 'loss_type': 'BPR', - 'hidden_act': 'swish' - } - objective_function(config_dict=config_dict, - config_file_list=config_file_list, saved=False) - if __name__ == '__main__': unittest.main() diff --git a/tests/test_data/test/test.inter b/tests/test_data/test/test.inter index d46a0f745..278c83243 100644 --- a/tests/test_data/test/test.inter +++ b/tests/test_data/test/test.inter @@ -1,6018 +1,6018 @@ -user_id:token item_id:token rating:float timestamp:float label:float -6 86 3 883603013 1 -38 95 5 892430094 0 -97 194 3 884238860 0 -7 32 4 891350932 1 -10 16 4 877888877 0 -99 4 5 886519097 0 -25 181 5 885853415 1 -59 196 5 888205088 0 -115 20 3 881171009 0 -138 26 5 879024232 1 -194 165 4 879546723 0 -11 111 4 891903862 0 -162 25 4 877635573 1 -135 23 4 879857765 1 -160 174 5 876860807 0 -42 96 5 881107178 1 -168 151 5 884288058 0 -58 144 4 884304936 0 -62 21 3 879373460 1 -44 195 5 878347874 1 -72 195 5 880037702 0 -82 135 3 878769629 1 -59 23 5 888205300 0 -43 14 2 883955745 1 -160 135 4 876860807 0 -90 98 5 891383204 0 -68 117 4 876973939 0 -172 177 4 875537965 1 -19 4 4 885412840 0 -5 2 3 875636053 0 -43 137 4 875975656 1 -99 79 4 885680138 0 -13 98 4 881515011 1 -1 61 4 878542420 0 -72 48 4 880036718 1 -92 77 3 875654637 1 -194 181 3 879521396 1 -151 10 5 879524921 1 -6 14 5 883599249 1 -54 106 3 880937882 0 -62 65 4 879374686 1 -92 172 4 875653271 1 -14 98 3 890881335 1 -194 54 3 879525876 1 -38 153 5 892430369 0 -193 96 1 889124507 1 -158 177 4 880134407 1 -181 3 2 878963441 0 -13 198 3 881515193 0 -1 189 3 888732928 1 -16 64 5 877720297 1 -95 135 3 879197562 0 -145 15 2 875270655 0 -187 64 5 879465631 1 -184 153 3 889911285 0 -1 33 4 878542699 1 -1 160 4 875072547 0 -82 183 3 878769848 1 -13 56 5 881515011 0 -18 26 4 880129731 1 -144 89 3 888105691 0 -200 96 5 884129409 1 -16 197 5 877726146 0 -142 169 5 888640356 0 -87 40 3 879876917 1 -10 175 3 877888677 0 -197 96 5 891409839 1 -194 66 3 879527264 1 -104 117 2 888465972 0 -7 163 4 891353444 1 -13 186 4 890704999 0 -83 78 2 880309089 0 -151 197 5 879528710 1 -5 17 4 875636198 1 -125 163 5 879454956 0 -23 196 2 874786926 0 -128 15 4 879968827 1 -60 60 5 883327734 1 -99 111 1 885678886 1 -65 47 2 879216672 1 -137 144 5 881433689 0 -1 20 4 887431883 0 -96 156 4 884402860 1 -72 182 5 880036515 1 -187 135 4 879465653 0 -184 187 4 889909024 1 -92 168 4 875653723 1 -72 54 3 880036854 0 -117 150 4 880125101 0 -94 184 2 891720862 1 -130 109 3 874953794 1 -151 176 2 879524293 0 -45 25 4 881014015 1 -131 126 4 883681514 1 -109 8 3 880572642 1 -198 58 3 884208173 0 -157 25 3 886890787 1 -56 121 5 892679480 1 -62 12 4 879373613 1 -10 7 4 877892210 0 -6 98 5 883600680 0 -118 200 5 875384647 1 -10 100 5 877891747 1 -189 56 5 893265263 0 -56 71 4 892683275 1 -185 23 4 883524249 1 -109 127 2 880563471 1 -18 86 4 880129731 0 -22 128 5 878887983 0 -8 22 5 879362183 0 -1 171 5 889751711 1 -181 121 4 878962623 1 -200 11 5 884129542 0 -90 25 5 891384789 1 -22 80 4 878887227 1 -15 25 3 879456204 0 -16 55 5 877717956 0 -189 20 5 893264466 0 -125 80 4 892838865 0 -43 120 4 884029430 1 -42 44 3 881108548 0 -102 70 3 888803537 1 -77 172 3 884752562 0 -62 68 1 879374969 1 -85 51 2 879454782 0 -87 82 5 879875774 0 -194 172 3 879521474 0 -94 62 3 891722933 1 -108 100 4 879879720 1 -90 22 4 891384357 1 -92 121 5 875640679 0 -194 23 4 879522819 0 -188 143 5 875072674 0 -161 48 1 891170745 0 -59 92 5 888204997 0 -21 129 4 874951382 1 -58 9 4 884304328 0 -194 152 3 879549996 1 -7 200 5 891353543 0 -113 126 5 875076827 1 -16 194 5 877720733 0 -79 50 4 891271545 1 -125 190 5 892836309 1 -150 181 5 878746685 1 -5 110 1 875636493 0 -1 155 2 878542201 1 -24 64 5 875322758 0 -82 56 3 878769410 0 -56 91 4 892683275 1 -16 8 5 877722736 0 -145 56 5 875271896 1 -17 13 3 885272654 1 -148 1 4 877019411 1 -21 164 5 874951695 1 -1 117 3 874965739 0 -60 162 4 883327734 1 -6 69 3 883601277 0 -110 38 3 886988574 1 -13 72 4 882141727 0 -194 77 3 879527421 1 -109 178 3 880572950 1 -62 182 5 879375169 0 -65 125 4 879217509 0 -90 12 5 891383241 0 -130 105 4 876251160 1 -96 87 4 884403531 0 -84 121 4 883452307 0 -198 118 2 884206513 0 -26 125 4 891371676 0 -151 13 3 879542688 1 -24 191 5 875323003 1 -13 181 5 882140354 0 -2 50 5 888552084 0 -144 125 4 888104191 0 -57 79 5 883698495 0 -121 180 3 891388286 0 -62 86 2 879374640 1 -194 187 4 879520813 0 -109 97 3 880578711 1 -8 50 5 879362124 0 -186 148 4 891719774 1 -175 127 5 877107640 0 -153 174 1 881371140 0 -62 59 4 879373821 1 -83 97 4 880308690 1 -63 100 5 875747319 0 -16 178 5 877719333 1 -85 25 2 879452769 0 -42 98 4 881106711 1 -184 98 4 889908539 0 -72 196 4 880036747 1 -128 182 4 879967225 1 -7 171 3 891351287 1 -181 14 1 878962392 0 -158 128 2 880134296 1 -1 47 4 875072125 1 -95 68 4 879196231 1 -6 23 4 883601365 1 -66 181 5 883601425 1 -76 61 4 875028123 1 -13 147 3 882397502 0 -16 89 2 877717833 1 -94 155 2 891723807 1 -136 89 4 882848925 0 -82 194 4 878770027 0 -178 199 4 882826306 0 -185 114 4 883524320 0 -94 24 4 885873423 0 -83 43 4 880308690 1 -59 177 4 888204349 1 -161 168 1 891171174 0 -43 40 3 883956468 1 -49 68 1 888069513 1 -44 15 4 878341343 1 -190 117 4 891033697 0 -29 189 4 882821942 1 -94 174 4 885870231 0 -117 181 5 880124648 0 -194 191 4 879521856 1 -158 24 4 880134261 0 -188 96 5 875073128 0 -58 173 5 884305353 1 -151 12 5 879524368 1 -14 174 5 890881294 1 -66 1 3 883601324 0 -5 1 4 875635748 1 -160 160 5 876862078 1 -109 1 4 880563619 0 -152 111 5 880148782 0 -194 160 2 879551380 1 -77 91 3 884752924 1 -181 1 3 878962392 1 -18 182 4 880130640 1 -87 177 5 879875940 1 -177 69 1 880131088 0 -125 134 5 879454532 0 -59 77 4 888206254 0 -38 161 5 892432062 1 -121 14 5 891390014 1 -117 15 5 880125887 1 -85 187 5 879454235 0 -59 54 4 888205921 0 -13 195 3 881515296 0 -144 153 5 888105823 1 -1 113 5 878542738 0 -76 175 4 875028853 1 -121 117 1 891388600 1 -85 13 3 879452866 0 -184 191 4 889908716 1 -13 121 5 882397503 1 -43 5 4 875981421 0 -11 38 3 891905936 0 -37 117 4 880915674 0 -70 82 4 884068075 1 -5 98 3 875720691 0 -56 184 4 892679088 0 -45 109 5 881012356 0 -65 100 3 879217558 0 -184 86 5 889908694 1 -72 28 4 880036824 0 -115 8 5 881171982 0 -95 1 5 879197329 1 -151 58 4 879524849 1 -45 118 4 881014550 1 -145 22 5 875273021 1 -71 89 5 880864462 0 -182 69 5 876435435 1 -64 160 4 889739288 1 -28 79 4 881961003 0 -18 113 5 880129628 1 -83 82 5 887665423 0 -87 196 5 879877681 1 -150 129 4 878746946 1 -161 98 4 891171357 0 -51 182 3 883498790 0 -92 176 5 875652981 0 -92 180 5 875653016 0 -90 187 4 891383561 0 -66 7 3 883601355 1 -144 182 3 888105743 1 -85 83 4 886282959 0 -197 55 3 891409982 1 -25 25 5 885853415 0 -103 24 4 880415847 0 -87 9 4 879877931 0 -49 47 5 888068715 1 -44 95 4 878347569 1 -135 39 3 879857931 1 -13 66 3 882141485 0 -184 161 2 889909640 0 -142 82 4 888640356 1 -99 50 5 885679998 0 -16 56 5 877719863 0 -62 132 5 879375022 1 -13 59 4 882140425 1 -102 161 2 888801876 1 -56 172 5 892737191 1 -65 196 5 879216637 0 -92 115 3 875654125 0 -32 151 3 883717850 1 -180 68 5 877127721 0 -184 36 3 889910195 1 -73 94 1 888625754 1 -198 7 4 884205317 1 -189 197 5 893265291 1 -73 56 4 888626041 0 -5 102 3 875721196 0 -13 150 5 882140588 0 -104 7 3 888465972 1 -42 176 3 881107178 1 -92 15 3 875640189 1 -79 100 5 891271652 0 -1 17 3 875073198 0 -7 81 5 891352626 0 -59 148 3 888203175 0 -82 14 4 876311280 1 -195 154 3 888737525 0 -92 81 3 875654929 1 -94 58 5 891720540 1 -117 151 4 880126373 0 -91 28 4 891439243 1 -64 176 4 889737567 1 -62 111 3 879372670 1 -95 172 4 879196847 1 -148 140 1 877019882 0 -185 199 4 883526268 1 -174 80 1 886515210 0 -42 195 5 881107949 0 -81 169 4 876534751 1 -62 114 4 879373568 0 -49 7 4 888067307 0 -58 100 5 884304553 1 -160 56 5 876770222 1 -103 127 4 880416331 0 -11 110 3 891905324 1 -87 2 4 879876074 0 -161 162 2 891171413 1 -23 172 4 874785889 1 -7 151 4 891352749 0 -84 12 5 883452874 0 -94 168 5 891721378 1 -144 106 3 888104684 0 -103 121 3 880415766 0 -200 24 2 884127370 0 -160 117 4 876767822 0 -158 72 3 880135118 0 -92 24 3 875640448 0 -164 117 5 889401816 0 -21 103 1 874951245 0 -1 90 4 878542300 1 -49 38 1 888068289 1 -151 89 5 879524491 1 -198 100 1 884207325 0 -194 4 4 879521397 1 -177 56 5 880130618 0 -57 28 4 883698324 0 -159 127 5 880989744 1 -16 155 3 877719157 1 -21 98 5 874951657 1 -77 195 5 884733695 1 -108 50 4 879879739 1 -184 181 4 889907426 0 -28 95 3 881956917 1 -181 16 1 878962996 1 -97 89 5 884238939 1 -109 101 1 880578186 0 -148 114 5 877016735 0 -94 9 5 885872684 1 -106 107 4 883876961 1 -67 64 5 875379211 1 -184 155 3 889912656 0 -68 7 3 876974096 1 -13 14 4 884538727 1 -71 134 3 885016614 0 -198 135 5 884208061 0 -98 47 4 880498898 0 -53 24 3 879442538 1 -7 106 4 891353892 0 -63 20 3 875748004 0 -42 185 4 881107449 1 -148 70 5 877021271 1 -184 71 4 889911552 1 -158 190 5 880134332 0 -83 118 3 880307071 1 -116 7 2 876453915 0 -52 95 4 882922927 1 -160 187 5 876770168 1 -26 25 3 891373727 0 -99 181 5 885680138 1 -56 196 2 892678628 0 -43 151 4 875975613 0 -62 24 4 879372633 1 -194 82 2 879524216 1 -42 69 4 881107375 0 -125 152 1 879454892 0 -63 50 4 875747292 1 -7 127 5 891351728 1 -6 143 2 883601053 0 -5 62 4 875637575 0 -184 100 5 889907652 1 -1 64 5 875072404 0 -142 181 5 888640317 0 -69 174 5 882145548 1 -49 17 2 888068651 1 -7 196 5 891351432 0 -175 96 3 877108051 1 -44 120 4 878346977 0 -83 139 3 880308959 1 -43 52 4 883955224 1 -174 160 5 886514377 0 -94 89 3 885870284 1 -7 44 5 891351728 0 -158 85 4 880135118 1 -196 67 5 881252017 0 -99 182 4 886518810 1 -175 71 4 877107942 0 -11 190 3 891904174 0 -162 181 4 877635798 1 -59 70 3 888204758 1 -131 100 5 883681418 1 -22 79 4 878887765 1 -115 127 5 881171760 1 -178 73 5 882827985 1 -56 69 4 892678893 0 -13 144 4 882397146 1 -15 127 2 879455505 0 -37 55 3 880915942 1 -16 191 5 877719454 0 -97 98 4 884238728 1 -58 109 4 884304396 1 -189 1 5 893264174 0 -67 147 3 875379357 1 -81 3 4 876592546 1 -151 186 4 879524222 0 -53 174 5 879442561 1 -123 135 5 879872868 1 -151 15 4 879524879 0 -59 12 5 888204260 0 -59 170 4 888204430 0 -92 106 3 875640609 1 -97 50 5 884239471 0 -150 121 2 878747322 1 -23 170 4 874785348 1 -13 97 4 882399357 0 -28 98 5 881961531 0 -28 173 3 881956220 1 -38 139 2 892432786 0 -44 123 4 878346532 0 -18 154 4 880131358 1 -7 28 5 891352341 0 -115 92 4 881172049 1 -62 138 1 879376709 1 -41 28 4 890687353 0 -117 50 5 880126022 1 -178 106 2 882824983 0 -198 179 4 884209264 0 -99 168 5 885680374 1 -109 31 4 880577844 1 -43 64 5 875981247 1 -89 197 5 879459859 0 -7 153 5 891352220 1 -70 50 4 884064188 0 -43 66 4 875981506 1 -60 47 4 883326399 1 -92 79 4 875653198 0 -97 115 5 884239525 1 -123 192 5 879873119 1 -49 49 2 888068990 1 -21 184 4 874951797 0 -145 183 5 875272009 0 -76 92 4 882606108 1 -48 174 5 879434723 0 -5 24 4 879198229 0 -64 93 2 889739025 1 -96 153 4 884403624 1 -150 100 2 878746636 1 -93 15 5 888705388 0 -13 167 4 882141659 1 -18 58 4 880130613 1 -145 13 5 875270507 0 -145 1 3 882181396 1 -7 188 5 891352778 1 -109 100 4 880563080 0 -7 78 3 891354165 0 -82 73 4 878769888 0 -145 50 5 885557660 0 -85 175 4 879828912 1 -124 50 3 890287508 0 -151 162 5 879528779 1 -187 116 5 879464978 0 -69 12 5 882145567 0 -85 133 4 879453876 0 -114 175 5 881259955 0 -42 121 4 881110578 1 -94 186 4 891722278 1 -85 98 4 879453716 1 -116 185 3 876453519 0 -123 13 3 879873988 1 -95 174 5 879196231 0 -178 148 4 882824325 1 -138 121 4 879023558 0 -30 82 4 875060217 1 -69 175 3 882145586 0 -16 144 5 877721142 0 -128 140 4 879968308 1 -95 128 3 879196354 0 -124 11 5 890287645 1 -7 133 5 891353192 0 -28 7 5 881961531 0 -7 93 5 891351042 1 -154 175 5 879138784 0 -44 56 2 878348601 0 -130 161 4 875802058 1 -98 163 3 880499053 0 -128 79 4 879967692 1 -195 186 3 888737240 0 -189 91 3 893265684 0 -95 143 4 880571951 0 -94 157 5 891725332 0 -7 174 5 891350757 1 -177 79 4 880130758 1 -77 168 4 884752721 1 -144 31 3 888105823 0 -94 33 3 891721919 1 -178 125 4 882824431 0 -138 151 4 879023389 1 -189 30 4 893266205 0 -198 24 2 884205385 0 -125 173 5 879454100 1 -128 143 5 879967300 0 -65 56 3 879217816 0 -144 91 2 888106106 0 -197 176 5 891409798 1 -26 15 4 891386369 1 -7 182 4 891350965 0 -109 154 2 880578121 0 -161 174 2 891170800 0 -109 89 4 880573263 1 -195 181 5 875771440 1 -7 193 5 892135346 0 -77 125 3 884733014 0 -85 58 4 879829689 1 -1 92 3 876892425 1 -90 31 4 891384673 0 -158 1 4 880132443 0 -42 143 4 881108229 0 -43 26 5 883954901 1 -130 200 5 875217392 1 -68 118 2 876974248 0 -102 118 3 888801465 0 -189 120 1 893264954 1 -20 11 2 879669401 1 -20 176 2 879669152 0 -49 148 1 888068195 0 -160 3 3 876770124 1 -152 147 3 880149045 1 -162 121 4 877636000 0 -178 121 5 882824291 0 -76 135 5 875028792 0 -75 121 4 884050450 1 -44 174 5 878347662 1 -145 172 5 882181632 0 -188 191 3 875073128 1 -37 183 4 880930042 0 -125 150 1 879454892 0 -56 194 5 892676908 1 -16 92 4 877721905 0 -60 79 4 883326620 1 -1 121 4 875071823 1 -62 4 4 879374640 1 -26 7 3 891350826 0 -121 86 5 891388286 1 -198 180 3 884207298 0 -1 114 5 875072173 0 -180 79 3 877442037 1 -67 1 3 875379445 1 -1 132 4 878542889 0 -1 74 1 889751736 0 -22 173 5 878886368 1 -1 134 4 875073067 0 -94 45 5 886008764 1 -6 180 4 883601311 0 -188 88 4 875075300 1 -137 55 5 881433689 0 -91 172 4 891439208 1 -150 13 4 878746889 1 -151 25 4 879528496 1 -181 123 2 878963276 1 -194 196 3 879524007 0 -109 5 3 880580637 0 -16 168 4 877721142 0 -74 9 4 888333458 1 -144 66 4 888106078 1 -195 14 4 890985390 0 -18 199 3 880129769 1 -174 41 1 886515063 1 -109 159 4 880578121 0 -56 68 3 892910913 0 -109 195 5 880578038 0 -183 96 3 891463617 1 -178 131 4 882827947 1 -119 54 4 886176814 0 -1 98 4 875072404 0 -64 187 5 889737395 1 -82 15 3 876311365 1 -1 186 4 875073128 1 -181 20 1 878962919 1 -87 135 5 879875649 0 -87 157 3 879877799 1 -87 163 4 879877083 0 -96 91 5 884403250 0 -24 153 4 875323368 1 -43 114 5 883954950 0 -42 48 5 881107821 0 -125 97 3 879454385 1 -108 13 3 879879834 1 -144 62 2 888105902 1 -148 172 5 877016513 1 -188 159 3 875074589 1 -44 88 2 878348885 1 -190 147 4 891033863 0 -185 127 5 883525183 1 -150 1 4 878746441 0 -60 179 4 883326566 1 -75 147 3 884050134 0 -59 121 4 888203313 1 -7 22 5 891351121 1 -85 53 3 882995643 1 -95 176 3 879196298 1 -144 64 5 888105140 0 -56 29 3 892910913 1 -200 72 4 884129542 1 -130 56 5 875216283 1 -49 102 2 888067164 1 -177 89 5 880131088 1 -42 102 5 881108873 1 -180 67 1 877127591 0 -23 183 3 874785728 0 -65 97 5 879216605 0 -92 134 4 875656623 0 -152 25 3 880149045 0 -62 28 3 879375169 1 -64 77 3 889737420 0 -15 20 3 879455541 0 -14 22 3 890881521 0 -62 157 3 879374686 0 -59 13 5 888203415 1 -73 12 5 888624976 0 -6 95 2 883602133 0 -87 70 5 879876448 0 -1 84 4 875072923 0 -22 186 5 878886368 0 -72 129 4 880035588 0 -1 31 3 875072144 1 -22 96 5 878887680 1 -85 97 2 879829667 1 -181 7 4 878963037 1 -94 180 5 885870284 1 -16 70 4 877720118 1 -58 45 5 884305295 1 -151 191 3 879524326 1 -158 38 4 880134607 1 -181 124 1 878962550 1 -145 182 5 885622510 0 -44 11 3 878347915 0 -49 10 3 888066086 1 -17 151 4 885272751 1 -59 47 5 888205574 0 -14 111 3 876965165 1 -195 100 5 875771440 0 -130 172 5 875801530 1 -177 124 3 880130881 0 -1 70 3 875072895 0 -13 178 4 882139829 1 -30 181 4 875060217 1 -8 182 5 879362183 0 -7 162 5 891353444 1 -56 63 3 892910268 1 -92 175 4 875653549 0 -18 196 3 880131297 0 -158 79 4 880134332 0 -87 67 4 879877007 0 -90 11 4 891384113 0 -1 60 5 875072370 0 -119 154 5 874782022 1 -83 186 4 880308601 1 -1 177 5 876892701 1 -59 10 4 888203234 0 -10 48 4 877889058 0 -99 124 2 885678886 1 -152 132 5 882475496 1 -189 45 3 893265657 0 -91 193 3 891439057 1 -14 56 5 879119579 1 -13 42 4 882141393 1 -159 111 4 880556981 0 -137 195 5 881433689 1 -152 97 5 882475618 0 -63 150 4 875747292 0 -200 103 2 891825521 0 -13 94 3 882142057 1 -14 93 3 879119311 0 -38 122 1 892434801 0 -148 177 2 877020715 0 -184 47 4 889909640 0 -145 25 2 875270655 0 -59 132 5 888205744 0 -1 27 2 876892946 1 -104 122 3 888465739 1 -60 178 5 883326399 1 -200 191 5 884128554 0 -148 185 1 877398385 1 -13 180 5 882141248 0 -25 174 5 885853415 1 -157 150 5 874813703 1 -106 69 4 881449886 0 -80 50 3 887401533 1 -56 174 5 892737191 0 -82 69 4 878769948 0 -83 95 4 880308453 1 -17 9 3 885272558 1 -82 147 3 876311473 1 -62 135 4 879375080 1 -5 167 2 875636281 0 -118 174 5 875385007 1 -13 29 2 882397833 1 -125 158 4 892839066 0 -43 15 5 875975546 0 -193 195 1 889124507 0 -117 1 4 880126083 1 -103 117 4 880416313 1 -104 100 4 888465166 0 -95 96 4 879196298 1 -49 1 2 888068651 0 -1 145 2 875073067 1 -1 174 5 875073198 1 -10 124 5 877888545 1 -81 118 2 876533764 1 -136 117 4 882694498 1 -115 11 4 881171348 0 -64 2 3 889737609 1 -28 50 4 881957090 0 -1 159 3 875073180 0 -60 172 4 883326339 1 -18 69 3 880129527 0 -184 132 5 889913687 0 -151 169 5 879524268 1 -110 79 4 886988480 1 -128 111 3 879969215 1 -1 82 5 878542589 0 -13 45 3 882139863 1 -94 185 5 885873684 1 -128 83 5 879967691 1 -142 189 4 888640317 1 -1 56 4 875072716 0 -184 14 4 889907738 1 -198 156 3 884207058 0 -194 153 3 879546723 1 -136 14 5 882693338 0 -73 127 5 888625200 1 -116 187 5 886310197 1 -28 12 4 881956853 1 -85 86 4 879454189 0 -151 7 4 879524610 0 -1 80 4 876893008 1 -44 153 4 878347234 0 -94 79 4 885882967 0 -109 62 3 880578711 0 -49 173 3 888067691 1 -121 121 2 891388501 0 -60 183 5 883326399 1 -198 51 3 884208455 1 -13 2 3 882397650 0 -44 55 4 878347455 0 -37 56 5 880915810 1 -194 162 3 879549899 1 -130 71 5 875801695 1 -130 50 5 874953665 0 -125 22 5 892836395 0 -69 56 5 882145428 1 -110 188 4 886988574 1 -106 45 3 881453290 1 -151 66 4 879524974 0 -123 22 4 879809943 0 -198 148 3 884206401 1 -56 79 4 892676303 1 -151 175 5 879524244 1 -152 125 5 880149165 0 -123 165 5 879872672 1 -169 174 4 891359418 0 -63 109 4 875747731 0 -72 89 3 880037164 0 -80 87 4 887401307 0 -85 56 4 879453587 0 -194 56 5 879521936 1 -110 82 4 886988480 0 -7 195 5 891352626 0 -12 82 4 879959610 0 -109 90 3 880583192 1 -13 64 5 882140037 0 -82 64 5 878770169 0 -42 70 3 881109148 1 -10 4 4 877889130 1 -14 175 5 879119497 1 -6 134 5 883602283 1 -28 153 3 881961214 0 -62 96 4 879374835 0 -102 195 4 888801360 1 -8 79 4 879362286 1 -28 184 4 881961671 1 -51 148 3 883498623 0 -186 53 1 879023882 0 -141 125 5 884585642 1 -23 88 3 874787410 1 -72 79 4 880037119 0 -82 13 2 878768615 0 -83 77 4 880308426 0 -43 7 4 875975520 0 -23 90 2 874787370 1 -106 97 5 881450810 0 -109 147 4 880564679 1 -156 58 4 888185906 0 -16 151 5 877721905 0 -94 99 3 891721815 1 -154 137 3 879138657 0 -158 144 4 880134445 1 -11 120 2 891903935 0 -197 181 5 891409893 0 -65 70 1 879216529 0 -128 77 3 879968447 1 -167 48 1 892738277 0 -56 143 3 892910182 0 -115 69 1 881171825 1 -145 109 4 875270903 1 -59 127 5 888204430 1 -58 42 4 884304936 0 -77 23 4 884753173 1 -95 15 4 879195062 0 -184 172 4 889908497 1 -13 168 4 881515193 0 -158 8 5 880134948 1 -92 87 3 876175077 0 -20 118 4 879668442 0 -95 33 3 880571704 0 -130 125 5 875801963 1 -174 107 5 886434361 1 -97 7 5 884238939 0 -125 143 5 879454793 1 -160 126 3 876769148 0 -32 117 3 883717555 0 -1 140 1 878543133 1 -5 173 4 875636675 1 -49 117 1 888069459 1 -25 127 3 885853030 1 -92 85 3 875812364 0 -187 70 4 879465394 1 -194 62 2 879524504 1 -70 71 3 884066399 0 -49 72 2 888069246 0 -194 132 3 879520991 0 -175 31 4 877108051 0 -138 100 5 879022956 0 -63 6 3 875747439 1 -180 121 5 877127830 1 -148 98 3 877017714 0 -102 66 3 892992129 0 -158 42 3 880134913 0 -70 151 3 884148603 1 -103 144 4 880420510 0 -95 173 5 879198547 1 -102 67 1 892993706 0 -160 93 5 876767572 1 -99 118 2 885679237 0 -70 152 4 884149877 0 -41 31 3 890687473 1 -178 179 2 882828320 0 -6 19 4 883602965 1 -130 55 5 875216507 1 -136 56 4 882848783 0 -74 15 4 888333542 1 -1 120 1 875241637 1 -64 100 4 879365558 1 -6 154 3 883602730 0 -60 152 4 883328033 1 -161 14 4 891171413 0 -18 82 3 880131236 1 -22 29 1 878888228 1 -96 8 5 884403020 0 -72 176 2 880037203 0 -102 89 4 888801315 1 -60 151 5 883326995 1 -13 90 3 882141872 0 -7 92 5 891352010 0 -91 195 5 891439057 0 -62 8 5 879373820 0 -197 68 2 891410082 1 -26 9 4 891386369 0 -119 193 4 874781872 1 -117 174 4 881011393 1 -189 129 3 893264378 0 -1 125 3 878542960 1 -23 83 4 874785926 1 -6 175 4 883601426 0 -184 89 4 889908572 0 -44 155 3 878348947 1 -90 199 5 891384423 1 -130 90 4 875801920 0 -20 186 3 879669040 0 -37 79 4 880915810 1 -163 56 4 891220097 1 -72 82 3 880037242 1 -117 176 5 881012028 0 -121 174 3 891388063 0 -20 172 3 879669181 1 -108 125 3 879879864 1 -49 53 4 888067405 0 -106 165 5 881450536 0 -85 71 4 879456308 0 -151 91 2 879542796 0 -116 195 4 876453626 1 -144 172 4 888105312 1 -74 126 3 888333428 0 -45 127 5 881007272 1 -109 4 2 880572756 0 -12 96 4 879959583 1 -109 42 1 880572756 0 -174 82 1 886515472 0 -180 83 5 877128388 0 -150 127 5 878746889 1 -102 83 3 888803487 1 -128 97 3 879968125 1 -11 90 2 891905298 0 -194 52 4 879525876 0 -177 87 4 880130931 0 -68 178 5 876974755 1 -90 179 5 891385389 1 -13 88 4 882141485 0 -120 25 5 889490370 1 -138 98 5 879024043 0 -160 124 4 876767360 0 -94 133 4 885882685 0 -121 122 2 891390501 1 -19 153 4 885412840 1 -90 132 5 891384673 0 -49 40 1 888069222 0 -7 90 3 891352984 1 -21 56 5 874951658 0 -184 126 3 889907971 0 -26 100 5 891386368 0 -21 106 2 874951447 0 -90 9 4 891385787 1 -31 135 4 881548030 0 -62 89 5 879374640 1 -1 6 5 887431973 1 -10 22 5 877888812 1 -90 30 5 891385843 1 -1 104 1 875241619 1 -76 100 5 875028391 1 -11 97 4 891904300 0 -83 125 5 880306811 0 -16 22 5 877721071 0 -10 155 4 877889186 1 -92 132 3 875812211 0 -18 25 3 880131591 1 -12 172 4 879959088 0 -57 56 3 883698646 1 -73 196 4 888626177 0 -7 10 4 891352864 1 -118 176 5 875384793 1 -77 153 5 884732685 0 -151 196 4 879542670 1 -102 186 4 888803487 0 -14 100 5 876965165 0 -130 148 4 876251127 1 -158 100 5 880132401 1 -59 14 5 888203234 0 -1 49 3 878542478 0 -94 109 4 891721974 0 -102 62 3 888801812 1 -118 156 5 875384946 0 -81 93 3 876533657 1 -79 124 5 891271870 0 -106 15 3 883876518 1 -73 7 4 888625956 1 -187 28 4 879465597 0 -15 137 4 879455939 0 -77 4 3 884752721 1 -184 92 3 889908657 0 -6 188 3 883602462 1 -194 51 4 879549793 1 -56 1 4 892683248 1 -177 182 5 880130684 1 -1 76 4 878543176 1 -106 64 4 881449830 0 -157 127 5 886890541 1 -56 31 4 892679259 1 -60 28 5 883326155 0 -12 143 5 879959635 0 -102 121 3 888801673 1 -92 123 2 875640251 1 -22 117 4 878887869 0 -18 190 4 880130155 0 -72 64 5 880036549 0 -1 72 4 878542678 0 -48 187 5 879434954 1 -94 153 5 891725333 1 -128 64 5 879966954 1 -62 153 4 879374686 0 -53 100 5 879442537 0 -174 94 2 886515062 1 -5 154 3 875636691 0 -200 7 4 876042451 1 -65 121 4 879217458 0 -63 111 3 875747896 1 -198 11 4 884207392 1 -91 99 2 891439386 0 -42 131 2 881108548 0 -152 98 2 882473974 1 -55 144 5 878176398 1 -125 175 2 879455184 1 -82 178 4 878769629 0 -1 185 4 875072631 1 -184 15 3 889907812 1 -152 167 5 882477430 0 -144 50 5 888103929 1 -97 28 5 884238778 1 -114 195 4 881260861 0 -188 69 4 875072009 1 -106 77 4 881451716 1 -188 7 5 875073477 1 -96 64 5 884403336 1 -160 79 4 876859413 0 -18 191 4 880130193 0 -162 42 3 877636675 1 -95 26 3 880571951 0 -58 8 4 884304955 0 -110 22 4 886987826 0 -1 96 5 875072716 0 -89 127 5 879441335 0 -95 137 3 879192404 1 -17 1 4 885272579 1 -87 154 4 879876564 1 -135 54 3 879858003 1 -14 151 5 876964725 0 -148 71 5 877019251 0 -6 156 3 883602212 1 -130 58 2 876251619 1 -76 12 3 882606060 1 -95 32 1 888954726 0 -130 47 3 875801470 1 -12 97 5 879960826 0 -38 99 5 892430829 1 -198 188 5 884208200 1 -72 45 5 880037853 0 -44 82 4 878348885 0 -198 97 3 884207112 0 -189 60 3 893265773 0 -28 100 5 881957425 1 -119 86 4 874782068 0 -174 117 5 886434136 0 -14 13 4 880929778 0 -103 126 5 880420002 1 -94 101 2 891720996 0 -92 42 4 875653664 0 -45 121 4 881013563 0 -175 56 2 877107790 1 -185 196 4 883524172 0 -49 168 5 888068686 0 -72 68 3 880037242 0 -72 12 5 880036664 0 -49 56 5 888067307 1 -82 191 4 878769748 0 -151 100 3 879524514 0 -20 194 3 879669152 0 -145 185 4 875271838 1 -169 172 5 891359317 1 -65 191 4 879216797 0 -121 125 2 891388600 0 -59 7 4 888202941 1 -52 116 4 882922328 1 -59 100 5 888202899 0 -24 129 3 875246185 1 -92 48 4 875653307 0 -158 68 3 880134532 1 -145 174 5 882181728 1 -64 8 4 889737968 0 -7 168 5 891351509 0 -161 56 3 891171257 1 -96 100 5 884403758 1 -91 131 2 891439471 0 -178 135 2 882826915 1 -135 176 4 879857765 1 -102 173 3 888803602 0 -194 30 3 879524504 1 -11 47 4 891904551 1 -162 174 4 877636772 0 -5 42 5 875636360 0 -82 11 4 878769992 1 -178 193 4 882826868 0 -193 117 4 889125913 1 -117 168 5 881012550 1 -162 50 5 877635662 0 -77 181 3 884732278 1 -177 1 3 880130699 0 -89 117 5 879441357 1 -28 174 5 881956334 0 -188 173 5 875075118 0 -48 50 4 879434723 1 -7 54 3 892132380 1 -200 121 5 876042268 0 -7 89 5 891351082 0 -151 193 4 879524491 1 -38 67 4 892434312 0 -156 12 3 888185853 1 -42 142 4 881109271 1 -59 126 5 888202899 1 -109 69 4 880572561 1 -28 143 4 881956564 1 -23 28 3 874786793 0 -1 81 5 875072865 1 -124 166 5 890287645 0 -198 15 3 884205185 0 -113 100 4 875935610 1 -156 64 3 888185677 0 -64 56 5 889737542 1 -6 133 4 883601459 0 -130 158 5 875801897 1 -18 14 5 880130431 1 -95 132 3 880570993 0 -10 64 4 877886598 0 -164 125 5 889402071 1 -141 50 4 884584735 1 -114 191 3 881309511 0 -82 127 2 878769777 1 -55 56 4 878176397 1 -160 21 1 876769480 0 -23 177 4 884550003 1 -32 100 3 883717662 0 -59 134 5 888204841 1 -43 117 4 883954853 0 -1 78 1 878543176 1 -6 70 3 883601427 0 -18 89 3 880130065 1 -197 187 5 891409798 1 -46 127 5 883616133 1 -62 100 4 879372276 0 -130 3 5 876250897 0 -83 22 5 880307724 1 -59 188 4 888205188 0 -145 200 4 877343121 0 -160 175 4 876860808 0 -13 25 1 882141686 0 -7 142 3 891354090 1 -72 181 1 880037203 1 -7 156 5 891351653 0 -49 129 2 888068079 1 -23 188 3 877817151 0 -59 48 5 888204502 0 -49 3 3 888068877 1 -56 98 4 892679067 0 -130 183 5 875801369 1 -18 194 3 880129816 0 -69 109 3 882145428 0 -42 25 3 881110670 0 -144 22 5 888105439 0 -102 183 4 888801360 0 -121 9 5 891390013 0 -90 6 4 891384357 1 -98 70 3 880499018 1 -189 173 5 893265160 0 -169 181 5 891359276 1 -95 24 3 879192542 1 -56 82 4 892676314 1 -23 99 4 874786098 0 -118 185 5 875384979 0 -18 71 4 880131236 0 -130 49 4 875802236 1 -14 7 5 876965061 0 -10 200 5 877889261 1 -119 144 4 887038665 0 -72 70 4 880036691 0 -94 31 4 891721286 0 -130 53 3 876251972 0 -95 88 4 880571016 1 -58 156 5 884304955 1 -13 161 5 882397741 1 -65 197 5 879216769 0 -42 99 5 881108346 0 -81 7 4 876533545 1 -119 87 5 874781829 1 -8 89 4 879362124 0 -6 151 3 883599558 1 -177 150 4 880130807 0 -117 121 4 880126038 1 -194 1 4 879539127 1 -184 88 3 889909551 0 -142 28 4 888640404 1 -99 123 3 885678997 0 -1 143 1 875072631 1 -195 99 3 888737277 1 -59 25 4 888203270 1 -64 173 5 889737454 0 -59 65 4 888205265 1 -174 63 4 886514985 0 -1 151 4 875072865 0 -56 94 4 892910292 0 -59 175 4 888205300 1 -164 148 5 889402203 1 -116 180 5 886310197 1 -1 51 4 878543275 0 -130 12 4 875216340 1 -90 185 5 891384959 0 -12 132 5 879959465 1 -5 139 3 875721260 1 -192 127 4 881367456 0 -135 77 4 879858003 0 -94 39 3 891721317 0 -177 175 5 880130972 1 -162 151 3 877636191 0 -87 55 4 879875774 1 -190 118 3 891033906 0 -106 8 4 881452405 0 -188 195 3 875073179 1 -177 179 5 880131057 0 -53 181 4 879443046 1 -117 12 5 881011350 0 -162 117 4 877635869 0 -114 157 2 881260611 1 -184 52 4 889910034 0 -99 196 4 885680578 1 -123 127 5 879809943 0 -70 176 4 884066573 1 -96 170 5 884403866 0 -13 190 4 882397145 0 -94 34 1 891723558 0 -18 12 5 880129991 1 -178 58 5 882827134 0 -114 183 5 881260545 1 -13 137 5 882139804 1 -79 137 4 891271870 1 -18 181 3 880131631 1 -84 31 4 883453755 1 -76 59 4 875027981 0 -200 25 4 876042234 0 -197 195 5 891409798 0 -64 181 4 889737420 1 -132 137 4 891278996 1 -145 120 2 888398563 0 -51 132 4 883498655 1 -130 84 4 876252497 0 -8 190 4 879362183 1 -24 25 4 875246258 1 -116 199 4 876454174 1 -109 9 3 880564607 0 -200 143 5 884128499 1 -99 11 5 885680138 0 -145 159 4 875272299 1 -200 82 5 884129656 0 -85 124 5 882813248 1 -6 131 5 883602048 0 -156 192 4 888185735 0 -130 22 5 875217265 0 -12 157 5 879959138 0 -151 114 5 879524268 1 -130 63 4 876252521 0 -144 129 4 888104234 0 -16 96 5 877717833 0 -1 175 5 875072547 1 -80 45 4 887401585 0 -12 71 4 879959635 1 -59 141 4 888206605 0 -56 118 4 892679460 1 -198 23 4 884208491 1 -77 179 5 884752806 0 -89 26 3 879459909 1 -53 199 5 879442384 0 -32 118 3 883717967 0 -18 180 4 880130252 0 -55 89 5 878176398 1 -177 197 4 880130758 1 -44 168 5 878347504 0 -90 42 4 891384885 0 -137 50 5 881432937 1 -109 117 5 880564457 1 -85 199 5 879829438 0 -62 183 4 879374893 0 -95 2 2 888955909 1 -153 64 5 881371005 0 -62 173 5 879374732 1 -160 4 4 876861754 0 -12 15 5 879959670 1 -62 78 2 879376612 0 -89 151 5 879441507 0 -120 9 4 889489886 0 -73 28 3 888626468 1 -87 88 5 879876672 0 -175 176 3 877107255 1 -185 197 5 883524428 0 -130 150 5 874953558 0 -109 176 5 880577868 1 -94 28 4 885873159 1 -178 70 4 882827083 1 -7 172 4 891350965 0 -44 106 2 878347076 1 -184 13 3 889907839 1 -73 156 4 888625835 0 -18 179 4 880129877 1 -200 29 4 884130540 0 -6 28 2 883603013 0 -154 182 5 879138783 1 -154 50 5 879138657 1 -94 118 3 891723295 0 -44 185 4 878347569 0 -102 176 3 888801360 1 -82 25 2 878768435 0 -14 70 1 879119692 0 -122 70 5 879270606 1 -23 32 3 874785809 1 -12 191 5 879960801 0 -6 136 5 883600842 0 -77 176 4 884752757 1 -200 33 4 884129602 0 -119 12 3 874781915 1 -90 178 5 891384611 0 -181 21 1 878963381 0 -156 137 4 888185735 0 -181 112 1 878962955 0 -14 14 3 879119311 1 -57 173 5 883698408 1 -89 83 4 879459884 0 -2 13 4 888551922 1 -131 1 4 883681384 1 -6 117 2 883599431 1 -1 107 4 875241619 1 -6 32 4 883601311 0 -72 124 4 880035636 1 -123 50 3 879873726 1 -181 148 2 878963204 1 -83 28 4 880308284 0 -92 183 4 875653960 0 -12 196 5 879959553 0 -94 64 5 885870362 1 -87 182 4 879875737 1 -58 20 1 884304538 1 -44 9 5 878341196 1 -180 111 5 877127747 0 -108 181 3 879879985 0 -153 22 2 881371140 0 -119 188 4 874781742 1 -189 21 2 893264619 1 -14 181 5 889666215 1 -91 82 5 891439386 0 -32 122 2 883718250 1 -6 15 3 883599302 0 -87 79 5 879875856 0 -195 61 3 888737277 1 -158 11 4 880134398 0 -13 48 5 882139863 1 -189 121 2 893264816 0 -94 50 5 891720996 1 -153 127 3 881371140 1 -200 45 3 884128372 1 -82 103 2 878768665 1 -64 83 3 889737654 0 -59 102 2 888205956 0 -161 127 3 891171698 1 -69 9 4 882126086 1 -95 14 5 879197329 1 -42 12 4 881107502 0 -67 121 4 875379683 1 -188 148 4 875074667 0 -119 111 5 886176779 1 -13 21 3 882399040 0 -184 77 3 889910217 1 -92 196 4 875654222 0 -95 83 5 880573288 0 -11 135 4 891904335 0 -178 178 4 882826395 1 -189 143 5 893266027 0 -188 13 4 875073408 0 -124 157 2 890287936 1 -6 135 5 883600747 0 -69 48 5 882145428 1 -57 7 4 883697105 0 -7 8 5 891351328 1 -106 1 4 881449487 1 -180 69 4 877355568 1 -144 194 5 888105287 0 -73 48 2 888625785 0 -189 100 4 893263994 0 -194 117 3 879535704 1 -42 82 4 881107449 1 -174 49 4 886513788 0 -75 108 4 884050661 1 -41 170 4 890687713 0 -174 196 5 886514108 0 -137 172 5 881433719 0 -60 176 4 883326057 0 -115 172 4 881171273 1 -13 61 4 882140552 1 -108 121 3 879880190 1 -62 33 1 879374785 1 -200 151 3 876042204 0 -180 56 5 877127130 0 -60 194 4 883326425 1 -14 121 3 876965061 0 -18 136 5 880129421 1 -144 33 5 888105902 0 -200 38 3 884130348 0 -5 40 4 879198109 0 -99 7 4 885678784 0 -90 166 4 891383423 1 -184 196 4 889908985 0 -197 92 1 891410082 1 -5 90 3 875636297 1 -80 58 4 887401677 1 -178 76 3 882827288 0 -62 147 3 879372870 1 -63 13 4 875747439 0 -194 124 4 879539229 0 -71 56 5 885016930 1 -10 135 5 877889004 0 -54 121 4 880936669 0 -138 111 4 879022890 1 -67 151 4 875379619 0 -16 183 5 877720733 0 -13 40 2 886302815 0 -5 153 5 875636375 1 -168 7 1 884287559 0 -109 200 2 880577734 0 -128 173 5 879966756 0 -197 33 2 891409981 0 -16 27 2 877726390 1 -13 73 3 882141485 1 -84 151 4 883449993 1 -189 96 5 893265971 1 -66 117 3 883601787 0 -101 118 3 877136424 0 -94 63 3 891723908 0 -43 118 4 883955546 0 -42 88 5 881108425 1 -158 182 5 880134296 0 -157 3 3 886890734 1 -65 135 4 879216567 0 -62 179 4 879374969 0 -43 54 3 883956494 0 -94 144 3 891721168 0 -151 47 3 879528459 0 -184 34 2 889913568 0 -200 15 4 884127745 0 -5 94 3 878844651 1 -99 56 5 885679833 1 -42 28 5 881108187 1 -184 70 4 889908657 0 -77 50 4 884732345 1 -144 73 3 888105636 0 -56 186 3 892676933 1 -69 151 5 882072998 1 -1 108 5 875240920 0 -174 118 2 886434186 1 -145 44 5 875272132 1 -186 71 5 879024535 1 -82 109 1 884714204 0 -200 173 5 884128554 1 -177 195 4 880130699 0 -62 121 4 879372916 0 -49 122 2 888069138 1 -90 96 4 891384754 0 -56 95 4 892683274 0 -38 71 5 892430516 1 -135 33 3 879857930 1 -182 172 5 876435435 1 -130 4 2 875801778 0 -1 12 5 878542960 0 -13 118 4 882397581 1 -10 164 4 877889333 1 -109 96 5 880572614 0 -76 150 5 875028880 1 -5 109 5 875635350 0 -56 179 3 892678669 0 -59 195 5 888204757 1 -90 86 5 891383626 1 -94 156 5 891725332 1 -60 71 3 883327948 0 -198 172 4 884207206 1 -10 191 5 877888613 1 -130 134 5 875801750 1 -15 18 1 879455681 1 -43 161 4 883955467 0 -176 100 5 886047918 0 -124 79 3 890287395 0 -188 98 5 875071957 0 -96 173 3 884402791 1 -118 23 5 875384979 0 -188 38 3 875073828 0 -188 77 4 875072328 0 -184 124 5 889907652 1 -125 28 4 879454385 1 -177 196 3 880130881 0 -145 105 2 875271442 1 -58 182 4 884304701 0 -16 164 5 877724438 0 -1 14 5 874965706 0 -151 65 4 879528729 0 -109 131 1 880579757 0 -125 64 5 879454139 1 -41 98 4 890687374 1 -54 147 5 880935959 0 -125 25 1 879454987 1 -92 88 3 875656349 0 -194 26 3 879522240 1 -92 181 4 876175052 1 -148 169 5 877020297 0 -56 181 5 892737154 1 -64 7 4 889737542 0 -1 97 3 875073128 0 -62 155 1 879376633 0 -90 197 5 891383319 0 -193 174 4 889125720 1 -54 127 4 880933834 0 -128 56 3 879966785 0 -49 151 5 888067727 0 -59 125 3 888203658 1 -1 44 5 878543541 1 -8 172 5 879362123 0 -56 96 5 892676429 0 -74 100 4 888333428 1 -92 32 3 875653363 1 -18 57 4 880130930 0 -43 50 4 875975211 0 -59 136 3 888205336 1 -131 14 5 883681313 0 -95 117 4 879193619 1 -85 8 4 879454952 0 -25 135 3 885852059 0 -1 53 3 876893206 1 -49 52 2 888066647 1 -97 168 4 884238693 1 -84 64 5 883450066 1 -60 186 4 883326566 0 -43 1 5 875975579 1 -178 22 5 882826187 0 -104 25 3 888465634 1 -6 125 3 883599670 1 -137 183 5 881433689 0 -194 185 4 879521254 1 -1 163 4 875072442 0 -181 149 1 878962719 0 -18 195 3 880131236 1 -163 64 4 891220161 1 -22 121 3 878887925 1 -77 174 5 884733587 0 -128 190 4 879967016 0 -158 163 4 880135044 1 -178 83 4 882826556 1 -16 69 5 877724846 1 -168 123 3 884287822 0 -90 177 5 891384516 1 -20 1 3 879667963 0 -56 73 4 892677094 1 -43 47 1 883955415 0 -7 82 3 891351471 1 -64 38 3 889740415 0 -25 151 4 885853335 1 -181 125 3 878962816 1 -97 97 5 884239525 0 -20 69 1 879668979 0 -92 189 4 875653519 1 -92 191 4 875653050 0 -152 162 5 882474898 1 -106 86 3 881451355 1 -68 50 5 876973969 1 -9 6 5 886960055 1 -194 58 4 879522917 1 -168 25 5 884287885 0 -142 89 3 888640489 0 -58 193 3 884305220 1 -77 69 3 884752997 1 -18 185 3 880129388 0 -174 29 2 886514469 1 -178 89 4 882826514 1 -10 156 4 877886846 1 -200 174 5 884128426 0 -62 118 2 879373007 0 -198 184 3 884209003 1 -6 199 4 883601203 1 -150 50 5 878746719 0 -92 190 4 876174729 1 -174 66 5 886513706 0 -56 51 3 892677186 0 -21 121 1 874951416 0 -92 129 4 886443161 1 -177 47 3 880131187 0 -49 101 3 888067164 1 -92 31 4 875654321 0 -59 169 4 888204757 1 -75 137 4 884050102 0 -92 11 4 875653363 0 -15 148 3 879456049 0 -18 186 4 880131699 1 -1 184 4 875072956 0 -87 96 5 879875734 1 -178 99 4 882827574 1 -158 176 4 880134398 0 -22 176 5 878887765 0 -6 183 4 883601311 0 -1 157 4 876892918 0 -181 10 2 878962955 1 -90 100 5 891383241 0 -11 9 5 891902970 1 -43 49 4 883956387 1 -79 6 4 891271901 1 -37 24 4 880915674 0 -49 143 3 888067726 1 -38 94 5 892432030 1 -92 98 5 875652934 0 -76 64 5 875498777 0 -193 33 3 889125912 1 -178 183 4 882826347 1 -122 191 5 879270128 0 -121 126 3 891388936 1 -89 93 2 879441307 1 -125 116 4 892838322 1 -45 15 4 881012184 1 -56 56 5 892676376 0 -41 69 4 890687145 0 -172 183 5 875538864 0 -80 194 3 887401763 0 -13 124 5 884538663 1 -99 100 5 885678813 0 -89 121 5 879441657 1 -6 197 5 883601203 1 -128 151 3 879968921 0 -7 177 4 891352904 0 -87 39 3 879875995 0 -85 108 2 880838201 0 -26 117 3 891351590 1 -119 109 5 874775580 1 -168 117 5 884287318 1 -1 150 5 876892196 1 -65 173 3 879217851 0 -193 111 1 889126375 1 -94 38 2 891722482 0 -74 150 3 888333458 1 -178 195 4 882826944 0 -90 190 5 891383687 1 -56 189 4 892683248 0 -196 111 4 881251793 1 -178 8 4 882826556 0 -158 149 3 880132383 1 -94 1 4 885870323 1 -11 185 4 891905783 0 -169 133 4 891359171 1 -25 189 5 885852488 0 -95 111 4 879194012 1 -158 62 5 880134759 1 -24 178 5 875323676 0 -73 100 4 888626120 1 -74 137 3 888333458 0 -125 73 5 892838288 0 -60 98 4 883326463 1 -84 7 4 883452155 0 -165 69 3 879525799 1 -114 182 3 881259994 1 -91 181 5 891439243 0 -1 183 5 875072262 1 -136 19 4 882693529 1 -138 150 3 879023131 1 -128 48 4 879967767 1 -85 45 3 879455197 0 -14 172 5 890881521 0 -13 153 4 882139901 0 -109 91 4 880582384 1 -49 116 4 888066109 0 -152 191 5 880149963 1 -186 44 5 879023529 0 -119 147 4 886176486 1 -176 13 4 886047994 1 -121 98 5 891388210 0 -128 65 4 879968512 1 -41 100 4 890687242 0 -145 5 3 875272196 0 -167 136 4 892738418 0 -6 195 4 883602283 1 -151 83 5 879524611 1 -108 21 3 879880141 0 -8 144 5 879362286 1 -5 100 5 875635349 1 -13 154 5 882141335 1 -119 174 4 874781303 0 -135 185 4 879857797 1 -38 1 5 892430636 0 -157 137 5 886889876 0 -10 99 5 877889130 0 -44 148 4 878346946 1 -159 103 1 880557604 0 -11 100 4 891902718 0 -5 143 3 875636815 0 -10 194 4 877886661 1 -167 133 5 892738453 0 -50 9 4 877052297 0 -131 19 4 883681418 1 -180 156 5 877127747 1 -60 163 4 883327566 0 -193 2 3 890860198 1 -174 28 5 886434547 1 -38 145 1 892433062 0 -118 184 5 875385057 1 -195 67 2 874825826 0 -122 175 5 879270084 1 -1 128 4 875072573 0 -188 79 5 875072393 1 -186 117 5 879023607 1 -87 7 4 879875735 0 -128 1 4 879966919 1 -64 151 3 879366214 1 -194 161 4 879523576 0 -96 1 5 884403574 1 -122 187 4 879270424 1 -151 172 5 879524325 1 -158 50 4 880133306 0 -51 64 4 883498936 0 -7 183 4 891351624 0 -178 117 4 882824467 1 -94 68 4 891722432 1 -59 131 4 888205410 0 -197 89 5 891409798 1 -198 193 4 884207833 1 -60 82 3 883327493 0 -178 98 5 882826944 1 -183 88 3 891466760 0 -199 111 3 883783042 1 -7 101 5 891350966 1 -125 136 5 879454309 1 -60 61 4 883326652 0 -160 32 5 876859413 0 -5 176 3 875635962 1 -7 136 5 891351813 1 -102 47 2 888803636 0 -64 161 3 889739779 0 -160 109 2 876857844 1 -16 160 4 877722001 0 -76 197 5 875028563 1 -52 15 5 882922204 1 -128 58 3 879968008 0 -92 159 4 875810543 0 -178 25 3 888514710 0 -13 100 5 882140166 1 -102 98 4 888802939 1 -6 193 3 883601529 0 -163 98 4 891220196 0 -167 169 1 892738419 0 -121 137 5 891388501 1 -13 71 4 882398654 1 -59 45 5 888204465 1 -182 121 3 885613117 1 -64 64 4 889737454 0 -151 49 3 879543055 1 -83 122 1 886534501 1 -139 127 5 879538578 0 -110 77 4 886988202 0 -130 94 5 875802058 1 -200 196 4 884126833 0 -16 99 5 877720733 1 -75 100 5 884049875 0 -95 151 4 879193353 1 -182 100 3 885613067 1 -150 93 4 878746889 0 -164 118 5 889401852 0 -169 127 4 891359354 1 -196 25 4 881251955 1 -151 200 3 879525002 0 -60 88 4 883327684 1 -60 143 3 883327441 0 -191 86 5 891562417 0 -99 69 4 885679833 1 -125 198 3 879454385 1 -75 125 3 884050164 0 -95 64 5 879197685 0 -1 148 2 875240799 0 -141 151 2 884585039 0 -145 7 5 875270429 1 -5 69 1 875721555 1 -130 66 5 875802173 1 -43 63 3 883956353 1 -70 128 4 884067339 0 -119 24 4 886177076 0 -50 125 2 877052502 0 -157 1 5 874813703 1 -1 112 1 878542441 0 -144 96 5 888105691 0 -165 181 5 879525738 0 -109 94 4 880579787 1 -37 161 5 880915902 1 -187 86 4 879465478 1 -145 39 4 875271838 0 -70 48 4 884064574 0 -92 161 2 875654125 0 -21 118 1 874951382 1 -7 181 3 891351287 0 -94 100 5 885872942 1 -7 7 5 891352220 1 -194 175 3 879521595 0 -187 175 2 879465241 0 -43 17 3 883956417 1 -60 21 3 883327923 0 -94 82 4 891721777 1 -30 28 4 885941321 1 -160 118 3 876768828 0 -18 188 3 880129388 0 -43 98 5 875981220 1 -151 79 4 879524642 0 -85 89 4 879454075 0 -1 193 4 876892654 1 -128 118 5 879968896 0 -15 9 4 879455635 0 -135 183 4 879857723 0 -90 79 4 891383912 1 -25 50 5 885852150 0 -87 87 4 879877931 0 -195 46 3 891762441 0 -151 183 3 879524642 1 -42 183 4 881107821 0 -175 183 4 877107942 1 -18 47 3 880131262 1 -50 123 4 877052958 1 -79 7 5 891272016 0 -184 69 3 889908694 0 -188 56 4 875071658 0 -83 63 4 880327970 1 -73 180 4 888626577 0 -101 121 4 877137015 1 -180 28 3 877355568 1 -199 117 3 883782879 1 -45 100 5 881010742 1 -117 109 4 880126336 0 -60 132 4 883325944 0 -197 62 2 891410039 1 -144 193 4 888105287 1 -115 32 5 881171348 0 -130 39 4 875801496 0 -84 148 4 883452274 0 -87 25 4 879876811 1 -178 187 4 882826049 0 -90 14 5 891383987 0 -87 64 5 879875649 1 -156 124 3 888185677 1 -22 110 1 878887157 0 -152 67 5 882477689 1 -18 193 5 880131358 1 -189 15 2 893264335 0 -144 181 4 888104032 1 -125 63 3 892838558 1 -7 154 5 891353124 0 -186 31 4 879023529 0 -64 9 4 889738085 0 -94 170 5 891725362 0 -72 127 5 880037702 0 -72 177 4 880037204 1 -181 25 5 878962675 1 -124 96 4 890399864 1 -8 56 5 879362183 0 -194 44 4 879524007 0 -87 63 4 879876848 1 -64 17 3 889739733 0 -174 21 1 886515209 0 -14 9 4 879119260 0 -92 96 4 875656025 1 -167 126 3 892738141 0 -69 150 5 882072920 0 -119 199 5 874781994 0 -18 169 5 880130252 1 -148 116 5 877398648 1 -101 109 2 877136360 0 -7 166 3 891351585 0 -44 5 4 878347598 0 -73 89 5 888625685 1 -185 28 5 883524428 1 -198 175 3 884207239 0 -38 118 5 892431151 0 -25 8 4 885852150 0 -18 170 5 880130515 1 -72 121 3 880036048 0 -37 22 5 880915810 0 -69 100 5 882072892 1 -117 98 4 881012430 1 -25 169 5 885852301 1 -7 185 5 892135346 1 -92 102 2 875813376 0 -128 14 5 879967341 0 -67 7 5 875379794 1 -87 97 5 879877825 1 -58 64 5 884305295 0 -46 151 4 883616218 1 -27 121 4 891543191 1 -12 28 5 879958969 0 -60 180 4 883326028 0 -7 191 5 891351201 0 -57 151 3 883697585 1 -167 73 2 892738452 1 -156 180 5 888185777 0 -72 100 5 880035680 1 -56 195 5 892676429 0 -117 143 1 881012472 0 -46 181 4 883616254 1 -164 181 5 889401906 0 -95 90 2 880572166 0 -197 127 5 891409839 0 -29 98 4 882821942 1 -7 139 3 891354729 1 -92 46 4 875653867 0 -101 24 4 877136391 0 -77 52 5 884753203 0 -200 2 4 884130046 0 -77 144 3 884752853 0 -48 170 4 879434886 1 -136 42 3 882848866 1 -10 160 4 877888944 1 -25 13 4 885852381 0 -42 79 5 881108040 1 -94 96 3 885872942 1 -109 68 3 880582469 0 -144 32 4 888105287 1 -109 196 4 880578358 0 -152 51 4 882476486 1 -92 109 3 886443351 1 -25 197 3 885852059 1 -102 167 2 892993927 0 -110 28 4 886987979 1 -64 71 3 879365670 1 -91 64 4 891439243 0 -163 97 4 891220019 0 -184 22 3 889908985 0 -109 183 5 880572528 1 -160 123 4 876768949 1 -95 142 4 880572249 0 -63 106 2 875748139 0 -6 81 4 883602283 0 -95 185 3 879197886 1 -62 176 5 879373768 1 -128 136 5 879967080 1 -141 117 4 884584929 0 -184 91 3 889909988 0 -144 93 1 888104032 0 -77 89 5 884733839 1 -10 176 4 877889130 0 -119 105 2 874775849 0 -144 191 4 888105081 1 -48 195 5 879434954 1 -70 89 4 884150202 1 -64 156 4 889737506 0 -102 50 4 888801315 1 -70 169 4 884149688 1 -59 118 5 888203234 1 -1 200 3 876893098 1 -174 14 5 886433771 1 -66 15 3 883601456 0 -175 9 4 877108146 0 -62 180 4 879373984 0 -151 160 4 879542670 1 -1 180 3 875072573 0 -151 64 5 879524536 0 -194 98 4 879521329 1 -125 120 1 892839312 1 -56 38 2 892683533 1 -178 134 3 882826983 0 -102 184 2 888801465 1 -23 13 4 874784497 0 -43 91 3 883956260 0 -41 174 4 890687264 1 -43 153 5 883955135 1 -48 132 5 879434886 0 -184 137 5 889907685 1 -38 82 5 892429903 0 -194 12 5 879520916 0 -109 172 5 880572528 1 -177 100 5 880130600 0 -59 95 2 888204758 1 -92 94 3 875812876 0 -83 106 4 887665549 0 -125 194 5 879454986 0 -194 195 3 879521657 0 -106 22 4 881449830 1 -115 82 4 881172117 1 -160 161 3 876861185 1 -8 7 3 879362287 0 -91 161 3 891439353 1 -70 121 3 884148728 0 -138 116 2 879022956 1 -94 102 3 891721462 1 -103 50 5 880416864 0 -144 19 4 888103929 0 -43 95 4 875975687 0 -18 64 5 880132501 1 -99 12 5 885680458 0 -18 99 5 880130829 0 -16 51 4 877726390 1 -17 125 1 885272538 0 -151 87 4 879524420 1 -5 79 3 875635895 0 -145 3 3 875271562 1 -115 89 5 881172049 0 -117 56 5 881011807 1 -125 1 4 879454699 0 -37 195 5 880915874 0 -187 196 4 879465507 0 -85 94 3 882995966 1 -94 88 3 891721942 1 -130 33 5 876252087 1 -48 172 5 879434791 0 -23 71 3 874789299 0 -148 163 4 877021402 0 -20 95 3 879669181 1 -81 124 3 876534594 0 -85 157 3 879454400 1 -95 161 3 879196298 1 -65 48 5 879217689 0 -174 197 5 886434547 1 -23 191 3 877817113 0 -83 1 4 880306903 1 -1 85 3 875073180 0 -90 17 4 891384721 1 -59 140 1 888206445 1 -145 38 3 888398747 0 -87 183 4 879875734 1 -92 173 3 875656535 1 -58 61 5 884305271 1 -43 175 2 875981304 1 -13 196 4 882140552 1 -87 73 3 879877083 0 -194 198 3 879522021 1 -152 151 4 880148735 0 -102 164 3 888803002 1 -1 91 5 876892636 1 -198 197 4 884208200 1 -22 118 4 878887983 0 -49 111 2 888068686 0 -72 96 5 880037203 1 -92 53 3 875656392 0 -148 7 5 877017054 0 -49 95 2 888067031 1 -70 197 4 884149469 1 -160 24 5 876769689 0 -95 3 1 879193881 1 -83 117 5 880307000 0 -18 19 3 880130582 1 -97 79 5 884238817 0 -49 123 1 888068195 0 -119 182 4 874781303 1 -91 174 5 891439090 1 -158 82 5 880134398 1 -181 103 1 878962586 1 -60 197 4 883326620 1 -16 161 5 877726390 0 -70 139 3 884150656 0 -130 176 5 881536127 0 -15 7 1 879455506 0 -130 28 4 875217172 1 -92 135 4 875652981 1 -92 67 3 875907436 0 -200 183 5 884128554 0 -200 8 4 884128904 1 -85 160 3 879454075 0 -38 79 3 892430309 0 -130 174 5 875216249 0 -37 11 4 880915838 0 -87 33 3 879876488 1 -185 86 5 883524428 1 -6 59 5 883601713 1 -90 149 3 891384754 0 -197 190 3 891410082 1 -183 159 4 892323452 0 -102 101 4 883748488 0 -7 79 4 891352261 1 -83 181 4 880306786 1 -130 99 5 875216786 1 -117 195 5 881012255 1 -119 83 4 886176922 0 -28 145 3 881961904 0 -99 3 3 885679237 0 -106 88 3 881453097 1 -178 181 5 882823832 0 -16 76 5 877719863 1 -57 100 5 883698581 0 -1 10 3 875693118 0 -67 122 3 875379566 1 -178 55 4 882826394 1 -151 121 5 879525054 1 -121 57 5 891390014 0 -174 124 5 886514168 1 -198 95 3 884207612 1 -184 64 4 889909045 1 -6 124 5 883599228 0 -7 131 5 891352383 0 -85 70 4 879828328 1 -80 199 2 887401353 0 -95 48 4 879197500 1 -44 118 3 878341197 1 -1 129 5 887431908 0 -18 131 4 880131004 1 -16 182 5 877719863 1 -44 91 2 878348573 1 -115 12 5 881171982 1 -7 121 5 891352904 1 -135 79 3 879857843 0 -200 112 3 884127370 1 -101 50 4 877135944 0 -121 192 4 891388250 0 -178 96 4 882826782 1 -184 116 4 889910481 1 -66 21 1 883601939 0 -137 15 4 881432965 0 -92 184 3 877383934 0 -153 56 5 881371140 0 -10 168 4 877888812 0 -70 189 4 884150202 0 -116 65 2 876454052 0 -136 100 5 882693338 0 -5 144 3 875636141 0 -16 31 5 877717956 0 -194 188 4 879522158 1 -44 191 4 878347234 0 -198 176 4 884207136 0 -49 172 1 888067691 1 -94 76 4 891720827 1 -83 110 4 880309185 0 -6 56 4 883601277 1 -23 98 5 874786016 1 -193 29 3 889126055 1 -125 174 5 879454309 1 -158 137 5 880132443 1 -137 51 1 881433605 0 -95 101 1 879198800 1 -56 70 4 892676996 0 -1 130 3 875072002 1 -152 80 5 882477572 1 -41 153 4 890687087 1 -12 200 1 879959610 1 -130 128 4 876251728 0 -49 11 3 888069458 0 -76 121 2 882607017 0 -130 184 4 875801695 1 -5 185 3 875720692 0 -43 191 5 875981247 1 -99 107 3 885679138 1 -200 148 4 876042340 0 -62 125 4 879372347 0 -144 105 2 888104767 1 -82 140 3 878769668 0 -16 156 4 877719863 1 -72 161 5 880037703 0 -94 70 4 891722511 1 -92 148 2 877383934 0 -125 98 5 879454345 1 -130 195 5 875801470 0 -7 126 3 891353254 1 -75 190 5 884051948 1 -102 99 2 883748488 0 -92 43 3 875813314 1 -178 28 5 882826806 1 -75 151 5 884050502 0 -81 151 2 876533946 1 -49 175 5 888068715 0 -59 186 5 888205660 1 -76 23 5 875027355 0 -49 185 5 888067307 1 -44 164 4 878348035 0 -18 1 5 880130802 1 -128 86 5 879966919 0 -24 56 4 875323240 1 -72 172 1 880037119 1 -77 100 3 884732716 1 -14 15 4 879119390 0 -189 79 3 893265478 1 -23 143 3 874786066 1 -49 55 4 888068057 1 -99 66 3 886519047 0 -18 97 4 880131525 1 -144 180 4 888105873 0 -14 42 4 879119579 1 -102 163 2 892993190 0 -198 79 3 884208518 0 -130 69 5 875216718 0 -118 22 5 875385136 0 -48 28 2 879434653 1 -14 176 1 890881484 1 -186 100 4 879023115 1 -23 133 4 874786220 1 -60 13 4 883327539 0 -82 185 3 878769334 1 -64 1 4 879366214 1 -102 94 2 892993545 1 -115 187 5 881171203 1 -11 194 4 891904920 1 -59 172 5 888204552 0 -60 200 4 883326710 0 -85 127 5 879829301 0 -196 94 3 881252172 0 -144 65 4 888106182 0 -184 58 4 889908984 1 -189 31 3 893266027 0 -142 55 2 888640489 0 -5 89 5 875636033 1 -70 185 4 884149753 1 -13 173 2 882139863 1 -151 164 5 879542984 0 -117 117 5 880126461 1 -145 69 5 882181632 0 -8 183 5 879362233 0 -71 151 1 877319446 1 -145 79 5 875271838 1 -198 82 3 884209451 0 -119 117 5 874775535 0 -181 150 1 878962465 1 -130 147 4 876250746 0 -109 158 1 880579916 0 -42 196 5 881107718 1 -97 174 4 884238817 0 -6 187 4 883600914 1 -1 103 1 878542845 0 -85 154 4 879828777 1 -101 122 1 877136928 0 -194 83 3 879521254 0 -90 191 5 891384424 0 -125 87 5 892836464 1 -188 127 4 875072799 1 -16 28 5 877727122 1 -94 12 4 886008625 1 -87 68 3 879876074 1 -174 40 4 886514985 1 -69 129 3 882072778 1 -67 123 4 875379322 1 -178 15 5 882823858 0 -59 71 3 888205574 1 -92 124 4 886440530 1 -144 197 4 888106106 0 -79 13 3 891271676 0 -44 96 4 878347633 0 -150 147 4 878746442 0 -168 100 4 884287362 1 -1 118 3 875071927 0 -197 161 4 891410039 0 -177 22 4 880130847 0 -102 144 3 888801360 0 -158 127 5 880132356 0 -60 138 2 883327287 0 -187 191 5 879465566 0 -189 135 4 893265535 0 -145 100 5 875270458 0 -82 70 4 878769888 1 -194 144 4 879547671 1 -197 79 5 891409839 1 -58 69 1 884663351 1 -64 69 4 889739091 0 -90 182 3 891383599 1 -42 172 5 881107220 0 -83 105 2 891182288 1 -137 117 5 881433015 0 -45 1 5 881013176 1 -110 195 2 886988480 0 -49 108 2 888068957 1 -194 25 2 879540807 1 -174 162 5 886514108 0 -87 186 5 879876734 0 -45 21 3 881014193 1 -18 126 5 880130680 0 -21 100 5 874951292 1 -92 164 4 875656201 0 -94 61 5 891720761 0 -184 72 3 889909988 0 -90 150 3 891385250 0 -194 7 3 879538898 0 -1 54 3 878543308 0 -27 100 5 891543129 0 -90 131 5 891384066 1 -1 24 3 875071713 1 -172 178 3 875538027 1 -198 196 3 884208098 1 -64 72 4 889740056 0 -11 109 3 891903836 1 -56 122 2 892911494 0 -144 176 4 888105338 0 -132 124 4 891278996 0 -42 194 5 881107329 0 -24 100 5 875323637 0 -193 127 5 890860351 0 -62 181 4 879372418 1 -7 190 5 891351728 1 -16 174 5 877719504 0 -5 80 2 875636511 0 -64 95 4 889737691 0 -72 180 4 880036579 1 -145 42 5 882181785 1 -92 101 2 875656624 1 -145 51 3 875272786 1 -168 15 5 884287362 0 -94 193 5 891720498 1 -156 197 5 888185777 1 -177 172 5 880130990 0 -62 20 4 879372696 0 -10 195 4 877889130 1 -130 168 3 875216786 0 -87 192 3 879877741 1 -46 7 4 883616155 0 -43 181 4 875975211 0 -59 82 5 888205660 0 -18 162 4 880131326 1 -193 155 4 889126376 1 -59 18 4 888203313 0 -92 66 3 875812279 1 -128 50 4 879967268 1 -110 68 2 886988631 1 -64 58 3 889739625 1 -1 86 5 878543541 0 -49 39 2 888068194 1 -102 181 2 888801406 0 -130 173 3 875216593 1 -198 182 4 884207946 1 -60 161 4 883327265 0 -200 50 5 884128400 1 -115 93 3 881170332 0 -158 183 3 880134332 1 -58 50 4 884304328 1 -70 109 3 884066514 1 -184 174 3 889908693 1 -18 70 4 880129668 0 -7 161 3 891352489 1 -14 116 5 876965165 1 -92 93 4 886444049 1 -83 94 4 880308831 0 -54 50 5 880931687 1 -10 13 3 877892050 0 -157 93 3 886890692 1 -177 198 4 880131161 1 -49 70 2 888066614 0 -1 196 5 874965677 0 -197 174 5 891409798 0 -92 89 5 875652981 1 -59 109 4 888203175 0 -95 7 5 879197329 1 -38 140 5 892430309 1 -16 134 4 877719158 0 -56 168 2 892679209 1 -98 116 5 880499053 1 -43 11 5 875981365 0 -95 69 5 879198210 1 -56 44 4 892679356 0 -18 13 5 880131497 1 -7 72 5 891353977 1 -64 96 4 889737748 0 -23 70 2 874786513 0 -20 121 3 879668227 1 -200 147 5 876042451 1 -1 39 4 875072173 0 -184 11 3 889908694 1 -76 200 5 882606216 1 -106 48 3 881453290 0 -10 183 5 877893020 0 -59 98 5 888204349 1 -59 200 5 888205370 0 -57 199 5 883698646 0 -104 150 5 888465225 0 -106 194 5 881450758 1 -59 39 4 888205033 0 -44 193 3 878348521 0 -108 10 5 879879834 0 -64 12 5 889738085 0 -135 12 4 879857764 1 -156 22 3 888186093 1 -1 164 3 876893171 1 -141 120 4 884585547 0 -87 8 5 879876447 0 -101 123 2 877136186 1 -194 99 3 879524643 0 -28 89 4 881961104 1 -177 168 4 880130807 1 -92 144 4 875810741 0 -58 150 4 884304570 0 -73 81 5 888626415 0 -194 127 5 879520813 0 -41 1 4 890692860 0 -91 134 4 891439353 1 -138 185 4 879023853 0 -104 147 3 888466002 0 -125 69 4 879454628 0 -189 134 5 893265239 0 -58 198 3 884305123 1 -79 150 3 891271652 1 -109 157 4 880577961 1 -181 9 4 878962675 0 -96 50 5 884402977 0 -16 9 5 877722736 0 -94 175 4 885870613 0 -194 94 3 879528000 0 -4 50 5 892003526 1 -8 127 5 879362123 0 -198 65 2 884208241 0 -130 111 5 874953825 1 -8 188 5 879362356 1 -58 123 4 884650140 1 -72 87 4 880036638 1 -189 194 5 893265428 0 -159 117 5 880486047 1 -11 22 4 891904241 1 -95 178 5 879197652 0 -200 123 4 884127568 0 -154 89 5 879138910 1 -95 181 4 879193353 1 -89 14 4 879441357 1 -10 132 5 877893020 0 -74 129 3 888333458 1 -64 199 4 889737654 0 -115 181 4 881172049 1 -189 174 5 893265160 1 -1 36 2 875073180 1 -23 189 5 874785985 1 -92 154 4 875657681 1 -152 22 5 882828490 0 -13 185 3 881515011 0 -128 98 4 879967047 0 -118 164 5 875385386 1 -18 135 3 880130065 0 -184 57 5 889908539 1 -14 23 5 890881216 0 -118 32 5 875384979 0 -189 9 3 893263994 1 -1 23 4 875072895 0 -188 66 3 875075118 1 -186 118 2 879023242 0 -92 62 3 875660468 1 -14 168 4 879119497 0 -128 99 4 879967840 0 -158 116 5 880132383 0 -94 135 4 885870231 1 -52 93 4 882922357 1 -84 194 5 883453617 1 -85 192 4 879454951 0 -71 65 5 885016961 1 -103 96 4 880422009 0 -188 161 3 875073048 1 -174 67 1 886515130 0 -180 173 5 877128388 1 -13 24 1 882397741 0 -90 148 2 891385787 1 -10 186 4 877886722 1 -189 16 3 893264335 0 -125 83 4 879454345 1 -154 143 3 879139003 1 -15 1 1 879455635 0 -71 50 3 885016784 1 -10 199 4 877892050 0 -59 50 5 888205087 1 -159 121 3 880486071 1 -109 121 5 880571741 1 -118 193 5 875384793 0 -60 64 4 883325994 0 -22 172 4 878887680 1 -11 175 3 891904551 1 -56 90 2 892677147 1 -71 135 4 885016536 0 -174 13 3 891551777 1 -200 135 4 884128400 0 -109 7 4 880563080 0 -1 73 3 876892774 0 -151 153 3 879524326 1 -118 17 3 875385257 0 -42 63 4 881108873 1 -148 78 1 877399018 1 -193 100 5 889124127 0 -176 50 5 886047879 1 -185 15 3 883525255 1 -63 116 5 875747319 0 -59 142 1 888206561 1 -96 23 5 884403123 0 -181 146 1 878962955 0 -82 151 2 876311547 1 -62 164 5 879374946 0 -58 195 4 884305123 1 -194 193 4 879524790 0 -1 67 3 876893054 1 -194 71 4 879524291 1 -160 137 4 876767299 0 -54 118 4 880937813 1 -8 176 5 879362233 1 -56 25 4 892911166 1 -188 181 3 875072148 0 -72 135 4 880037054 1 -38 28 4 892429399 0 -164 121 5 889402203 0 -196 8 5 881251753 0 -14 50 5 890881557 0 -13 27 3 882397833 1 -94 52 5 891721026 0 -158 172 4 880134398 0 -23 1 5 874784615 0 -38 22 5 892429347 1 -31 124 4 881548110 1 -102 5 3 888803002 0 -70 96 4 884066910 0 -119 100 5 874774575 1 -37 176 4 880915942 1 -160 23 5 876859778 1 -24 109 3 875322848 0 -188 185 4 875071710 1 -1 65 4 875072125 0 -200 88 4 884128760 0 -72 117 4 880035588 1 -144 190 5 888105714 1 -18 151 3 880131804 1 -12 50 4 879959044 1 -44 21 2 878346789 1 -130 122 3 876251090 0 -1 190 5 875072125 1 -141 1 3 884584753 1 -60 56 4 883326919 1 -6 189 3 883601365 1 -74 121 4 888333428 1 -25 114 5 885852218 0 -178 71 4 882826577 0 -48 181 5 879434954 1 -22 153 5 878886423 0 -76 98 5 875028391 0 -10 56 5 877886598 1 -64 175 5 889739415 0 -184 67 3 889912569 0 -125 94 5 892839065 0 -2 19 3 888550871 1 -97 192 1 884238778 0 -69 147 3 882072920 0 -188 164 4 875072674 0 -87 161 5 879875893 1 -110 11 4 886987922 1 -90 180 4 891384065 0 -178 16 4 882823905 1 -18 152 3 880130515 1 -151 51 4 879543055 0 -144 165 4 888105993 0 -56 169 4 892683248 0 -160 7 3 876767822 0 -64 62 2 889740654 1 -189 176 4 893265214 0 -106 196 5 881450578 0 -26 150 3 891350750 0 -90 83 5 891383687 1 -26 127 5 891386368 0 -94 55 4 885873653 0 -181 13 2 878962465 0 -42 118 4 881105505 1 -102 96 3 888801316 0 -22 154 4 878886423 1 -11 40 3 891905279 1 -62 3 3 879372325 1 -81 98 5 876534854 0 -20 144 2 879669401 1 -64 70 5 889739158 1 -123 132 3 879872672 1 -1 100 5 878543541 0 -115 9 5 881171982 1 -43 173 5 875981190 0 -92 22 3 875653121 0 -158 117 3 880132719 1 -42 72 3 881108229 0 -198 33 3 884209291 1 -157 147 5 886890342 0 -178 196 4 882827834 1 -130 143 5 876251922 1 -132 154 4 891278996 1 -70 191 3 884149340 0 -151 163 4 879542723 1 -200 56 4 884128858 0 -94 17 2 891721494 0 -42 95 5 881107220 1 -193 56 1 889125572 1 -38 133 2 892429873 0 -95 79 4 879196231 0 -21 148 1 874951482 0 -72 51 4 880036946 0 -22 194 5 878886607 0 -6 87 4 883602174 1 -103 69 3 880420585 1 -145 195 5 882181728 0 -31 79 2 881548082 0 -114 100 5 881259927 0 -193 147 2 890860290 1 -10 127 5 877886661 1 -198 154 4 884208098 1 -183 54 2 891467546 0 -161 187 3 891170998 1 -22 195 4 878887810 1 -59 101 5 888206605 0 -156 11 2 888185906 0 -65 7 1 879217290 1 -59 33 3 888205265 0 -119 40 4 886176993 0 -109 162 2 880578358 0 -82 8 4 878769292 1 -10 133 5 877891904 1 -108 14 5 879879720 1 -130 44 4 875801662 0 -63 126 3 875747556 0 -95 43 2 880572356 0 -24 9 5 875323745 1 -161 191 2 891171734 1 -165 91 4 879525756 0 -115 50 5 881172049 0 -158 186 3 880134913 0 -56 7 5 892679439 1 -117 25 4 881009470 0 -184 9 5 889907685 0 -174 56 5 886452583 0 -102 79 2 888801316 1 -10 98 4 877889261 0 -200 125 5 876041895 1 -11 94 3 891905324 1 -64 154 4 889737943 0 -60 77 4 883327040 0 -109 58 4 880572950 1 -92 28 3 875653050 0 -1 154 5 878543541 0 -184 143 3 889908903 0 -74 124 3 888333542 1 -90 143 5 891383204 1 -95 191 5 879198161 1 -114 96 3 881259955 0 -116 137 2 876454308 1 -28 70 4 881961311 1 -114 186 3 881260352 1 -85 163 3 882813312 1 -158 184 3 880134407 0 -59 183 5 888204802 1 -115 178 5 881172246 1 -97 32 5 884239791 0 -198 183 5 884207654 1 -141 106 5 884585195 0 -194 192 5 879521253 0 -38 88 5 892430695 0 -122 46 5 879270567 0 -10 1 4 877888877 0 -87 118 4 879876162 0 -108 137 5 879879941 1 -7 176 3 891350782 0 -62 168 5 879373711 1 -82 199 4 878769888 1 -158 148 4 880132613 1 -134 15 5 891732726 1 -118 134 5 875384916 1 -151 189 5 879528495 1 -189 127 4 893263994 0 -174 138 1 891551778 1 -42 77 5 881108684 1 -130 41 3 875801662 0 -83 35 1 886534501 1 -20 98 3 879669547 1 -41 181 4 890687175 0 -1 161 4 875072303 1 -56 164 4 892910604 0 -45 108 4 881014620 0 -70 69 4 884065733 0 -22 168 5 878886517 1 -144 160 2 888106181 0 -16 195 5 877720298 1 -161 135 2 891170656 0 -56 77 3 892679333 1 -1 62 3 878542282 0 -198 174 5 884208326 1 -156 48 4 888185777 1 -44 147 4 878341343 0 -26 13 3 891373086 0 -195 55 4 888737417 0 -49 100 4 888067307 0 -125 88 5 879455184 0 -90 45 3 891385039 1 -195 132 5 875771441 1 -175 132 3 877107712 1 -43 56 5 875975687 1 -120 148 3 889490499 1 -174 122 1 886434421 1 -13 109 4 882141306 0 -58 13 3 884304503 0 -30 7 4 875140648 0 -64 4 3 889739138 0 -158 154 4 880135069 1 -200 140 4 884129962 0 -160 1 4 876768025 0 -64 52 3 889739625 1 -94 161 3 891721439 1 -43 77 3 883955650 1 -160 50 4 876767572 0 -48 71 3 879434850 0 -87 120 2 879877173 0 -11 51 4 891906439 0 -181 147 1 878963168 1 -87 4 5 879876524 0 -90 33 4 891383600 0 -130 68 5 875216283 1 -71 154 3 877319610 0 -68 125 1 876974096 0 -115 77 2 881171623 0 -194 180 3 879521657 0 -72 38 3 880037307 1 -194 64 5 879521936 0 -58 89 3 884305220 0 -43 155 4 883956518 1 -115 22 3 881171273 0 -11 191 4 891904270 0 -193 194 4 889125006 1 -81 147 4 876533389 1 -94 92 4 891721142 0 -85 95 4 879455114 1 -23 50 4 874784440 1 -58 120 2 892242765 0 -60 199 5 883326339 0 -62 14 4 879372851 1 -91 97 5 891438947 1 -93 125 1 888705416 0 -62 162 4 879375843 1 -6 100 5 883599176 1 -96 96 4 884403531 1 -125 50 5 892836362 1 -24 117 4 875246216 0 -154 135 5 879139003 1 -64 125 2 889739678 1 -184 164 3 889911434 1 -114 179 5 881260611 0 -73 173 5 888625292 1 -123 143 5 879872406 0 -98 173 1 880498935 1 -62 55 5 879373692 1 -96 79 4 884403500 0 -10 144 4 877892110 1 -194 95 3 879521719 0 -96 198 5 884403465 0 -58 194 3 884304747 1 -182 123 4 885612994 1 -128 54 2 879968415 1 -94 23 5 885870284 1 -70 193 4 884149646 0 -144 195 5 888105081 1 -13 11 1 882397146 1 -76 89 4 875027507 0 -1 188 3 875073128 0 -70 186 4 884065703 1 -92 2 3 875653699 1 -43 71 4 883955675 1 -49 179 5 888066446 1 -44 176 5 883613372 0 -58 32 5 884304812 0 -1 102 2 889751736 1 -1 69 3 875072262 0 -89 150 5 879441452 1 -94 8 5 885873653 0 -158 124 4 880134261 1 -82 174 5 878769478 1 -64 157 4 879365491 0 -62 47 4 879375537 0 -90 155 5 891385040 1 -177 59 4 880130825 0 -121 181 5 891390014 1 -152 157 5 882476486 1 -96 176 4 884403758 0 -14 18 3 879119260 1 -102 102 3 883748488 1 -7 118 2 891353411 0 -92 73 3 875656474 0 -16 7 5 877724066 0 -7 53 5 891354689 0 -11 12 2 891904194 1 -85 179 4 879454272 0 -56 64 5 892678482 1 -194 70 3 879522324 0 -145 122 1 888398307 1 -87 90 2 879877127 0 -75 118 3 884050760 1 -43 51 1 883956562 0 -120 125 4 889490447 0 -186 95 3 879024535 1 -20 87 5 879669746 1 -178 39 2 882827645 0 -59 173 5 888205144 1 -44 161 4 878347634 1 -23 109 3 874784466 0 -1 170 5 876892856 0 -92 82 2 875654846 1 -198 198 4 884207654 1 -72 7 1 880036347 1 -128 196 5 879967550 1 -168 9 1 884287394 1 -59 64 5 888204309 1 -177 23 5 880130758 1 -7 99 5 891352557 0 -189 89 5 893265624 1 -109 67 5 880580719 1 -109 173 5 880572786 1 -90 151 2 891385190 1 -94 7 4 885873089 1 -92 56 5 875653271 1 -189 198 4 893265657 1 -95 190 4 888954513 0 -117 179 5 881012776 0 -70 175 3 884150422 1 -194 100 4 879539305 0 -1 38 3 878543075 1 -199 1 1 883782854 0 -124 98 4 890287822 1 -96 185 5 884403866 0 -137 121 5 881432881 0 -1 9 5 878543541 1 -144 173 5 888105902 0 -37 68 5 880915902 1 -73 59 5 888625980 1 -73 135 5 888626371 0 -13 89 4 882139717 0 -181 137 2 878962465 1 -82 97 4 878769777 1 -119 52 3 890627339 1 -116 193 4 876453681 0 -62 9 4 879372182 0 -77 133 2 884752997 1 -10 82 4 877886912 1 -12 170 4 879959374 0 -90 52 5 891385522 1 -90 127 4 891383561 0 -17 117 3 885272724 1 -64 168 5 889739243 1 -28 11 4 881956144 1 -174 158 2 886514921 0 -83 64 5 887665422 1 -158 20 4 880134261 1 -81 1 4 876534949 0 -38 112 5 892432751 1 -195 47 5 876632643 0 -200 58 4 884129301 1 -13 23 5 882139937 1 -11 168 3 891904949 0 -37 89 4 880930072 1 -145 12 5 882182917 0 -144 68 2 888105665 1 -197 188 3 891409982 1 -43 88 5 883955702 0 -59 83 4 888204802 1 -17 150 5 885272654 0 -144 24 4 888104541 0 -22 187 5 878887680 0 -94 154 5 886008791 1 -42 1 5 881105633 0 -38 200 5 892432180 1 -38 69 5 892430486 1 -57 111 4 883697679 0 -87 132 5 879877930 0 -151 136 4 879524293 0 -5 99 3 875721216 1 -150 151 4 878746824 1 -189 131 4 893265710 0 -11 70 4 891904573 0 -200 99 5 884128858 1 -145 150 5 875270655 1 -70 181 4 884064416 1 -6 21 3 883600152 0 -18 6 5 880130764 1 -94 11 5 885870231 1 -89 13 2 879441672 1 -176 111 4 886048040 1 -85 190 4 879453845 0 -37 27 4 880915942 0 -117 33 4 881011697 0 -200 188 4 884129160 1 -110 173 1 886988909 1 -159 24 5 880989865 0 -99 28 3 885680578 0 -96 187 5 884402791 1 -26 1 3 891350625 0 -90 162 5 891385190 1 -64 81 4 889739460 0 -121 124 5 891388063 1 -92 167 3 875656557 1 -23 95 4 874786220 1 -194 31 3 879549793 0 -65 65 3 879216672 1 -85 195 3 882995132 0 -177 154 4 880130600 1 -158 173 5 880134913 0 -178 123 4 882824325 1 -137 181 5 881433015 0 -24 127 5 875323879 0 -13 51 3 882399419 1 -131 124 5 883681313 0 -175 100 2 877107712 1 -109 179 4 880577961 0 -138 13 4 879023345 0 -66 24 3 883601582 1 -194 154 3 879546305 1 -1 22 4 875072404 0 -119 50 5 874774718 0 -5 21 3 875635327 1 -1 21 1 878542772 0 -178 2 4 882827375 0 -83 2 4 881971771 1 -13 4 5 882141306 1 -42 15 4 881105633 1 -168 125 4 884287731 0 -110 96 4 886988449 0 -144 20 4 888104559 0 -193 187 4 890860351 0 -200 1 5 876042340 0 -59 51 5 888206095 0 -198 187 4 884207239 0 -151 98 4 879524088 1 -99 64 5 885680578 0 -178 197 2 882826720 0 -21 123 4 874951382 0 -130 132 5 875802006 1 -27 50 3 891542897 0 -135 173 4 879857723 1 -95 127 4 879195062 0 -85 150 3 890255432 1 -160 169 4 876862077 1 -1 179 3 875072370 0 -56 151 4 892910207 0 -110 69 4 886987860 0 -128 193 3 879967249 0 -198 173 4 884207492 0 -49 91 5 888066979 0 -92 122 3 875907535 0 -37 127 4 880930071 0 -62 188 3 879373638 1 -125 56 1 879454345 1 -13 96 4 882140104 0 -92 153 4 875653605 1 -69 123 4 882126125 1 -186 79 5 879023460 1 -138 187 5 879024043 0 -22 53 3 878888107 1 -118 180 5 875385136 1 -115 7 5 881171982 0 -6 200 3 883602422 1 -101 111 2 877136686 0 -10 162 4 877892210 1 -26 129 4 891350566 0 -25 141 4 885852720 1 -10 161 4 877892050 1 -175 64 5 877107552 1 -189 44 4 893266376 0 -44 143 4 878347392 1 -37 92 4 880930072 1 -92 117 4 875640214 1 -177 161 3 880130915 1 -114 89 5 881260024 0 -81 100 3 876533545 0 -44 1 4 878341315 0 -99 92 4 885680837 1 -59 56 5 888204465 0 -196 70 3 881251842 1 -90 193 4 891383752 0 -18 65 5 880130333 1 -87 38 5 879875940 1 -1 187 4 874965678 0 -2 111 4 888551853 1 -82 111 4 876311423 0 -101 181 4 877137015 1 -18 79 4 880131450 1 -95 98 4 879197385 1 -160 182 5 876770311 1 -128 172 3 879967248 0 -72 147 5 880037702 1 -123 9 5 879873726 0 -70 150 3 884065247 1 -21 17 4 874951695 1 -151 52 5 879524586 0 -178 176 4 882826782 0 -84 98 4 883453755 1 -7 97 5 891351201 1 -23 175 5 874785526 0 -148 69 5 877019101 0 -64 32 1 889739346 1 -151 69 4 879524368 1 -7 135 5 891351547 0 -95 140 3 879199014 1 -97 189 4 884238887 1 -110 55 3 886988449 1 -22 85 5 878886989 1 -64 143 4 889739051 1 -168 121 4 884287731 0 -115 121 3 881170065 0 -87 167 4 879876703 1 -193 73 3 889127237 1 -1 135 4 875072404 1 -84 15 4 883449993 1 -60 97 3 883326215 1 -59 9 4 888203053 0 -189 196 5 893266204 0 -87 100 5 879876488 1 -41 196 3 890687593 0 -83 66 4 880307898 0 -174 1 3 886433898 0 -24 55 5 875323308 1 -6 165 5 883600747 0 -60 181 4 883326754 0 -49 145 1 888067460 0 -184 117 2 889907995 0 -102 56 3 888801360 0 -89 7 5 879441422 0 -7 192 4 891352010 1 -46 125 4 883616284 1 -128 191 4 879967080 0 -102 182 3 889362833 1 -60 121 4 883327664 0 -95 183 5 879197329 1 -54 7 4 880935294 0 -58 176 4 884304936 1 -186 106 2 879023242 1 -18 60 4 880132055 0 -5 135 4 875637536 0 -184 166 3 889910684 1 -157 50 4 886890541 1 -92 29 3 875656624 1 -95 175 5 879197603 0 -196 66 3 881251911 1 -117 122 2 886022187 1 -125 79 5 879454100 0 -60 144 4 883325944 0 -194 197 4 879522021 1 -194 135 3 879521474 1 -158 120 1 880134014 0 -65 50 5 879217689 1 -185 181 4 883524475 0 -26 151 3 891372429 1 -102 185 3 888802940 0 -184 127 5 889907396 0 -85 10 4 879452898 1 -55 117 3 878176047 1 -158 168 5 880134948 1 -195 127 5 875771441 0 -7 91 3 891353860 1 -54 25 4 880936500 1 -38 84 5 892430937 0 -120 15 4 889490244 1 -95 180 3 880570852 0 -97 1 4 884238911 0 -28 164 4 881960945 0 -1 68 4 875072688 0 -96 174 5 884403020 1 -177 12 5 880130825 0 -95 91 5 880573288 1 -182 191 4 876435434 1 -106 12 4 881451234 0 -55 181 4 878176237 1 -42 173 5 881107220 0 -87 62 5 879875996 1 -115 183 5 881171488 0 -183 77 3 891466405 1 -79 19 5 891271792 0 -11 56 4 891904949 1 -72 134 5 880037793 1 -135 98 5 879857765 0 -44 98 2 878347420 0 -14 12 5 890881216 1 -1 146 4 875071561 1 -115 4 4 881172117 1 -130 54 5 876251895 0 -13 99 4 882398654 1 -58 124 5 884304483 0 -75 123 3 884050164 1 -38 70 5 892432424 1 -42 83 4 881108093 1 -10 50 5 877888545 0 -151 137 5 879528754 0 -58 11 5 884305019 1 -65 185 4 879218449 0 -84 111 4 883453108 1 -1 176 5 876892468 1 -96 42 1 884403214 1 -89 187 5 879461246 0 -18 4 3 880132150 0 -96 7 5 884403811 0 -141 121 4 884585071 1 -18 45 5 880130739 1 -122 193 4 879270605 1 -194 178 3 879521253 0 -23 14 4 874784440 0 -145 89 4 882181605 1 -195 59 3 888737346 0 -54 24 1 880937311 1 -65 168 4 879217851 0 -151 86 5 879524345 1 -60 195 4 883326086 0 -43 189 5 875981220 1 -1 166 5 874965677 1 -152 120 2 880149686 0 -189 172 5 893265683 0 -43 25 5 875975656 0 -123 197 5 879872066 0 -101 1 3 877136039 1 -1 138 1 878543006 0 -102 175 4 892991117 1 -160 13 4 876768990 1 -98 168 2 880498834 1 -64 97 3 889738085 1 -187 97 3 879465717 0 -119 96 5 874781257 1 -62 56 5 879373711 0 -92 200 3 875811717 0 -181 15 3 878962816 1 -151 118 3 879542588 1 -190 125 3 891033863 1 -60 128 3 883326566 1 -94 190 5 885870231 0 -1 89 5 875072484 0 -110 33 4 886988631 0 -92 198 5 875653016 0 -158 96 4 880134332 1 -132 56 5 891278996 1 -194 90 3 879552841 0 -1 2 3 876893171 1 -175 193 4 877108098 0 -194 194 4 879523575 0 -196 108 4 881252110 1 -160 100 5 876767023 0 -43 82 4 883955498 1 -14 127 2 879644647 0 -162 11 4 877636772 0 -152 71 5 882900320 1 -6 22 3 883602048 1 -44 200 4 878347633 0 -71 64 4 885016536 1 -76 42 3 882606243 1 -13 83 2 886303585 0 -176 151 4 886048305 1 -193 38 3 889126055 1 -77 97 2 884753292 0 -128 132 3 879966785 1 -124 172 3 890287645 0 -90 117 3 891385389 0 -168 126 5 884287962 1 -95 82 3 879196408 0 -37 82 1 880915942 0 -10 157 5 877889004 0 -198 25 2 884205114 0 -90 175 3 891383912 1 -158 118 5 880132638 0 -6 50 4 883600842 0 -192 50 4 881367505 0 -56 183 5 892676314 1 -38 97 5 892430369 0 -94 25 3 891724142 1 -15 14 4 879455659 1 -23 124 5 874784440 1 -59 123 3 888203343 1 -151 152 3 879525075 1 -110 64 4 886987894 1 -104 126 4 888465513 1 -117 172 5 881012623 1 -189 105 2 893264865 0 -6 169 4 883600943 0 -80 100 5 887401453 0 -95 199 5 880570964 0 -56 158 3 892911539 0 -177 121 2 880131123 1 -165 15 5 879525799 1 -104 10 2 888465413 1 -57 125 3 883697223 0 -87 48 4 879875649 0 -144 187 4 888105312 1 -97 135 5 884238652 1 -110 94 4 886989473 0 -44 135 5 878347259 0 -44 132 4 878347315 0 -59 59 5 888204928 0 -198 168 4 884207654 1 -52 22 5 882922833 1 -64 50 5 889737914 0 -16 143 5 877727192 1 -94 77 3 891721462 1 -92 91 3 875660164 0 -64 162 3 889739262 1 -23 132 4 874785756 1 -18 168 3 880130431 0 -82 168 5 878769748 1 -178 82 5 882826242 0 -200 69 5 884128788 1 -62 70 3 879373960 0 -130 27 4 875802105 0 -7 143 3 892132627 1 -13 200 3 882140552 0 -87 199 5 879875649 1 -18 153 4 880130551 1 -95 31 4 888954513 0 -64 22 4 889737376 0 -200 169 5 884128822 0 -15 13 1 879455940 1 -59 161 3 888205855 1 -59 22 4 888204260 1 -85 57 5 879828107 0 -83 71 3 880328167 1 -16 95 5 877728417 0 -59 99 4 888205033 1 -53 121 4 879443329 1 -184 183 4 889908630 0 -165 176 4 879526007 1 -184 44 4 889909746 1 -95 170 5 880573288 1 -20 181 4 879667904 0 -125 195 5 892836465 0 -144 196 4 888105743 0 -189 99 5 893265684 1 -199 116 5 883782807 1 -60 174 4 883326497 0 -128 121 4 879968278 0 -89 111 4 879441452 1 -180 186 4 877127189 1 -43 111 4 883955745 0 -12 133 4 879959670 1 -114 56 3 881260545 0 -184 176 4 889908740 1 -192 121 2 881368127 0 -85 188 2 879454782 0 -22 167 3 878887023 1 -16 79 5 877727122 0 -60 8 3 883326370 0 -11 57 2 891904552 0 -94 176 4 891720570 0 -198 101 5 884209569 1 -64 11 4 889737376 0 -151 171 5 879524921 0 -188 28 3 875072972 1 -51 83 5 883498937 0 -135 56 4 879857765 0 -77 56 4 884752900 0 -200 177 4 884129656 0 -92 71 5 875654888 1 -92 12 5 875652934 0 -1 30 3 878542515 1 -177 55 3 880131143 1 -123 100 4 879872792 0 -85 170 4 879453748 1 -5 25 3 875635318 1 -85 100 3 879452693 1 -1 63 2 878543196 1 -18 61 4 880130803 1 -151 185 4 879528801 0 -102 168 3 888803537 1 -7 98 4 891351002 0 -5 186 5 875636375 0 -85 28 4 879829301 1 -82 9 4 876311146 0 -141 7 5 884584981 0 -92 92 4 875654846 1 -59 3 4 888203814 1 -49 82 1 888067765 0 -87 22 4 879875817 0 -128 71 4 879967576 1 -110 56 1 886988449 0 -118 7 5 875385198 1 -30 2 3 875061066 0 -16 4 5 877726390 1 -128 197 4 879966729 1 -174 12 5 886439091 0 -158 89 5 880133189 0 -175 147 3 877108146 1 -7 199 5 892135346 1 -37 174 5 880915810 0 -92 54 3 875656624 1 -94 179 5 885870577 1 -152 69 5 882474000 0 -63 108 2 875748164 0 -113 7 3 875076827 0 -151 70 4 879524947 1 -59 55 5 888204553 0 -66 127 4 883601156 1 -7 23 3 891351383 0 -138 182 4 879023948 0 -58 185 2 884304896 1 -56 200 4 892679088 1 -151 181 5 879524394 0 -42 54 4 881108982 1 -177 50 5 880131216 0 -114 156 4 881309662 1 -90 70 5 891383866 1 -7 175 5 892133057 1 -52 121 4 882922382 1 -177 153 4 880130972 1 -22 105 1 878887347 0 -94 192 4 891721142 1 -44 100 5 878341196 0 -183 55 4 891466266 1 -5 194 4 878845197 1 -18 165 4 880129527 0 -80 154 3 887401307 1 -181 105 1 878963304 0 -95 168 4 879197970 1 -95 28 4 879197603 1 -1 32 5 888732909 1 -94 111 4 891721414 0 -49 159 2 888068245 1 -145 156 5 875271896 1 -90 89 5 891385039 1 -157 100 5 886890650 1 -153 50 1 881371140 0 -96 194 2 884403392 0 -70 24 4 884064743 0 -83 69 4 887665549 1 -83 15 4 880307000 1 -7 187 4 891350757 0 -62 50 5 879372216 0 -53 64 5 879442384 0 -11 79 4 891905783 0 -109 79 5 880572721 0 -177 92 4 882142295 0 -76 7 4 875312133 0 -121 165 4 891388210 1 -193 82 2 889125880 0 -94 187 4 885870362 1 -64 82 3 889740199 1 -38 127 2 892429460 0 -18 91 3 880130393 1 -91 132 3 891439503 1 -178 38 3 882827574 1 -70 8 4 884064986 0 -31 32 5 881548030 0 -182 111 4 885613238 0 -162 144 3 877636746 1 -43 97 5 883955293 0 -5 183 4 875636014 1 -136 137 5 882693339 0 -20 94 2 879669954 1 -1 141 3 878542608 1 -69 42 5 882145548 1 -84 1 2 883452108 1 -178 24 3 882824221 0 -119 56 4 874781198 0 -200 28 5 884128458 1 -5 29 4 875637023 0 -73 32 4 888626220 1 -24 180 5 875322847 1 -109 181 5 880563471 0 -43 196 4 875981190 0 -42 43 2 881109325 1 -97 132 5 884238693 1 -57 11 3 883698454 1 -198 1 4 884205081 1 -90 136 5 891383241 1 -95 70 4 880571951 1 -158 39 5 880134398 1 -85 194 4 879454189 1 -23 100 5 874784557 1 -113 124 3 875076307 1 -118 79 5 875384885 1 -194 121 2 879539794 0 -167 96 5 892738307 0 -31 175 5 881548053 0 -96 195 5 884403159 0 -57 64 5 883698431 1 -122 180 5 879270327 0 -177 11 4 880131161 1 -148 50 5 877016805 0 -17 137 4 885272606 1 -91 135 4 891439302 1 -94 90 3 891721889 1 -145 23 4 875271896 1 -18 200 3 880131775 0 -59 111 4 888203095 0 -132 175 3 891278807 1 -15 50 5 879455606 1 -118 132 4 875384793 0 -13 155 2 882399615 0 -2 1 4 888550871 0 -63 15 3 875747439 0 -128 133 5 879967248 1 -52 117 4 882922629 1 -193 94 3 889127592 0 -122 69 2 879270511 0 -71 175 4 885016882 0 -109 29 3 880582783 1 -178 95 5 882826514 0 -123 98 4 879872672 1 -62 1 2 879372813 0 -193 72 2 889127301 1 -92 145 2 875654929 1 -117 144 4 881011807 0 -102 91 3 883748488 0 -91 176 5 891439130 1 -44 81 4 878348499 1 -11 69 3 891904270 1 -142 124 4 888640379 1 -95 193 3 879198482 1 -67 25 4 875379420 0 -116 116 3 876453733 1 -26 126 4 891371676 1 -148 89 5 877398587 1 -10 116 4 877888944 1 -43 140 4 883955110 0 -94 66 2 891721889 1 -72 15 5 880035708 0 -115 33 4 881171693 1 -14 96 4 890881433 0 -85 197 5 879455197 1 -94 56 5 891725331 1 -178 90 3 882827985 0 -92 100 5 875640294 0 -130 82 5 875802080 1 -18 9 5 880130550 1 -26 181 4 891386369 1 -189 132 5 893265865 1 -194 69 4 879521595 0 -44 159 3 878347633 0 -145 117 5 875270655 1 -85 30 3 882995290 1 -176 25 3 886048188 1 -92 143 3 875653960 1 -156 178 5 888185777 0 -118 53 5 875385280 1 -200 107 3 884128022 1 -73 171 5 888626199 1 -137 174 5 881433654 0 -128 159 4 879968390 0 -5 101 5 878844510 1 -144 69 5 888105140 0 -161 181 2 891171848 1 -44 25 2 878346431 0 -94 93 4 891724282 1 -92 160 4 875654125 0 -87 21 3 879877173 1 -60 173 4 883326498 0 -1 40 3 876893230 0 -13 191 3 881515193 0 -178 127 5 882823978 1 -43 133 4 875981483 0 -42 58 5 881108040 1 -177 176 4 880130951 0 -161 186 4 891171530 1 -42 125 4 881105462 1 -75 114 4 884051893 0 -102 38 2 888801622 0 -18 94 3 880131676 1 -138 133 4 879024043 1 -26 24 3 891377540 0 -91 182 4 891439439 0 -6 47 3 883600943 0 -198 56 5 884207392 1 -43 86 4 883955020 1 -1 133 4 876892818 1 -90 26 4 891385842 1 -42 175 2 881107687 0 -144 144 4 888105254 0 -159 72 3 884026946 0 -64 191 4 889740740 0 -116 191 4 876453961 0 -62 91 4 879375196 0 -190 15 4 891033697 1 -97 183 5 884238911 1 -183 176 3 891466266 1 -70 83 4 884065895 0 -197 56 1 891409799 0 -96 181 5 884403687 0 -15 118 1 879456381 0 -44 24 3 878346575 0 -120 121 4 889490290 0 -58 171 5 884663379 1 -58 172 5 884305241 1 -118 56 5 875385198 1 -199 93 4 883782825 0 -102 53 2 888801577 1 -24 69 5 875323051 0 -7 140 5 891353124 0 -53 118 4 879443253 1 -16 11 5 877718755 1 -188 5 4 875074266 1 -8 195 5 879362287 1 -85 27 4 879827488 1 -60 59 5 883326155 1 -64 182 4 889738030 0 -102 29 1 888802677 1 -109 64 2 880572560 1 -124 28 3 890287068 0 -158 194 5 880134913 0 -91 98 5 891439130 1 -7 100 5 891351082 1 -23 82 3 874787449 0 -97 197 3 884239655 0 -118 135 5 875384591 0 -178 97 5 882827020 1 -25 143 3 885852529 1 -43 3 2 884029543 1 -15 15 4 879455939 0 -87 144 4 879875734 1 -130 98 5 875216507 1 -109 77 4 880578388 1 -119 22 4 874781698 0 -99 125 4 885678840 0 -177 200 4 880130951 0 -145 54 5 888398669 0 -141 118 5 884585274 0 -16 200 5 877722736 0 -70 161 3 884067638 1 -152 161 5 882476363 0 -57 24 3 883697459 1 -130 159 4 875802211 0 -18 166 4 880129595 1 -64 179 5 889739460 1 -198 121 3 884206330 1 -85 153 3 879453658 0 -38 188 2 892431953 1 -27 148 3 891543129 0 -97 96 5 884239712 1 -194 50 3 879521396 0 -13 95 5 882140104 1 -65 63 2 879217913 1 -82 99 4 878769949 0 -102 194 3 888803537 0 -109 70 4 880578038 1 -7 27 4 891352692 1 -90 170 5 891383561 0 -71 197 5 885016990 1 -38 105 3 892434217 1 -200 179 4 884129029 0 -59 52 4 888205615 1 -184 82 3 889909934 1 -83 191 4 880308038 0 -83 121 4 880306951 1 -144 87 5 888105548 0 -92 64 4 875653519 1 -184 20 4 889907771 1 -141 127 2 884584735 0 -7 77 5 891353325 1 -130 31 4 875801801 0 -194 9 4 879535704 0 -200 89 5 884128788 0 -18 132 5 880132437 1 -180 153 1 877126182 1 -183 181 2 891463937 1 -49 80 1 888069117 1 -42 161 4 881108229 1 -72 118 3 880036346 0 -25 195 4 885852008 0 -127 62 5 884364950 0 -13 92 3 882397271 0 -59 194 3 888204841 0 -94 97 4 891721317 0 -11 24 3 891904016 0 -95 94 5 880573288 0 -64 183 5 889737914 0 -2 14 4 888551853 1 -152 15 5 880148843 0 -5 168 3 875636691 1 -12 195 4 879959670 0 -1 194 4 876892743 0 -90 19 3 891384020 0 -59 176 5 888205574 1 -60 95 4 883327799 1 -200 195 5 884128822 1 -82 81 3 878770059 0 -94 183 5 891720921 0 -93 1 5 888705321 1 -94 41 3 891723355 1 -64 195 5 889737914 1 -200 54 4 884129920 0 -200 98 5 884128933 1 -28 200 2 881961671 1 -95 179 3 880570909 1 -45 50 5 881007272 1 -53 96 4 879442514 0 -89 137 1 879441335 0 -125 41 2 892838510 0 -90 18 3 891383687 0 -189 24 4 893264248 1 -185 111 4 883524529 0 -130 79 5 875217392 1 -67 24 4 875379729 1 -125 109 3 892838288 1 -59 149 4 888203313 0 -195 152 3 890589490 0 -94 125 1 891721851 0 -7 56 5 891351432 0 -178 92 3 882827803 1 -158 129 5 880132383 1 -194 182 3 879521475 0 -5 50 4 875635758 0 -115 96 3 881172117 0 -24 176 5 875323595 0 -82 28 3 878769815 1 -49 13 3 888068816 1 -95 63 3 880572218 0 -60 153 3 883326733 0 -184 25 4 889908068 1 -197 39 2 891409982 0 -154 191 4 879138832 1 -119 11 5 874781198 0 -44 71 3 878347633 1 -109 71 4 880578066 0 -174 111 5 886433898 1 -41 175 5 890687526 1 -151 31 3 879524713 0 -94 83 4 885873653 1 -58 175 5 884663324 1 -62 174 4 879374916 0 -128 82 5 879968185 1 -186 121 2 879023074 0 -187 65 5 879465507 1 -13 79 3 882139746 0 -44 69 4 878347711 0 -81 150 3 876533619 1 -193 1 4 890859954 0 -187 197 4 879465597 0 -108 127 4 879879720 1 -72 9 5 880035636 1 -7 62 3 891354499 1 -59 135 5 888204758 0 -55 118 5 878176134 1 -37 147 3 880915749 1 -58 189 3 884304790 1 -73 64 5 888625042 0 -81 121 4 876533586 1 -98 88 3 880499087 0 -151 154 4 879524642 1 -104 181 5 888465972 0 -117 173 5 881011697 0 -7 29 3 891353828 0 -151 131 5 879525075 0 -26 14 3 891371505 0 -188 157 3 875072674 1 -45 13 5 881012356 0 -56 117 5 892679439 0 -110 41 4 886989399 0 -184 97 2 889908539 1 -85 134 5 879454004 0 -65 28 4 879216734 0 -70 168 4 884065423 1 -132 12 4 891278867 1 -174 100 5 886433788 1 -59 11 5 888205744 1 -13 110 3 882141130 0 -84 25 3 883452462 1 -189 136 4 893265535 1 -70 183 4 884149894 0 -119 25 5 886177013 1 -56 191 4 892678526 1 -90 174 5 891383866 0 -43 172 4 883955135 0 -194 118 3 879539229 1 -109 122 2 880583493 1 -189 97 4 893277579 0 -92 7 4 876175754 0 -96 190 4 884402978 1 -10 33 4 877893020 0 -161 22 2 891171282 0 -48 183 5 879434608 1 -94 49 4 891722174 0 -87 111 4 879876611 1 -194 28 5 879522324 0 -12 168 4 879959513 1 -16 109 4 877719333 0 -85 193 3 879454189 1 -113 116 3 875076246 1 -197 22 5 891409839 0 -182 126 5 885613153 1 -85 99 5 880838306 0 -85 14 4 879452638 1 -56 88 1 892683895 0 -22 184 5 878887869 1 -138 194 5 879024184 1 -59 181 5 888204877 0 -8 174 5 879362183 0 -144 117 4 888103969 0 -24 8 5 875323002 1 -59 174 5 888204553 1 -128 26 4 879969032 0 -70 95 4 884065501 0 -132 127 4 891278937 0 -10 174 4 877886661 1 -57 126 3 883697293 1 -120 117 3 889490979 0 -69 181 5 882072778 1 -13 68 3 882397741 0 -85 182 4 893110061 1 -161 50 2 891170972 0 -184 66 4 889910013 0 -10 129 4 877891966 0 -124 154 5 890287645 0 -87 172 5 879875737 1 -7 178 4 891350932 1 -96 98 5 884403214 0 -54 1 4 880931595 1 -85 191 4 879455021 0 -130 181 5 874953621 1 -75 1 4 884050018 1 -8 82 5 879362356 1 -113 50 5 875076416 1 -115 56 5 881171409 1 -13 170 5 882139774 0 -75 196 4 884051948 1 -94 143 4 891722609 1 -181 118 2 878962955 0 -70 132 4 884067281 1 -23 7 4 874784385 1 -58 70 4 890321652 0 -92 78 3 876175191 1 -178 11 5 882826162 1 -99 121 3 885679261 0 -79 116 5 891271676 1 -60 160 4 883326525 0 -5 162 1 875721572 1 -24 11 5 875323100 1 -114 176 5 881260203 1 -5 95 4 875721168 0 -157 117 5 886890296 1 -101 117 4 877136067 1 -68 111 3 876974276 0 -114 180 3 881309718 0 -151 198 4 879524472 0 -145 17 3 875272132 0 -75 111 4 884050502 0 -25 186 4 885852569 0 -60 168 5 883326837 1 -198 6 2 884206270 0 -76 77 2 882607017 0 -10 178 5 877888677 1 -28 195 4 881957250 0 -10 11 4 877888677 1 -92 182 4 875653836 1 -95 72 2 880571389 0 -194 86 3 879520991 1 -94 53 4 891721378 1 -158 123 3 880132488 0 -10 182 5 877888876 1 -87 181 5 879876194 0 -13 1 3 882140487 1 -194 22 5 879521474 0 -24 41 5 875323594 0 -58 116 5 884304409 0 -159 96 4 884360539 1 -121 127 5 891388333 0 -115 177 5 881172117 1 -109 177 4 880578358 0 -109 12 4 880577542 0 -28 56 5 881957479 1 -62 44 3 879374142 0 -110 196 4 886987978 1 -52 13 5 882922485 1 -66 50 5 883601236 1 -48 185 4 879434819 1 -152 49 5 882477402 1 -49 42 4 888068791 0 -124 144 4 890287645 0 -41 195 4 890687042 1 -18 42 3 880130713 0 -22 94 3 878887277 0 -10 134 5 877889131 0 -56 11 4 892676376 1 -138 15 4 879023389 0 -52 151 5 882922249 1 -30 172 4 875060742 1 -99 22 5 885679596 1 -10 137 4 877889186 0 -59 168 5 888204641 1 -76 137 5 875498777 1 -121 100 4 891388035 1 -195 198 3 884420000 1 -62 83 5 879375000 1 -194 8 3 879521719 0 -118 55 5 875385099 0 -144 1 4 888104063 0 -1 93 5 875071484 0 -92 120 2 875642089 0 -56 173 4 892737191 0 -84 95 4 883453642 1 -104 13 3 888465634 0 -56 111 2 892683877 1 -95 25 3 879192597 1 -7 47 5 891352692 1 -94 22 4 885872758 0 -186 98 5 891719859 1 -18 177 3 880131297 1 -119 168 5 874781351 1 -60 12 4 883326463 1 -60 166 4 883326593 1 -18 198 3 880130613 1 -125 105 3 892839021 1 -49 90 1 888069194 1 -192 108 4 881368339 0 -130 100 3 874953558 0 -1 8 1 875072484 0 -198 98 4 884207611 1 -56 78 3 892910544 1 -72 194 4 880037793 0 -43 79 4 875981335 1 -188 100 4 875074127 1 -62 195 5 879373960 0 -189 13 4 893264220 0 -44 121 4 878346946 1 -109 164 5 880578066 1 -49 96 1 888069512 1 -188 97 5 875071891 1 -22 175 4 878886682 0 -181 100 3 878962816 0 -59 61 4 888204597 0 -194 73 3 879527145 0 -164 100 5 889401998 0 -95 200 2 888954552 1 -158 92 4 880134407 1 -57 109 4 883697293 1 -13 13 5 882141617 1 -161 132 1 891171458 0 -109 125 5 880564534 1 -95 89 3 879196353 0 -156 187 5 888185778 1 -94 80 2 891723525 1 -1 105 2 875240739 1 -84 117 4 883450553 1 -1 147 3 875240993 1 -62 98 4 879373543 0 -115 23 5 881171348 0 -125 181 5 879454139 1 -95 77 4 880571746 0 -200 68 5 884129729 0 -83 25 2 883867729 1 -24 173 5 875323474 1 -137 1 3 881433048 0 -151 26 3 879542252 0 -87 127 4 879876194 0 -85 143 4 879456247 0 -83 111 3 884647519 1 -142 176 5 888640455 1 -1 99 3 875072547 1 -77 127 2 884732927 1 -195 143 5 875771441 0 -104 111 1 888465675 1 -64 196 4 889737992 0 -1 1 5 874965758 0 -18 98 5 880129527 0 -92 5 4 875654432 0 -148 151 4 877400124 1 -151 132 5 879524669 1 -177 135 5 880130712 0 -20 174 4 879669087 1 -199 100 3 883782807 0 -193 23 4 889126609 1 -91 127 5 891439018 0 -64 144 3 889737771 1 -73 179 5 888626041 0 -181 117 2 878962918 0 -138 12 5 879024232 1 -200 63 4 884130415 0 -72 77 4 880036945 0 -194 76 2 879549503 1 -6 137 5 883599327 1 -198 191 4 884208682 0 -41 188 4 890687571 1 -64 121 2 889739678 0 -95 188 3 879196354 1 -7 64 5 891350756 1 -145 134 4 882181695 1 -194 13 4 879539410 1 -144 8 4 888105612 1 -57 181 5 883697352 1 -178 56 4 882825767 0 -95 153 5 879197022 0 -187 168 5 879465273 1 -49 50 1 888067691 0 -69 98 5 882145375 1 -178 9 2 882823758 0 -92 195 5 875652981 1 -26 118 3 891385691 0 -90 20 4 891384357 0 -13 138 1 882399218 1 -30 174 5 885941156 0 -71 181 3 877319414 0 -144 61 3 888106182 1 -22 24 5 878888026 1 -13 117 3 882398138 0 -131 127 4 883681418 1 -177 173 4 880130667 1 -77 15 2 884732873 1 -75 13 5 884050102 1 -13 12 5 881515011 1 -54 181 5 880931358 1 -102 187 3 888801232 0 -144 4 4 888105873 1 -49 71 3 888067096 0 -178 87 4 885784558 0 -52 111 4 882922357 0 -178 200 3 882826983 1 -186 56 3 879023460 0 -23 151 3 874784668 0 -189 7 3 893264300 1 -188 64 5 875071891 0 -15 181 5 879455710 1 -101 147 4 877136506 0 -118 171 5 875384825 1 -154 174 5 879138657 1 -25 116 4 885853335 1 -16 12 5 877718168 1 -7 157 5 891352059 1 -6 64 4 883600597 0 -97 175 5 884239616 0 -158 53 1 880134781 1 -48 191 5 879434954 1 -60 73 4 883326995 0 -194 159 3 879552401 0 -124 168 5 890287645 1 -109 156 5 880573084 0 -156 83 3 888185677 1 -158 111 4 880134261 0 -98 25 5 880499111 0 -94 200 4 891721414 0 -87 50 5 879876194 1 -95 198 5 880570823 0 -82 3 2 878768765 1 -52 19 5 882922407 0 -194 134 2 879521719 0 -60 30 5 883325944 0 -106 25 4 881451016 0 -43 9 4 875975656 1 -124 174 3 890287317 0 -184 175 3 889908985 1 -83 196 5 880307996 1 -115 174 5 881171137 0 -95 141 4 888954631 0 -181 19 1 878962392 1 -196 116 3 881251753 0 -130 11 5 875216545 0 -81 42 4 876534704 0 -174 139 3 886515591 1 -181 129 2 878962279 1 -37 118 2 880915633 1 -159 126 5 880557038 1 -177 64 4 880130736 0 -97 191 5 884239472 0 -195 93 3 891762536 0 -92 171 4 875652981 1 -6 174 4 883600985 0 -130 118 4 874953895 1 -85 79 3 879453845 0 -72 174 5 880037702 0 -96 182 4 884402791 0 -95 121 4 879194114 1 -48 98 5 879434954 0 -91 50 5 891439386 1 -5 172 5 875636130 0 -175 12 4 877108146 0 -167 8 5 892738237 1 -181 18 1 878962623 1 -162 1 4 877635819 1 -189 124 5 893264048 1 -76 60 4 875028007 1 -59 79 5 888204260 0 -125 176 5 879454448 1 -152 117 4 880148782 1 -181 111 3 878962774 0 -92 80 2 875907504 1 -89 66 3 879459980 0 -62 97 2 879373795 1 -119 23 3 874782100 0 -1 197 5 875072956 1 -151 147 2 879524947 1 -161 133 2 891171023 1 -95 78 3 888956901 1 -136 116 5 882693723 0 -1 173 5 878541803 0 -13 7 2 882396790 1 -122 11 1 879270424 1 -89 100 5 879441271 1 -1 75 4 878543238 1 -68 25 4 876974176 0 -18 66 3 880131728 1 -198 117 1 884205114 1 -184 51 4 889909069 0 -198 143 3 884208951 1 -197 172 5 891409839 1 -46 50 4 883616254 0 -118 98 5 875384979 1 -102 127 2 888801316 0 -5 70 4 875636389 0 -29 79 4 882821989 0 -160 185 5 876861185 0 -6 13 2 883599400 1 -130 5 4 876251650 1 -109 55 2 880572756 1 -62 170 3 879373848 0 -194 183 3 879520916 1 -185 9 4 883524396 0 -199 7 4 883782854 1 -115 98 3 881171409 1 -128 25 3 879968185 0 -74 7 4 888333458 0 -59 147 5 888203270 0 -1 34 2 878542869 0 -62 53 2 879376270 1 -27 118 3 891543222 0 -94 132 4 891720862 1 -184 182 4 889908497 1 -158 181 3 880132383 0 -138 56 5 879024232 0 -69 50 5 882072748 0 -198 55 3 884207525 0 -18 95 4 880131297 0 -181 104 1 878962866 0 -77 175 4 884733655 1 -197 2 3 891409981 0 -62 196 4 879374015 0 -60 185 4 883326682 0 -181 93 1 878962773 1 -59 44 4 888206048 0 -7 80 4 891354381 1 -141 147 4 884584906 0 -119 132 5 874782228 0 -49 98 4 888067307 0 -7 51 2 891352984 1 -23 161 2 874787017 1 -13 193 5 882139937 0 -144 135 5 888105364 0 -18 157 3 880131849 0 -72 2 3 880037376 1 -20 151 3 879668555 1 -188 177 4 875073329 0 -184 121 2 889908026 0 -68 121 1 876974176 0 -151 50 5 879525034 0 -110 2 3 886988536 0 -193 199 5 889125535 1 -181 126 2 878962585 0 -95 71 5 880573288 1 -151 81 5 879524293 1 -198 161 3 884208454 0 -23 185 4 874785756 0 -120 1 4 889490412 0 -114 197 4 881260506 0 -197 11 1 891409893 1 -94 160 4 891721942 0 -87 80 4 879877241 1 -13 177 5 882397271 0 -64 190 4 889737851 1 -125 191 5 879454385 1 -59 58 4 888204389 0 -1 144 4 875073180 1 -97 153 5 884239686 0 -116 20 3 892683858 1 -62 129 3 879372276 0 -174 147 4 886433936 0 -37 50 5 880915838 0 -23 155 3 874787059 1 -81 111 3 876534174 0 -6 186 4 883602730 1 -189 166 4 893265657 1 -58 127 4 884304503 0 -116 47 3 876454238 0 -109 174 5 880572721 0 -70 99 4 884067222 0 -188 54 4 875074589 0 -94 142 3 891721749 1 -92 157 4 875653988 0 -13 183 4 882397271 0 -75 25 5 884049875 1 -11 180 2 891904335 0 -160 15 2 876768609 1 -132 151 3 891278774 0 -6 191 4 883601088 1 -144 15 4 888104150 0 -59 137 5 888203234 1 -193 25 4 889127301 0 -26 109 3 891376987 1 -102 82 2 888801360 0 -5 151 3 875635723 1 -178 153 4 882826347 1 -52 126 5 882922589 1 -95 144 5 879197329 0 -94 98 4 891721192 1 -82 181 4 876311241 1 -72 98 5 880037417 0 -72 25 5 880035588 0 -128 168 4 879966685 1 -174 140 4 886515514 0 -130 62 4 876252175 1 -32 9 3 883717747 0 -200 43 3 884129814 1 -104 3 3 888465739 0 -188 121 4 875073647 0 -200 141 4 884129346 0 -83 4 2 880336655 0 -82 121 4 876311387 1 -56 67 2 892677114 0 -92 4 4 875654222 1 -18 125 3 880131004 0 -23 79 4 874785957 0 -12 159 4 879959306 0 -24 200 5 875323440 0 -194 87 4 879523104 0 -106 161 3 881452816 0 -184 79 3 889909551 0 -94 29 2 891723883 0 -151 9 4 879524199 0 -59 1 2 888203053 0 -57 105 3 883698009 1 -79 93 2 891271676 0 -189 162 3 893266230 0 -99 1 4 886518459 0 -6 178 4 883600785 1 -25 131 4 885852611 1 -11 86 4 891904551 0 -128 181 4 879966954 1 -64 188 4 889739586 1 -21 145 1 874951761 0 -174 9 5 886439492 0 -91 136 4 891438909 0 -109 82 5 880572680 0 -94 173 4 885872758 1 -87 89 4 879875818 0 -184 93 4 889907771 0 -28 176 5 881956445 0 -197 38 3 891410039 0 -182 181 5 885612967 0 -158 55 4 880134407 1 -65 88 4 879217942 0 -64 135 4 889737889 1 -92 125 4 876175004 1 -73 153 3 888626007 1 -109 88 4 880581942 1 -53 7 3 879442991 0 -1 119 5 876893098 0 -56 176 5 892676377 0 -152 133 5 882474845 1 -1 26 3 875072442 0 -109 118 3 880571801 0 -22 163 1 878886845 0 -115 13 5 881171983 0 -44 90 2 878348784 0 -139 150 4 879538327 1 -42 66 4 881108280 1 -62 159 3 879375762 1 -64 194 5 889737710 1 -10 32 4 877886661 0 -66 121 3 883601834 1 -3 181 4 889237482 0 -23 91 4 884550049 1 -13 91 2 882398724 0 -56 154 2 892911144 1 -23 145 3 874786244 0 -72 97 4 880036638 0 -59 87 4 888205228 1 -174 15 5 886434065 1 -95 186 5 880573288 1 -71 14 5 877319375 1 -162 105 2 877636458 0 -7 134 4 892134959 0 -158 187 5 880134332 0 -2 25 4 888551648 0 -51 173 5 883498844 0 -7 194 5 891351851 1 -178 143 4 882827574 0 -198 70 3 884207691 1 -200 117 5 876042268 0 -198 132 4 884208137 0 -148 175 4 877016259 0 -194 91 3 879524892 0 -27 9 4 891542942 1 -62 62 3 879375781 1 -72 170 3 880037793 1 -23 156 3 877817091 0 -23 174 4 874785652 1 -73 154 5 888625343 1 -83 174 5 880307699 1 -85 69 4 879454582 1 -57 8 4 883698292 0 -104 130 1 888465554 0 -174 151 3 886434013 1 -102 188 2 888801812 1 -1 158 3 878542699 0 -1 37 2 878543030 0 -194 15 4 879539127 1 -23 134 4 874786098 0 -14 32 5 890881485 0 -91 31 5 891438875 1 -76 56 5 875027739 1 -67 105 4 875379683 1 -198 200 4 884207239 0 -151 111 4 879542775 1 -96 56 5 884403336 1 -44 89 5 878347315 0 -137 89 5 881433719 0 -28 117 4 881957002 0 -85 168 4 879454304 0 -22 109 4 878886710 1 -184 118 2 889908344 1 -13 194 5 882141458 0 -54 151 2 880936670 1 -151 134 4 879524131 1 -174 31 4 886434566 1 -197 29 3 891410170 0 -1 181 5 874965739 1 -21 200 5 874951695 0 -91 187 5 891438908 0 -85 180 4 879454820 1 -128 70 3 879967341 1 -189 191 5 893265402 1 -57 42 5 883698324 0 -194 155 3 879550737 1 -175 172 5 877107339 1 -83 161 4 887665549 0 -6 166 4 883601426 1 -160 192 5 876861185 1 -18 134 5 880129877 1 -130 24 5 874953866 0 -56 97 3 892677186 1 -16 1 5 877717833 1 -93 151 1 888705360 0 -99 117 5 885678784 1 -82 21 1 884714456 1 -187 69 4 879465566 1 -26 121 3 891377540 0 -109 98 4 880572755 0 -189 83 4 893265624 1 -160 168 4 876858091 0 -144 70 4 888105587 1 -94 54 4 891722432 1 -151 14 5 879524325 0 -44 196 4 878348885 0 -11 98 2 891905783 0 -198 185 3 884209264 1 -1 136 3 876893206 1 -178 50 5 882823857 0 -94 191 5 885870175 0 -188 22 5 875072459 1 -180 12 2 877355568 1 -194 174 4 879520916 1 -144 48 5 888105197 1 -26 50 4 891386368 1 -97 82 4 884239552 1 -6 194 4 883601365 0 -185 25 4 883525206 1 -13 32 4 882140286 0 -160 151 4 876769097 0 -194 81 2 879523576 1 -174 69 5 886514201 0 -160 153 3 876860808 0 -102 72 3 888803602 1 -161 100 4 891171127 1 -76 172 5 882606080 1 -21 7 5 874951292 0 -178 155 4 882828021 0 -13 5 1 882396869 0 -21 5 2 874951761 0 -65 15 5 879217138 1 -178 184 5 882827947 1 -159 195 3 884360539 1 -180 98 5 877544444 0 -109 72 5 880577892 1 -48 136 4 879434689 1 -130 65 4 875216786 0 -60 50 5 883326566 1 -141 126 5 884585642 0 -145 123 4 879161848 0 -22 21 4 878886750 1 -6 177 4 883600818 0 -62 199 4 879373692 0 -200 48 2 884129029 0 -99 25 3 885679025 1 -154 61 4 879138657 0 -44 99 4 878348812 0 -2 10 2 888551853 0 -14 124 5 876964936 0 -58 12 5 884304895 0 -121 83 4 891388210 1 -83 70 4 880308256 1 -16 58 4 877720118 1 -73 152 3 888626496 0 -200 161 4 884128979 0 -90 153 5 891384754 0 -138 117 4 879023245 1 -23 19 4 874784466 1 -101 7 3 877135944 0 -151 88 5 879542645 1 -11 39 3 891905824 1 -59 199 4 888205410 1 -178 62 4 882827083 1 -1 131 1 878542552 1 -13 17 1 882396954 0 -6 12 4 883601053 1 -184 50 4 889907396 0 -95 182 2 879198210 1 -160 195 4 876859413 1 -94 81 4 885870577 1 -180 196 5 877355617 0 -141 25 5 884585105 0 -92 174 5 875654189 0 -189 150 4 893277702 0 -91 192 4 891439302 1 -109 111 4 880564570 0 -92 116 3 875640251 1 -121 118 2 891390501 1 -13 128 1 882397502 0 -29 182 4 882821989 0 -200 193 4 884129209 0 -69 7 5 882126086 1 -60 69 4 883326215 0 -56 42 4 892676933 1 -56 53 3 892679163 0 -58 25 4 884304570 0 -94 164 3 891721528 1 -188 118 3 875072972 1 -159 7 5 880485861 1 -65 179 3 879216605 1 -90 194 5 891383424 1 -30 50 3 875061066 0 -185 116 4 883526268 1 -85 152 5 879454751 1 -72 187 4 880036638 1 -1 109 5 874965739 1 -90 126 2 891384611 1 -152 66 5 886535773 0 -1 182 4 875072520 0 -108 124 4 879879757 0 -96 89 5 884402896 0 -145 88 5 875272833 0 -151 195 3 879524642 1 -1 71 3 876892425 0 -90 56 5 891384516 0 -198 195 3 884207267 0 -49 57 4 888066571 0 -23 102 3 874785957 0 -85 23 4 879454272 0 -44 64 5 878347915 1 -92 72 3 875658159 0 -43 28 4 875981452 0 -11 15 5 891903067 1 -95 67 2 879198109 1 -23 131 4 884550021 0 -142 186 4 888640430 1 -90 154 5 891384516 0 -116 11 5 886310197 0 -77 31 3 884753292 0 -139 100 5 879538199 1 -125 172 5 879454448 1 -95 65 4 879197918 1 -12 161 5 879959553 0 -59 184 4 888206094 1 -72 56 5 880037702 1 -96 127 5 884403214 0 -118 175 5 875384885 0 -148 132 4 877020715 0 -175 133 4 877107390 1 -13 179 2 882140206 1 -58 135 4 884305150 1 -55 79 5 878176398 0 -70 173 4 884149452 0 -125 122 1 892839312 0 -70 28 4 884065757 1 -152 21 3 880149253 1 -109 15 4 880577868 0 -90 10 5 891383987 1 -91 143 4 891439386 1 -75 56 5 884051921 0 -82 1 4 876311241 0 -6 71 4 883601053 0 -58 151 3 884304553 1 -26 111 3 891371437 1 -37 96 4 880915810 1 -110 43 3 886988100 0 -13 86 1 881515348 0 -200 118 4 876042299 0 -193 69 5 889125287 0 -90 141 5 891385899 1 -23 55 4 874785624 0 -90 134 5 891383204 1 -55 22 5 878176397 1 -200 132 5 884130792 0 -184 40 4 889910326 0 -154 197 5 879139003 0 -136 124 5 882693489 0 -18 197 4 880130109 0 -62 128 2 879374866 0 -99 173 4 885680062 1 -42 135 4 881109148 1 -44 67 3 878348111 1 -59 97 5 888205921 0 -176 117 4 886048305 0 -1 46 4 876893230 0 -130 67 4 876252064 1 -90 64 4 891383912 0 -44 163 4 878348627 0 -109 17 4 880582132 0 -59 106 4 888203959 0 -115 124 5 881170332 0 -81 25 5 876533946 1 -65 73 4 879217998 1 -144 124 4 888104063 1 -46 100 4 883616134 0 -23 8 4 874785474 1 -99 105 2 885679353 0 -190 121 3 891033773 0 -200 91 4 884129814 0 -21 185 5 874951658 1 -106 14 4 881449486 0 -43 174 4 875975687 0 -43 12 5 883955048 0 -178 124 4 882823758 1 -184 165 4 889911178 1 -73 1 2 888626065 1 -56 153 4 892911144 0 -153 181 1 881371140 1 -109 186 3 880572786 0 -144 55 4 888105254 1 -1 169 5 878543541 0 -97 195 5 884238966 0 -125 186 3 879454448 0 -122 135 4 879270327 1 -136 9 5 882693429 1 -18 23 4 880130065 1 -148 181 5 877399135 0 -94 196 4 891721462 1 -77 25 2 884733055 1 -95 102 4 880572474 1 -115 137 5 881169776 1 -151 97 5 879528801 1 -178 144 4 882825768 1 -6 168 4 883602865 0 -70 135 4 884065387 1 -184 185 4 889908843 0 -64 101 2 889740225 1 -193 177 4 890860290 1 -16 98 5 877718107 1 -82 7 3 876311217 1 -163 28 3 891220019 1 -11 125 4 891903108 1 -52 25 5 882922562 0 -85 121 2 879453167 0 -49 25 2 888068791 0 -198 153 4 884207858 0 -41 97 3 890687665 0 -18 178 3 880129628 1 -73 175 5 888625785 1 -84 87 5 883453587 0 -13 33 5 882397581 1 -145 173 5 875272604 1 -148 56 5 877398212 0 -48 194 4 879434819 0 -87 195 5 879875736 0 -92 51 4 875812305 0 -1 41 2 876892818 1 -1 162 4 878542420 0 -70 174 5 884065782 1 -31 136 5 881548030 0 -65 98 4 879218418 0 -188 162 4 875072972 1 -38 155 5 892432090 0 -89 50 5 879461219 0 -152 8 5 882829050 1 -181 6 1 878962866 1 -1 110 1 878542845 0 -95 95 3 879198109 0 -7 73 3 892133154 1 -178 77 4 882827947 0 -189 50 5 893263994 0 -13 22 4 882140487 1 -198 73 3 884208419 0 -153 182 5 881371198 0 -135 55 4 879857797 1 -10 197 5 877888944 1 -64 91 4 889739733 1 -28 28 4 881956853 0 -58 98 4 884304747 0 -58 199 4 891611501 1 -185 160 1 883524281 1 -130 64 5 875801549 1 -177 186 4 880130990 0 -6 185 5 883601393 0 -20 148 5 879668713 1 -189 118 1 893264735 0 -174 126 5 886433166 0 -182 50 5 885613018 0 -174 178 5 886513947 0 -1 66 4 878543030 1 -13 39 3 882397581 0 -76 93 4 882606572 1 -151 93 5 879525002 1 -85 82 3 879454633 0 -26 116 2 891352941 0 -115 176 5 881171203 0 -59 180 4 888204597 0 -197 183 5 891409839 1 -72 5 4 880037418 0 -102 68 2 888801673 0 -1 77 4 876893205 0 -200 62 5 884130146 0 -59 185 5 888205228 1 -194 88 3 879549394 1 -178 7 4 882823805 0 -85 132 5 879453965 0 -72 69 4 880036579 1 -110 88 4 886988967 0 -113 9 3 875076307 1 -102 49 2 892992129 0 -187 179 5 879465782 1 -109 95 4 880572721 0 -182 150 3 885613294 1 -85 136 4 879454349 1 -167 83 5 892738384 0 -12 4 5 879960826 0 -63 25 4 875747292 0 -184 65 4 889909516 0 -184 1 4 889907652 0 -124 173 2 890287687 1 -99 147 5 885678997 0 -41 168 5 890687304 0 -71 174 2 877319610 0 -130 29 3 878537558 1 -87 153 5 879876703 1 -90 133 5 891384147 0 -23 144 3 874785926 1 -54 100 5 880931595 1 -43 168 4 875981159 0 -121 156 4 891388145 0 -60 7 5 883326241 0 -59 198 5 888204389 1 -42 64 5 881106711 0 -44 102 2 878348499 0 -44 7 5 878341246 1 -123 23 4 879873020 1 -125 117 3 879454699 1 -109 81 2 880580030 0 -28 185 5 881957002 1 -116 181 4 876452523 1 -49 171 4 888066551 0 -189 178 5 893265191 0 -70 142 3 884150884 0 -198 69 4 884207560 1 -22 181 5 878887765 1 -102 95 4 883748488 1 -152 173 5 882474378 0 -189 165 5 893265535 1 -18 189 5 880129816 0 -189 151 5 893264378 1 -193 153 4 889125629 1 -178 157 5 882827400 0 -190 148 4 891033742 1 -181 109 1 878962955 1 -60 9 5 883326399 1 -151 199 3 879524563 0 -192 25 4 881367618 1 -16 125 3 877726944 0 -24 79 4 875322796 0 -14 191 4 890881557 1 -11 29 3 891904805 0 -102 88 3 892991311 0 -89 49 4 879460347 0 -145 66 4 875272786 0 -10 69 4 877889131 0 -131 9 5 883681723 1 -6 79 3 883600747 1 -95 22 4 888953953 0 -152 153 4 880149924 1 -87 47 3 879876637 1 -1 199 4 875072262 1 -77 132 3 884753028 1 -181 24 1 878962866 1 -130 38 4 876252263 0 -12 174 5 879958969 0 -104 15 5 888465413 1 -74 13 4 888333542 0 -94 72 3 891723220 1 -43 8 4 875975717 1 -101 125 4 877137015 0 -1 57 5 878542459 0 -1 50 5 874965954 0 -145 181 5 875270507 1 -11 123 3 891902745 1 -65 69 3 879216479 0 -11 42 3 891905058 1 -22 17 4 878886682 0 -70 143 5 884149431 0 -176 129 3 886048391 0 -198 122 1 884206807 0 -52 107 4 882922540 1 -51 184 3 883498685 0 -145 164 4 875271948 1 -13 157 3 882140552 1 -41 58 3 890687353 0 -97 23 5 884239553 0 -82 71 4 878770169 1 -151 133 5 879524797 0 -70 88 4 884067394 1 -64 185 4 889739517 1 -194 157 4 879547184 0 -178 64 5 882826242 0 -69 182 4 882145400 1 -198 164 3 884208571 1 -60 136 4 883326057 1 -198 108 3 884206270 1 -188 144 3 875071520 1 -60 70 4 883326838 0 -38 78 5 892433062 1 -96 83 3 884403758 1 -87 72 3 879876848 0 -151 82 3 879524819 1 -76 192 5 875027442 0 -178 172 4 882826555 0 -24 12 5 875323711 0 -6 173 5 883602462 1 -148 168 5 877015900 0 -162 79 4 877636713 0 -136 127 5 882693404 0 -142 7 4 888640489 1 -81 186 5 876534783 1 -59 191 4 888204841 0 -174 70 5 886453169 1 -102 200 3 888803051 1 -110 161 5 886988631 1 -97 100 2 884238778 1 -23 194 4 874786016 0 -121 135 5 891388090 0 -49 77 1 888068289 1 -187 23 4 879465631 0 -8 177 4 879362233 0 -89 181 4 879441491 0 -56 144 5 892910796 1 -11 25 3 891903836 1 -169 134 5 891359250 0 -76 6 5 875028165 1 -23 96 4 874785551 0 -77 1 5 884732808 1 -48 56 3 879434723 0 -42 38 3 881109148 0 -109 161 3 880572756 0 -141 100 4 884584688 0 -194 29 2 879528342 1 -51 136 4 883498756 1 -125 8 4 879454419 0 -151 1 5 879524151 1 -64 31 4 889739318 0 -138 137 5 879023131 1 -1 192 4 875072547 1 -26 148 3 891377540 1 -1 178 5 878543541 1 -90 59 5 891383173 0 -169 50 5 891359250 0 -43 69 4 875981421 1 -123 64 3 879872791 0 -92 193 4 875654222 1 -1 5 3 889751712 0 -106 28 4 881451144 0 -83 56 1 886534501 1 -95 58 3 879197834 0 -13 87 5 882398814 1 -92 9 4 875640148 0 -130 2 4 876252327 1 -144 98 4 888105587 1 -189 175 5 893265506 0 -103 181 4 880415875 1 -175 195 3 877107790 0 -10 60 3 877892110 0 -145 111 3 875270322 1 -1 87 5 878543541 0 -23 162 3 874786950 1 -24 92 5 875323241 0 -92 95 3 875653664 0 -85 65 3 879455021 1 -82 175 4 878769598 1 -198 131 3 884208952 1 -49 181 1 888067765 1 -32 111 3 883717986 1 -184 170 5 889913687 1 -1 156 4 874965556 0 -186 38 5 879023723 0 -85 87 4 879829327 1 -117 7 3 880125780 0 -62 144 3 879374785 1 -8 96 3 879362183 1 -58 181 3 884304447 1 -85 174 4 879454139 1 -43 186 3 875981335 1 -187 8 5 879465273 0 -104 124 2 888465226 1 -94 121 2 891721815 0 -7 141 5 891353444 1 -178 180 3 882826395 1 -52 100 4 882922204 0 -178 79 4 882826306 0 -95 197 4 888954243 1 -64 153 3 889739243 1 -24 98 5 875323401 0 -177 60 4 880130634 0 -99 120 2 885679472 0 -4 11 4 892004520 0 -198 64 4 884207206 0 -62 127 4 879372216 1 -64 98 4 889737654 1 -16 180 5 877726790 0 -152 121 5 880149166 1 -97 133 1 884239655 0 -102 73 3 892992297 0 -10 170 4 877889333 0 -59 179 5 888204996 1 -13 163 3 882141582 0 -22 144 5 878887680 1 -49 85 3 888068934 1 -78 93 4 879633766 0 -92 49 3 875907416 0 -159 25 5 880557112 0 -151 33 5 879543181 1 -69 197 5 882145548 1 -5 145 1 875720830 0 -110 184 1 886988631 1 -125 144 5 879454197 0 -13 164 3 882396790 1 -59 197 5 888205462 0 -94 194 4 885870284 0 -64 141 4 889739517 1 -92 1 4 875810511 1 -51 144 5 883498894 0 -72 50 2 880037119 1 -158 29 3 880134607 1 -56 89 4 892676314 0 -157 120 1 886891243 0 -92 55 3 875654245 1 -65 87 5 879217689 0 -178 168 4 882826347 1 -95 196 4 879198354 0 -13 199 5 882140001 0 -13 67 1 882141686 1 -182 178 5 876435434 0 -24 71 5 875323833 1 -22 2 2 878887925 1 -51 172 5 883498936 0 -12 98 5 879959068 0 -175 11 5 877107339 1 -148 135 5 877016514 0 -62 13 4 879372634 0 -1 106 4 875241390 1 -153 172 1 881371140 1 -8 187 4 879362123 0 -128 66 3 879969329 1 -17 126 4 885272724 0 -13 82 2 882397503 1 -82 133 4 878769410 1 -1 167 2 878542383 0 -16 158 4 877727280 1 -200 95 5 884128979 0 -6 153 4 883603013 0 -175 50 5 877107138 0 -119 137 5 886176486 0 -1 115 5 878541637 0 -7 68 4 891351547 1 -124 195 4 890399864 1 -198 27 2 884208595 0 -165 169 5 879525832 0 -59 96 5 888205659 1 -145 62 2 885557699 0 -187 134 3 879465079 0 -6 89 4 883600842 0 -64 127 5 879366214 1 -145 31 5 875271896 0 -21 50 3 874951131 1 -13 160 4 882140070 0 -6 7 2 883599102 0 -22 161 4 878887925 1 -102 13 3 892991118 1 -1 11 2 875072262 0 -90 57 5 891385389 1 -175 186 4 877107790 0 -158 161 2 880134477 1 -71 177 2 885016961 1 -141 15 5 884584981 1 -95 133 3 888954341 1 -37 172 4 880930072 1 -6 192 4 883600914 0 -172 124 4 875537151 0 -49 182 3 888069416 1 -58 169 4 884304936 1 -73 187 5 888625934 0 -117 156 4 881011376 1 -195 60 3 888737240 1 -198 186 5 884207733 1 -97 83 1 884238817 1 -92 186 4 875653960 0 -59 187 5 888204349 1 -195 134 5 875771441 0 -176 7 5 886048188 0 -181 116 1 878962550 0 -178 1 4 882823805 0 -158 121 4 880132701 0 -1 35 1 878542420 1 -76 129 3 878101114 1 -176 150 4 886047879 1 -59 190 5 888205033 0 -152 143 5 882474378 1 -97 169 5 884238887 0 -91 22 5 891439208 1 -1 137 5 875071541 0 -178 156 2 882826395 1 -128 88 4 879969390 0 -188 153 5 875075062 0 -145 135 5 885557731 0 -92 58 4 875653836 0 -198 4 3 884209536 0 -53 151 4 879443011 0 -128 161 5 879968896 0 -94 32 5 891721851 0 -17 7 4 885272487 1 -116 50 3 876452443 1 -25 183 4 885852008 0 -62 81 4 879375323 0 -13 50 5 882140001 1 -43 121 4 883955907 1 -136 15 4 882693723 0 -56 22 5 892676376 0 -62 71 4 879374661 1 -157 118 2 886890439 1 -94 42 4 885870577 1 -53 25 4 879442538 1 -153 79 5 881371198 1 -121 197 4 891388286 0 -15 111 4 879455914 1 -7 4 5 891351772 0 -194 89 3 879521328 0 -13 37 1 882397011 0 -167 99 4 892738385 0 -192 125 3 881367849 0 -196 153 5 881251820 0 -94 4 4 891721168 1 -18 111 3 880131631 0 -95 97 4 879198652 0 -30 161 4 875060883 0 -198 127 5 884204919 1 -184 7 3 889907738 0 -55 50 4 878176005 1 -70 101 3 884150753 1 -25 98 5 885853415 1 -93 121 3 888705053 1 -148 8 4 877020297 1 -82 197 4 878769847 0 -76 70 4 875027981 0 -152 155 5 884018390 1 -174 125 5 886514069 0 -92 47 4 875654732 1 -189 61 3 893265826 1 -101 151 3 877136628 1 -130 185 5 875217033 1 -151 178 5 879524586 0 -83 31 5 880307751 0 -106 191 5 881451453 1 -92 179 5 875653077 1 -54 117 5 880935384 0 -73 82 2 888625754 0 -158 188 4 880134332 0 -188 180 5 875073329 0 -193 161 3 889125912 0 -114 168 3 881259927 0 -6 127 5 883599134 1 -98 152 3 880498968 0 -96 144 4 884403250 0 -44 197 4 878347420 0 -85 173 3 879454045 1 -1 127 5 874965706 0 -92 8 5 875654159 0 -10 185 5 877888876 0 -119 181 4 874775406 0 -125 70 3 892838287 1 -59 32 4 888205228 0 -58 153 5 884304896 0 -157 111 3 886889876 1 -159 67 1 884026964 0 -13 70 3 882140691 1 -130 179 4 875217265 1 -89 1 5 879461219 0 -25 23 4 885852529 0 -151 151 5 879524760 1 -82 79 3 878769334 1 -174 50 4 886433166 0 -177 129 3 880130653 1 -8 181 4 879362183 1 -16 172 5 877724726 0 -145 176 5 875271838 1 -125 72 4 892838322 0 -189 4 5 893265741 0 -138 45 5 879024232 0 -124 7 4 890287645 1 -7 145 1 891354530 1 -104 50 5 888465972 1 -95 177 3 879196408 0 -144 54 2 888105473 1 -51 134 2 883498844 0 -90 8 5 891383424 0 -60 89 5 883326463 0 -197 82 5 891409893 1 -188 76 4 875073048 0 -43 131 3 883954997 0 -6 1 4 883599478 1 -50 15 2 877052438 0 -5 169 5 878844495 1 -16 71 5 877721071 1 -91 56 1 891439057 1 -1 16 5 878543541 0 -83 38 5 887665422 1 -63 1 3 875747368 0 -42 151 4 881110578 1 -122 190 4 879270424 1 -200 71 4 884129409 0 -176 93 5 886047963 0 -97 186 3 884239574 1 -158 10 4 880132513 1 -87 13 3 879876734 1 -65 1 3 879217290 1 -56 87 4 892678508 1 -194 133 3 879523575 0 -1 79 4 875072865 1 -23 62 3 874786880 1 -189 10 5 893264335 0 -128 73 3 879969032 0 -194 143 3 879524643 0 -14 81 5 890881384 1 -70 15 3 884148728 0 -53 50 4 879442978 0 -119 9 4 890627252 0 -38 144 5 892430369 1 -44 144 4 878347532 1 -130 156 3 875801447 1 -69 124 4 882072869 0 -92 68 3 875653699 0 -197 177 5 891409935 1 -95 139 4 880572250 1 -56 62 5 892910890 0 -64 89 3 889737376 1 -124 117 3 890287181 1 -60 96 4 883326122 0 -13 58 4 882139966 1 -144 116 4 888104258 1 -183 177 5 892323452 0 -128 174 3 879966954 1 -161 69 4 891171657 1 -18 175 4 880130431 0 -2 100 5 888552084 0 -85 52 3 881705026 1 -87 158 3 879877173 0 -94 181 4 885872942 1 -109 191 4 880577844 1 -77 154 5 884733922 0 -177 7 4 880130881 0 -42 50 5 881107178 0 -1 45 5 875241687 1 -5 121 4 875635189 0 -16 15 5 877722001 0 -115 48 5 881171203 1 -199 9 5 883782853 1 -183 62 2 891479217 0 -85 135 5 879453845 0 -130 89 4 875216458 1 -117 96 5 881012530 1 -1 48 5 875072520 0 -49 93 5 888068912 0 -114 98 4 881259495 0 -87 179 4 879875649 0 -49 8 3 888067691 1 -134 1 5 891732756 0 -128 179 3 879967767 0 -162 147 4 877636147 0 -198 50 5 884204919 1 -43 58 3 883955859 1 -178 66 4 882826868 0 -59 133 3 888204349 1 -196 13 2 881251955 0 -188 151 3 875073909 1 -17 100 4 885272520 0 -109 54 3 880578286 0 -7 186 4 891350900 0 -7 25 3 891352451 0 -156 86 4 888185854 1 -130 42 4 875801422 1 -67 125 4 875379643 1 -1 25 4 875071805 1 -64 186 4 889737691 1 -60 131 4 883327441 0 -192 7 4 881367791 1 -90 65 4 891385298 0 -62 69 4 879374015 1 -161 15 2 891172284 0 -16 152 4 877728417 1 -199 14 4 883783005 0 -132 50 3 891278774 0 -125 111 3 892838322 1 -131 137 1 883681466 0 -193 122 1 889127698 1 -119 89 4 874781352 0 -90 97 5 891383987 0 -60 23 4 883326652 0 -1 195 5 876892855 1 -18 81 3 880130890 0 -62 82 4 879375414 0 -151 170 5 879524669 1 -194 72 3 879554100 1 -174 155 4 886513767 1 -29 12 5 882821989 1 -89 88 4 879459980 1 -182 48 3 876436556 0 -13 197 4 881515239 0 -119 70 3 874781829 0 -194 173 5 879521088 0 -96 183 4 884403123 1 -77 96 3 884752562 1 -53 156 4 879442561 1 -151 194 4 879524443 1 -20 143 3 879669040 0 -109 168 3 880577734 0 -69 117 4 882072748 1 -72 191 5 880036515 0 -125 21 3 892838424 0 -55 121 3 878176084 0 -49 161 1 888069513 1 -144 127 4 888105823 1 -197 4 3 891409981 0 -144 147 3 888104402 0 -200 94 4 884130046 1 -31 153 4 881548110 0 -189 181 3 893264023 0 -7 180 5 891350782 0 -160 61 4 876861799 1 -158 175 4 880135044 0 -197 182 3 891409935 0 -1 153 3 876893230 1 -5 189 5 878844495 1 -82 169 4 878769442 1 -14 173 4 879119579 0 -85 50 5 882813248 0 -22 89 5 878887680 1 -78 25 3 879633785 1 -144 14 4 888104122 1 -45 7 3 881008080 0 -183 94 3 891466863 0 -16 87 4 877720916 0 -125 49 3 879455241 1 -17 111 3 885272674 1 -50 124 1 877052400 0 -151 168 5 879528495 0 -103 118 3 880420002 1 -1 101 2 878542845 1 -122 83 5 879270327 1 -80 86 5 887401496 0 -184 134 5 889909618 1 -70 63 3 884151168 0 -94 69 3 885870057 0 -10 59 4 877886722 1 -110 12 4 886987826 1 -87 27 4 879876037 0 -45 151 2 881013885 1 -197 184 1 891409981 1 -104 127 3 888465201 0 -2 127 5 888552084 0 -11 54 3 891905936 1 -23 153 4 874786438 0 -196 173 2 881251820 1 -24 132 3 875323274 0 -160 150 4 876767440 0 -6 132 5 883602422 0 -102 117 3 888801232 0 -148 190 2 877398586 0 -23 171 5 874785809 0 -1 168 5 874965478 0 -59 15 5 888203449 0 -99 116 2 888469419 1 -95 51 4 879198353 1 -128 131 5 879967452 0 -123 182 4 879872671 1 -117 11 5 881011824 1 -168 1 5 884287509 0 -145 55 3 875272009 1 -118 172 5 875384751 0 -16 100 5 877720437 1 -43 122 2 884029709 0 -130 188 4 876251895 0 -92 50 5 875640148 0 -7 9 5 891351432 1 -130 1 5 874953595 0 -184 56 3 889908657 0 -51 50 5 883498685 0 -151 4 5 879524922 1 -63 79 3 875748245 0 -162 7 3 877635869 1 -200 79 5 884128499 0 -92 156 4 875656086 1 -41 56 4 890687472 1 -95 99 4 888954699 1 -151 56 4 879524879 1 -119 125 5 874775262 0 -42 97 3 881107502 1 -178 164 3 882827288 0 -188 199 4 875071658 0 -13 111 5 882140588 1 -43 124 4 891294050 0 -60 134 4 883326215 1 -90 198 5 891383204 1 -158 174 5 880134332 1 -189 133 5 893265773 1 -1 123 4 875071541 0 -193 79 4 889125755 0 -10 192 4 877891966 1 -58 111 4 884304638 0 -57 194 4 883698272 0 -68 9 4 876974073 1 -119 121 4 874775311 0 -82 100 5 876311299 0 -64 111 4 889739975 1 -92 38 3 875657640 1 -144 9 5 888104191 0 -178 31 4 882827083 0 -22 4 5 878886571 1 -194 179 4 879521329 0 -87 188 4 879875818 0 -43 169 5 875981128 0 -42 86 3 881107880 1 -77 173 5 884752689 1 -145 155 2 875272871 1 -44 109 3 878346431 0 -18 143 4 880131474 1 -151 125 4 879542939 1 -82 87 3 878769598 0 -59 151 5 888203053 1 -130 196 5 875801695 1 -130 95 5 875216867 1 -25 173 4 885852969 0 -1 191 5 875072956 1 -181 122 2 878963276 0 -8 11 3 879362233 0 -59 89 5 888204965 0 -15 125 5 879456049 1 -125 153 2 879454419 0 -59 69 5 888205087 0 -22 127 5 878887869 1 -44 133 4 878347569 0 -119 82 2 874781352 1 -151 44 4 879542413 0 -113 127 4 875935610 1 -192 100 5 881367706 1 -87 94 4 879876703 1 -69 172 5 882145548 1 -65 64 5 879216529 0 -186 159 5 879023723 0 -102 55 3 888801465 0 -177 181 4 880130931 0 -1 4 3 876893119 1 -42 71 4 881108229 0 -194 78 1 879535549 1 -77 28 5 884753061 1 -79 1 4 891271870 0 -45 111 4 881011550 1 -177 96 3 880130898 1 -187 83 5 879465274 1 -43 70 4 883955048 1 -193 121 3 889125913 0 -57 117 4 883697512 1 -95 195 5 879196231 0 -18 22 5 880130640 0 -51 181 5 883498655 0 -56 50 5 892737154 1 -165 174 4 879525961 1 -65 25 4 879217406 0 -13 69 4 884538766 0 -142 147 1 888640356 0 -18 72 3 880132252 1 -22 62 4 878887925 1 -64 132 4 889737851 1 -44 194 5 878347504 0 -188 194 3 875073329 0 -145 106 4 875270655 1 -110 54 4 886988202 1 -18 59 4 880132501 1 -6 170 4 883602574 1 -188 174 5 875072741 0 -108 7 5 879879812 1 -174 132 2 886439516 0 -82 125 3 877452380 1 -187 186 4 879465308 0 -18 52 5 880130680 1 -109 11 4 880572786 1 -85 64 5 879454046 1 -165 127 4 879525706 0 -144 174 5 888105612 0 -181 108 1 878963343 0 -178 51 4 882828021 1 -7 70 1 891352557 0 -165 187 3 879526046 1 -200 9 4 884126833 1 -82 112 1 877452357 1 -180 191 4 877372188 1 -62 171 4 879373659 0 -56 114 4 892683248 0 -63 3 2 875748068 0 -90 171 2 891384476 1 -168 118 4 884288009 0 -177 183 4 880130972 1 -23 195 4 874786993 1 -148 191 1 877020715 0 -137 118 5 881433179 1 -175 88 4 877108146 1 -13 152 5 882141393 1 -32 7 4 883717766 1 -172 23 3 875537717 0 -144 12 4 888105419 1 -198 31 3 884207897 0 -193 24 2 889125880 1 -125 90 5 892838623 0 -13 62 5 882397833 0 -194 97 3 879524291 0 -75 79 5 884051893 1 -25 1 5 885853415 1 -13 53 1 882396955 1 -64 87 4 889737851 0 -76 156 3 882606108 0 -6 9 4 883599205 1 -56 161 4 892910890 1 -194 177 3 879523104 1 -106 162 5 881450758 0 -91 69 5 891439057 0 -158 56 5 880134296 0 -16 199 5 877719645 1 -11 176 3 891905783 1 -94 177 5 885870284 0 -183 144 3 891479783 0 -123 185 4 879873120 1 -65 111 4 879217375 1 -44 50 5 878341246 1 -7 71 5 891352692 0 -94 172 4 885870175 1 -18 48 4 880130515 0 -109 25 4 880571741 0 -87 49 5 879876564 0 -23 56 4 874785233 0 -198 181 4 884205050 0 -64 79 4 889737943 1 -13 38 3 882397974 0 -144 126 4 888104150 0 -82 50 5 876311146 1 -1 55 5 875072688 1 -11 121 3 891902745 1 -138 1 4 879023031 1 -154 200 5 879138832 0 -72 188 4 880037203 1 -59 182 5 888204877 1 -7 96 5 891351383 0 -94 67 3 891723296 0 -62 190 5 879374686 1 -123 14 5 879872540 1 -152 88 5 884035964 0 -1 42 5 876892425 1 -1 139 3 878543216 0 -194 79 3 879521088 0 -185 178 4 883524364 0 -144 56 4 888105387 1 -183 121 3 891463809 1 -41 96 4 890687019 1 -92 44 3 875906989 0 -145 118 3 875270764 1 -144 7 2 888104087 1 -58 137 5 884304430 1 -161 70 3 891171064 0 -162 179 3 877636794 1 -95 8 5 879198262 1 -120 127 4 889489772 0 -150 150 3 878746824 1 -97 69 5 884239616 1 -13 158 1 882142057 1 -84 4 3 883453713 0 -58 121 2 892242300 1 -178 194 4 882826306 0 -174 11 5 886439516 1 -119 28 5 874782022 1 -11 83 5 891904335 0 -99 172 5 885679952 0 -14 195 5 890881336 1 -90 156 4 891384147 0 -190 100 4 891033653 1 -19 8 5 885412723 0 -49 54 2 888068265 0 -119 194 5 874781257 0 -13 172 5 882140355 0 -159 130 1 880557322 0 -11 28 5 891904241 1 -92 111 3 875641135 0 -49 4 2 888069512 0 -94 94 2 891723883 1 -7 179 5 891352303 0 -102 11 3 888801232 1 -44 157 4 878347711 1 -109 63 3 880582679 0 -80 79 4 887401407 0 -117 132 4 881012110 0 -49 62 2 888069660 0 -64 28 4 889737851 1 -82 170 4 878769703 1 -165 156 3 879525894 0 -15 121 3 879456168 0 -59 42 5 888204841 1 -184 29 3 889910326 1 -68 127 4 876973969 0 -185 47 4 883524249 1 -42 132 5 881107502 0 -190 7 4 891033653 0 -18 137 5 880132437 0 -200 172 5 884128554 0 -9 50 5 886960055 0 -59 30 5 888205787 1 -56 167 3 892911494 0 -117 184 3 881012601 0 -119 7 5 874775185 0 -94 195 3 885870231 1 -62 134 4 879373768 1 -180 53 5 877442125 1 -28 5 3 881961600 0 -178 161 5 882827645 1 -122 57 2 879270644 0 -158 107 3 880132960 1 -187 173 5 879465307 1 -89 86 5 879459859 1 -43 100 4 875975656 0 -194 168 5 879521254 1 -28 96 5 881957250 1 -84 79 4 883453520 1 -110 63 3 886989363 1 -62 172 5 879373794 0 -11 52 3 891904335 1 -1 7 4 875071561 1 -118 188 5 875384669 1 -97 172 4 884238939 1 -96 200 5 884403215 1 -116 145 2 876452980 1 -29 180 4 882821989 1 -23 59 4 874785526 1 -5 66 1 875721019 0 -194 186 5 879521088 1 -13 145 2 882397011 1 -59 60 5 888204965 0 -87 66 5 879876403 0 -115 79 4 881171273 0 -37 7 4 880915528 1 -185 50 4 883525998 0 -94 91 5 891722006 1 -41 152 4 890687326 1 -181 106 2 878963167 1 -177 127 5 880130667 1 -92 199 3 875811628 1 -178 12 5 882826162 0 -128 69 4 879966867 0 -160 11 4 876858091 1 -186 12 1 879023460 0 -13 127 5 881515411 1 -187 137 5 879464895 1 -64 174 5 889737478 1 -164 9 4 889402050 0 -41 180 5 890687019 1 -161 197 3 891171734 1 -7 39 5 891353614 1 -121 25 5 891390316 0 -7 125 4 891353192 0 -150 124 2 878746442 0 -92 39 3 875656419 1 -64 48 5 879365619 0 -43 127 4 875981304 0 -13 78 1 882399218 1 -148 174 5 877015066 1 -189 137 4 893264407 0 -28 31 4 881956082 1 -177 174 4 880130990 0 -18 88 3 880130890 0 -174 99 3 886515457 0 -7 132 5 891351287 1 -156 157 4 888185906 1 -71 52 4 877319567 0 -89 15 5 879441307 0 -11 8 4 891904949 0 -200 176 5 884129627 0 -186 55 4 879023556 1 -77 121 2 884733261 0 -109 144 4 880572560 1 -127 50 4 884364866 1 -198 71 3 884208419 1 -73 197 5 888625934 0 -43 102 4 875981483 1 -46 93 4 883616218 0 -79 10 5 891271901 0 -13 184 1 882397011 1 -84 70 5 883452906 0 -194 125 2 879548026 0 -13 174 4 882139829 1 -189 157 4 893265865 1 -125 168 5 879454793 0 -90 192 4 891384959 0 -18 100 5 880130065 0 -189 8 5 893265710 1 -114 153 3 881309622 0 -119 124 4 874781994 0 -92 118 2 875640512 0 -7 31 4 892134959 0 -150 123 4 878746852 0 -119 64 4 874781460 0 -1 149 2 878542791 0 -70 91 3 884068138 0 -72 58 4 880036638 1 -76 182 4 882606392 1 -162 122 2 877636300 1 -121 50 5 891390014 0 -193 182 4 890860290 0 -23 154 3 874785552 0 -138 14 3 879022730 1 -25 121 4 885853030 1 -13 49 4 882399419 1 -97 173 3 884238728 0 -42 111 1 881105931 1 -116 56 5 886310197 0 -1 43 4 878542869 0 -121 12 5 891390014 1 -41 194 3 890687242 1 -43 4 4 875981421 1 -77 156 4 884733621 0 -81 116 3 876533504 1 -62 167 2 879376727 0 -154 187 5 879139096 1 -169 199 4 891359353 0 -128 186 5 879966895 0 -58 1 5 884304483 0 -130 123 4 875216112 1 -184 160 3 889911459 1 -63 14 4 875747401 1 -59 193 4 888204465 1 -44 87 5 878347742 1 -160 129 4 876768828 1 -1 165 5 874965518 1 -87 121 5 879875893 0 -23 89 5 874785582 1 -187 52 4 879465683 0 -137 96 5 881433654 0 -151 174 5 879524088 1 -109 151 5 880571661 1 -1 116 3 878542960 1 -174 65 5 886514123 1 -50 100 2 877052400 1 -13 175 4 882139717 0 -94 51 3 891721026 0 -119 31 5 874781779 1 -13 165 3 881515295 0 -85 141 3 879829042 1 -109 53 4 880583336 1 -1 198 5 878542717 1 -181 151 2 878962866 1 -152 33 5 882475924 1 -11 196 5 891904270 0 -145 98 5 875271896 0 -189 199 5 893265263 1 -83 79 5 887665423 0 -30 164 4 875060217 0 -25 133 3 885852381 1 -194 67 1 879549793 0 -62 22 4 879373820 1 -57 15 4 883697223 1 -57 50 5 883697105 1 -11 58 3 891904596 1 -87 174 5 879875736 0 -5 63 1 878844629 0 -23 116 5 874784466 1 -13 132 4 882140002 1 -38 35 5 892433801 1 -58 174 4 884305271 0 -5 181 5 875635757 1 -18 32 2 880132129 0 -144 100 5 888104063 0 -7 69 5 891351728 0 -69 79 4 882145524 1 -22 50 5 878887765 1 -85 42 3 879453876 0 -62 72 3 879375762 0 -70 79 4 884149453 1 -77 199 5 884733988 0 -102 4 2 888801522 0 -18 8 5 880130802 0 -160 157 5 876858346 0 -42 141 3 881109059 0 -85 186 3 879454273 1 -84 100 4 883452155 0 -194 167 2 879549900 0 -1 124 5 875071484 1 -94 47 5 891720498 1 -148 133 5 877019251 0 -42 181 5 881107291 1 -1 95 4 875072303 0 -25 134 4 885852008 1 -10 180 5 877889333 1 -12 88 5 879960826 0 -59 24 4 888203579 1 -122 86 5 879270458 1 -11 88 3 891905003 1 -72 1 4 880035614 0 -154 185 5 879139002 0 -130 96 5 875216786 1 -57 195 3 883698431 0 -106 100 3 881449487 0 -58 134 5 884304766 0 -159 125 5 880557192 0 -162 55 3 877636713 0 -83 127 4 887665549 1 -144 58 3 888105548 0 -122 127 5 879270424 1 -109 175 1 880577734 1 -95 62 4 879196354 0 -45 181 4 881010742 0 -95 49 3 879198604 1 -68 181 5 876973884 0 -75 117 4 884050164 1 -72 198 5 880037881 0 -1 58 4 878542960 1 -148 189 4 877019698 0 -161 194 1 891171503 0 -95 73 4 879198161 0 -5 163 5 879197864 0 -18 172 3 880130551 1 -158 22 5 880134333 0 -59 68 2 888205228 0 -60 133 4 883326893 1 -121 172 5 891388090 0 -13 187 5 882140205 0 -1 142 2 878543238 1 -13 143 1 882140205 0 -43 144 4 883955415 0 -10 70 4 877891747 0 -188 11 5 875071520 1 -8 55 5 879362286 0 -77 192 3 884752900 1 -178 147 4 886678902 0 -108 1 4 879879720 0 -71 168 5 885016641 1 -130 77 5 880396792 1 -160 55 4 876858091 0 -178 100 4 882823758 0 -142 42 4 888640489 1 -102 153 2 892991376 1 -14 186 4 879119497 0 -85 9 4 879456308 0 -7 52 4 891353801 1 -42 174 5 881106711 0 -71 153 4 885016495 0 -60 175 5 883326919 1 -44 172 4 878348521 1 -182 1 4 885613092 1 -7 11 3 891352451 0 -181 130 1 878963241 1 -42 73 4 881108484 1 -97 193 4 884238997 0 -186 177 4 891719775 1 -7 197 4 891351082 1 -49 147 1 888069416 1 -192 9 5 881367527 0 -132 100 4 891278744 0 -18 174 4 880130613 0 -115 185 5 881171409 0 -115 192 5 881171137 0 -158 195 5 880134398 0 -189 179 5 893265478 0 -7 144 5 891351201 1 -110 29 3 886988374 0 -145 77 3 875272348 0 -95 110 2 880572323 1 -71 98 4 885016536 1 -25 79 4 885852757 0 -21 15 4 874951188 0 -177 144 5 880131011 1 -72 197 5 880037702 0 -90 69 1 891383424 0 -123 187 4 879809943 1 -144 72 4 888105338 1 -130 88 2 875217265 0 -9 7 4 886960030 0 -73 96 2 888626523 0 -189 28 4 893266298 1 -94 188 4 885870665 1 -94 159 3 891723081 1 -1 126 2 875071713 1 -1 83 3 875072370 0 -10 23 5 877886911 1 -11 173 5 891904920 1 -96 196 4 884403057 1 -160 59 4 876858346 1 -188 50 4 875072741 0 -43 73 4 883956099 1 -92 63 3 875907504 0 -180 40 4 877127296 1 -13 176 3 882140455 0 -23 181 4 874784337 0 -161 177 2 891171848 0 -198 89 5 884208623 1 -73 183 4 888626262 1 -142 91 5 888640404 0 -184 192 4 889908843 0 -42 168 3 881107773 1 -94 86 5 891720971 1 -44 22 4 878347942 0 -109 22 4 880572950 1 -59 81 4 888205336 1 -137 79 5 881433689 1 -21 127 5 874951188 0 -124 1 3 890287733 1 -92 69 5 875653198 0 -200 22 4 884128372 0 -87 134 4 879877740 1 -119 196 5 886177162 1 -99 98 5 885679596 0 -92 147 2 875640542 0 -178 133 4 885784518 1 -181 120 1 878963204 0 -114 135 4 881260611 0 -73 129 4 888625907 1 -28 196 4 881956081 1 -123 134 4 879872275 1 -82 118 3 878768510 0 -1 3 4 878542960 1 -106 9 4 883876572 0 -87 152 4 879876564 1 -5 200 2 875720717 0 -90 60 4 891385039 1 -83 151 3 880306745 1 -167 86 4 892738212 1 -167 137 5 892738081 1 -49 99 4 888067031 1 -41 173 4 890687549 1 -178 69 5 882826437 1 -59 116 4 888203018 0 -65 66 3 879217972 1 -128 117 5 879967631 1 -7 12 5 892135346 1 -168 181 4 884287298 1 -181 107 1 878963343 1 -66 9 4 883601265 1 -64 10 5 889739733 0 -18 15 4 880131054 0 -63 137 4 875747368 0 -174 87 5 886514089 0 -94 71 4 891721642 0 -174 167 3 886514953 1 -198 137 4 884205252 1 -55 174 4 878176397 1 -62 116 3 879372480 0 -87 194 5 879876403 0 -64 172 4 889739091 0 -125 66 5 879455184 1 -30 135 5 885941156 0 -130 144 5 875216717 0 -104 121 2 888466002 0 -175 136 4 877108051 1 -197 50 5 891409839 1 -10 153 4 877886722 0 -13 60 4 884538767 1 -58 191 5 892791893 0 -5 105 3 875635443 1 -110 31 3 886989057 0 -57 168 3 883698362 0 -42 2 5 881109271 1 -144 198 4 888105287 0 -151 143 5 879524878 1 -89 25 5 879441637 0 -135 38 3 879858003 0 -109 56 5 880577804 1 -18 50 4 880130155 0 -189 14 5 893263994 0 -32 50 4 883717521 1 -177 98 5 880131026 1 -38 185 2 892432573 1 -20 22 5 879669339 1 -128 28 5 879966785 1 -24 7 4 875323676 0 -56 193 5 892678669 0 -151 124 5 879524491 1 -194 136 5 879521167 1 -130 17 5 875217096 1 -92 149 3 886443494 0 -16 135 4 877720916 1 -20 50 3 879667937 1 -1 19 5 875071515 1 -159 118 4 880557464 0 -62 76 4 879374045 1 -95 52 4 879198800 0 -18 142 4 880131173 1 -119 172 4 874782191 0 -81 79 5 876534817 0 -158 83 5 880134913 0 -49 200 3 888067358 0 -59 90 2 888206363 0 -58 56 5 884305369 0 -177 156 5 880130931 0 -59 73 4 888206254 1 -18 187 5 880130393 1 -102 2 2 888801522 0 -102 174 4 888801360 1 -125 95 5 879454628 0 -90 137 5 891384754 1 -125 85 3 892838424 0 -145 64 4 882181785 1 -13 28 5 882398814 0 -10 85 4 877892438 1 -63 10 4 875748004 0 -91 183 5 891438909 1 -145 9 2 875270394 1 -44 175 4 878347972 0 -16 127 5 877719206 1 -92 40 3 875656164 1 -49 174 1 888067691 1 -92 155 2 875654888 0 -44 173 5 878348725 1 -174 143 5 886515457 1 -1 29 1 878542869 1 -151 135 5 879524471 0 -21 9 5 874951188 1 -62 7 4 879372277 1 -92 25 3 875640072 0 -94 127 5 885870175 1 -156 9 4 888185735 1 -73 188 5 888625553 0 -25 125 5 885852817 1 -6 111 2 883599478 1 -198 128 3 884209451 1 -99 174 5 885679705 0 -65 77 5 879217689 0 -44 151 4 878341370 0 -7 50 5 891351042 0 -85 172 4 882813285 1 -77 98 4 884752901 0 -176 181 3 886047879 1 -25 7 4 885853155 0 -116 124 3 876453733 0 -175 111 4 877108015 1 -42 136 4 881107329 0 -6 182 4 883268776 0 -10 40 4 877892438 0 -195 135 5 875771440 1 -115 83 3 881172183 0 -76 24 2 882607536 1 -62 117 4 879372563 0 -167 184 1 892738278 1 -1 18 4 887432020 1 -196 110 1 881252305 1 -94 134 5 886008885 1 -138 147 4 879023779 1 -1 59 5 876892817 1 -193 159 4 889124191 1 -198 151 4 884206401 0 -1 15 5 875071608 1 -57 1 5 883698581 0 -1 111 5 889751711 1 -1 52 4 875072205 0 -144 137 4 888104150 0 -125 67 5 892838865 1 -106 70 3 881452355 0 -145 96 5 882181728 1 -18 28 3 880129527 1 -189 170 4 893265380 1 -32 181 4 883717628 0 -18 56 5 880129454 0 -95 194 5 879197603 1 -198 96 4 884208326 1 -10 12 5 877886911 0 -30 69 5 885941156 1 -1 88 4 878542791 1 -182 15 4 885612967 1 -119 93 4 874775262 1 -109 28 3 880572721 1 -184 197 4 889908873 1 -70 1 4 884065277 0 -41 156 4 890687304 0 -92 169 5 875653121 0 -38 162 5 892431727 0 -6 8 4 883600657 1 -160 9 3 876767023 1 -18 83 5 880129877 0 -10 179 5 877889004 1 -186 77 5 879023694 0 -156 77 2 888185906 0 -120 118 2 889490979 0 -7 86 4 891350810 1 -145 11 5 875273120 0 -178 174 5 882826719 1 -114 200 3 881260409 1 -22 174 5 878887765 0 -177 42 4 880130972 1 -1 13 5 875071805 0 -16 33 2 877722001 1 -90 135 5 891384570 1 -12 69 5 879958902 1 -72 106 4 880036185 1 -44 190 5 878348000 1 -116 127 5 876454257 1 -12 127 4 879959488 0 -183 50 2 891467546 1 -114 172 5 881259495 0 -25 177 3 885852488 0 -162 28 4 877636746 1 -144 183 4 888105140 0 -60 141 3 883327472 0 -43 143 4 883955247 0 -159 15 5 880485972 1 -7 164 5 891351813 1 -174 98 5 886452583 1 -92 108 2 886443416 0 -189 185 5 893265428 0 -115 100 5 881171982 0 -121 11 2 891387992 0 -180 181 2 877125956 1 -44 181 4 878341290 1 -48 193 2 879434751 0 -151 173 5 879524130 1 -151 28 4 879524199 0 -190 24 3 891033773 1 -194 199 4 879521329 1 -102 1 3 883748352 1 -89 173 5 879459859 0 -148 173 5 877017054 1 -13 9 3 882140205 0 -158 70 4 880135118 1 -175 98 5 877107390 1 -59 143 1 888204641 1 -95 50 5 879197329 0 -45 24 3 881014550 0 -41 50 5 890687066 0 -109 50 5 880563331 0 -91 79 5 891439018 0 -85 162 2 879454235 1 -156 100 4 888185677 1 -65 194 4 879217881 0 -75 129 3 884049939 0 -1 28 4 875072173 1 -59 53 5 888206161 0 -117 164 5 881011727 0 -25 82 4 885852150 1 -178 173 5 882826306 0 -121 1 4 891388475 0 -125 82 5 879454386 1 -161 118 2 891172421 1 -110 67 3 886989566 1 -77 191 3 884752948 1 -195 109 3 878019342 1 -11 107 4 891903276 0 -106 82 3 881453290 1 -1 172 5 874965478 1 -13 135 5 882139541 0 -24 97 4 875323193 0 -18 133 5 880130713 0 -72 23 4 880036550 1 -23 176 3 874785843 1 -87 56 4 879876524 1 -44 31 4 878348998 1 -198 81 5 884208326 0 -13 8 4 882140001 1 -83 50 3 880327590 1 -118 100 5 875384751 1 -60 15 4 883328033 1 -118 5 2 875385256 0 -82 134 4 878769442 0 -154 152 4 879138832 1 -118 179 5 875384612 1 -200 139 3 884130540 0 -177 187 4 880131040 1 -59 28 5 888204841 0 -67 117 5 875379794 1 -62 191 5 879373613 0 -77 134 4 884752562 1 -145 49 3 875272926 1 -72 81 3 880036876 1 -158 4 4 880134477 0 -186 147 4 891719774 1 -130 7 5 874953557 0 -192 111 2 881368222 1 -87 128 3 879876037 1 -63 181 3 875747556 1 -58 200 3 884305295 0 -190 9 1 891033725 0 -58 7 5 884304656 1 -13 116 5 882140455 0 -114 171 4 881309511 0 -7 173 5 891351002 1 -49 12 4 888068057 0 -1 122 3 875241498 1 -175 187 4 877107338 0 -148 164 4 877398444 0 -77 183 5 884732606 0 -13 141 2 890705034 1 -13 182 5 882139347 1 -53 15 5 879443027 0 -24 58 3 875323745 0 -20 82 4 879669697 1 -63 121 1 875748139 1 -93 118 3 888705416 0 -42 87 4 881107576 1 -41 191 4 890687473 0 -93 14 4 888705200 0 -144 59 4 888105197 0 -58 168 5 891611548 0 -85 196 4 879454952 1 -14 25 2 876965165 1 -85 161 4 882819528 1 -62 15 2 879372634 1 -122 197 5 879270482 0 -144 170 4 888105364 0 -104 9 2 888465201 1 -94 182 5 885873089 1 -128 180 5 879967174 0 -59 129 5 888202941 1 -115 117 4 881171009 1 -135 5 3 879857868 1 -142 134 5 888640356 0 -178 118 4 882824291 0 -106 59 4 881453318 1 -71 100 4 877319197 1 -27 123 5 891543191 0 -38 195 1 892429952 0 -30 29 3 875106638 1 -18 116 5 880131358 1 -154 172 4 879138783 1 -120 50 4 889489973 1 -52 191 5 882923031 1 -189 186 2 893266027 1 -64 197 3 889737506 0 -23 173 5 874787587 0 -159 9 3 880485766 1 -54 148 3 880937490 1 -90 23 5 891384997 1 -151 73 4 879528909 0 -76 96 5 875312034 1 -198 93 3 884205346 0 -103 56 5 880416602 1 -77 42 5 884752948 1 -130 117 5 874953895 1 -56 28 5 892678669 0 -94 151 5 891721716 1 -59 86 3 888205145 1 -25 86 4 885852248 1 -103 98 3 880420565 1 -11 11 2 891904271 1 -49 121 1 888068100 1 -44 97 2 878348000 0 -16 66 4 877719075 0 -1 152 5 878542589 1 -177 160 4 880131011 0 -41 135 4 890687473 1 -21 53 4 874951820 1 -158 7 5 880132744 0 -56 66 3 892911110 0 -184 95 4 889908801 0 -188 187 3 875072211 1 -85 181 4 882813312 1 -37 62 5 880916070 0 -44 183 4 883613372 1 -65 9 5 879217138 0 -145 53 2 875272245 0 -24 151 5 875322848 0 -23 73 3 874787016 0 -62 151 5 879372651 0 -13 188 4 882140130 1 -87 180 4 879875649 1 -59 4 4 888205188 0 -10 93 4 877892160 0 -20 15 4 879667937 0 -21 1 5 874951244 1 -44 198 4 878348947 0 -18 127 5 880129668 1 -189 59 3 893265191 0 -71 6 3 880864124 0 -7 198 3 891351685 0 -188 176 4 875072876 0 -52 7 5 882922204 1 -57 144 3 883698408 1 -55 7 3 878176047 0 -70 172 5 884064217 1 -59 91 4 888205265 1 -49 2 1 888069606 1 -60 135 5 883327087 1 -7 152 4 891351851 0 -82 22 3 878769777 0 -13 166 5 884538663 0 -49 154 5 888068715 1 -158 125 3 880132745 0 -42 103 3 881106162 1 -14 19 5 880929651 0 -92 13 4 886443292 1 -141 181 4 884584709 0 -22 68 4 878887925 0 -83 88 5 880308186 1 -178 111 4 882823905 0 -145 59 1 882181695 1 -62 64 4 879373638 0 -70 94 3 884151014 0 -80 64 5 887401475 0 -192 118 2 881367932 0 -145 97 5 875272652 1 -37 121 2 880915528 0 -153 187 2 881371198 1 -145 121 2 875270507 1 -1 94 2 875072956 1 -16 39 5 877720118 0 -189 180 5 893265741 1 -65 178 5 879217689 0 -174 168 1 886434621 0 -90 196 4 891385250 0 -26 122 1 891380200 1 -150 14 4 878746889 1 -148 194 5 877015066 0 -151 190 4 879528673 1 -102 154 3 888803708 0 -31 192 4 881548054 1 -174 88 5 886513752 1 -89 107 5 879441780 1 -122 28 4 879270084 0 -160 127 5 876770168 0 -148 127 1 877399351 1 -57 121 4 883697432 1 -92 65 4 875653960 1 -10 9 4 877889005 0 -109 180 3 880581127 0 -64 184 4 889739243 0 -43 123 1 875975520 1 -25 176 4 885852862 0 -98 194 5 880498898 1 -10 198 3 877889005 0 -5 174 5 875636130 0 -102 7 2 888801407 1 -102 172 3 888801232 1 -130 93 5 874953665 0 +user_id:token item_id:token rating:float timestamp:float +6 86 3 883603013 +38 95 5 892430094 +97 194 3 884238860 +7 32 4 891350932 +10 16 4 877888877 +99 4 5 886519097 +25 181 5 885853415 +59 196 5 888205088 +115 20 3 881171009 +138 26 5 879024232 +194 165 4 879546723 +11 111 4 891903862 +162 25 4 877635573 +135 23 4 879857765 +160 174 5 876860807 +42 96 5 881107178 +168 151 5 884288058 +58 144 4 884304936 +62 21 3 879373460 +44 195 5 878347874 +72 195 5 880037702 +82 135 3 878769629 +59 23 5 888205300 +43 14 2 883955745 +160 135 4 876860807 +90 98 5 891383204 +68 117 4 876973939 +172 177 4 875537965 +19 4 4 885412840 +5 2 3 875636053 +43 137 4 875975656 +99 79 4 885680138 +13 98 4 881515011 +1 61 4 878542420 +72 48 4 880036718 +92 77 3 875654637 +194 181 3 879521396 +151 10 5 879524921 +6 14 5 883599249 +54 106 3 880937882 +62 65 4 879374686 +92 172 4 875653271 +14 98 3 890881335 +194 54 3 879525876 +38 153 5 892430369 +193 96 1 889124507 +158 177 4 880134407 +181 3 2 878963441 +13 198 3 881515193 +1 189 3 888732928 +16 64 5 877720297 +95 135 3 879197562 +145 15 2 875270655 +187 64 5 879465631 +184 153 3 889911285 +1 33 4 878542699 +1 160 4 875072547 +82 183 3 878769848 +13 56 5 881515011 +18 26 4 880129731 +144 89 3 888105691 +200 96 5 884129409 +16 197 5 877726146 +142 169 5 888640356 +87 40 3 879876917 +10 175 3 877888677 +197 96 5 891409839 +194 66 3 879527264 +104 117 2 888465972 +7 163 4 891353444 +13 186 4 890704999 +83 78 2 880309089 +151 197 5 879528710 +5 17 4 875636198 +125 163 5 879454956 +23 196 2 874786926 +128 15 4 879968827 +60 60 5 883327734 +99 111 1 885678886 +65 47 2 879216672 +137 144 5 881433689 +1 20 4 887431883 +96 156 4 884402860 +72 182 5 880036515 +187 135 4 879465653 +184 187 4 889909024 +92 168 4 875653723 +72 54 3 880036854 +117 150 4 880125101 +94 184 2 891720862 +130 109 3 874953794 +151 176 2 879524293 +45 25 4 881014015 +131 126 4 883681514 +109 8 3 880572642 +198 58 3 884208173 +157 25 3 886890787 +56 121 5 892679480 +62 12 4 879373613 +10 7 4 877892210 +6 98 5 883600680 +118 200 5 875384647 +10 100 5 877891747 +189 56 5 893265263 +56 71 4 892683275 +185 23 4 883524249 +109 127 2 880563471 +18 86 4 880129731 +22 128 5 878887983 +8 22 5 879362183 +1 171 5 889751711 +181 121 4 878962623 +200 11 5 884129542 +90 25 5 891384789 +22 80 4 878887227 +15 25 3 879456204 +16 55 5 877717956 +189 20 5 893264466 +125 80 4 892838865 +43 120 4 884029430 +42 44 3 881108548 +102 70 3 888803537 +77 172 3 884752562 +62 68 1 879374969 +85 51 2 879454782 +87 82 5 879875774 +194 172 3 879521474 +94 62 3 891722933 +108 100 4 879879720 +90 22 4 891384357 +92 121 5 875640679 +194 23 4 879522819 +188 143 5 875072674 +161 48 1 891170745 +59 92 5 888204997 +21 129 4 874951382 +58 9 4 884304328 +194 152 3 879549996 +7 200 5 891353543 +113 126 5 875076827 +16 194 5 877720733 +79 50 4 891271545 +125 190 5 892836309 +150 181 5 878746685 +5 110 1 875636493 +1 155 2 878542201 +24 64 5 875322758 +82 56 3 878769410 +56 91 4 892683275 +16 8 5 877722736 +145 56 5 875271896 +17 13 3 885272654 +148 1 4 877019411 +21 164 5 874951695 +1 117 3 874965739 +60 162 4 883327734 +6 69 3 883601277 +110 38 3 886988574 +13 72 4 882141727 +194 77 3 879527421 +109 178 3 880572950 +62 182 5 879375169 +65 125 4 879217509 +90 12 5 891383241 +130 105 4 876251160 +96 87 4 884403531 +84 121 4 883452307 +198 118 2 884206513 +26 125 4 891371676 +151 13 3 879542688 +24 191 5 875323003 +13 181 5 882140354 +2 50 5 888552084 +144 125 4 888104191 +57 79 5 883698495 +121 180 3 891388286 +62 86 2 879374640 +194 187 4 879520813 +109 97 3 880578711 +8 50 5 879362124 +186 148 4 891719774 +175 127 5 877107640 +153 174 1 881371140 +62 59 4 879373821 +83 97 4 880308690 +63 100 5 875747319 +16 178 5 877719333 +85 25 2 879452769 +42 98 4 881106711 +184 98 4 889908539 +72 196 4 880036747 +128 182 4 879967225 +7 171 3 891351287 +181 14 1 878962392 +158 128 2 880134296 +1 47 4 875072125 +95 68 4 879196231 +6 23 4 883601365 +66 181 5 883601425 +76 61 4 875028123 +13 147 3 882397502 +16 89 2 877717833 +94 155 2 891723807 +136 89 4 882848925 +82 194 4 878770027 +178 199 4 882826306 +185 114 4 883524320 +94 24 4 885873423 +83 43 4 880308690 +59 177 4 888204349 +161 168 1 891171174 +43 40 3 883956468 +49 68 1 888069513 +44 15 4 878341343 +190 117 4 891033697 +29 189 4 882821942 +94 174 4 885870231 +117 181 5 880124648 +194 191 4 879521856 +158 24 4 880134261 +188 96 5 875073128 +58 173 5 884305353 +151 12 5 879524368 +14 174 5 890881294 +66 1 3 883601324 +5 1 4 875635748 +160 160 5 876862078 +109 1 4 880563619 +152 111 5 880148782 +194 160 2 879551380 +77 91 3 884752924 +181 1 3 878962392 +18 182 4 880130640 +87 177 5 879875940 +177 69 1 880131088 +125 134 5 879454532 +59 77 4 888206254 +38 161 5 892432062 +121 14 5 891390014 +117 15 5 880125887 +85 187 5 879454235 +59 54 4 888205921 +13 195 3 881515296 +144 153 5 888105823 +1 113 5 878542738 +76 175 4 875028853 +121 117 1 891388600 +85 13 3 879452866 +184 191 4 889908716 +13 121 5 882397503 +43 5 4 875981421 +11 38 3 891905936 +37 117 4 880915674 +70 82 4 884068075 +5 98 3 875720691 +56 184 4 892679088 +45 109 5 881012356 +65 100 3 879217558 +184 86 5 889908694 +72 28 4 880036824 +115 8 5 881171982 +95 1 5 879197329 +151 58 4 879524849 +45 118 4 881014550 +145 22 5 875273021 +71 89 5 880864462 +182 69 5 876435435 +64 160 4 889739288 +28 79 4 881961003 +18 113 5 880129628 +83 82 5 887665423 +87 196 5 879877681 +150 129 4 878746946 +161 98 4 891171357 +51 182 3 883498790 +92 176 5 875652981 +92 180 5 875653016 +90 187 4 891383561 +66 7 3 883601355 +144 182 3 888105743 +85 83 4 886282959 +197 55 3 891409982 +25 25 5 885853415 +103 24 4 880415847 +87 9 4 879877931 +49 47 5 888068715 +44 95 4 878347569 +135 39 3 879857931 +13 66 3 882141485 +184 161 2 889909640 +142 82 4 888640356 +99 50 5 885679998 +16 56 5 877719863 +62 132 5 879375022 +13 59 4 882140425 +102 161 2 888801876 +56 172 5 892737191 +65 196 5 879216637 +92 115 3 875654125 +32 151 3 883717850 +180 68 5 877127721 +184 36 3 889910195 +73 94 1 888625754 +198 7 4 884205317 +189 197 5 893265291 +73 56 4 888626041 +5 102 3 875721196 +13 150 5 882140588 +104 7 3 888465972 +42 176 3 881107178 +92 15 3 875640189 +79 100 5 891271652 +1 17 3 875073198 +7 81 5 891352626 +59 148 3 888203175 +82 14 4 876311280 +195 154 3 888737525 +92 81 3 875654929 +94 58 5 891720540 +117 151 4 880126373 +91 28 4 891439243 +64 176 4 889737567 +62 111 3 879372670 +95 172 4 879196847 +148 140 1 877019882 +185 199 4 883526268 +174 80 1 886515210 +42 195 5 881107949 +81 169 4 876534751 +62 114 4 879373568 +49 7 4 888067307 +58 100 5 884304553 +160 56 5 876770222 +103 127 4 880416331 +11 110 3 891905324 +87 2 4 879876074 +161 162 2 891171413 +23 172 4 874785889 +7 151 4 891352749 +84 12 5 883452874 +94 168 5 891721378 +144 106 3 888104684 +103 121 3 880415766 +200 24 2 884127370 +160 117 4 876767822 +158 72 3 880135118 +92 24 3 875640448 +164 117 5 889401816 +21 103 1 874951245 +1 90 4 878542300 +49 38 1 888068289 +151 89 5 879524491 +198 100 1 884207325 +194 4 4 879521397 +177 56 5 880130618 +57 28 4 883698324 +159 127 5 880989744 +16 155 3 877719157 +21 98 5 874951657 +77 195 5 884733695 +108 50 4 879879739 +184 181 4 889907426 +28 95 3 881956917 +181 16 1 878962996 +97 89 5 884238939 +109 101 1 880578186 +148 114 5 877016735 +94 9 5 885872684 +106 107 4 883876961 +67 64 5 875379211 +184 155 3 889912656 +68 7 3 876974096 +13 14 4 884538727 +71 134 3 885016614 +198 135 5 884208061 +98 47 4 880498898 +53 24 3 879442538 +7 106 4 891353892 +63 20 3 875748004 +42 185 4 881107449 +148 70 5 877021271 +184 71 4 889911552 +158 190 5 880134332 +83 118 3 880307071 +116 7 2 876453915 +52 95 4 882922927 +160 187 5 876770168 +26 25 3 891373727 +99 181 5 885680138 +56 196 2 892678628 +43 151 4 875975613 +62 24 4 879372633 +194 82 2 879524216 +42 69 4 881107375 +125 152 1 879454892 +63 50 4 875747292 +7 127 5 891351728 +6 143 2 883601053 +5 62 4 875637575 +184 100 5 889907652 +1 64 5 875072404 +142 181 5 888640317 +69 174 5 882145548 +49 17 2 888068651 +7 196 5 891351432 +175 96 3 877108051 +44 120 4 878346977 +83 139 3 880308959 +43 52 4 883955224 +174 160 5 886514377 +94 89 3 885870284 +7 44 5 891351728 +158 85 4 880135118 +196 67 5 881252017 +99 182 4 886518810 +175 71 4 877107942 +11 190 3 891904174 +162 181 4 877635798 +59 70 3 888204758 +131 100 5 883681418 +22 79 4 878887765 +115 127 5 881171760 +178 73 5 882827985 +56 69 4 892678893 +13 144 4 882397146 +15 127 2 879455505 +37 55 3 880915942 +16 191 5 877719454 +97 98 4 884238728 +58 109 4 884304396 +189 1 5 893264174 +67 147 3 875379357 +81 3 4 876592546 +151 186 4 879524222 +53 174 5 879442561 +123 135 5 879872868 +151 15 4 879524879 +59 12 5 888204260 +59 170 4 888204430 +92 106 3 875640609 +97 50 5 884239471 +150 121 2 878747322 +23 170 4 874785348 +13 97 4 882399357 +28 98 5 881961531 +28 173 3 881956220 +38 139 2 892432786 +44 123 4 878346532 +18 154 4 880131358 +7 28 5 891352341 +115 92 4 881172049 +62 138 1 879376709 +41 28 4 890687353 +117 50 5 880126022 +178 106 2 882824983 +198 179 4 884209264 +99 168 5 885680374 +109 31 4 880577844 +43 64 5 875981247 +89 197 5 879459859 +7 153 5 891352220 +70 50 4 884064188 +43 66 4 875981506 +60 47 4 883326399 +92 79 4 875653198 +97 115 5 884239525 +123 192 5 879873119 +49 49 2 888068990 +21 184 4 874951797 +145 183 5 875272009 +76 92 4 882606108 +48 174 5 879434723 +5 24 4 879198229 +64 93 2 889739025 +96 153 4 884403624 +150 100 2 878746636 +93 15 5 888705388 +13 167 4 882141659 +18 58 4 880130613 +145 13 5 875270507 +145 1 3 882181396 +7 188 5 891352778 +109 100 4 880563080 +7 78 3 891354165 +82 73 4 878769888 +145 50 5 885557660 +85 175 4 879828912 +124 50 3 890287508 +151 162 5 879528779 +187 116 5 879464978 +69 12 5 882145567 +85 133 4 879453876 +114 175 5 881259955 +42 121 4 881110578 +94 186 4 891722278 +85 98 4 879453716 +116 185 3 876453519 +123 13 3 879873988 +95 174 5 879196231 +178 148 4 882824325 +138 121 4 879023558 +30 82 4 875060217 +69 175 3 882145586 +16 144 5 877721142 +128 140 4 879968308 +95 128 3 879196354 +124 11 5 890287645 +7 133 5 891353192 +28 7 5 881961531 +7 93 5 891351042 +154 175 5 879138784 +44 56 2 878348601 +130 161 4 875802058 +98 163 3 880499053 +128 79 4 879967692 +195 186 3 888737240 +189 91 3 893265684 +95 143 4 880571951 +94 157 5 891725332 +7 174 5 891350757 +177 79 4 880130758 +77 168 4 884752721 +144 31 3 888105823 +94 33 3 891721919 +178 125 4 882824431 +138 151 4 879023389 +189 30 4 893266205 +198 24 2 884205385 +125 173 5 879454100 +128 143 5 879967300 +65 56 3 879217816 +144 91 2 888106106 +197 176 5 891409798 +26 15 4 891386369 +7 182 4 891350965 +109 154 2 880578121 +161 174 2 891170800 +109 89 4 880573263 +195 181 5 875771440 +7 193 5 892135346 +77 125 3 884733014 +85 58 4 879829689 +1 92 3 876892425 +90 31 4 891384673 +158 1 4 880132443 +42 143 4 881108229 +43 26 5 883954901 +130 200 5 875217392 +68 118 2 876974248 +102 118 3 888801465 +189 120 1 893264954 +20 11 2 879669401 +20 176 2 879669152 +49 148 1 888068195 +160 3 3 876770124 +152 147 3 880149045 +162 121 4 877636000 +178 121 5 882824291 +76 135 5 875028792 +75 121 4 884050450 +44 174 5 878347662 +145 172 5 882181632 +188 191 3 875073128 +37 183 4 880930042 +125 150 1 879454892 +56 194 5 892676908 +16 92 4 877721905 +60 79 4 883326620 +1 121 4 875071823 +62 4 4 879374640 +26 7 3 891350826 +121 86 5 891388286 +198 180 3 884207298 +1 114 5 875072173 +180 79 3 877442037 +67 1 3 875379445 +1 132 4 878542889 +1 74 1 889751736 +22 173 5 878886368 +1 134 4 875073067 +94 45 5 886008764 +6 180 4 883601311 +188 88 4 875075300 +137 55 5 881433689 +91 172 4 891439208 +150 13 4 878746889 +151 25 4 879528496 +181 123 2 878963276 +194 196 3 879524007 +109 5 3 880580637 +16 168 4 877721142 +74 9 4 888333458 +144 66 4 888106078 +195 14 4 890985390 +18 199 3 880129769 +174 41 1 886515063 +109 159 4 880578121 +56 68 3 892910913 +109 195 5 880578038 +183 96 3 891463617 +178 131 4 882827947 +119 54 4 886176814 +1 98 4 875072404 +64 187 5 889737395 +82 15 3 876311365 +1 186 4 875073128 +181 20 1 878962919 +87 135 5 879875649 +87 157 3 879877799 +87 163 4 879877083 +96 91 5 884403250 +24 153 4 875323368 +43 114 5 883954950 +42 48 5 881107821 +125 97 3 879454385 +108 13 3 879879834 +144 62 2 888105902 +148 172 5 877016513 +188 159 3 875074589 +44 88 2 878348885 +190 147 4 891033863 +185 127 5 883525183 +150 1 4 878746441 +60 179 4 883326566 +75 147 3 884050134 +59 121 4 888203313 +7 22 5 891351121 +85 53 3 882995643 +95 176 3 879196298 +144 64 5 888105140 +56 29 3 892910913 +200 72 4 884129542 +130 56 5 875216283 +49 102 2 888067164 +177 89 5 880131088 +42 102 5 881108873 +180 67 1 877127591 +23 183 3 874785728 +65 97 5 879216605 +92 134 4 875656623 +152 25 3 880149045 +62 28 3 879375169 +64 77 3 889737420 +15 20 3 879455541 +14 22 3 890881521 +62 157 3 879374686 +59 13 5 888203415 +73 12 5 888624976 +6 95 2 883602133 +87 70 5 879876448 +1 84 4 875072923 +22 186 5 878886368 +72 129 4 880035588 +1 31 3 875072144 +22 96 5 878887680 +85 97 2 879829667 +181 7 4 878963037 +94 180 5 885870284 +16 70 4 877720118 +58 45 5 884305295 +151 191 3 879524326 +158 38 4 880134607 +181 124 1 878962550 +145 182 5 885622510 +44 11 3 878347915 +49 10 3 888066086 +17 151 4 885272751 +59 47 5 888205574 +14 111 3 876965165 +195 100 5 875771440 +130 172 5 875801530 +177 124 3 880130881 +1 70 3 875072895 +13 178 4 882139829 +30 181 4 875060217 +8 182 5 879362183 +7 162 5 891353444 +56 63 3 892910268 +92 175 4 875653549 +18 196 3 880131297 +158 79 4 880134332 +87 67 4 879877007 +90 11 4 891384113 +1 60 5 875072370 +119 154 5 874782022 +83 186 4 880308601 +1 177 5 876892701 +59 10 4 888203234 +10 48 4 877889058 +99 124 2 885678886 +152 132 5 882475496 +189 45 3 893265657 +91 193 3 891439057 +14 56 5 879119579 +13 42 4 882141393 +159 111 4 880556981 +137 195 5 881433689 +152 97 5 882475618 +63 150 4 875747292 +200 103 2 891825521 +13 94 3 882142057 +14 93 3 879119311 +38 122 1 892434801 +148 177 2 877020715 +184 47 4 889909640 +145 25 2 875270655 +59 132 5 888205744 +1 27 2 876892946 +104 122 3 888465739 +60 178 5 883326399 +200 191 5 884128554 +148 185 1 877398385 +13 180 5 882141248 +25 174 5 885853415 +157 150 5 874813703 +106 69 4 881449886 +80 50 3 887401533 +56 174 5 892737191 +82 69 4 878769948 +83 95 4 880308453 +17 9 3 885272558 +82 147 3 876311473 +62 135 4 879375080 +5 167 2 875636281 +118 174 5 875385007 +13 29 2 882397833 +125 158 4 892839066 +43 15 5 875975546 +193 195 1 889124507 +117 1 4 880126083 +103 117 4 880416313 +104 100 4 888465166 +95 96 4 879196298 +49 1 2 888068651 +1 145 2 875073067 +1 174 5 875073198 +10 124 5 877888545 +81 118 2 876533764 +136 117 4 882694498 +115 11 4 881171348 +64 2 3 889737609 +28 50 4 881957090 +1 159 3 875073180 +60 172 4 883326339 +18 69 3 880129527 +184 132 5 889913687 +151 169 5 879524268 +110 79 4 886988480 +128 111 3 879969215 +1 82 5 878542589 +13 45 3 882139863 +94 185 5 885873684 +128 83 5 879967691 +142 189 4 888640317 +1 56 4 875072716 +184 14 4 889907738 +198 156 3 884207058 +194 153 3 879546723 +136 14 5 882693338 +73 127 5 888625200 +116 187 5 886310197 +28 12 4 881956853 +85 86 4 879454189 +151 7 4 879524610 +1 80 4 876893008 +44 153 4 878347234 +94 79 4 885882967 +109 62 3 880578711 +49 173 3 888067691 +121 121 2 891388501 +60 183 5 883326399 +198 51 3 884208455 +13 2 3 882397650 +44 55 4 878347455 +37 56 5 880915810 +194 162 3 879549899 +130 71 5 875801695 +130 50 5 874953665 +125 22 5 892836395 +69 56 5 882145428 +110 188 4 886988574 +106 45 3 881453290 +151 66 4 879524974 +123 22 4 879809943 +198 148 3 884206401 +56 79 4 892676303 +151 175 5 879524244 +152 125 5 880149165 +123 165 5 879872672 +169 174 4 891359418 +63 109 4 875747731 +72 89 3 880037164 +80 87 4 887401307 +85 56 4 879453587 +194 56 5 879521936 +110 82 4 886988480 +7 195 5 891352626 +12 82 4 879959610 +109 90 3 880583192 +13 64 5 882140037 +82 64 5 878770169 +42 70 3 881109148 +10 4 4 877889130 +14 175 5 879119497 +6 134 5 883602283 +28 153 3 881961214 +62 96 4 879374835 +102 195 4 888801360 +8 79 4 879362286 +28 184 4 881961671 +51 148 3 883498623 +186 53 1 879023882 +141 125 5 884585642 +23 88 3 874787410 +72 79 4 880037119 +82 13 2 878768615 +83 77 4 880308426 +43 7 4 875975520 +23 90 2 874787370 +106 97 5 881450810 +109 147 4 880564679 +156 58 4 888185906 +16 151 5 877721905 +94 99 3 891721815 +154 137 3 879138657 +158 144 4 880134445 +11 120 2 891903935 +197 181 5 891409893 +65 70 1 879216529 +128 77 3 879968447 +167 48 1 892738277 +56 143 3 892910182 +115 69 1 881171825 +145 109 4 875270903 +59 127 5 888204430 +58 42 4 884304936 +77 23 4 884753173 +95 15 4 879195062 +184 172 4 889908497 +13 168 4 881515193 +158 8 5 880134948 +92 87 3 876175077 +20 118 4 879668442 +95 33 3 880571704 +130 125 5 875801963 +174 107 5 886434361 +97 7 5 884238939 +125 143 5 879454793 +160 126 3 876769148 +32 117 3 883717555 +1 140 1 878543133 +5 173 4 875636675 +49 117 1 888069459 +25 127 3 885853030 +92 85 3 875812364 +187 70 4 879465394 +194 62 2 879524504 +70 71 3 884066399 +49 72 2 888069246 +194 132 3 879520991 +175 31 4 877108051 +138 100 5 879022956 +63 6 3 875747439 +180 121 5 877127830 +148 98 3 877017714 +102 66 3 892992129 +158 42 3 880134913 +70 151 3 884148603 +103 144 4 880420510 +95 173 5 879198547 +102 67 1 892993706 +160 93 5 876767572 +99 118 2 885679237 +70 152 4 884149877 +41 31 3 890687473 +178 179 2 882828320 +6 19 4 883602965 +130 55 5 875216507 +136 56 4 882848783 +74 15 4 888333542 +1 120 1 875241637 +64 100 4 879365558 +6 154 3 883602730 +60 152 4 883328033 +161 14 4 891171413 +18 82 3 880131236 +22 29 1 878888228 +96 8 5 884403020 +72 176 2 880037203 +102 89 4 888801315 +60 151 5 883326995 +13 90 3 882141872 +7 92 5 891352010 +91 195 5 891439057 +62 8 5 879373820 +197 68 2 891410082 +26 9 4 891386369 +119 193 4 874781872 +117 174 4 881011393 +189 129 3 893264378 +1 125 3 878542960 +23 83 4 874785926 +6 175 4 883601426 +184 89 4 889908572 +44 155 3 878348947 +90 199 5 891384423 +130 90 4 875801920 +20 186 3 879669040 +37 79 4 880915810 +163 56 4 891220097 +72 82 3 880037242 +117 176 5 881012028 +121 174 3 891388063 +20 172 3 879669181 +108 125 3 879879864 +49 53 4 888067405 +106 165 5 881450536 +85 71 4 879456308 +151 91 2 879542796 +116 195 4 876453626 +144 172 4 888105312 +74 126 3 888333428 +45 127 5 881007272 +109 4 2 880572756 +12 96 4 879959583 +109 42 1 880572756 +174 82 1 886515472 +180 83 5 877128388 +150 127 5 878746889 +102 83 3 888803487 +128 97 3 879968125 +11 90 2 891905298 +194 52 4 879525876 +177 87 4 880130931 +68 178 5 876974755 +90 179 5 891385389 +13 88 4 882141485 +120 25 5 889490370 +138 98 5 879024043 +160 124 4 876767360 +94 133 4 885882685 +121 122 2 891390501 +19 153 4 885412840 +90 132 5 891384673 +49 40 1 888069222 +7 90 3 891352984 +21 56 5 874951658 +184 126 3 889907971 +26 100 5 891386368 +21 106 2 874951447 +90 9 4 891385787 +31 135 4 881548030 +62 89 5 879374640 +1 6 5 887431973 +10 22 5 877888812 +90 30 5 891385843 +1 104 1 875241619 +76 100 5 875028391 +11 97 4 891904300 +83 125 5 880306811 +16 22 5 877721071 +10 155 4 877889186 +92 132 3 875812211 +18 25 3 880131591 +12 172 4 879959088 +57 56 3 883698646 +73 196 4 888626177 +7 10 4 891352864 +118 176 5 875384793 +77 153 5 884732685 +151 196 4 879542670 +102 186 4 888803487 +14 100 5 876965165 +130 148 4 876251127 +158 100 5 880132401 +59 14 5 888203234 +1 49 3 878542478 +94 109 4 891721974 +102 62 3 888801812 +118 156 5 875384946 +81 93 3 876533657 +79 124 5 891271870 +106 15 3 883876518 +73 7 4 888625956 +187 28 4 879465597 +15 137 4 879455939 +77 4 3 884752721 +184 92 3 889908657 +6 188 3 883602462 +194 51 4 879549793 +56 1 4 892683248 +177 182 5 880130684 +1 76 4 878543176 +106 64 4 881449830 +157 127 5 886890541 +56 31 4 892679259 +60 28 5 883326155 +12 143 5 879959635 +102 121 3 888801673 +92 123 2 875640251 +22 117 4 878887869 +18 190 4 880130155 +72 64 5 880036549 +1 72 4 878542678 +48 187 5 879434954 +94 153 5 891725333 +128 64 5 879966954 +62 153 4 879374686 +53 100 5 879442537 +174 94 2 886515062 +5 154 3 875636691 +200 7 4 876042451 +65 121 4 879217458 +63 111 3 875747896 +198 11 4 884207392 +91 99 2 891439386 +42 131 2 881108548 +152 98 2 882473974 +55 144 5 878176398 +125 175 2 879455184 +82 178 4 878769629 +1 185 4 875072631 +184 15 3 889907812 +152 167 5 882477430 +144 50 5 888103929 +97 28 5 884238778 +114 195 4 881260861 +188 69 4 875072009 +106 77 4 881451716 +188 7 5 875073477 +96 64 5 884403336 +160 79 4 876859413 +18 191 4 880130193 +162 42 3 877636675 +95 26 3 880571951 +58 8 4 884304955 +110 22 4 886987826 +1 96 5 875072716 +89 127 5 879441335 +95 137 3 879192404 +17 1 4 885272579 +87 154 4 879876564 +135 54 3 879858003 +14 151 5 876964725 +148 71 5 877019251 +6 156 3 883602212 +130 58 2 876251619 +76 12 3 882606060 +95 32 1 888954726 +130 47 3 875801470 +12 97 5 879960826 +38 99 5 892430829 +198 188 5 884208200 +72 45 5 880037853 +44 82 4 878348885 +198 97 3 884207112 +189 60 3 893265773 +28 100 5 881957425 +119 86 4 874782068 +174 117 5 886434136 +14 13 4 880929778 +103 126 5 880420002 +94 101 2 891720996 +92 42 4 875653664 +45 121 4 881013563 +175 56 2 877107790 +185 196 4 883524172 +49 168 5 888068686 +72 68 3 880037242 +72 12 5 880036664 +49 56 5 888067307 +82 191 4 878769748 +151 100 3 879524514 +20 194 3 879669152 +145 185 4 875271838 +169 172 5 891359317 +65 191 4 879216797 +121 125 2 891388600 +59 7 4 888202941 +52 116 4 882922328 +59 100 5 888202899 +24 129 3 875246185 +92 48 4 875653307 +158 68 3 880134532 +145 174 5 882181728 +64 8 4 889737968 +7 168 5 891351509 +161 56 3 891171257 +96 100 5 884403758 +91 131 2 891439471 +178 135 2 882826915 +135 176 4 879857765 +102 173 3 888803602 +194 30 3 879524504 +11 47 4 891904551 +162 174 4 877636772 +5 42 5 875636360 +82 11 4 878769992 +178 193 4 882826868 +193 117 4 889125913 +117 168 5 881012550 +162 50 5 877635662 +77 181 3 884732278 +177 1 3 880130699 +89 117 5 879441357 +28 174 5 881956334 +188 173 5 875075118 +48 50 4 879434723 +7 54 3 892132380 +200 121 5 876042268 +7 89 5 891351082 +151 193 4 879524491 +38 67 4 892434312 +156 12 3 888185853 +42 142 4 881109271 +59 126 5 888202899 +109 69 4 880572561 +28 143 4 881956564 +23 28 3 874786793 +1 81 5 875072865 +124 166 5 890287645 +198 15 3 884205185 +113 100 4 875935610 +156 64 3 888185677 +64 56 5 889737542 +6 133 4 883601459 +130 158 5 875801897 +18 14 5 880130431 +95 132 3 880570993 +10 64 4 877886598 +164 125 5 889402071 +141 50 4 884584735 +114 191 3 881309511 +82 127 2 878769777 +55 56 4 878176397 +160 21 1 876769480 +23 177 4 884550003 +32 100 3 883717662 +59 134 5 888204841 +43 117 4 883954853 +1 78 1 878543176 +6 70 3 883601427 +18 89 3 880130065 +197 187 5 891409798 +46 127 5 883616133 +62 100 4 879372276 +130 3 5 876250897 +83 22 5 880307724 +59 188 4 888205188 +145 200 4 877343121 +160 175 4 876860808 +13 25 1 882141686 +7 142 3 891354090 +72 181 1 880037203 +7 156 5 891351653 +49 129 2 888068079 +23 188 3 877817151 +59 48 5 888204502 +49 3 3 888068877 +56 98 4 892679067 +130 183 5 875801369 +18 194 3 880129816 +69 109 3 882145428 +42 25 3 881110670 +144 22 5 888105439 +102 183 4 888801360 +121 9 5 891390013 +90 6 4 891384357 +98 70 3 880499018 +189 173 5 893265160 +169 181 5 891359276 +95 24 3 879192542 +56 82 4 892676314 +23 99 4 874786098 +118 185 5 875384979 +18 71 4 880131236 +130 49 4 875802236 +14 7 5 876965061 +10 200 5 877889261 +119 144 4 887038665 +72 70 4 880036691 +94 31 4 891721286 +130 53 3 876251972 +95 88 4 880571016 +58 156 5 884304955 +13 161 5 882397741 +65 197 5 879216769 +42 99 5 881108346 +81 7 4 876533545 +119 87 5 874781829 +8 89 4 879362124 +6 151 3 883599558 +177 150 4 880130807 +117 121 4 880126038 +194 1 4 879539127 +184 88 3 889909551 +142 28 4 888640404 +99 123 3 885678997 +1 143 1 875072631 +195 99 3 888737277 +59 25 4 888203270 +64 173 5 889737454 +59 65 4 888205265 +174 63 4 886514985 +1 151 4 875072865 +56 94 4 892910292 +59 175 4 888205300 +164 148 5 889402203 +116 180 5 886310197 +1 51 4 878543275 +130 12 4 875216340 +90 185 5 891384959 +12 132 5 879959465 +5 139 3 875721260 +192 127 4 881367456 +135 77 4 879858003 +94 39 3 891721317 +177 175 5 880130972 +162 151 3 877636191 +87 55 4 879875774 +190 118 3 891033906 +106 8 4 881452405 +188 195 3 875073179 +177 179 5 880131057 +53 181 4 879443046 +117 12 5 881011350 +162 117 4 877635869 +114 157 2 881260611 +184 52 4 889910034 +99 196 4 885680578 +123 127 5 879809943 +70 176 4 884066573 +96 170 5 884403866 +13 190 4 882397145 +94 34 1 891723558 +18 12 5 880129991 +178 58 5 882827134 +114 183 5 881260545 +13 137 5 882139804 +79 137 4 891271870 +18 181 3 880131631 +84 31 4 883453755 +76 59 4 875027981 +200 25 4 876042234 +197 195 5 891409798 +64 181 4 889737420 +132 137 4 891278996 +145 120 2 888398563 +51 132 4 883498655 +130 84 4 876252497 +8 190 4 879362183 +24 25 4 875246258 +116 199 4 876454174 +109 9 3 880564607 +200 143 5 884128499 +99 11 5 885680138 +145 159 4 875272299 +200 82 5 884129656 +85 124 5 882813248 +6 131 5 883602048 +156 192 4 888185735 +130 22 5 875217265 +12 157 5 879959138 +151 114 5 879524268 +130 63 4 876252521 +144 129 4 888104234 +16 96 5 877717833 +1 175 5 875072547 +80 45 4 887401585 +12 71 4 879959635 +59 141 4 888206605 +56 118 4 892679460 +198 23 4 884208491 +77 179 5 884752806 +89 26 3 879459909 +53 199 5 879442384 +32 118 3 883717967 +18 180 4 880130252 +55 89 5 878176398 +177 197 4 880130758 +44 168 5 878347504 +90 42 4 891384885 +137 50 5 881432937 +109 117 5 880564457 +85 199 5 879829438 +62 183 4 879374893 +95 2 2 888955909 +153 64 5 881371005 +62 173 5 879374732 +160 4 4 876861754 +12 15 5 879959670 +62 78 2 879376612 +89 151 5 879441507 +120 9 4 889489886 +73 28 3 888626468 +87 88 5 879876672 +175 176 3 877107255 +185 197 5 883524428 +130 150 5 874953558 +109 176 5 880577868 +94 28 4 885873159 +178 70 4 882827083 +7 172 4 891350965 +44 106 2 878347076 +184 13 3 889907839 +73 156 4 888625835 +18 179 4 880129877 +200 29 4 884130540 +6 28 2 883603013 +154 182 5 879138783 +154 50 5 879138657 +94 118 3 891723295 +44 185 4 878347569 +102 176 3 888801360 +82 25 2 878768435 +14 70 1 879119692 +122 70 5 879270606 +23 32 3 874785809 +12 191 5 879960801 +6 136 5 883600842 +77 176 4 884752757 +200 33 4 884129602 +119 12 3 874781915 +90 178 5 891384611 +181 21 1 878963381 +156 137 4 888185735 +181 112 1 878962955 +14 14 3 879119311 +57 173 5 883698408 +89 83 4 879459884 +2 13 4 888551922 +131 1 4 883681384 +6 117 2 883599431 +1 107 4 875241619 +6 32 4 883601311 +72 124 4 880035636 +123 50 3 879873726 +181 148 2 878963204 +83 28 4 880308284 +92 183 4 875653960 +12 196 5 879959553 +94 64 5 885870362 +87 182 4 879875737 +58 20 1 884304538 +44 9 5 878341196 +180 111 5 877127747 +108 181 3 879879985 +153 22 2 881371140 +119 188 4 874781742 +189 21 2 893264619 +14 181 5 889666215 +91 82 5 891439386 +32 122 2 883718250 +6 15 3 883599302 +87 79 5 879875856 +195 61 3 888737277 +158 11 4 880134398 +13 48 5 882139863 +189 121 2 893264816 +94 50 5 891720996 +153 127 3 881371140 +200 45 3 884128372 +82 103 2 878768665 +64 83 3 889737654 +59 102 2 888205956 +161 127 3 891171698 +69 9 4 882126086 +95 14 5 879197329 +42 12 4 881107502 +67 121 4 875379683 +188 148 4 875074667 +119 111 5 886176779 +13 21 3 882399040 +184 77 3 889910217 +92 196 4 875654222 +95 83 5 880573288 +11 135 4 891904335 +178 178 4 882826395 +189 143 5 893266027 +188 13 4 875073408 +124 157 2 890287936 +6 135 5 883600747 +69 48 5 882145428 +57 7 4 883697105 +7 8 5 891351328 +106 1 4 881449487 +180 69 4 877355568 +144 194 5 888105287 +73 48 2 888625785 +189 100 4 893263994 +194 117 3 879535704 +42 82 4 881107449 +174 49 4 886513788 +75 108 4 884050661 +41 170 4 890687713 +174 196 5 886514108 +137 172 5 881433719 +60 176 4 883326057 +115 172 4 881171273 +13 61 4 882140552 +108 121 3 879880190 +62 33 1 879374785 +200 151 3 876042204 +180 56 5 877127130 +60 194 4 883326425 +14 121 3 876965061 +18 136 5 880129421 +144 33 5 888105902 +200 38 3 884130348 +5 40 4 879198109 +99 7 4 885678784 +90 166 4 891383423 +184 196 4 889908985 +197 92 1 891410082 +5 90 3 875636297 +80 58 4 887401677 +178 76 3 882827288 +62 147 3 879372870 +63 13 4 875747439 +194 124 4 879539229 +71 56 5 885016930 +10 135 5 877889004 +54 121 4 880936669 +138 111 4 879022890 +67 151 4 875379619 +16 183 5 877720733 +13 40 2 886302815 +5 153 5 875636375 +168 7 1 884287559 +109 200 2 880577734 +128 173 5 879966756 +197 33 2 891409981 +16 27 2 877726390 +13 73 3 882141485 +84 151 4 883449993 +189 96 5 893265971 +66 117 3 883601787 +101 118 3 877136424 +94 63 3 891723908 +43 118 4 883955546 +42 88 5 881108425 +158 182 5 880134296 +157 3 3 886890734 +65 135 4 879216567 +62 179 4 879374969 +43 54 3 883956494 +94 144 3 891721168 +151 47 3 879528459 +184 34 2 889913568 +200 15 4 884127745 +5 94 3 878844651 +99 56 5 885679833 +42 28 5 881108187 +184 70 4 889908657 +77 50 4 884732345 +144 73 3 888105636 +56 186 3 892676933 +69 151 5 882072998 +1 108 5 875240920 +174 118 2 886434186 +145 44 5 875272132 +186 71 5 879024535 +82 109 1 884714204 +200 173 5 884128554 +177 195 4 880130699 +62 121 4 879372916 +49 122 2 888069138 +90 96 4 891384754 +56 95 4 892683274 +38 71 5 892430516 +135 33 3 879857930 +182 172 5 876435435 +130 4 2 875801778 +1 12 5 878542960 +13 118 4 882397581 +10 164 4 877889333 +109 96 5 880572614 +76 150 5 875028880 +5 109 5 875635350 +56 179 3 892678669 +59 195 5 888204757 +90 86 5 891383626 +94 156 5 891725332 +60 71 3 883327948 +198 172 4 884207206 +10 191 5 877888613 +130 134 5 875801750 +15 18 1 879455681 +43 161 4 883955467 +176 100 5 886047918 +124 79 3 890287395 +188 98 5 875071957 +96 173 3 884402791 +118 23 5 875384979 +188 38 3 875073828 +188 77 4 875072328 +184 124 5 889907652 +125 28 4 879454385 +177 196 3 880130881 +145 105 2 875271442 +58 182 4 884304701 +16 164 5 877724438 +1 14 5 874965706 +151 65 4 879528729 +109 131 1 880579757 +125 64 5 879454139 +41 98 4 890687374 +54 147 5 880935959 +125 25 1 879454987 +92 88 3 875656349 +194 26 3 879522240 +92 181 4 876175052 +148 169 5 877020297 +56 181 5 892737154 +64 7 4 889737542 +1 97 3 875073128 +62 155 1 879376633 +90 197 5 891383319 +193 174 4 889125720 +54 127 4 880933834 +128 56 3 879966785 +49 151 5 888067727 +59 125 3 888203658 +1 44 5 878543541 +8 172 5 879362123 +56 96 5 892676429 +74 100 4 888333428 +92 32 3 875653363 +18 57 4 880130930 +43 50 4 875975211 +59 136 3 888205336 +131 14 5 883681313 +95 117 4 879193619 +85 8 4 879454952 +25 135 3 885852059 +1 53 3 876893206 +49 52 2 888066647 +97 168 4 884238693 +84 64 5 883450066 +60 186 4 883326566 +43 1 5 875975579 +178 22 5 882826187 +104 25 3 888465634 +6 125 3 883599670 +137 183 5 881433689 +194 185 4 879521254 +1 163 4 875072442 +181 149 1 878962719 +18 195 3 880131236 +163 64 4 891220161 +22 121 3 878887925 +77 174 5 884733587 +128 190 4 879967016 +158 163 4 880135044 +178 83 4 882826556 +16 69 5 877724846 +168 123 3 884287822 +90 177 5 891384516 +20 1 3 879667963 +56 73 4 892677094 +43 47 1 883955415 +7 82 3 891351471 +64 38 3 889740415 +25 151 4 885853335 +181 125 3 878962816 +97 97 5 884239525 +20 69 1 879668979 +92 189 4 875653519 +92 191 4 875653050 +152 162 5 882474898 +106 86 3 881451355 +68 50 5 876973969 +9 6 5 886960055 +194 58 4 879522917 +168 25 5 884287885 +142 89 3 888640489 +58 193 3 884305220 +77 69 3 884752997 +18 185 3 880129388 +174 29 2 886514469 +178 89 4 882826514 +10 156 4 877886846 +200 174 5 884128426 +62 118 2 879373007 +198 184 3 884209003 +6 199 4 883601203 +150 50 5 878746719 +92 190 4 876174729 +174 66 5 886513706 +56 51 3 892677186 +21 121 1 874951416 +92 129 4 886443161 +177 47 3 880131187 +49 101 3 888067164 +92 31 4 875654321 +59 169 4 888204757 +75 137 4 884050102 +92 11 4 875653363 +15 148 3 879456049 +18 186 4 880131699 +1 184 4 875072956 +87 96 5 879875734 +178 99 4 882827574 +158 176 4 880134398 +22 176 5 878887765 +6 183 4 883601311 +1 157 4 876892918 +181 10 2 878962955 +90 100 5 891383241 +11 9 5 891902970 +43 49 4 883956387 +79 6 4 891271901 +37 24 4 880915674 +49 143 3 888067726 +38 94 5 892432030 +92 98 5 875652934 +76 64 5 875498777 +193 33 3 889125912 +178 183 4 882826347 +122 191 5 879270128 +121 126 3 891388936 +89 93 2 879441307 +125 116 4 892838322 +45 15 4 881012184 +56 56 5 892676376 +41 69 4 890687145 +172 183 5 875538864 +80 194 3 887401763 +13 124 5 884538663 +99 100 5 885678813 +89 121 5 879441657 +6 197 5 883601203 +128 151 3 879968921 +7 177 4 891352904 +87 39 3 879875995 +85 108 2 880838201 +26 117 3 891351590 +119 109 5 874775580 +168 117 5 884287318 +1 150 5 876892196 +65 173 3 879217851 +193 111 1 889126375 +94 38 2 891722482 +74 150 3 888333458 +178 195 4 882826944 +90 190 5 891383687 +56 189 4 892683248 +196 111 4 881251793 +178 8 4 882826556 +158 149 3 880132383 +94 1 4 885870323 +11 185 4 891905783 +169 133 4 891359171 +25 189 5 885852488 +95 111 4 879194012 +158 62 5 880134759 +24 178 5 875323676 +73 100 4 888626120 +74 137 3 888333458 +125 73 5 892838288 +60 98 4 883326463 +84 7 4 883452155 +165 69 3 879525799 +114 182 3 881259994 +91 181 5 891439243 +1 183 5 875072262 +136 19 4 882693529 +138 150 3 879023131 +128 48 4 879967767 +85 45 3 879455197 +14 172 5 890881521 +13 153 4 882139901 +109 91 4 880582384 +49 116 4 888066109 +152 191 5 880149963 +186 44 5 879023529 +119 147 4 886176486 +176 13 4 886047994 +121 98 5 891388210 +128 65 4 879968512 +41 100 4 890687242 +145 5 3 875272196 +167 136 4 892738418 +6 195 4 883602283 +151 83 5 879524611 +108 21 3 879880141 +8 144 5 879362286 +5 100 5 875635349 +13 154 5 882141335 +119 174 4 874781303 +135 185 4 879857797 +38 1 5 892430636 +157 137 5 886889876 +10 99 5 877889130 +44 148 4 878346946 +159 103 1 880557604 +11 100 4 891902718 +5 143 3 875636815 +10 194 4 877886661 +167 133 5 892738453 +50 9 4 877052297 +131 19 4 883681418 +180 156 5 877127747 +60 163 4 883327566 +193 2 3 890860198 +174 28 5 886434547 +38 145 1 892433062 +118 184 5 875385057 +195 67 2 874825826 +122 175 5 879270084 +1 128 4 875072573 +188 79 5 875072393 +186 117 5 879023607 +87 7 4 879875735 +128 1 4 879966919 +64 151 3 879366214 +194 161 4 879523576 +96 1 5 884403574 +122 187 4 879270424 +151 172 5 879524325 +158 50 4 880133306 +51 64 4 883498936 +7 183 4 891351624 +178 117 4 882824467 +94 68 4 891722432 +59 131 4 888205410 +197 89 5 891409798 +198 193 4 884207833 +60 82 3 883327493 +178 98 5 882826944 +183 88 3 891466760 +199 111 3 883783042 +7 101 5 891350966 +125 136 5 879454309 +60 61 4 883326652 +160 32 5 876859413 +5 176 3 875635962 +7 136 5 891351813 +102 47 2 888803636 +64 161 3 889739779 +160 109 2 876857844 +16 160 4 877722001 +76 197 5 875028563 +52 15 5 882922204 +128 58 3 879968008 +92 159 4 875810543 +178 25 3 888514710 +13 100 5 882140166 +102 98 4 888802939 +6 193 3 883601529 +163 98 4 891220196 +167 169 1 892738419 +121 137 5 891388501 +13 71 4 882398654 +59 45 5 888204465 +182 121 3 885613117 +64 64 4 889737454 +151 49 3 879543055 +83 122 1 886534501 +139 127 5 879538578 +110 77 4 886988202 +130 94 5 875802058 +200 196 4 884126833 +16 99 5 877720733 +75 100 5 884049875 +95 151 4 879193353 +182 100 3 885613067 +150 93 4 878746889 +164 118 5 889401852 +169 127 4 891359354 +196 25 4 881251955 +151 200 3 879525002 +60 88 4 883327684 +60 143 3 883327441 +191 86 5 891562417 +99 69 4 885679833 +125 198 3 879454385 +75 125 3 884050164 +95 64 5 879197685 +1 148 2 875240799 +141 151 2 884585039 +145 7 5 875270429 +5 69 1 875721555 +130 66 5 875802173 +43 63 3 883956353 +70 128 4 884067339 +119 24 4 886177076 +50 125 2 877052502 +157 1 5 874813703 +1 112 1 878542441 +144 96 5 888105691 +165 181 5 879525738 +109 94 4 880579787 +37 161 5 880915902 +187 86 4 879465478 +145 39 4 875271838 +70 48 4 884064574 +92 161 2 875654125 +21 118 1 874951382 +7 181 3 891351287 +94 100 5 885872942 +7 7 5 891352220 +194 175 3 879521595 +187 175 2 879465241 +43 17 3 883956417 +60 21 3 883327923 +94 82 4 891721777 +30 28 4 885941321 +160 118 3 876768828 +18 188 3 880129388 +43 98 5 875981220 +151 79 4 879524642 +85 89 4 879454075 +1 193 4 876892654 +128 118 5 879968896 +15 9 4 879455635 +135 183 4 879857723 +90 79 4 891383912 +25 50 5 885852150 +87 87 4 879877931 +195 46 3 891762441 +151 183 3 879524642 +42 183 4 881107821 +175 183 4 877107942 +18 47 3 880131262 +50 123 4 877052958 +79 7 5 891272016 +184 69 3 889908694 +188 56 4 875071658 +83 63 4 880327970 +73 180 4 888626577 +101 121 4 877137015 +180 28 3 877355568 +199 117 3 883782879 +45 100 5 881010742 +117 109 4 880126336 +60 132 4 883325944 +197 62 2 891410039 +144 193 4 888105287 +115 32 5 881171348 +130 39 4 875801496 +84 148 4 883452274 +87 25 4 879876811 +178 187 4 882826049 +90 14 5 891383987 +87 64 5 879875649 +156 124 3 888185677 +22 110 1 878887157 +152 67 5 882477689 +18 193 5 880131358 +189 15 2 893264335 +144 181 4 888104032 +125 63 3 892838558 +7 154 5 891353124 +186 31 4 879023529 +64 9 4 889738085 +94 170 5 891725362 +72 127 5 880037702 +72 177 4 880037204 +181 25 5 878962675 +124 96 4 890399864 +8 56 5 879362183 +194 44 4 879524007 +87 63 4 879876848 +64 17 3 889739733 +174 21 1 886515209 +14 9 4 879119260 +92 96 4 875656025 +167 126 3 892738141 +69 150 5 882072920 +119 199 5 874781994 +18 169 5 880130252 +148 116 5 877398648 +101 109 2 877136360 +7 166 3 891351585 +44 5 4 878347598 +73 89 5 888625685 +185 28 5 883524428 +198 175 3 884207239 +38 118 5 892431151 +25 8 4 885852150 +18 170 5 880130515 +72 121 3 880036048 +37 22 5 880915810 +69 100 5 882072892 +117 98 4 881012430 +25 169 5 885852301 +7 185 5 892135346 +92 102 2 875813376 +128 14 5 879967341 +67 7 5 875379794 +87 97 5 879877825 +58 64 5 884305295 +46 151 4 883616218 +27 121 4 891543191 +12 28 5 879958969 +60 180 4 883326028 +7 191 5 891351201 +57 151 3 883697585 +167 73 2 892738452 +156 180 5 888185777 +72 100 5 880035680 +56 195 5 892676429 +117 143 1 881012472 +46 181 4 883616254 +164 181 5 889401906 +95 90 2 880572166 +197 127 5 891409839 +29 98 4 882821942 +7 139 3 891354729 +92 46 4 875653867 +101 24 4 877136391 +77 52 5 884753203 +200 2 4 884130046 +77 144 3 884752853 +48 170 4 879434886 +136 42 3 882848866 +10 160 4 877888944 +25 13 4 885852381 +42 79 5 881108040 +94 96 3 885872942 +109 68 3 880582469 +144 32 4 888105287 +109 196 4 880578358 +152 51 4 882476486 +92 109 3 886443351 +25 197 3 885852059 +102 167 2 892993927 +110 28 4 886987979 +64 71 3 879365670 +91 64 4 891439243 +163 97 4 891220019 +184 22 3 889908985 +109 183 5 880572528 +160 123 4 876768949 +95 142 4 880572249 +63 106 2 875748139 +6 81 4 883602283 +95 185 3 879197886 +62 176 5 879373768 +128 136 5 879967080 +141 117 4 884584929 +184 91 3 889909988 +144 93 1 888104032 +77 89 5 884733839 +10 176 4 877889130 +119 105 2 874775849 +144 191 4 888105081 +48 195 5 879434954 +70 89 4 884150202 +64 156 4 889737506 +102 50 4 888801315 +70 169 4 884149688 +59 118 5 888203234 +1 200 3 876893098 +174 14 5 886433771 +66 15 3 883601456 +175 9 4 877108146 +62 180 4 879373984 +151 160 4 879542670 +1 180 3 875072573 +151 64 5 879524536 +194 98 4 879521329 +125 120 1 892839312 +56 38 2 892683533 +178 134 3 882826983 +102 184 2 888801465 +23 13 4 874784497 +43 91 3 883956260 +41 174 4 890687264 +43 153 5 883955135 +48 132 5 879434886 +184 137 5 889907685 +38 82 5 892429903 +194 12 5 879520916 +109 172 5 880572528 +177 100 5 880130600 +59 95 2 888204758 +92 94 3 875812876 +83 106 4 887665549 +125 194 5 879454986 +194 195 3 879521657 +106 22 4 881449830 +115 82 4 881172117 +160 161 3 876861185 +8 7 3 879362287 +91 161 3 891439353 +70 121 3 884148728 +138 116 2 879022956 +94 102 3 891721462 +103 50 5 880416864 +144 19 4 888103929 +43 95 4 875975687 +18 64 5 880132501 +99 12 5 885680458 +18 99 5 880130829 +16 51 4 877726390 +17 125 1 885272538 +151 87 4 879524420 +5 79 3 875635895 +145 3 3 875271562 +115 89 5 881172049 +117 56 5 881011807 +125 1 4 879454699 +37 195 5 880915874 +187 196 4 879465507 +85 94 3 882995966 +94 88 3 891721942 +130 33 5 876252087 +48 172 5 879434791 +23 71 3 874789299 +148 163 4 877021402 +20 95 3 879669181 +81 124 3 876534594 +85 157 3 879454400 +95 161 3 879196298 +65 48 5 879217689 +174 197 5 886434547 +23 191 3 877817113 +83 1 4 880306903 +1 85 3 875073180 +90 17 4 891384721 +59 140 1 888206445 +145 38 3 888398747 +87 183 4 879875734 +92 173 3 875656535 +58 61 5 884305271 +43 175 2 875981304 +13 196 4 882140552 +87 73 3 879877083 +194 198 3 879522021 +152 151 4 880148735 +102 164 3 888803002 +1 91 5 876892636 +198 197 4 884208200 +22 118 4 878887983 +49 111 2 888068686 +72 96 5 880037203 +92 53 3 875656392 +148 7 5 877017054 +49 95 2 888067031 +70 197 4 884149469 +160 24 5 876769689 +95 3 1 879193881 +83 117 5 880307000 +18 19 3 880130582 +97 79 5 884238817 +49 123 1 888068195 +119 182 4 874781303 +91 174 5 891439090 +158 82 5 880134398 +181 103 1 878962586 +60 197 4 883326620 +16 161 5 877726390 +70 139 3 884150656 +130 176 5 881536127 +15 7 1 879455506 +130 28 4 875217172 +92 135 4 875652981 +92 67 3 875907436 +200 183 5 884128554 +200 8 4 884128904 +85 160 3 879454075 +38 79 3 892430309 +130 174 5 875216249 +37 11 4 880915838 +87 33 3 879876488 +185 86 5 883524428 +6 59 5 883601713 +90 149 3 891384754 +197 190 3 891410082 +183 159 4 892323452 +102 101 4 883748488 +7 79 4 891352261 +83 181 4 880306786 +130 99 5 875216786 +117 195 5 881012255 +119 83 4 886176922 +28 145 3 881961904 +99 3 3 885679237 +106 88 3 881453097 +178 181 5 882823832 +16 76 5 877719863 +57 100 5 883698581 +1 10 3 875693118 +67 122 3 875379566 +178 55 4 882826394 +151 121 5 879525054 +121 57 5 891390014 +174 124 5 886514168 +198 95 3 884207612 +184 64 4 889909045 +6 124 5 883599228 +7 131 5 891352383 +85 70 4 879828328 +80 199 2 887401353 +95 48 4 879197500 +44 118 3 878341197 +1 129 5 887431908 +18 131 4 880131004 +16 182 5 877719863 +44 91 2 878348573 +115 12 5 881171982 +7 121 5 891352904 +135 79 3 879857843 +200 112 3 884127370 +101 50 4 877135944 +121 192 4 891388250 +178 96 4 882826782 +184 116 4 889910481 +66 21 1 883601939 +137 15 4 881432965 +92 184 3 877383934 +153 56 5 881371140 +10 168 4 877888812 +70 189 4 884150202 +116 65 2 876454052 +136 100 5 882693338 +5 144 3 875636141 +16 31 5 877717956 +194 188 4 879522158 +44 191 4 878347234 +198 176 4 884207136 +49 172 1 888067691 +94 76 4 891720827 +83 110 4 880309185 +6 56 4 883601277 +23 98 5 874786016 +193 29 3 889126055 +125 174 5 879454309 +158 137 5 880132443 +137 51 1 881433605 +95 101 1 879198800 +56 70 4 892676996 +1 130 3 875072002 +152 80 5 882477572 +41 153 4 890687087 +12 200 1 879959610 +130 128 4 876251728 +49 11 3 888069458 +76 121 2 882607017 +130 184 4 875801695 +5 185 3 875720692 +43 191 5 875981247 +99 107 3 885679138 +200 148 4 876042340 +62 125 4 879372347 +144 105 2 888104767 +82 140 3 878769668 +16 156 4 877719863 +72 161 5 880037703 +94 70 4 891722511 +92 148 2 877383934 +125 98 5 879454345 +130 195 5 875801470 +7 126 3 891353254 +75 190 5 884051948 +102 99 2 883748488 +92 43 3 875813314 +178 28 5 882826806 +75 151 5 884050502 +81 151 2 876533946 +49 175 5 888068715 +59 186 5 888205660 +76 23 5 875027355 +49 185 5 888067307 +44 164 4 878348035 +18 1 5 880130802 +128 86 5 879966919 +24 56 4 875323240 +72 172 1 880037119 +77 100 3 884732716 +14 15 4 879119390 +189 79 3 893265478 +23 143 3 874786066 +49 55 4 888068057 +99 66 3 886519047 +18 97 4 880131525 +144 180 4 888105873 +14 42 4 879119579 +102 163 2 892993190 +198 79 3 884208518 +130 69 5 875216718 +118 22 5 875385136 +48 28 2 879434653 +14 176 1 890881484 +186 100 4 879023115 +23 133 4 874786220 +60 13 4 883327539 +82 185 3 878769334 +64 1 4 879366214 +102 94 2 892993545 +115 187 5 881171203 +11 194 4 891904920 +59 172 5 888204552 +60 200 4 883326710 +85 127 5 879829301 +196 94 3 881252172 +144 65 4 888106182 +184 58 4 889908984 +189 31 3 893266027 +142 55 2 888640489 +5 89 5 875636033 +70 185 4 884149753 +13 173 2 882139863 +151 164 5 879542984 +117 117 5 880126461 +145 69 5 882181632 +8 183 5 879362233 +71 151 1 877319446 +145 79 5 875271838 +198 82 3 884209451 +119 117 5 874775535 +181 150 1 878962465 +130 147 4 876250746 +109 158 1 880579916 +42 196 5 881107718 +97 174 4 884238817 +6 187 4 883600914 +1 103 1 878542845 +85 154 4 879828777 +101 122 1 877136928 +194 83 3 879521254 +90 191 5 891384424 +125 87 5 892836464 +188 127 4 875072799 +16 28 5 877727122 +94 12 4 886008625 +87 68 3 879876074 +174 40 4 886514985 +69 129 3 882072778 +67 123 4 875379322 +178 15 5 882823858 +59 71 3 888205574 +92 124 4 886440530 +144 197 4 888106106 +79 13 3 891271676 +44 96 4 878347633 +150 147 4 878746442 +168 100 4 884287362 +1 118 3 875071927 +197 161 4 891410039 +177 22 4 880130847 +102 144 3 888801360 +158 127 5 880132356 +60 138 2 883327287 +187 191 5 879465566 +189 135 4 893265535 +145 100 5 875270458 +82 70 4 878769888 +194 144 4 879547671 +197 79 5 891409839 +58 69 1 884663351 +64 69 4 889739091 +90 182 3 891383599 +42 172 5 881107220 +83 105 2 891182288 +137 117 5 881433015 +45 1 5 881013176 +110 195 2 886988480 +49 108 2 888068957 +194 25 2 879540807 +174 162 5 886514108 +87 186 5 879876734 +45 21 3 881014193 +18 126 5 880130680 +21 100 5 874951292 +92 164 4 875656201 +94 61 5 891720761 +184 72 3 889909988 +90 150 3 891385250 +194 7 3 879538898 +1 54 3 878543308 +27 100 5 891543129 +90 131 5 891384066 +1 24 3 875071713 +172 178 3 875538027 +198 196 3 884208098 +64 72 4 889740056 +11 109 3 891903836 +56 122 2 892911494 +144 176 4 888105338 +132 124 4 891278996 +42 194 5 881107329 +24 100 5 875323637 +193 127 5 890860351 +62 181 4 879372418 +7 190 5 891351728 +16 174 5 877719504 +5 80 2 875636511 +64 95 4 889737691 +72 180 4 880036579 +145 42 5 882181785 +92 101 2 875656624 +145 51 3 875272786 +168 15 5 884287362 +94 193 5 891720498 +156 197 5 888185777 +177 172 5 880130990 +62 20 4 879372696 +10 195 4 877889130 +130 168 3 875216786 +87 192 3 879877741 +46 7 4 883616155 +43 181 4 875975211 +59 82 5 888205660 +18 162 4 880131326 +193 155 4 889126376 +59 18 4 888203313 +92 66 3 875812279 +128 50 4 879967268 +110 68 2 886988631 +64 58 3 889739625 +1 86 5 878543541 +49 39 2 888068194 +102 181 2 888801406 +130 173 3 875216593 +198 182 4 884207946 +60 161 4 883327265 +200 50 5 884128400 +115 93 3 881170332 +158 183 3 880134332 +58 50 4 884304328 +70 109 3 884066514 +184 174 3 889908693 +18 70 4 880129668 +7 161 3 891352489 +14 116 5 876965165 +92 93 4 886444049 +83 94 4 880308831 +54 50 5 880931687 +10 13 3 877892050 +157 93 3 886890692 +177 198 4 880131161 +49 70 2 888066614 +1 196 5 874965677 +197 174 5 891409798 +92 89 5 875652981 +59 109 4 888203175 +95 7 5 879197329 +38 140 5 892430309 +16 134 4 877719158 +56 168 2 892679209 +98 116 5 880499053 +43 11 5 875981365 +95 69 5 879198210 +56 44 4 892679356 +18 13 5 880131497 +7 72 5 891353977 +64 96 4 889737748 +23 70 2 874786513 +20 121 3 879668227 +200 147 5 876042451 +1 39 4 875072173 +184 11 3 889908694 +76 200 5 882606216 +106 48 3 881453290 +10 183 5 877893020 +59 98 5 888204349 +59 200 5 888205370 +57 199 5 883698646 +104 150 5 888465225 +106 194 5 881450758 +59 39 4 888205033 +44 193 3 878348521 +108 10 5 879879834 +64 12 5 889738085 +135 12 4 879857764 +156 22 3 888186093 +1 164 3 876893171 +141 120 4 884585547 +87 8 5 879876447 +101 123 2 877136186 +194 99 3 879524643 +28 89 4 881961104 +177 168 4 880130807 +92 144 4 875810741 +58 150 4 884304570 +73 81 5 888626415 +194 127 5 879520813 +41 1 4 890692860 +91 134 4 891439353 +138 185 4 879023853 +104 147 3 888466002 +125 69 4 879454628 +189 134 5 893265239 +58 198 3 884305123 +79 150 3 891271652 +109 157 4 880577961 +181 9 4 878962675 +96 50 5 884402977 +16 9 5 877722736 +94 175 4 885870613 +194 94 3 879528000 +4 50 5 892003526 +8 127 5 879362123 +198 65 2 884208241 +130 111 5 874953825 +8 188 5 879362356 +58 123 4 884650140 +72 87 4 880036638 +189 194 5 893265428 +159 117 5 880486047 +11 22 4 891904241 +95 178 5 879197652 +200 123 4 884127568 +154 89 5 879138910 +95 181 4 879193353 +89 14 4 879441357 +10 132 5 877893020 +74 129 3 888333458 +64 199 4 889737654 +115 181 4 881172049 +189 174 5 893265160 +1 36 2 875073180 +23 189 5 874785985 +92 154 4 875657681 +152 22 5 882828490 +13 185 3 881515011 +128 98 4 879967047 +118 164 5 875385386 +18 135 3 880130065 +184 57 5 889908539 +14 23 5 890881216 +118 32 5 875384979 +189 9 3 893263994 +1 23 4 875072895 +188 66 3 875075118 +186 118 2 879023242 +92 62 3 875660468 +14 168 4 879119497 +128 99 4 879967840 +158 116 5 880132383 +94 135 4 885870231 +52 93 4 882922357 +84 194 5 883453617 +85 192 4 879454951 +71 65 5 885016961 +103 96 4 880422009 +188 161 3 875073048 +174 67 1 886515130 +180 173 5 877128388 +13 24 1 882397741 +90 148 2 891385787 +10 186 4 877886722 +189 16 3 893264335 +125 83 4 879454345 +154 143 3 879139003 +15 1 1 879455635 +71 50 3 885016784 +10 199 4 877892050 +59 50 5 888205087 +159 121 3 880486071 +109 121 5 880571741 +118 193 5 875384793 +60 64 4 883325994 +22 172 4 878887680 +11 175 3 891904551 +56 90 2 892677147 +71 135 4 885016536 +174 13 3 891551777 +200 135 4 884128400 +109 7 4 880563080 +1 73 3 876892774 +151 153 3 879524326 +118 17 3 875385257 +42 63 4 881108873 +148 78 1 877399018 +193 100 5 889124127 +176 50 5 886047879 +185 15 3 883525255 +63 116 5 875747319 +59 142 1 888206561 +96 23 5 884403123 +181 146 1 878962955 +82 151 2 876311547 +62 164 5 879374946 +58 195 4 884305123 +194 193 4 879524790 +1 67 3 876893054 +194 71 4 879524291 +160 137 4 876767299 +54 118 4 880937813 +8 176 5 879362233 +56 25 4 892911166 +188 181 3 875072148 +72 135 4 880037054 +38 28 4 892429399 +164 121 5 889402203 +196 8 5 881251753 +14 50 5 890881557 +13 27 3 882397833 +94 52 5 891721026 +158 172 4 880134398 +23 1 5 874784615 +38 22 5 892429347 +31 124 4 881548110 +102 5 3 888803002 +70 96 4 884066910 +119 100 5 874774575 +37 176 4 880915942 +160 23 5 876859778 +24 109 3 875322848 +188 185 4 875071710 +1 65 4 875072125 +200 88 4 884128760 +72 117 4 880035588 +144 190 5 888105714 +18 151 3 880131804 +12 50 4 879959044 +44 21 2 878346789 +130 122 3 876251090 +1 190 5 875072125 +141 1 3 884584753 +60 56 4 883326919 +6 189 3 883601365 +74 121 4 888333428 +25 114 5 885852218 +178 71 4 882826577 +48 181 5 879434954 +22 153 5 878886423 +76 98 5 875028391 +10 56 5 877886598 +64 175 5 889739415 +184 67 3 889912569 +125 94 5 892839065 +2 19 3 888550871 +97 192 1 884238778 +69 147 3 882072920 +188 164 4 875072674 +87 161 5 879875893 +110 11 4 886987922 +90 180 4 891384065 +178 16 4 882823905 +18 152 3 880130515 +151 51 4 879543055 +144 165 4 888105993 +56 169 4 892683248 +160 7 3 876767822 +64 62 2 889740654 +189 176 4 893265214 +106 196 5 881450578 +26 150 3 891350750 +90 83 5 891383687 +26 127 5 891386368 +94 55 4 885873653 +181 13 2 878962465 +42 118 4 881105505 +102 96 3 888801316 +22 154 4 878886423 +11 40 3 891905279 +62 3 3 879372325 +81 98 5 876534854 +20 144 2 879669401 +64 70 5 889739158 +123 132 3 879872672 +1 100 5 878543541 +115 9 5 881171982 +43 173 5 875981190 +92 22 3 875653121 +158 117 3 880132719 +42 72 3 881108229 +198 33 3 884209291 +157 147 5 886890342 +178 196 4 882827834 +130 143 5 876251922 +132 154 4 891278996 +70 191 3 884149340 +151 163 4 879542723 +200 56 4 884128858 +94 17 2 891721494 +42 95 5 881107220 +193 56 1 889125572 +38 133 2 892429873 +95 79 4 879196231 +21 148 1 874951482 +72 51 4 880036946 +22 194 5 878886607 +6 87 4 883602174 +103 69 3 880420585 +145 195 5 882181728 +31 79 2 881548082 +114 100 5 881259927 +193 147 2 890860290 +10 127 5 877886661 +198 154 4 884208098 +183 54 2 891467546 +161 187 3 891170998 +22 195 4 878887810 +59 101 5 888206605 +156 11 2 888185906 +65 7 1 879217290 +59 33 3 888205265 +119 40 4 886176993 +109 162 2 880578358 +82 8 4 878769292 +10 133 5 877891904 +108 14 5 879879720 +130 44 4 875801662 +63 126 3 875747556 +95 43 2 880572356 +24 9 5 875323745 +161 191 2 891171734 +165 91 4 879525756 +115 50 5 881172049 +158 186 3 880134913 +56 7 5 892679439 +117 25 4 881009470 +184 9 5 889907685 +174 56 5 886452583 +102 79 2 888801316 +10 98 4 877889261 +200 125 5 876041895 +11 94 3 891905324 +64 154 4 889737943 +60 77 4 883327040 +109 58 4 880572950 +92 28 3 875653050 +1 154 5 878543541 +184 143 3 889908903 +74 124 3 888333542 +90 143 5 891383204 +95 191 5 879198161 +114 96 3 881259955 +116 137 2 876454308 +28 70 4 881961311 +114 186 3 881260352 +85 163 3 882813312 +158 184 3 880134407 +59 183 5 888204802 +115 178 5 881172246 +97 32 5 884239791 +198 183 5 884207654 +141 106 5 884585195 +194 192 5 879521253 +38 88 5 892430695 +122 46 5 879270567 +10 1 4 877888877 +87 118 4 879876162 +108 137 5 879879941 +7 176 3 891350782 +62 168 5 879373711 +82 199 4 878769888 +158 148 4 880132613 +134 15 5 891732726 +118 134 5 875384916 +151 189 5 879528495 +189 127 4 893263994 +174 138 1 891551778 +42 77 5 881108684 +130 41 3 875801662 +83 35 1 886534501 +20 98 3 879669547 +41 181 4 890687175 +1 161 4 875072303 +56 164 4 892910604 +45 108 4 881014620 +70 69 4 884065733 +22 168 5 878886517 +144 160 2 888106181 +16 195 5 877720298 +161 135 2 891170656 +56 77 3 892679333 +1 62 3 878542282 +198 174 5 884208326 +156 48 4 888185777 +44 147 4 878341343 +26 13 3 891373086 +195 55 4 888737417 +49 100 4 888067307 +125 88 5 879455184 +90 45 3 891385039 +195 132 5 875771441 +175 132 3 877107712 +43 56 5 875975687 +120 148 3 889490499 +174 122 1 886434421 +13 109 4 882141306 +58 13 3 884304503 +30 7 4 875140648 +64 4 3 889739138 +158 154 4 880135069 +200 140 4 884129962 +160 1 4 876768025 +64 52 3 889739625 +94 161 3 891721439 +43 77 3 883955650 +160 50 4 876767572 +48 71 3 879434850 +87 120 2 879877173 +11 51 4 891906439 +181 147 1 878963168 +87 4 5 879876524 +90 33 4 891383600 +130 68 5 875216283 +71 154 3 877319610 +68 125 1 876974096 +115 77 2 881171623 +194 180 3 879521657 +72 38 3 880037307 +194 64 5 879521936 +58 89 3 884305220 +43 155 4 883956518 +115 22 3 881171273 +11 191 4 891904270 +193 194 4 889125006 +81 147 4 876533389 +94 92 4 891721142 +85 95 4 879455114 +23 50 4 874784440 +58 120 2 892242765 +60 199 5 883326339 +62 14 4 879372851 +91 97 5 891438947 +93 125 1 888705416 +62 162 4 879375843 +6 100 5 883599176 +96 96 4 884403531 +125 50 5 892836362 +24 117 4 875246216 +154 135 5 879139003 +64 125 2 889739678 +184 164 3 889911434 +114 179 5 881260611 +73 173 5 888625292 +123 143 5 879872406 +98 173 1 880498935 +62 55 5 879373692 +96 79 4 884403500 +10 144 4 877892110 +194 95 3 879521719 +96 198 5 884403465 +58 194 3 884304747 +182 123 4 885612994 +128 54 2 879968415 +94 23 5 885870284 +70 193 4 884149646 +144 195 5 888105081 +13 11 1 882397146 +76 89 4 875027507 +1 188 3 875073128 +70 186 4 884065703 +92 2 3 875653699 +43 71 4 883955675 +49 179 5 888066446 +44 176 5 883613372 +58 32 5 884304812 +1 102 2 889751736 +1 69 3 875072262 +89 150 5 879441452 +94 8 5 885873653 +158 124 4 880134261 +82 174 5 878769478 +64 157 4 879365491 +62 47 4 879375537 +90 155 5 891385040 +177 59 4 880130825 +121 181 5 891390014 +152 157 5 882476486 +96 176 4 884403758 +14 18 3 879119260 +102 102 3 883748488 +7 118 2 891353411 +92 73 3 875656474 +16 7 5 877724066 +7 53 5 891354689 +11 12 2 891904194 +85 179 4 879454272 +56 64 5 892678482 +194 70 3 879522324 +145 122 1 888398307 +87 90 2 879877127 +75 118 3 884050760 +43 51 1 883956562 +120 125 4 889490447 +186 95 3 879024535 +20 87 5 879669746 +178 39 2 882827645 +59 173 5 888205144 +44 161 4 878347634 +23 109 3 874784466 +1 170 5 876892856 +92 82 2 875654846 +198 198 4 884207654 +72 7 1 880036347 +128 196 5 879967550 +168 9 1 884287394 +59 64 5 888204309 +177 23 5 880130758 +7 99 5 891352557 +189 89 5 893265624 +109 67 5 880580719 +109 173 5 880572786 +90 151 2 891385190 +94 7 4 885873089 +92 56 5 875653271 +189 198 4 893265657 +95 190 4 888954513 +117 179 5 881012776 +70 175 3 884150422 +194 100 4 879539305 +1 38 3 878543075 +199 1 1 883782854 +124 98 4 890287822 +96 185 5 884403866 +137 121 5 881432881 +1 9 5 878543541 +144 173 5 888105902 +37 68 5 880915902 +73 59 5 888625980 +73 135 5 888626371 +13 89 4 882139717 +181 137 2 878962465 +82 97 4 878769777 +119 52 3 890627339 +116 193 4 876453681 +62 9 4 879372182 +77 133 2 884752997 +10 82 4 877886912 +12 170 4 879959374 +90 52 5 891385522 +90 127 4 891383561 +17 117 3 885272724 +64 168 5 889739243 +28 11 4 881956144 +174 158 2 886514921 +83 64 5 887665422 +158 20 4 880134261 +81 1 4 876534949 +38 112 5 892432751 +195 47 5 876632643 +200 58 4 884129301 +13 23 5 882139937 +11 168 3 891904949 +37 89 4 880930072 +145 12 5 882182917 +144 68 2 888105665 +197 188 3 891409982 +43 88 5 883955702 +59 83 4 888204802 +17 150 5 885272654 +144 24 4 888104541 +22 187 5 878887680 +94 154 5 886008791 +42 1 5 881105633 +38 200 5 892432180 +38 69 5 892430486 +57 111 4 883697679 +87 132 5 879877930 +151 136 4 879524293 +5 99 3 875721216 +150 151 4 878746824 +189 131 4 893265710 +11 70 4 891904573 +200 99 5 884128858 +145 150 5 875270655 +70 181 4 884064416 +6 21 3 883600152 +18 6 5 880130764 +94 11 5 885870231 +89 13 2 879441672 +176 111 4 886048040 +85 190 4 879453845 +37 27 4 880915942 +117 33 4 881011697 +200 188 4 884129160 +110 173 1 886988909 +159 24 5 880989865 +99 28 3 885680578 +96 187 5 884402791 +26 1 3 891350625 +90 162 5 891385190 +64 81 4 889739460 +121 124 5 891388063 +92 167 3 875656557 +23 95 4 874786220 +194 31 3 879549793 +65 65 3 879216672 +85 195 3 882995132 +177 154 4 880130600 +158 173 5 880134913 +178 123 4 882824325 +137 181 5 881433015 +24 127 5 875323879 +13 51 3 882399419 +131 124 5 883681313 +175 100 2 877107712 +109 179 4 880577961 +138 13 4 879023345 +66 24 3 883601582 +194 154 3 879546305 +1 22 4 875072404 +119 50 5 874774718 +5 21 3 875635327 +1 21 1 878542772 +178 2 4 882827375 +83 2 4 881971771 +13 4 5 882141306 +42 15 4 881105633 +168 125 4 884287731 +110 96 4 886988449 +144 20 4 888104559 +193 187 4 890860351 +200 1 5 876042340 +59 51 5 888206095 +198 187 4 884207239 +151 98 4 879524088 +99 64 5 885680578 +178 197 2 882826720 +21 123 4 874951382 +130 132 5 875802006 +27 50 3 891542897 +135 173 4 879857723 +95 127 4 879195062 +85 150 3 890255432 +160 169 4 876862077 +1 179 3 875072370 +56 151 4 892910207 +110 69 4 886987860 +128 193 3 879967249 +198 173 4 884207492 +49 91 5 888066979 +92 122 3 875907535 +37 127 4 880930071 +62 188 3 879373638 +125 56 1 879454345 +13 96 4 882140104 +92 153 4 875653605 +69 123 4 882126125 +186 79 5 879023460 +138 187 5 879024043 +22 53 3 878888107 +118 180 5 875385136 +115 7 5 881171982 +6 200 3 883602422 +101 111 2 877136686 +10 162 4 877892210 +26 129 4 891350566 +25 141 4 885852720 +10 161 4 877892050 +175 64 5 877107552 +189 44 4 893266376 +44 143 4 878347392 +37 92 4 880930072 +92 117 4 875640214 +177 161 3 880130915 +114 89 5 881260024 +81 100 3 876533545 +44 1 4 878341315 +99 92 4 885680837 +59 56 5 888204465 +196 70 3 881251842 +90 193 4 891383752 +18 65 5 880130333 +87 38 5 879875940 +1 187 4 874965678 +2 111 4 888551853 +82 111 4 876311423 +101 181 4 877137015 +18 79 4 880131450 +95 98 4 879197385 +160 182 5 876770311 +128 172 3 879967248 +72 147 5 880037702 +123 9 5 879873726 +70 150 3 884065247 +21 17 4 874951695 +151 52 5 879524586 +178 176 4 882826782 +84 98 4 883453755 +7 97 5 891351201 +23 175 5 874785526 +148 69 5 877019101 +64 32 1 889739346 +151 69 4 879524368 +7 135 5 891351547 +95 140 3 879199014 +97 189 4 884238887 +110 55 3 886988449 +22 85 5 878886989 +64 143 4 889739051 +168 121 4 884287731 +115 121 3 881170065 +87 167 4 879876703 +193 73 3 889127237 +1 135 4 875072404 +84 15 4 883449993 +60 97 3 883326215 +59 9 4 888203053 +189 196 5 893266204 +87 100 5 879876488 +41 196 3 890687593 +83 66 4 880307898 +174 1 3 886433898 +24 55 5 875323308 +6 165 5 883600747 +60 181 4 883326754 +49 145 1 888067460 +184 117 2 889907995 +102 56 3 888801360 +89 7 5 879441422 +7 192 4 891352010 +46 125 4 883616284 +128 191 4 879967080 +102 182 3 889362833 +60 121 4 883327664 +95 183 5 879197329 +54 7 4 880935294 +58 176 4 884304936 +186 106 2 879023242 +18 60 4 880132055 +5 135 4 875637536 +184 166 3 889910684 +157 50 4 886890541 +92 29 3 875656624 +95 175 5 879197603 +196 66 3 881251911 +117 122 2 886022187 +125 79 5 879454100 +60 144 4 883325944 +194 197 4 879522021 +194 135 3 879521474 +158 120 1 880134014 +65 50 5 879217689 +185 181 4 883524475 +26 151 3 891372429 +102 185 3 888802940 +184 127 5 889907396 +85 10 4 879452898 +55 117 3 878176047 +158 168 5 880134948 +195 127 5 875771441 +7 91 3 891353860 +54 25 4 880936500 +38 84 5 892430937 +120 15 4 889490244 +95 180 3 880570852 +97 1 4 884238911 +28 164 4 881960945 +1 68 4 875072688 +96 174 5 884403020 +177 12 5 880130825 +95 91 5 880573288 +182 191 4 876435434 +106 12 4 881451234 +55 181 4 878176237 +42 173 5 881107220 +87 62 5 879875996 +115 183 5 881171488 +183 77 3 891466405 +79 19 5 891271792 +11 56 4 891904949 +72 134 5 880037793 +135 98 5 879857765 +44 98 2 878347420 +14 12 5 890881216 +1 146 4 875071561 +115 4 4 881172117 +130 54 5 876251895 +13 99 4 882398654 +58 124 5 884304483 +75 123 3 884050164 +38 70 5 892432424 +42 83 4 881108093 +10 50 5 877888545 +151 137 5 879528754 +58 11 5 884305019 +65 185 4 879218449 +84 111 4 883453108 +1 176 5 876892468 +96 42 1 884403214 +89 187 5 879461246 +18 4 3 880132150 +96 7 5 884403811 +141 121 4 884585071 +18 45 5 880130739 +122 193 4 879270605 +194 178 3 879521253 +23 14 4 874784440 +145 89 4 882181605 +195 59 3 888737346 +54 24 1 880937311 +65 168 4 879217851 +151 86 5 879524345 +60 195 4 883326086 +43 189 5 875981220 +1 166 5 874965677 +152 120 2 880149686 +189 172 5 893265683 +43 25 5 875975656 +123 197 5 879872066 +101 1 3 877136039 +1 138 1 878543006 +102 175 4 892991117 +160 13 4 876768990 +98 168 2 880498834 +64 97 3 889738085 +187 97 3 879465717 +119 96 5 874781257 +62 56 5 879373711 +92 200 3 875811717 +181 15 3 878962816 +151 118 3 879542588 +190 125 3 891033863 +60 128 3 883326566 +94 190 5 885870231 +1 89 5 875072484 +110 33 4 886988631 +92 198 5 875653016 +158 96 4 880134332 +132 56 5 891278996 +194 90 3 879552841 +1 2 3 876893171 +175 193 4 877108098 +194 194 4 879523575 +196 108 4 881252110 +160 100 5 876767023 +43 82 4 883955498 +14 127 2 879644647 +162 11 4 877636772 +152 71 5 882900320 +6 22 3 883602048 +44 200 4 878347633 +71 64 4 885016536 +76 42 3 882606243 +13 83 2 886303585 +176 151 4 886048305 +193 38 3 889126055 +77 97 2 884753292 +128 132 3 879966785 +124 172 3 890287645 +90 117 3 891385389 +168 126 5 884287962 +95 82 3 879196408 +37 82 1 880915942 +10 157 5 877889004 +198 25 2 884205114 +90 175 3 891383912 +158 118 5 880132638 +6 50 4 883600842 +192 50 4 881367505 +56 183 5 892676314 +38 97 5 892430369 +94 25 3 891724142 +15 14 4 879455659 +23 124 5 874784440 +59 123 3 888203343 +151 152 3 879525075 +110 64 4 886987894 +104 126 4 888465513 +117 172 5 881012623 +189 105 2 893264865 +6 169 4 883600943 +80 100 5 887401453 +95 199 5 880570964 +56 158 3 892911539 +177 121 2 880131123 +165 15 5 879525799 +104 10 2 888465413 +57 125 3 883697223 +87 48 4 879875649 +144 187 4 888105312 +97 135 5 884238652 +110 94 4 886989473 +44 135 5 878347259 +44 132 4 878347315 +59 59 5 888204928 +198 168 4 884207654 +52 22 5 882922833 +64 50 5 889737914 +16 143 5 877727192 +94 77 3 891721462 +92 91 3 875660164 +64 162 3 889739262 +23 132 4 874785756 +18 168 3 880130431 +82 168 5 878769748 +178 82 5 882826242 +200 69 5 884128788 +62 70 3 879373960 +130 27 4 875802105 +7 143 3 892132627 +13 200 3 882140552 +87 199 5 879875649 +18 153 4 880130551 +95 31 4 888954513 +64 22 4 889737376 +200 169 5 884128822 +15 13 1 879455940 +59 161 3 888205855 +59 22 4 888204260 +85 57 5 879828107 +83 71 3 880328167 +16 95 5 877728417 +59 99 4 888205033 +53 121 4 879443329 +184 183 4 889908630 +165 176 4 879526007 +184 44 4 889909746 +95 170 5 880573288 +20 181 4 879667904 +125 195 5 892836465 +144 196 4 888105743 +189 99 5 893265684 +199 116 5 883782807 +60 174 4 883326497 +128 121 4 879968278 +89 111 4 879441452 +180 186 4 877127189 +43 111 4 883955745 +12 133 4 879959670 +114 56 3 881260545 +184 176 4 889908740 +192 121 2 881368127 +85 188 2 879454782 +22 167 3 878887023 +16 79 5 877727122 +60 8 3 883326370 +11 57 2 891904552 +94 176 4 891720570 +198 101 5 884209569 +64 11 4 889737376 +151 171 5 879524921 +188 28 3 875072972 +51 83 5 883498937 +135 56 4 879857765 +77 56 4 884752900 +200 177 4 884129656 +92 71 5 875654888 +92 12 5 875652934 +1 30 3 878542515 +177 55 3 880131143 +123 100 4 879872792 +85 170 4 879453748 +5 25 3 875635318 +85 100 3 879452693 +1 63 2 878543196 +18 61 4 880130803 +151 185 4 879528801 +102 168 3 888803537 +7 98 4 891351002 +5 186 5 875636375 +85 28 4 879829301 +82 9 4 876311146 +141 7 5 884584981 +92 92 4 875654846 +59 3 4 888203814 +49 82 1 888067765 +87 22 4 879875817 +128 71 4 879967576 +110 56 1 886988449 +118 7 5 875385198 +30 2 3 875061066 +16 4 5 877726390 +128 197 4 879966729 +174 12 5 886439091 +158 89 5 880133189 +175 147 3 877108146 +7 199 5 892135346 +37 174 5 880915810 +92 54 3 875656624 +94 179 5 885870577 +152 69 5 882474000 +63 108 2 875748164 +113 7 3 875076827 +151 70 4 879524947 +59 55 5 888204553 +66 127 4 883601156 +7 23 3 891351383 +138 182 4 879023948 +58 185 2 884304896 +56 200 4 892679088 +151 181 5 879524394 +42 54 4 881108982 +177 50 5 880131216 +114 156 4 881309662 +90 70 5 891383866 +7 175 5 892133057 +52 121 4 882922382 +177 153 4 880130972 +22 105 1 878887347 +94 192 4 891721142 +44 100 5 878341196 +183 55 4 891466266 +5 194 4 878845197 +18 165 4 880129527 +80 154 3 887401307 +181 105 1 878963304 +95 168 4 879197970 +95 28 4 879197603 +1 32 5 888732909 +94 111 4 891721414 +49 159 2 888068245 +145 156 5 875271896 +90 89 5 891385039 +157 100 5 886890650 +153 50 1 881371140 +96 194 2 884403392 +70 24 4 884064743 +83 69 4 887665549 +83 15 4 880307000 +7 187 4 891350757 +62 50 5 879372216 +53 64 5 879442384 +11 79 4 891905783 +109 79 5 880572721 +177 92 4 882142295 +76 7 4 875312133 +121 165 4 891388210 +193 82 2 889125880 +94 187 4 885870362 +64 82 3 889740199 +38 127 2 892429460 +18 91 3 880130393 +91 132 3 891439503 +178 38 3 882827574 +70 8 4 884064986 +31 32 5 881548030 +182 111 4 885613238 +162 144 3 877636746 +43 97 5 883955293 +5 183 4 875636014 +136 137 5 882693339 +20 94 2 879669954 +1 141 3 878542608 +69 42 5 882145548 +84 1 2 883452108 +178 24 3 882824221 +119 56 4 874781198 +200 28 5 884128458 +5 29 4 875637023 +73 32 4 888626220 +24 180 5 875322847 +109 181 5 880563471 +43 196 4 875981190 +42 43 2 881109325 +97 132 5 884238693 +57 11 3 883698454 +198 1 4 884205081 +90 136 5 891383241 +95 70 4 880571951 +158 39 5 880134398 +85 194 4 879454189 +23 100 5 874784557 +113 124 3 875076307 +118 79 5 875384885 +194 121 2 879539794 +167 96 5 892738307 +31 175 5 881548053 +96 195 5 884403159 +57 64 5 883698431 +122 180 5 879270327 +177 11 4 880131161 +148 50 5 877016805 +17 137 4 885272606 +91 135 4 891439302 +94 90 3 891721889 +145 23 4 875271896 +18 200 3 880131775 +59 111 4 888203095 +132 175 3 891278807 +15 50 5 879455606 +118 132 4 875384793 +13 155 2 882399615 +2 1 4 888550871 +63 15 3 875747439 +128 133 5 879967248 +52 117 4 882922629 +193 94 3 889127592 +122 69 2 879270511 +71 175 4 885016882 +109 29 3 880582783 +178 95 5 882826514 +123 98 4 879872672 +62 1 2 879372813 +193 72 2 889127301 +92 145 2 875654929 +117 144 4 881011807 +102 91 3 883748488 +91 176 5 891439130 +44 81 4 878348499 +11 69 3 891904270 +142 124 4 888640379 +95 193 3 879198482 +67 25 4 875379420 +116 116 3 876453733 +26 126 4 891371676 +148 89 5 877398587 +10 116 4 877888944 +43 140 4 883955110 +94 66 2 891721889 +72 15 5 880035708 +115 33 4 881171693 +14 96 4 890881433 +85 197 5 879455197 +94 56 5 891725331 +178 90 3 882827985 +92 100 5 875640294 +130 82 5 875802080 +18 9 5 880130550 +26 181 4 891386369 +189 132 5 893265865 +194 69 4 879521595 +44 159 3 878347633 +145 117 5 875270655 +85 30 3 882995290 +176 25 3 886048188 +92 143 3 875653960 +156 178 5 888185777 +118 53 5 875385280 +200 107 3 884128022 +73 171 5 888626199 +137 174 5 881433654 +128 159 4 879968390 +5 101 5 878844510 +144 69 5 888105140 +161 181 2 891171848 +44 25 2 878346431 +94 93 4 891724282 +92 160 4 875654125 +87 21 3 879877173 +60 173 4 883326498 +1 40 3 876893230 +13 191 3 881515193 +178 127 5 882823978 +43 133 4 875981483 +42 58 5 881108040 +177 176 4 880130951 +161 186 4 891171530 +42 125 4 881105462 +75 114 4 884051893 +102 38 2 888801622 +18 94 3 880131676 +138 133 4 879024043 +26 24 3 891377540 +91 182 4 891439439 +6 47 3 883600943 +198 56 5 884207392 +43 86 4 883955020 +1 133 4 876892818 +90 26 4 891385842 +42 175 2 881107687 +144 144 4 888105254 +159 72 3 884026946 +64 191 4 889740740 +116 191 4 876453961 +62 91 4 879375196 +190 15 4 891033697 +97 183 5 884238911 +183 176 3 891466266 +70 83 4 884065895 +197 56 1 891409799 +96 181 5 884403687 +15 118 1 879456381 +44 24 3 878346575 +120 121 4 889490290 +58 171 5 884663379 +58 172 5 884305241 +118 56 5 875385198 +199 93 4 883782825 +102 53 2 888801577 +24 69 5 875323051 +7 140 5 891353124 +53 118 4 879443253 +16 11 5 877718755 +188 5 4 875074266 +8 195 5 879362287 +85 27 4 879827488 +60 59 5 883326155 +64 182 4 889738030 +102 29 1 888802677 +109 64 2 880572560 +124 28 3 890287068 +158 194 5 880134913 +91 98 5 891439130 +7 100 5 891351082 +23 82 3 874787449 +97 197 3 884239655 +118 135 5 875384591 +178 97 5 882827020 +25 143 3 885852529 +43 3 2 884029543 +15 15 4 879455939 +87 144 4 879875734 +130 98 5 875216507 +109 77 4 880578388 +119 22 4 874781698 +99 125 4 885678840 +177 200 4 880130951 +145 54 5 888398669 +141 118 5 884585274 +16 200 5 877722736 +70 161 3 884067638 +152 161 5 882476363 +57 24 3 883697459 +130 159 4 875802211 +18 166 4 880129595 +64 179 5 889739460 +198 121 3 884206330 +85 153 3 879453658 +38 188 2 892431953 +27 148 3 891543129 +97 96 5 884239712 +194 50 3 879521396 +13 95 5 882140104 +65 63 2 879217913 +82 99 4 878769949 +102 194 3 888803537 +109 70 4 880578038 +7 27 4 891352692 +90 170 5 891383561 +71 197 5 885016990 +38 105 3 892434217 +200 179 4 884129029 +59 52 4 888205615 +184 82 3 889909934 +83 191 4 880308038 +83 121 4 880306951 +144 87 5 888105548 +92 64 4 875653519 +184 20 4 889907771 +141 127 2 884584735 +7 77 5 891353325 +130 31 4 875801801 +194 9 4 879535704 +200 89 5 884128788 +18 132 5 880132437 +180 153 1 877126182 +183 181 2 891463937 +49 80 1 888069117 +42 161 4 881108229 +72 118 3 880036346 +25 195 4 885852008 +127 62 5 884364950 +13 92 3 882397271 +59 194 3 888204841 +94 97 4 891721317 +11 24 3 891904016 +95 94 5 880573288 +64 183 5 889737914 +2 14 4 888551853 +152 15 5 880148843 +5 168 3 875636691 +12 195 4 879959670 +1 194 4 876892743 +90 19 3 891384020 +59 176 5 888205574 +60 95 4 883327799 +200 195 5 884128822 +82 81 3 878770059 +94 183 5 891720921 +93 1 5 888705321 +94 41 3 891723355 +64 195 5 889737914 +200 54 4 884129920 +200 98 5 884128933 +28 200 2 881961671 +95 179 3 880570909 +45 50 5 881007272 +53 96 4 879442514 +89 137 1 879441335 +125 41 2 892838510 +90 18 3 891383687 +189 24 4 893264248 +185 111 4 883524529 +130 79 5 875217392 +67 24 4 875379729 +125 109 3 892838288 +59 149 4 888203313 +195 152 3 890589490 +94 125 1 891721851 +7 56 5 891351432 +178 92 3 882827803 +158 129 5 880132383 +194 182 3 879521475 +5 50 4 875635758 +115 96 3 881172117 +24 176 5 875323595 +82 28 3 878769815 +49 13 3 888068816 +95 63 3 880572218 +60 153 3 883326733 +184 25 4 889908068 +197 39 2 891409982 +154 191 4 879138832 +119 11 5 874781198 +44 71 3 878347633 +109 71 4 880578066 +174 111 5 886433898 +41 175 5 890687526 +151 31 3 879524713 +94 83 4 885873653 +58 175 5 884663324 +62 174 4 879374916 +128 82 5 879968185 +186 121 2 879023074 +187 65 5 879465507 +13 79 3 882139746 +44 69 4 878347711 +81 150 3 876533619 +193 1 4 890859954 +187 197 4 879465597 +108 127 4 879879720 +72 9 5 880035636 +7 62 3 891354499 +59 135 5 888204758 +55 118 5 878176134 +37 147 3 880915749 +58 189 3 884304790 +73 64 5 888625042 +81 121 4 876533586 +98 88 3 880499087 +151 154 4 879524642 +104 181 5 888465972 +117 173 5 881011697 +7 29 3 891353828 +151 131 5 879525075 +26 14 3 891371505 +188 157 3 875072674 +45 13 5 881012356 +56 117 5 892679439 +110 41 4 886989399 +184 97 2 889908539 +85 134 5 879454004 +65 28 4 879216734 +70 168 4 884065423 +132 12 4 891278867 +174 100 5 886433788 +59 11 5 888205744 +13 110 3 882141130 +84 25 3 883452462 +189 136 4 893265535 +70 183 4 884149894 +119 25 5 886177013 +56 191 4 892678526 +90 174 5 891383866 +43 172 4 883955135 +194 118 3 879539229 +109 122 2 880583493 +189 97 4 893277579 +92 7 4 876175754 +96 190 4 884402978 +10 33 4 877893020 +161 22 2 891171282 +48 183 5 879434608 +94 49 4 891722174 +87 111 4 879876611 +194 28 5 879522324 +12 168 4 879959513 +16 109 4 877719333 +85 193 3 879454189 +113 116 3 875076246 +197 22 5 891409839 +182 126 5 885613153 +85 99 5 880838306 +85 14 4 879452638 +56 88 1 892683895 +22 184 5 878887869 +138 194 5 879024184 +59 181 5 888204877 +8 174 5 879362183 +144 117 4 888103969 +24 8 5 875323002 +59 174 5 888204553 +128 26 4 879969032 +70 95 4 884065501 +132 127 4 891278937 +10 174 4 877886661 +57 126 3 883697293 +120 117 3 889490979 +69 181 5 882072778 +13 68 3 882397741 +85 182 4 893110061 +161 50 2 891170972 +184 66 4 889910013 +10 129 4 877891966 +124 154 5 890287645 +87 172 5 879875737 +7 178 4 891350932 +96 98 5 884403214 +54 1 4 880931595 +85 191 4 879455021 +130 181 5 874953621 +75 1 4 884050018 +8 82 5 879362356 +113 50 5 875076416 +115 56 5 881171409 +13 170 5 882139774 +75 196 4 884051948 +94 143 4 891722609 +181 118 2 878962955 +70 132 4 884067281 +23 7 4 874784385 +58 70 4 890321652 +92 78 3 876175191 +178 11 5 882826162 +99 121 3 885679261 +79 116 5 891271676 +60 160 4 883326525 +5 162 1 875721572 +24 11 5 875323100 +114 176 5 881260203 +5 95 4 875721168 +157 117 5 886890296 +101 117 4 877136067 +68 111 3 876974276 +114 180 3 881309718 +151 198 4 879524472 +145 17 3 875272132 +75 111 4 884050502 +25 186 4 885852569 +60 168 5 883326837 +198 6 2 884206270 +76 77 2 882607017 +10 178 5 877888677 +28 195 4 881957250 +10 11 4 877888677 +92 182 4 875653836 +95 72 2 880571389 +194 86 3 879520991 +94 53 4 891721378 +158 123 3 880132488 +10 182 5 877888876 +87 181 5 879876194 +13 1 3 882140487 +194 22 5 879521474 +24 41 5 875323594 +58 116 5 884304409 +159 96 4 884360539 +121 127 5 891388333 +115 177 5 881172117 +109 177 4 880578358 +109 12 4 880577542 +28 56 5 881957479 +62 44 3 879374142 +110 196 4 886987978 +52 13 5 882922485 +66 50 5 883601236 +48 185 4 879434819 +152 49 5 882477402 +49 42 4 888068791 +124 144 4 890287645 +41 195 4 890687042 +18 42 3 880130713 +22 94 3 878887277 +10 134 5 877889131 +56 11 4 892676376 +138 15 4 879023389 +52 151 5 882922249 +30 172 4 875060742 +99 22 5 885679596 +10 137 4 877889186 +59 168 5 888204641 +76 137 5 875498777 +121 100 4 891388035 +195 198 3 884420000 +62 83 5 879375000 +194 8 3 879521719 +118 55 5 875385099 +144 1 4 888104063 +1 93 5 875071484 +92 120 2 875642089 +56 173 4 892737191 +84 95 4 883453642 +104 13 3 888465634 +56 111 2 892683877 +95 25 3 879192597 +7 47 5 891352692 +94 22 4 885872758 +186 98 5 891719859 +18 177 3 880131297 +119 168 5 874781351 +60 12 4 883326463 +60 166 4 883326593 +18 198 3 880130613 +125 105 3 892839021 +49 90 1 888069194 +192 108 4 881368339 +130 100 3 874953558 +1 8 1 875072484 +198 98 4 884207611 +56 78 3 892910544 +72 194 4 880037793 +43 79 4 875981335 +188 100 4 875074127 +62 195 5 879373960 +189 13 4 893264220 +44 121 4 878346946 +109 164 5 880578066 +49 96 1 888069512 +188 97 5 875071891 +22 175 4 878886682 +181 100 3 878962816 +59 61 4 888204597 +194 73 3 879527145 +164 100 5 889401998 +95 200 2 888954552 +158 92 4 880134407 +57 109 4 883697293 +13 13 5 882141617 +161 132 1 891171458 +109 125 5 880564534 +95 89 3 879196353 +156 187 5 888185778 +94 80 2 891723525 +1 105 2 875240739 +84 117 4 883450553 +1 147 3 875240993 +62 98 4 879373543 +115 23 5 881171348 +125 181 5 879454139 +95 77 4 880571746 +200 68 5 884129729 +83 25 2 883867729 +24 173 5 875323474 +137 1 3 881433048 +151 26 3 879542252 +87 127 4 879876194 +85 143 4 879456247 +83 111 3 884647519 +142 176 5 888640455 +1 99 3 875072547 +77 127 2 884732927 +195 143 5 875771441 +104 111 1 888465675 +64 196 4 889737992 +1 1 5 874965758 +18 98 5 880129527 +92 5 4 875654432 +148 151 4 877400124 +151 132 5 879524669 +177 135 5 880130712 +20 174 4 879669087 +199 100 3 883782807 +193 23 4 889126609 +91 127 5 891439018 +64 144 3 889737771 +73 179 5 888626041 +181 117 2 878962918 +138 12 5 879024232 +200 63 4 884130415 +72 77 4 880036945 +194 76 2 879549503 +6 137 5 883599327 +198 191 4 884208682 +41 188 4 890687571 +64 121 2 889739678 +95 188 3 879196354 +7 64 5 891350756 +145 134 4 882181695 +194 13 4 879539410 +144 8 4 888105612 +57 181 5 883697352 +178 56 4 882825767 +95 153 5 879197022 +187 168 5 879465273 +49 50 1 888067691 +69 98 5 882145375 +178 9 2 882823758 +92 195 5 875652981 +26 118 3 891385691 +90 20 4 891384357 +13 138 1 882399218 +30 174 5 885941156 +71 181 3 877319414 +144 61 3 888106182 +22 24 5 878888026 +13 117 3 882398138 +131 127 4 883681418 +177 173 4 880130667 +77 15 2 884732873 +75 13 5 884050102 +13 12 5 881515011 +54 181 5 880931358 +102 187 3 888801232 +144 4 4 888105873 +49 71 3 888067096 +178 87 4 885784558 +52 111 4 882922357 +178 200 3 882826983 +186 56 3 879023460 +23 151 3 874784668 +189 7 3 893264300 +188 64 5 875071891 +15 181 5 879455710 +101 147 4 877136506 +118 171 5 875384825 +154 174 5 879138657 +25 116 4 885853335 +16 12 5 877718168 +7 157 5 891352059 +6 64 4 883600597 +97 175 5 884239616 +158 53 1 880134781 +48 191 5 879434954 +60 73 4 883326995 +194 159 3 879552401 +124 168 5 890287645 +109 156 5 880573084 +156 83 3 888185677 +158 111 4 880134261 +98 25 5 880499111 +94 200 4 891721414 +87 50 5 879876194 +95 198 5 880570823 +82 3 2 878768765 +52 19 5 882922407 +194 134 2 879521719 +60 30 5 883325944 +106 25 4 881451016 +43 9 4 875975656 +124 174 3 890287317 +184 175 3 889908985 +83 196 5 880307996 +115 174 5 881171137 +95 141 4 888954631 +181 19 1 878962392 +196 116 3 881251753 +130 11 5 875216545 +81 42 4 876534704 +174 139 3 886515591 +181 129 2 878962279 +37 118 2 880915633 +159 126 5 880557038 +177 64 4 880130736 +97 191 5 884239472 +195 93 3 891762536 +92 171 4 875652981 +6 174 4 883600985 +130 118 4 874953895 +85 79 3 879453845 +72 174 5 880037702 +96 182 4 884402791 +95 121 4 879194114 +48 98 5 879434954 +91 50 5 891439386 +5 172 5 875636130 +175 12 4 877108146 +167 8 5 892738237 +181 18 1 878962623 +162 1 4 877635819 +189 124 5 893264048 +76 60 4 875028007 +59 79 5 888204260 +125 176 5 879454448 +152 117 4 880148782 +181 111 3 878962774 +92 80 2 875907504 +89 66 3 879459980 +62 97 2 879373795 +119 23 3 874782100 +1 197 5 875072956 +151 147 2 879524947 +161 133 2 891171023 +95 78 3 888956901 +136 116 5 882693723 +1 173 5 878541803 +13 7 2 882396790 +122 11 1 879270424 +89 100 5 879441271 +1 75 4 878543238 +68 25 4 876974176 +18 66 3 880131728 +198 117 1 884205114 +184 51 4 889909069 +198 143 3 884208951 +197 172 5 891409839 +46 50 4 883616254 +118 98 5 875384979 +102 127 2 888801316 +5 70 4 875636389 +29 79 4 882821989 +160 185 5 876861185 +6 13 2 883599400 +130 5 4 876251650 +109 55 2 880572756 +62 170 3 879373848 +194 183 3 879520916 +185 9 4 883524396 +199 7 4 883782854 +115 98 3 881171409 +128 25 3 879968185 +74 7 4 888333458 +59 147 5 888203270 +1 34 2 878542869 +62 53 2 879376270 +27 118 3 891543222 +94 132 4 891720862 +184 182 4 889908497 +158 181 3 880132383 +138 56 5 879024232 +69 50 5 882072748 +198 55 3 884207525 +18 95 4 880131297 +181 104 1 878962866 +77 175 4 884733655 +197 2 3 891409981 +62 196 4 879374015 +60 185 4 883326682 +181 93 1 878962773 +59 44 4 888206048 +7 80 4 891354381 +141 147 4 884584906 +119 132 5 874782228 +49 98 4 888067307 +7 51 2 891352984 +23 161 2 874787017 +13 193 5 882139937 +144 135 5 888105364 +18 157 3 880131849 +72 2 3 880037376 +20 151 3 879668555 +188 177 4 875073329 +184 121 2 889908026 +68 121 1 876974176 +151 50 5 879525034 +110 2 3 886988536 +193 199 5 889125535 +181 126 2 878962585 +95 71 5 880573288 +151 81 5 879524293 +198 161 3 884208454 +23 185 4 874785756 +120 1 4 889490412 +114 197 4 881260506 +197 11 1 891409893 +94 160 4 891721942 +87 80 4 879877241 +13 177 5 882397271 +64 190 4 889737851 +125 191 5 879454385 +59 58 4 888204389 +1 144 4 875073180 +97 153 5 884239686 +116 20 3 892683858 +62 129 3 879372276 +174 147 4 886433936 +37 50 5 880915838 +23 155 3 874787059 +81 111 3 876534174 +6 186 4 883602730 +189 166 4 893265657 +58 127 4 884304503 +116 47 3 876454238 +109 174 5 880572721 +70 99 4 884067222 +188 54 4 875074589 +94 142 3 891721749 +92 157 4 875653988 +13 183 4 882397271 +75 25 5 884049875 +11 180 2 891904335 +160 15 2 876768609 +132 151 3 891278774 +6 191 4 883601088 +144 15 4 888104150 +59 137 5 888203234 +193 25 4 889127301 +26 109 3 891376987 +102 82 2 888801360 +5 151 3 875635723 +178 153 4 882826347 +52 126 5 882922589 +95 144 5 879197329 +94 98 4 891721192 +82 181 4 876311241 +72 98 5 880037417 +72 25 5 880035588 +128 168 4 879966685 +174 140 4 886515514 +130 62 4 876252175 +32 9 3 883717747 +200 43 3 884129814 +104 3 3 888465739 +188 121 4 875073647 +200 141 4 884129346 +83 4 2 880336655 +82 121 4 876311387 +56 67 2 892677114 +92 4 4 875654222 +18 125 3 880131004 +23 79 4 874785957 +12 159 4 879959306 +24 200 5 875323440 +194 87 4 879523104 +106 161 3 881452816 +184 79 3 889909551 +94 29 2 891723883 +151 9 4 879524199 +59 1 2 888203053 +57 105 3 883698009 +79 93 2 891271676 +189 162 3 893266230 +99 1 4 886518459 +6 178 4 883600785 +25 131 4 885852611 +11 86 4 891904551 +128 181 4 879966954 +64 188 4 889739586 +21 145 1 874951761 +174 9 5 886439492 +91 136 4 891438909 +109 82 5 880572680 +94 173 4 885872758 +87 89 4 879875818 +184 93 4 889907771 +28 176 5 881956445 +197 38 3 891410039 +182 181 5 885612967 +158 55 4 880134407 +65 88 4 879217942 +64 135 4 889737889 +92 125 4 876175004 +73 153 3 888626007 +109 88 4 880581942 +53 7 3 879442991 +1 119 5 876893098 +56 176 5 892676377 +152 133 5 882474845 +1 26 3 875072442 +109 118 3 880571801 +22 163 1 878886845 +115 13 5 881171983 +44 90 2 878348784 +139 150 4 879538327 +42 66 4 881108280 +62 159 3 879375762 +64 194 5 889737710 +10 32 4 877886661 +66 121 3 883601834 +3 181 4 889237482 +23 91 4 884550049 +13 91 2 882398724 +56 154 2 892911144 +23 145 3 874786244 +72 97 4 880036638 +59 87 4 888205228 +174 15 5 886434065 +95 186 5 880573288 +71 14 5 877319375 +162 105 2 877636458 +7 134 4 892134959 +158 187 5 880134332 +2 25 4 888551648 +51 173 5 883498844 +7 194 5 891351851 +178 143 4 882827574 +198 70 3 884207691 +200 117 5 876042268 +198 132 4 884208137 +148 175 4 877016259 +194 91 3 879524892 +27 9 4 891542942 +62 62 3 879375781 +72 170 3 880037793 +23 156 3 877817091 +23 174 4 874785652 +73 154 5 888625343 +83 174 5 880307699 +85 69 4 879454582 +57 8 4 883698292 +104 130 1 888465554 +174 151 3 886434013 +102 188 2 888801812 +1 158 3 878542699 +1 37 2 878543030 +194 15 4 879539127 +23 134 4 874786098 +14 32 5 890881485 +91 31 5 891438875 +76 56 5 875027739 +67 105 4 875379683 +198 200 4 884207239 +151 111 4 879542775 +96 56 5 884403336 +44 89 5 878347315 +137 89 5 881433719 +28 117 4 881957002 +85 168 4 879454304 +22 109 4 878886710 +184 118 2 889908344 +13 194 5 882141458 +54 151 2 880936670 +151 134 4 879524131 +174 31 4 886434566 +197 29 3 891410170 +1 181 5 874965739 +21 200 5 874951695 +91 187 5 891438908 +85 180 4 879454820 +128 70 3 879967341 +189 191 5 893265402 +57 42 5 883698324 +194 155 3 879550737 +175 172 5 877107339 +83 161 4 887665549 +6 166 4 883601426 +160 192 5 876861185 +18 134 5 880129877 +130 24 5 874953866 +56 97 3 892677186 +16 1 5 877717833 +93 151 1 888705360 +99 117 5 885678784 +82 21 1 884714456 +187 69 4 879465566 +26 121 3 891377540 +109 98 4 880572755 +189 83 4 893265624 +160 168 4 876858091 +144 70 4 888105587 +94 54 4 891722432 +151 14 5 879524325 +44 196 4 878348885 +11 98 2 891905783 +198 185 3 884209264 +1 136 3 876893206 +178 50 5 882823857 +94 191 5 885870175 +188 22 5 875072459 +180 12 2 877355568 +194 174 4 879520916 +144 48 5 888105197 +26 50 4 891386368 +97 82 4 884239552 +6 194 4 883601365 +185 25 4 883525206 +13 32 4 882140286 +160 151 4 876769097 +194 81 2 879523576 +174 69 5 886514201 +160 153 3 876860808 +102 72 3 888803602 +161 100 4 891171127 +76 172 5 882606080 +21 7 5 874951292 +178 155 4 882828021 +13 5 1 882396869 +21 5 2 874951761 +65 15 5 879217138 +178 184 5 882827947 +159 195 3 884360539 +180 98 5 877544444 +109 72 5 880577892 +48 136 4 879434689 +130 65 4 875216786 +60 50 5 883326566 +141 126 5 884585642 +145 123 4 879161848 +22 21 4 878886750 +6 177 4 883600818 +62 199 4 879373692 +200 48 2 884129029 +99 25 3 885679025 +154 61 4 879138657 +44 99 4 878348812 +2 10 2 888551853 +14 124 5 876964936 +58 12 5 884304895 +121 83 4 891388210 +83 70 4 880308256 +16 58 4 877720118 +73 152 3 888626496 +200 161 4 884128979 +90 153 5 891384754 +138 117 4 879023245 +23 19 4 874784466 +101 7 3 877135944 +151 88 5 879542645 +11 39 3 891905824 +59 199 4 888205410 +178 62 4 882827083 +1 131 1 878542552 +13 17 1 882396954 +6 12 4 883601053 +184 50 4 889907396 +95 182 2 879198210 +160 195 4 876859413 +94 81 4 885870577 +180 196 5 877355617 +141 25 5 884585105 +92 174 5 875654189 +189 150 4 893277702 +91 192 4 891439302 +109 111 4 880564570 +92 116 3 875640251 +121 118 2 891390501 +13 128 1 882397502 +29 182 4 882821989 +200 193 4 884129209 +69 7 5 882126086 +60 69 4 883326215 +56 42 4 892676933 +56 53 3 892679163 +58 25 4 884304570 +94 164 3 891721528 +188 118 3 875072972 +159 7 5 880485861 +65 179 3 879216605 +90 194 5 891383424 +30 50 3 875061066 +185 116 4 883526268 +85 152 5 879454751 +72 187 4 880036638 +1 109 5 874965739 +90 126 2 891384611 +152 66 5 886535773 +1 182 4 875072520 +108 124 4 879879757 +96 89 5 884402896 +145 88 5 875272833 +151 195 3 879524642 +1 71 3 876892425 +90 56 5 891384516 +198 195 3 884207267 +49 57 4 888066571 +23 102 3 874785957 +85 23 4 879454272 +44 64 5 878347915 +92 72 3 875658159 +43 28 4 875981452 +11 15 5 891903067 +95 67 2 879198109 +23 131 4 884550021 +142 186 4 888640430 +90 154 5 891384516 +116 11 5 886310197 +77 31 3 884753292 +139 100 5 879538199 +125 172 5 879454448 +95 65 4 879197918 +12 161 5 879959553 +59 184 4 888206094 +72 56 5 880037702 +96 127 5 884403214 +118 175 5 875384885 +148 132 4 877020715 +175 133 4 877107390 +13 179 2 882140206 +58 135 4 884305150 +55 79 5 878176398 +70 173 4 884149452 +125 122 1 892839312 +70 28 4 884065757 +152 21 3 880149253 +109 15 4 880577868 +90 10 5 891383987 +91 143 4 891439386 +75 56 5 884051921 +82 1 4 876311241 +6 71 4 883601053 +58 151 3 884304553 +26 111 3 891371437 +37 96 4 880915810 +110 43 3 886988100 +13 86 1 881515348 +200 118 4 876042299 +193 69 5 889125287 +90 141 5 891385899 +23 55 4 874785624 +90 134 5 891383204 +55 22 5 878176397 +200 132 5 884130792 +184 40 4 889910326 +154 197 5 879139003 +136 124 5 882693489 +18 197 4 880130109 +62 128 2 879374866 +99 173 4 885680062 +42 135 4 881109148 +44 67 3 878348111 +59 97 5 888205921 +176 117 4 886048305 +1 46 4 876893230 +130 67 4 876252064 +90 64 4 891383912 +44 163 4 878348627 +109 17 4 880582132 +59 106 4 888203959 +115 124 5 881170332 +81 25 5 876533946 +65 73 4 879217998 +144 124 4 888104063 +46 100 4 883616134 +23 8 4 874785474 +99 105 2 885679353 +190 121 3 891033773 +200 91 4 884129814 +21 185 5 874951658 +106 14 4 881449486 +43 174 4 875975687 +43 12 5 883955048 +178 124 4 882823758 +184 165 4 889911178 +73 1 2 888626065 +56 153 4 892911144 +153 181 1 881371140 +109 186 3 880572786 +144 55 4 888105254 +1 169 5 878543541 +97 195 5 884238966 +125 186 3 879454448 +122 135 4 879270327 +136 9 5 882693429 +18 23 4 880130065 +148 181 5 877399135 +94 196 4 891721462 +77 25 2 884733055 +95 102 4 880572474 +115 137 5 881169776 +151 97 5 879528801 +178 144 4 882825768 +6 168 4 883602865 +70 135 4 884065387 +184 185 4 889908843 +64 101 2 889740225 +193 177 4 890860290 +16 98 5 877718107 +82 7 3 876311217 +163 28 3 891220019 +11 125 4 891903108 +52 25 5 882922562 +85 121 2 879453167 +49 25 2 888068791 +198 153 4 884207858 +41 97 3 890687665 +18 178 3 880129628 +73 175 5 888625785 +84 87 5 883453587 +13 33 5 882397581 +145 173 5 875272604 +148 56 5 877398212 +48 194 4 879434819 +87 195 5 879875736 +92 51 4 875812305 +1 41 2 876892818 +1 162 4 878542420 +70 174 5 884065782 +31 136 5 881548030 +65 98 4 879218418 +188 162 4 875072972 +38 155 5 892432090 +89 50 5 879461219 +152 8 5 882829050 +181 6 1 878962866 +1 110 1 878542845 +95 95 3 879198109 +7 73 3 892133154 +178 77 4 882827947 +189 50 5 893263994 +13 22 4 882140487 +198 73 3 884208419 +153 182 5 881371198 +135 55 4 879857797 +10 197 5 877888944 +64 91 4 889739733 +28 28 4 881956853 +58 98 4 884304747 +58 199 4 891611501 +185 160 1 883524281 +130 64 5 875801549 +177 186 4 880130990 +6 185 5 883601393 +20 148 5 879668713 +189 118 1 893264735 +174 126 5 886433166 +182 50 5 885613018 +174 178 5 886513947 +1 66 4 878543030 +13 39 3 882397581 +76 93 4 882606572 +151 93 5 879525002 +85 82 3 879454633 +26 116 2 891352941 +115 176 5 881171203 +59 180 4 888204597 +197 183 5 891409839 +72 5 4 880037418 +102 68 2 888801673 +1 77 4 876893205 +200 62 5 884130146 +59 185 5 888205228 +194 88 3 879549394 +178 7 4 882823805 +85 132 5 879453965 +72 69 4 880036579 +110 88 4 886988967 +113 9 3 875076307 +102 49 2 892992129 +187 179 5 879465782 +109 95 4 880572721 +182 150 3 885613294 +85 136 4 879454349 +167 83 5 892738384 +12 4 5 879960826 +63 25 4 875747292 +184 65 4 889909516 +184 1 4 889907652 +124 173 2 890287687 +99 147 5 885678997 +41 168 5 890687304 +71 174 2 877319610 +130 29 3 878537558 +87 153 5 879876703 +90 133 5 891384147 +23 144 3 874785926 +54 100 5 880931595 +43 168 4 875981159 +121 156 4 891388145 +60 7 5 883326241 +59 198 5 888204389 +42 64 5 881106711 +44 102 2 878348499 +44 7 5 878341246 +123 23 4 879873020 +125 117 3 879454699 +109 81 2 880580030 +28 185 5 881957002 +116 181 4 876452523 +49 171 4 888066551 +189 178 5 893265191 +70 142 3 884150884 +198 69 4 884207560 +22 181 5 878887765 +102 95 4 883748488 +152 173 5 882474378 +189 165 5 893265535 +18 189 5 880129816 +189 151 5 893264378 +193 153 4 889125629 +178 157 5 882827400 +190 148 4 891033742 +181 109 1 878962955 +60 9 5 883326399 +151 199 3 879524563 +192 25 4 881367618 +16 125 3 877726944 +24 79 4 875322796 +14 191 4 890881557 +11 29 3 891904805 +102 88 3 892991311 +89 49 4 879460347 +145 66 4 875272786 +10 69 4 877889131 +131 9 5 883681723 +6 79 3 883600747 +95 22 4 888953953 +152 153 4 880149924 +87 47 3 879876637 +1 199 4 875072262 +77 132 3 884753028 +181 24 1 878962866 +130 38 4 876252263 +12 174 5 879958969 +104 15 5 888465413 +74 13 4 888333542 +94 72 3 891723220 +43 8 4 875975717 +101 125 4 877137015 +1 57 5 878542459 +1 50 5 874965954 +145 181 5 875270507 +11 123 3 891902745 +65 69 3 879216479 +11 42 3 891905058 +22 17 4 878886682 +70 143 5 884149431 +176 129 3 886048391 +198 122 1 884206807 +52 107 4 882922540 +51 184 3 883498685 +145 164 4 875271948 +13 157 3 882140552 +41 58 3 890687353 +97 23 5 884239553 +82 71 4 878770169 +151 133 5 879524797 +70 88 4 884067394 +64 185 4 889739517 +194 157 4 879547184 +178 64 5 882826242 +69 182 4 882145400 +198 164 3 884208571 +60 136 4 883326057 +198 108 3 884206270 +188 144 3 875071520 +60 70 4 883326838 +38 78 5 892433062 +96 83 3 884403758 +87 72 3 879876848 +151 82 3 879524819 +76 192 5 875027442 +178 172 4 882826555 +24 12 5 875323711 +6 173 5 883602462 +148 168 5 877015900 +162 79 4 877636713 +136 127 5 882693404 +142 7 4 888640489 +81 186 5 876534783 +59 191 4 888204841 +174 70 5 886453169 +102 200 3 888803051 +110 161 5 886988631 +97 100 2 884238778 +23 194 4 874786016 +121 135 5 891388090 +49 77 1 888068289 +187 23 4 879465631 +8 177 4 879362233 +89 181 4 879441491 +56 144 5 892910796 +11 25 3 891903836 +169 134 5 891359250 +76 6 5 875028165 +23 96 4 874785551 +77 1 5 884732808 +48 56 3 879434723 +42 38 3 881109148 +109 161 3 880572756 +141 100 4 884584688 +194 29 2 879528342 +51 136 4 883498756 +125 8 4 879454419 +151 1 5 879524151 +64 31 4 889739318 +138 137 5 879023131 +1 192 4 875072547 +26 148 3 891377540 +1 178 5 878543541 +90 59 5 891383173 +169 50 5 891359250 +43 69 4 875981421 +123 64 3 879872791 +92 193 4 875654222 +1 5 3 889751712 +106 28 4 881451144 +83 56 1 886534501 +95 58 3 879197834 +13 87 5 882398814 +92 9 4 875640148 +130 2 4 876252327 +144 98 4 888105587 +189 175 5 893265506 +103 181 4 880415875 +175 195 3 877107790 +10 60 3 877892110 +145 111 3 875270322 +1 87 5 878543541 +23 162 3 874786950 +24 92 5 875323241 +92 95 3 875653664 +85 65 3 879455021 +82 175 4 878769598 +198 131 3 884208952 +49 181 1 888067765 +32 111 3 883717986 +184 170 5 889913687 +1 156 4 874965556 +186 38 5 879023723 +85 87 4 879829327 +117 7 3 880125780 +62 144 3 879374785 +8 96 3 879362183 +58 181 3 884304447 +85 174 4 879454139 +43 186 3 875981335 +187 8 5 879465273 +104 124 2 888465226 +94 121 2 891721815 +7 141 5 891353444 +178 180 3 882826395 +52 100 4 882922204 +178 79 4 882826306 +95 197 4 888954243 +64 153 3 889739243 +24 98 5 875323401 +177 60 4 880130634 +99 120 2 885679472 +4 11 4 892004520 +198 64 4 884207206 +62 127 4 879372216 +64 98 4 889737654 +16 180 5 877726790 +152 121 5 880149166 +97 133 1 884239655 +102 73 3 892992297 +10 170 4 877889333 +59 179 5 888204996 +13 163 3 882141582 +22 144 5 878887680 +49 85 3 888068934 +78 93 4 879633766 +92 49 3 875907416 +159 25 5 880557112 +151 33 5 879543181 +69 197 5 882145548 +5 145 1 875720830 +110 184 1 886988631 +125 144 5 879454197 +13 164 3 882396790 +59 197 5 888205462 +94 194 4 885870284 +64 141 4 889739517 +92 1 4 875810511 +51 144 5 883498894 +72 50 2 880037119 +158 29 3 880134607 +56 89 4 892676314 +157 120 1 886891243 +92 55 3 875654245 +65 87 5 879217689 +178 168 4 882826347 +95 196 4 879198354 +13 199 5 882140001 +13 67 1 882141686 +182 178 5 876435434 +24 71 5 875323833 +22 2 2 878887925 +51 172 5 883498936 +12 98 5 879959068 +175 11 5 877107339 +148 135 5 877016514 +62 13 4 879372634 +1 106 4 875241390 +153 172 1 881371140 +8 187 4 879362123 +128 66 3 879969329 +17 126 4 885272724 +13 82 2 882397503 +82 133 4 878769410 +1 167 2 878542383 +16 158 4 877727280 +200 95 5 884128979 +6 153 4 883603013 +175 50 5 877107138 +119 137 5 886176486 +1 115 5 878541637 +7 68 4 891351547 +124 195 4 890399864 +198 27 2 884208595 +165 169 5 879525832 +59 96 5 888205659 +145 62 2 885557699 +187 134 3 879465079 +6 89 4 883600842 +64 127 5 879366214 +145 31 5 875271896 +21 50 3 874951131 +13 160 4 882140070 +6 7 2 883599102 +22 161 4 878887925 +102 13 3 892991118 +1 11 2 875072262 +90 57 5 891385389 +175 186 4 877107790 +158 161 2 880134477 +71 177 2 885016961 +141 15 5 884584981 +95 133 3 888954341 +37 172 4 880930072 +6 192 4 883600914 +172 124 4 875537151 +49 182 3 888069416 +58 169 4 884304936 +73 187 5 888625934 +117 156 4 881011376 +195 60 3 888737240 +198 186 5 884207733 +97 83 1 884238817 +92 186 4 875653960 +59 187 5 888204349 +195 134 5 875771441 +176 7 5 886048188 +181 116 1 878962550 +178 1 4 882823805 +158 121 4 880132701 +1 35 1 878542420 +76 129 3 878101114 +176 150 4 886047879 +59 190 5 888205033 +152 143 5 882474378 +97 169 5 884238887 +91 22 5 891439208 +1 137 5 875071541 +178 156 2 882826395 +128 88 4 879969390 +188 153 5 875075062 +145 135 5 885557731 +92 58 4 875653836 +198 4 3 884209536 +53 151 4 879443011 +128 161 5 879968896 +94 32 5 891721851 +17 7 4 885272487 +116 50 3 876452443 +25 183 4 885852008 +62 81 4 879375323 +13 50 5 882140001 +43 121 4 883955907 +136 15 4 882693723 +56 22 5 892676376 +62 71 4 879374661 +157 118 2 886890439 +94 42 4 885870577 +53 25 4 879442538 +153 79 5 881371198 +121 197 4 891388286 +15 111 4 879455914 +7 4 5 891351772 +194 89 3 879521328 +13 37 1 882397011 +167 99 4 892738385 +192 125 3 881367849 +196 153 5 881251820 +94 4 4 891721168 +18 111 3 880131631 +95 97 4 879198652 +30 161 4 875060883 +198 127 5 884204919 +184 7 3 889907738 +55 50 4 878176005 +70 101 3 884150753 +25 98 5 885853415 +93 121 3 888705053 +148 8 4 877020297 +82 197 4 878769847 +76 70 4 875027981 +152 155 5 884018390 +174 125 5 886514069 +92 47 4 875654732 +189 61 3 893265826 +101 151 3 877136628 +130 185 5 875217033 +151 178 5 879524586 +83 31 5 880307751 +106 191 5 881451453 +92 179 5 875653077 +54 117 5 880935384 +73 82 2 888625754 +158 188 4 880134332 +188 180 5 875073329 +193 161 3 889125912 +114 168 3 881259927 +6 127 5 883599134 +98 152 3 880498968 +96 144 4 884403250 +44 197 4 878347420 +85 173 3 879454045 +1 127 5 874965706 +92 8 5 875654159 +10 185 5 877888876 +119 181 4 874775406 +125 70 3 892838287 +59 32 4 888205228 +58 153 5 884304896 +157 111 3 886889876 +159 67 1 884026964 +13 70 3 882140691 +130 179 4 875217265 +89 1 5 879461219 +25 23 4 885852529 +151 151 5 879524760 +82 79 3 878769334 +174 50 4 886433166 +177 129 3 880130653 +8 181 4 879362183 +16 172 5 877724726 +145 176 5 875271838 +125 72 4 892838322 +189 4 5 893265741 +138 45 5 879024232 +124 7 4 890287645 +7 145 1 891354530 +104 50 5 888465972 +95 177 3 879196408 +144 54 2 888105473 +51 134 2 883498844 +90 8 5 891383424 +60 89 5 883326463 +197 82 5 891409893 +188 76 4 875073048 +43 131 3 883954997 +6 1 4 883599478 +50 15 2 877052438 +5 169 5 878844495 +16 71 5 877721071 +91 56 1 891439057 +1 16 5 878543541 +83 38 5 887665422 +63 1 3 875747368 +42 151 4 881110578 +122 190 4 879270424 +200 71 4 884129409 +176 93 5 886047963 +97 186 3 884239574 +158 10 4 880132513 +87 13 3 879876734 +65 1 3 879217290 +56 87 4 892678508 +194 133 3 879523575 +1 79 4 875072865 +23 62 3 874786880 +189 10 5 893264335 +128 73 3 879969032 +194 143 3 879524643 +14 81 5 890881384 +70 15 3 884148728 +53 50 4 879442978 +119 9 4 890627252 +38 144 5 892430369 +44 144 4 878347532 +130 156 3 875801447 +69 124 4 882072869 +92 68 3 875653699 +197 177 5 891409935 +95 139 4 880572250 +56 62 5 892910890 +64 89 3 889737376 +124 117 3 890287181 +60 96 4 883326122 +13 58 4 882139966 +144 116 4 888104258 +183 177 5 892323452 +128 174 3 879966954 +161 69 4 891171657 +18 175 4 880130431 +2 100 5 888552084 +85 52 3 881705026 +87 158 3 879877173 +94 181 4 885872942 +109 191 4 880577844 +77 154 5 884733922 +177 7 4 880130881 +42 50 5 881107178 +1 45 5 875241687 +5 121 4 875635189 +16 15 5 877722001 +115 48 5 881171203 +199 9 5 883782853 +183 62 2 891479217 +85 135 5 879453845 +130 89 4 875216458 +117 96 5 881012530 +1 48 5 875072520 +49 93 5 888068912 +114 98 4 881259495 +87 179 4 879875649 +49 8 3 888067691 +134 1 5 891732756 +128 179 3 879967767 +162 147 4 877636147 +198 50 5 884204919 +43 58 3 883955859 +178 66 4 882826868 +59 133 3 888204349 +196 13 2 881251955 +188 151 3 875073909 +17 100 4 885272520 +109 54 3 880578286 +7 186 4 891350900 +7 25 3 891352451 +156 86 4 888185854 +130 42 4 875801422 +67 125 4 875379643 +1 25 4 875071805 +64 186 4 889737691 +60 131 4 883327441 +192 7 4 881367791 +90 65 4 891385298 +62 69 4 879374015 +161 15 2 891172284 +16 152 4 877728417 +199 14 4 883783005 +132 50 3 891278774 +125 111 3 892838322 +131 137 1 883681466 +193 122 1 889127698 +119 89 4 874781352 +90 97 5 891383987 +60 23 4 883326652 +1 195 5 876892855 +18 81 3 880130890 +62 82 4 879375414 +151 170 5 879524669 +194 72 3 879554100 +174 155 4 886513767 +29 12 5 882821989 +89 88 4 879459980 +182 48 3 876436556 +13 197 4 881515239 +119 70 3 874781829 +194 173 5 879521088 +96 183 4 884403123 +77 96 3 884752562 +53 156 4 879442561 +151 194 4 879524443 +20 143 3 879669040 +109 168 3 880577734 +69 117 4 882072748 +72 191 5 880036515 +125 21 3 892838424 +55 121 3 878176084 +49 161 1 888069513 +144 127 4 888105823 +197 4 3 891409981 +144 147 3 888104402 +200 94 4 884130046 +31 153 4 881548110 +189 181 3 893264023 +7 180 5 891350782 +160 61 4 876861799 +158 175 4 880135044 +197 182 3 891409935 +1 153 3 876893230 +5 189 5 878844495 +82 169 4 878769442 +14 173 4 879119579 +85 50 5 882813248 +22 89 5 878887680 +78 25 3 879633785 +144 14 4 888104122 +45 7 3 881008080 +183 94 3 891466863 +16 87 4 877720916 +125 49 3 879455241 +17 111 3 885272674 +50 124 1 877052400 +151 168 5 879528495 +103 118 3 880420002 +1 101 2 878542845 +122 83 5 879270327 +80 86 5 887401496 +184 134 5 889909618 +70 63 3 884151168 +94 69 3 885870057 +10 59 4 877886722 +110 12 4 886987826 +87 27 4 879876037 +45 151 2 881013885 +197 184 1 891409981 +104 127 3 888465201 +2 127 5 888552084 +11 54 3 891905936 +23 153 4 874786438 +196 173 2 881251820 +24 132 3 875323274 +160 150 4 876767440 +6 132 5 883602422 +102 117 3 888801232 +148 190 2 877398586 +23 171 5 874785809 +1 168 5 874965478 +59 15 5 888203449 +99 116 2 888469419 +95 51 4 879198353 +128 131 5 879967452 +123 182 4 879872671 +117 11 5 881011824 +168 1 5 884287509 +145 55 3 875272009 +118 172 5 875384751 +16 100 5 877720437 +43 122 2 884029709 +130 188 4 876251895 +92 50 5 875640148 +7 9 5 891351432 +130 1 5 874953595 +184 56 3 889908657 +51 50 5 883498685 +151 4 5 879524922 +63 79 3 875748245 +162 7 3 877635869 +200 79 5 884128499 +92 156 4 875656086 +41 56 4 890687472 +95 99 4 888954699 +151 56 4 879524879 +119 125 5 874775262 +42 97 3 881107502 +178 164 3 882827288 +188 199 4 875071658 +13 111 5 882140588 +43 124 4 891294050 +60 134 4 883326215 +90 198 5 891383204 +158 174 5 880134332 +189 133 5 893265773 +1 123 4 875071541 +193 79 4 889125755 +10 192 4 877891966 +58 111 4 884304638 +57 194 4 883698272 +68 9 4 876974073 +119 121 4 874775311 +82 100 5 876311299 +64 111 4 889739975 +92 38 3 875657640 +144 9 5 888104191 +178 31 4 882827083 +22 4 5 878886571 +194 179 4 879521329 +87 188 4 879875818 +43 169 5 875981128 +42 86 3 881107880 +77 173 5 884752689 +145 155 2 875272871 +44 109 3 878346431 +18 143 4 880131474 +151 125 4 879542939 +82 87 3 878769598 +59 151 5 888203053 +130 196 5 875801695 +130 95 5 875216867 +25 173 4 885852969 +1 191 5 875072956 +181 122 2 878963276 +8 11 3 879362233 +59 89 5 888204965 +15 125 5 879456049 +125 153 2 879454419 +59 69 5 888205087 +22 127 5 878887869 +44 133 4 878347569 +119 82 2 874781352 +151 44 4 879542413 +113 127 4 875935610 +192 100 5 881367706 +87 94 4 879876703 +69 172 5 882145548 +65 64 5 879216529 +186 159 5 879023723 +102 55 3 888801465 +177 181 4 880130931 +1 4 3 876893119 +42 71 4 881108229 +194 78 1 879535549 +77 28 5 884753061 +79 1 4 891271870 +45 111 4 881011550 +177 96 3 880130898 +187 83 5 879465274 +43 70 4 883955048 +193 121 3 889125913 +57 117 4 883697512 +95 195 5 879196231 +18 22 5 880130640 +51 181 5 883498655 +56 50 5 892737154 +165 174 4 879525961 +65 25 4 879217406 +13 69 4 884538766 +142 147 1 888640356 +18 72 3 880132252 +22 62 4 878887925 +64 132 4 889737851 +44 194 5 878347504 +188 194 3 875073329 +145 106 4 875270655 +110 54 4 886988202 +18 59 4 880132501 +6 170 4 883602574 +188 174 5 875072741 +108 7 5 879879812 +174 132 2 886439516 +82 125 3 877452380 +187 186 4 879465308 +18 52 5 880130680 +109 11 4 880572786 +85 64 5 879454046 +165 127 4 879525706 +144 174 5 888105612 +181 108 1 878963343 +178 51 4 882828021 +7 70 1 891352557 +165 187 3 879526046 +200 9 4 884126833 +82 112 1 877452357 +180 191 4 877372188 +62 171 4 879373659 +56 114 4 892683248 +63 3 2 875748068 +90 171 2 891384476 +168 118 4 884288009 +177 183 4 880130972 +23 195 4 874786993 +148 191 1 877020715 +137 118 5 881433179 +175 88 4 877108146 +13 152 5 882141393 +32 7 4 883717766 +172 23 3 875537717 +144 12 4 888105419 +198 31 3 884207897 +193 24 2 889125880 +125 90 5 892838623 +13 62 5 882397833 +194 97 3 879524291 +75 79 5 884051893 +25 1 5 885853415 +13 53 1 882396955 +64 87 4 889737851 +76 156 3 882606108 +6 9 4 883599205 +56 161 4 892910890 +194 177 3 879523104 +106 162 5 881450758 +91 69 5 891439057 +158 56 5 880134296 +16 199 5 877719645 +11 176 3 891905783 +94 177 5 885870284 +183 144 3 891479783 +123 185 4 879873120 +65 111 4 879217375 +44 50 5 878341246 +7 71 5 891352692 +94 172 4 885870175 +18 48 4 880130515 +109 25 4 880571741 +87 49 5 879876564 +23 56 4 874785233 +198 181 4 884205050 +64 79 4 889737943 +13 38 3 882397974 +144 126 4 888104150 +82 50 5 876311146 +1 55 5 875072688 +11 121 3 891902745 +138 1 4 879023031 +154 200 5 879138832 +72 188 4 880037203 +59 182 5 888204877 +7 96 5 891351383 +94 67 3 891723296 +62 190 5 879374686 +123 14 5 879872540 +152 88 5 884035964 +1 42 5 876892425 +1 139 3 878543216 +194 79 3 879521088 +185 178 4 883524364 +144 56 4 888105387 +183 121 3 891463809 +41 96 4 890687019 +92 44 3 875906989 +145 118 3 875270764 +144 7 2 888104087 +58 137 5 884304430 +161 70 3 891171064 +162 179 3 877636794 +95 8 5 879198262 +120 127 4 889489772 +150 150 3 878746824 +97 69 5 884239616 +13 158 1 882142057 +84 4 3 883453713 +58 121 2 892242300 +178 194 4 882826306 +174 11 5 886439516 +119 28 5 874782022 +11 83 5 891904335 +99 172 5 885679952 +14 195 5 890881336 +90 156 4 891384147 +190 100 4 891033653 +19 8 5 885412723 +49 54 2 888068265 +119 194 5 874781257 +13 172 5 882140355 +159 130 1 880557322 +11 28 5 891904241 +92 111 3 875641135 +49 4 2 888069512 +94 94 2 891723883 +7 179 5 891352303 +102 11 3 888801232 +44 157 4 878347711 +109 63 3 880582679 +80 79 4 887401407 +117 132 4 881012110 +49 62 2 888069660 +64 28 4 889737851 +82 170 4 878769703 +165 156 3 879525894 +15 121 3 879456168 +59 42 5 888204841 +184 29 3 889910326 +68 127 4 876973969 +185 47 4 883524249 +42 132 5 881107502 +190 7 4 891033653 +18 137 5 880132437 +200 172 5 884128554 +9 50 5 886960055 +59 30 5 888205787 +56 167 3 892911494 +117 184 3 881012601 +119 7 5 874775185 +94 195 3 885870231 +62 134 4 879373768 +180 53 5 877442125 +28 5 3 881961600 +178 161 5 882827645 +122 57 2 879270644 +158 107 3 880132960 +187 173 5 879465307 +89 86 5 879459859 +43 100 4 875975656 +194 168 5 879521254 +28 96 5 881957250 +84 79 4 883453520 +110 63 3 886989363 +62 172 5 879373794 +11 52 3 891904335 +1 7 4 875071561 +118 188 5 875384669 +97 172 4 884238939 +96 200 5 884403215 +116 145 2 876452980 +29 180 4 882821989 +23 59 4 874785526 +5 66 1 875721019 +194 186 5 879521088 +13 145 2 882397011 +59 60 5 888204965 +87 66 5 879876403 +115 79 4 881171273 +37 7 4 880915528 +185 50 4 883525998 +94 91 5 891722006 +41 152 4 890687326 +181 106 2 878963167 +177 127 5 880130667 +92 199 3 875811628 +178 12 5 882826162 +128 69 4 879966867 +160 11 4 876858091 +186 12 1 879023460 +13 127 5 881515411 +187 137 5 879464895 +64 174 5 889737478 +164 9 4 889402050 +41 180 5 890687019 +161 197 3 891171734 +7 39 5 891353614 +121 25 5 891390316 +7 125 4 891353192 +150 124 2 878746442 +92 39 3 875656419 +64 48 5 879365619 +43 127 4 875981304 +13 78 1 882399218 +148 174 5 877015066 +189 137 4 893264407 +28 31 4 881956082 +177 174 4 880130990 +18 88 3 880130890 +174 99 3 886515457 +7 132 5 891351287 +156 157 4 888185906 +71 52 4 877319567 +89 15 5 879441307 +11 8 4 891904949 +200 176 5 884129627 +186 55 4 879023556 +77 121 2 884733261 +109 144 4 880572560 +127 50 4 884364866 +198 71 3 884208419 +73 197 5 888625934 +43 102 4 875981483 +46 93 4 883616218 +79 10 5 891271901 +13 184 1 882397011 +84 70 5 883452906 +194 125 2 879548026 +13 174 4 882139829 +189 157 4 893265865 +125 168 5 879454793 +90 192 4 891384959 +18 100 5 880130065 +189 8 5 893265710 +114 153 3 881309622 +119 124 4 874781994 +92 118 2 875640512 +7 31 4 892134959 +150 123 4 878746852 +119 64 4 874781460 +1 149 2 878542791 +70 91 3 884068138 +72 58 4 880036638 +76 182 4 882606392 +162 122 2 877636300 +121 50 5 891390014 +193 182 4 890860290 +23 154 3 874785552 +138 14 3 879022730 +25 121 4 885853030 +13 49 4 882399419 +97 173 3 884238728 +42 111 1 881105931 +116 56 5 886310197 +1 43 4 878542869 +121 12 5 891390014 +41 194 3 890687242 +43 4 4 875981421 +77 156 4 884733621 +81 116 3 876533504 +62 167 2 879376727 +154 187 5 879139096 +169 199 4 891359353 +128 186 5 879966895 +58 1 5 884304483 +130 123 4 875216112 +184 160 3 889911459 +63 14 4 875747401 +59 193 4 888204465 +44 87 5 878347742 +160 129 4 876768828 +1 165 5 874965518 +87 121 5 879875893 +23 89 5 874785582 +187 52 4 879465683 +137 96 5 881433654 +151 174 5 879524088 +109 151 5 880571661 +1 116 3 878542960 +174 65 5 886514123 +50 100 2 877052400 +13 175 4 882139717 +94 51 3 891721026 +119 31 5 874781779 +13 165 3 881515295 +85 141 3 879829042 +109 53 4 880583336 +1 198 5 878542717 +181 151 2 878962866 +152 33 5 882475924 +11 196 5 891904270 +145 98 5 875271896 +189 199 5 893265263 +83 79 5 887665423 +30 164 4 875060217 +25 133 3 885852381 +194 67 1 879549793 +62 22 4 879373820 +57 15 4 883697223 +57 50 5 883697105 +11 58 3 891904596 +87 174 5 879875736 +5 63 1 878844629 +23 116 5 874784466 +13 132 4 882140002 +38 35 5 892433801 +58 174 4 884305271 +5 181 5 875635757 +18 32 2 880132129 +144 100 5 888104063 +7 69 5 891351728 +69 79 4 882145524 +22 50 5 878887765 +85 42 3 879453876 +62 72 3 879375762 +70 79 4 884149453 +77 199 5 884733988 +102 4 2 888801522 +18 8 5 880130802 +160 157 5 876858346 +42 141 3 881109059 +85 186 3 879454273 +84 100 4 883452155 +194 167 2 879549900 +1 124 5 875071484 +94 47 5 891720498 +148 133 5 877019251 +42 181 5 881107291 +1 95 4 875072303 +25 134 4 885852008 +10 180 5 877889333 +12 88 5 879960826 +59 24 4 888203579 +122 86 5 879270458 +11 88 3 891905003 +72 1 4 880035614 +154 185 5 879139002 +130 96 5 875216786 +57 195 3 883698431 +106 100 3 881449487 +58 134 5 884304766 +159 125 5 880557192 +162 55 3 877636713 +83 127 4 887665549 +144 58 3 888105548 +122 127 5 879270424 +109 175 1 880577734 +95 62 4 879196354 +45 181 4 881010742 +95 49 3 879198604 +68 181 5 876973884 +75 117 4 884050164 +72 198 5 880037881 +1 58 4 878542960 +148 189 4 877019698 +161 194 1 891171503 +95 73 4 879198161 +5 163 5 879197864 +18 172 3 880130551 +158 22 5 880134333 +59 68 2 888205228 +60 133 4 883326893 +121 172 5 891388090 +13 187 5 882140205 +1 142 2 878543238 +13 143 1 882140205 +43 144 4 883955415 +10 70 4 877891747 +188 11 5 875071520 +8 55 5 879362286 +77 192 3 884752900 +178 147 4 886678902 +108 1 4 879879720 +71 168 5 885016641 +130 77 5 880396792 +160 55 4 876858091 +178 100 4 882823758 +142 42 4 888640489 +102 153 2 892991376 +14 186 4 879119497 +85 9 4 879456308 +7 52 4 891353801 +42 174 5 881106711 +71 153 4 885016495 +60 175 5 883326919 +44 172 4 878348521 +182 1 4 885613092 +7 11 3 891352451 +181 130 1 878963241 +42 73 4 881108484 +97 193 4 884238997 +186 177 4 891719775 +7 197 4 891351082 +49 147 1 888069416 +192 9 5 881367527 +132 100 4 891278744 +18 174 4 880130613 +115 185 5 881171409 +115 192 5 881171137 +158 195 5 880134398 +189 179 5 893265478 +7 144 5 891351201 +110 29 3 886988374 +145 77 3 875272348 +95 110 2 880572323 +71 98 4 885016536 +25 79 4 885852757 +21 15 4 874951188 +177 144 5 880131011 +72 197 5 880037702 +90 69 1 891383424 +123 187 4 879809943 +144 72 4 888105338 +130 88 2 875217265 +9 7 4 886960030 +73 96 2 888626523 +189 28 4 893266298 +94 188 4 885870665 +94 159 3 891723081 +1 126 2 875071713 +1 83 3 875072370 +10 23 5 877886911 +11 173 5 891904920 +96 196 4 884403057 +160 59 4 876858346 +188 50 4 875072741 +43 73 4 883956099 +92 63 3 875907504 +180 40 4 877127296 +13 176 3 882140455 +23 181 4 874784337 +161 177 2 891171848 +198 89 5 884208623 +73 183 4 888626262 +142 91 5 888640404 +184 192 4 889908843 +42 168 3 881107773 +94 86 5 891720971 +44 22 4 878347942 +109 22 4 880572950 +59 81 4 888205336 +137 79 5 881433689 +21 127 5 874951188 +124 1 3 890287733 +92 69 5 875653198 +200 22 4 884128372 +87 134 4 879877740 +119 196 5 886177162 +99 98 5 885679596 +92 147 2 875640542 +178 133 4 885784518 +181 120 1 878963204 +114 135 4 881260611 +73 129 4 888625907 +28 196 4 881956081 +123 134 4 879872275 +82 118 3 878768510 +1 3 4 878542960 +106 9 4 883876572 +87 152 4 879876564 +5 200 2 875720717 +90 60 4 891385039 +83 151 3 880306745 +167 86 4 892738212 +167 137 5 892738081 +49 99 4 888067031 +41 173 4 890687549 +178 69 5 882826437 +59 116 4 888203018 +65 66 3 879217972 +128 117 5 879967631 +7 12 5 892135346 +168 181 4 884287298 +181 107 1 878963343 +66 9 4 883601265 +64 10 5 889739733 +18 15 4 880131054 +63 137 4 875747368 +174 87 5 886514089 +94 71 4 891721642 +174 167 3 886514953 +198 137 4 884205252 +55 174 4 878176397 +62 116 3 879372480 +87 194 5 879876403 +64 172 4 889739091 +125 66 5 879455184 +30 135 5 885941156 +130 144 5 875216717 +104 121 2 888466002 +175 136 4 877108051 +197 50 5 891409839 +10 153 4 877886722 +13 60 4 884538767 +58 191 5 892791893 +5 105 3 875635443 +110 31 3 886989057 +57 168 3 883698362 +42 2 5 881109271 +144 198 4 888105287 +151 143 5 879524878 +89 25 5 879441637 +135 38 3 879858003 +109 56 5 880577804 +18 50 4 880130155 +189 14 5 893263994 +32 50 4 883717521 +177 98 5 880131026 +38 185 2 892432573 +20 22 5 879669339 +128 28 5 879966785 +24 7 4 875323676 +56 193 5 892678669 +151 124 5 879524491 +194 136 5 879521167 +130 17 5 875217096 +92 149 3 886443494 +16 135 4 877720916 +20 50 3 879667937 +1 19 5 875071515 +159 118 4 880557464 +62 76 4 879374045 +95 52 4 879198800 +18 142 4 880131173 +119 172 4 874782191 +81 79 5 876534817 +158 83 5 880134913 +49 200 3 888067358 +59 90 2 888206363 +58 56 5 884305369 +177 156 5 880130931 +59 73 4 888206254 +18 187 5 880130393 +102 2 2 888801522 +102 174 4 888801360 +125 95 5 879454628 +90 137 5 891384754 +125 85 3 892838424 +145 64 4 882181785 +13 28 5 882398814 +10 85 4 877892438 +63 10 4 875748004 +91 183 5 891438909 +145 9 2 875270394 +44 175 4 878347972 +16 127 5 877719206 +92 40 3 875656164 +49 174 1 888067691 +92 155 2 875654888 +44 173 5 878348725 +174 143 5 886515457 +1 29 1 878542869 +151 135 5 879524471 +21 9 5 874951188 +62 7 4 879372277 +92 25 3 875640072 +94 127 5 885870175 +156 9 4 888185735 +73 188 5 888625553 +25 125 5 885852817 +6 111 2 883599478 +198 128 3 884209451 +99 174 5 885679705 +65 77 5 879217689 +44 151 4 878341370 +7 50 5 891351042 +85 172 4 882813285 +77 98 4 884752901 +176 181 3 886047879 +25 7 4 885853155 +116 124 3 876453733 +175 111 4 877108015 +42 136 4 881107329 +6 182 4 883268776 +10 40 4 877892438 +195 135 5 875771440 +115 83 3 881172183 +76 24 2 882607536 +62 117 4 879372563 +167 184 1 892738278 +1 18 4 887432020 +196 110 1 881252305 +94 134 5 886008885 +138 147 4 879023779 +1 59 5 876892817 +193 159 4 889124191 +198 151 4 884206401 +1 15 5 875071608 +57 1 5 883698581 +1 111 5 889751711 +1 52 4 875072205 +144 137 4 888104150 +125 67 5 892838865 +106 70 3 881452355 +145 96 5 882181728 +18 28 3 880129527 +189 170 4 893265380 +32 181 4 883717628 +18 56 5 880129454 +95 194 5 879197603 +198 96 4 884208326 +10 12 5 877886911 +30 69 5 885941156 +1 88 4 878542791 +182 15 4 885612967 +119 93 4 874775262 +109 28 3 880572721 +184 197 4 889908873 +70 1 4 884065277 +41 156 4 890687304 +92 169 5 875653121 +38 162 5 892431727 +6 8 4 883600657 +160 9 3 876767023 +18 83 5 880129877 +10 179 5 877889004 +186 77 5 879023694 +156 77 2 888185906 +120 118 2 889490979 +7 86 4 891350810 +145 11 5 875273120 +178 174 5 882826719 +114 200 3 881260409 +22 174 5 878887765 +177 42 4 880130972 +1 13 5 875071805 +16 33 2 877722001 +90 135 5 891384570 +12 69 5 879958902 +72 106 4 880036185 +44 190 5 878348000 +116 127 5 876454257 +12 127 4 879959488 +183 50 2 891467546 +114 172 5 881259495 +25 177 3 885852488 +162 28 4 877636746 +144 183 4 888105140 +60 141 3 883327472 +43 143 4 883955247 +159 15 5 880485972 +7 164 5 891351813 +174 98 5 886452583 +92 108 2 886443416 +189 185 5 893265428 +115 100 5 881171982 +121 11 2 891387992 +180 181 2 877125956 +44 181 4 878341290 +48 193 2 879434751 +151 173 5 879524130 +151 28 4 879524199 +190 24 3 891033773 +194 199 4 879521329 +102 1 3 883748352 +89 173 5 879459859 +148 173 5 877017054 +13 9 3 882140205 +158 70 4 880135118 +175 98 5 877107390 +59 143 1 888204641 +95 50 5 879197329 +45 24 3 881014550 +41 50 5 890687066 +109 50 5 880563331 +91 79 5 891439018 +85 162 2 879454235 +156 100 4 888185677 +65 194 4 879217881 +75 129 3 884049939 +1 28 4 875072173 +59 53 5 888206161 +117 164 5 881011727 +25 82 4 885852150 +178 173 5 882826306 +121 1 4 891388475 +125 82 5 879454386 +161 118 2 891172421 +110 67 3 886989566 +77 191 3 884752948 +195 109 3 878019342 +11 107 4 891903276 +106 82 3 881453290 +1 172 5 874965478 +13 135 5 882139541 +24 97 4 875323193 +18 133 5 880130713 +72 23 4 880036550 +23 176 3 874785843 +87 56 4 879876524 +44 31 4 878348998 +198 81 5 884208326 +13 8 4 882140001 +83 50 3 880327590 +118 100 5 875384751 +60 15 4 883328033 +118 5 2 875385256 +82 134 4 878769442 +154 152 4 879138832 +118 179 5 875384612 +200 139 3 884130540 +177 187 4 880131040 +59 28 5 888204841 +67 117 5 875379794 +62 191 5 879373613 +77 134 4 884752562 +145 49 3 875272926 +72 81 3 880036876 +158 4 4 880134477 +186 147 4 891719774 +130 7 5 874953557 +192 111 2 881368222 +87 128 3 879876037 +63 181 3 875747556 +58 200 3 884305295 +190 9 1 891033725 +58 7 5 884304656 +13 116 5 882140455 +114 171 4 881309511 +7 173 5 891351002 +49 12 4 888068057 +1 122 3 875241498 +175 187 4 877107338 +148 164 4 877398444 +77 183 5 884732606 +13 141 2 890705034 +13 182 5 882139347 +53 15 5 879443027 +24 58 3 875323745 +20 82 4 879669697 +63 121 1 875748139 +93 118 3 888705416 +42 87 4 881107576 +41 191 4 890687473 +93 14 4 888705200 +144 59 4 888105197 +58 168 5 891611548 +85 196 4 879454952 +14 25 2 876965165 +85 161 4 882819528 +62 15 2 879372634 +122 197 5 879270482 +144 170 4 888105364 +104 9 2 888465201 +94 182 5 885873089 +128 180 5 879967174 +59 129 5 888202941 +115 117 4 881171009 +135 5 3 879857868 +142 134 5 888640356 +178 118 4 882824291 +106 59 4 881453318 +71 100 4 877319197 +27 123 5 891543191 +38 195 1 892429952 +30 29 3 875106638 +18 116 5 880131358 +154 172 4 879138783 +120 50 4 889489973 +52 191 5 882923031 +189 186 2 893266027 +64 197 3 889737506 +23 173 5 874787587 +159 9 3 880485766 +54 148 3 880937490 +90 23 5 891384997 +151 73 4 879528909 +76 96 5 875312034 +198 93 3 884205346 +103 56 5 880416602 +77 42 5 884752948 +130 117 5 874953895 +56 28 5 892678669 +94 151 5 891721716 +59 86 3 888205145 +25 86 4 885852248 +103 98 3 880420565 +11 11 2 891904271 +49 121 1 888068100 +44 97 2 878348000 +16 66 4 877719075 +1 152 5 878542589 +177 160 4 880131011 +41 135 4 890687473 +21 53 4 874951820 +158 7 5 880132744 +56 66 3 892911110 +184 95 4 889908801 +188 187 3 875072211 +85 181 4 882813312 +37 62 5 880916070 +44 183 4 883613372 +65 9 5 879217138 +145 53 2 875272245 +24 151 5 875322848 +23 73 3 874787016 +62 151 5 879372651 +13 188 4 882140130 +87 180 4 879875649 +59 4 4 888205188 +10 93 4 877892160 +20 15 4 879667937 +21 1 5 874951244 +44 198 4 878348947 +18 127 5 880129668 +189 59 3 893265191 +71 6 3 880864124 +7 198 3 891351685 +188 176 4 875072876 +52 7 5 882922204 +57 144 3 883698408 +55 7 3 878176047 +70 172 5 884064217 +59 91 4 888205265 +49 2 1 888069606 +60 135 5 883327087 +7 152 4 891351851 +82 22 3 878769777 +13 166 5 884538663 +49 154 5 888068715 +158 125 3 880132745 +42 103 3 881106162 +14 19 5 880929651 +92 13 4 886443292 +141 181 4 884584709 +22 68 4 878887925 +83 88 5 880308186 +178 111 4 882823905 +145 59 1 882181695 +62 64 4 879373638 +70 94 3 884151014 +80 64 5 887401475 +192 118 2 881367932 +145 97 5 875272652 +37 121 2 880915528 +153 187 2 881371198 +145 121 2 875270507 +1 94 2 875072956 +16 39 5 877720118 +189 180 5 893265741 +65 178 5 879217689 +174 168 1 886434621 +90 196 4 891385250 +26 122 1 891380200 +150 14 4 878746889 +148 194 5 877015066 +151 190 4 879528673 +102 154 3 888803708 +31 192 4 881548054 +174 88 5 886513752 +89 107 5 879441780 +122 28 4 879270084 +160 127 5 876770168 +148 127 1 877399351 +57 121 4 883697432 +92 65 4 875653960 +10 9 4 877889005 +109 180 3 880581127 +64 184 4 889739243 +43 123 1 875975520 +25 176 4 885852862 +98 194 5 880498898 +10 198 3 877889005 +5 174 5 875636130 +102 7 2 888801407 +102 172 3 888801232 +130 93 5 874953665 130 121 5 876250746 1 \ No newline at end of file