Merge pull request #4 from RUCAIBox/0.2.x

get latest code from 0.2.x
RUCAIBox · Dec 12, 2020 · 7518311 · 7518311
2 parents ee37080 + 64ea090
commit 7518311
Show file tree

Hide file tree

Showing 54 changed files with 2,476 additions and 555 deletions.
diff --git a/.github/workflows/python-package.yml b/.github/workflows/python-package.yml
@@ -40,7 +40,4 @@ jobs:
         python -m pytest -v tests/config/test_config.py
         export PYTHONPATH=.
         python tests/config/test_command_line.py --use_gpu=False --valid_metric=Recall@10 --split_ratio=[0.7,0.2,0.1] --metrics=['Recall@10'] --epochs=200 --eval_setting='LO_RS' --learning_rate=0.3
-    - name: Test evaluation_setting
-      run: |
-        python -m pytest -v tests/evaluation_setting
-
+    
diff --git a/README.md b/README.md
@@ -53,6 +53,12 @@ oriented to the GPU environment.
 for testing and comparing recommendation algorithms.
 
 ## RecBole News
+**12/06/2020**: We release RecBole [v0.1.2](https://github.com/RUCAIBox/RecBole/releases/tag/v0.1.2).
+
+**11/29/2020**: We constructed preliminary experiments to test the time and memory cost on three
+different-sized datasets and provided the [test result](https://github.com/RUCAIBox/RecBole#time-and-memory-costs)
+for reference.
+
 **11/03/2020**: We release the first version of RecBole **v0.1.1**.
 
 
@@ -154,35 +160,23 @@ python run_recbole.py --model=[model_name]
 ```
 
 
-## Time and memory cost of models
-We test our models on three datasets of different size (small size, medium size and large size) to estimate their time and memory cost. You can
-click links to check more information.<br> 
-(**NOTE:**  Our test results only reflect the approximate time and memory cost of models. If you find any error in our result, 
-please let us know.)<br>
+## Time and Memory Costs
+We constructed preliminary experiments to test the time and memory cost on three different-sized datasets  (small, medium and large). For detailed information, you can click the following links.<br> 
 
-* [General recommendation models](time_test_result/General_recommendation.md)<br>
-* [Sequential recommendation models]()<br>
-* [Context-aware recommendation models]()<br>
-* [Knowledge-based recommendation models]()<br>
+* [General recommendation models](asset/time_test_result/General_recommendation.md)<br>
+* [Sequential recommendation models](asset/time_test_result/Sequential_recommendation.md)<br>
+* [Context-aware recommendation models](asset/time_test_result/Context-aware_recommendation.md)<br>
+* [Knowledge-based recommendation models](asset/time_test_result/Knowledge-based_recommendation.md)<br>
 
-Here is our testing device information:<br>
-```
-GPU:                      TITAN GTX
-Driver Version:           430.64
-CUDA Version:             10.1
-Memory size:              65412748 KB
-CPU:                      Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
-The number of CPU cores:  8
-Cache size:               11264KB
-```
+NOTE: Our test results only gave the approximate time and memory cost of our implementations in the RecBole library (based on our machine server).  Any feedback or suggestions about the implementations and test are welcome. We will keep improving our implementations, and update these test results.
 
 
 ## RecBole Major Releases
 | Releases  | Date   | Features |
 |-----------|--------|-------------------------|
+| v0.1.2    | 12/06/2020 |  Basic RecBole |
 | v0.1.1    | 11/03/2020 |  Basic RecBole |
 
-
 ## Contributing
 
 Please let us know if you encounter a bug or have any suggestions by [filing an issue](https://github.com/RUCAIBox/RecBole/issues).

diff --git a/asset/time_test_result/Context-aware_recommendation.md b/asset/time_test_result/Context-aware_recommendation.md
@@ -0,0 +1,189 @@
+## Time and memory cost of context-aware recommendation models 
+
+### Datasets information:
+
+| Dataset | #Interaction | #Feature Field | #Feature |
+| ------- | ------------: | --------------: | --------: |
+| ml-1m   | 1,000,209    | 5              | 134      |
+| Criteo  | 2,292,530    | 39             | 2,572,192 |
+| Avazu   | 4,218,938    | 21             | 1,326,631 |
+
+### Device information
+
+```
+OS:                   Linux
+Python Version:       3.8.3
+PyTorch Version:      1.7.0
+cudatoolkit Version:  10.1
+GPU:                  TITAN RTX（24GB）
+Machine Specs:        32 CPU machine, 64GB RAM
+```
+
+### 1) ml-1m dataset:
+
+#### Time and memory cost on ml-1m dataset:
+
+| Method    | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) |
+| --------- | -----------------: | -----------------: | -----------: |
+| LR        | 18.34             | 2.18              | 0.82        |
+| DIN       | 20.37             | 2.26              | 1.16        |
+| DSSM      | 21.93             | 2.24              | 0.95        |
+| FM        | 19.33             | 2.34              | 0.83        |
+| DeepFM    | 20.42             | 2.27              | 0.91        |
+| Wide&Deep | 26.13             | 2.95              | 0.89        |
+| NFM       | 23.36             | 2.26              | 0.89        |
+| AFM       | 20.08             | 2.26              | 0.92        |
+| AutoInt   | 22.41             | 2.34              | 0.94        |
+| DCN       | 28.33             | 2.97              | 0.93        |
+| FNN(DNN)  | 19.51             | 2.21              | 0.91        |
+| PNN       | 22.29             | 2.23              | 0.91        |
+| FFM       | 22.98             | 2.47              | 0.87        |
+| FwFM      | 23.38             | 2.50              | 0.85        |
+| xDeepFM   | 24.40             | 2.30              | 1.06        |
+
+#### Config file of ml-1m dataset:
+
+```
+# dataset config
+field_separator: "\t"
+seq_separator: " "
+USER_ID_FIELD: user_id
+ITEM_ID_FIELD: item_id
+LABEL_FIELD: label
+threshold:
+  rating: 4.0
+drop_filter_field : True
+load_col:
+  inter: [user_id, item_id, rating]
+  item: [item_id, release_year, genre]
+  user: [user_id, age, gender, occupation]
+
+# training and evaluation
+epochs: 500
+train_batch_size: 2048
+eval_batch_size: 2048
+eval_setting: RO_RS
+group_by_user: False
+valid_metric: AUC
+metrics: ['AUC', 'LogLoss']
+```
+
+Other parameters (including model parameters) are default value. 
+
+### 2）Criteo dataset:
+
+#### Time and memory cost on Criteo dataset:
+
+| Method    | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) |
+| --------- | -------------------------: | ---------------------------: | ---------------: |
+| LR        | 7.65                      | 0.61                        | 1.11            |
+| DIN       | -                         | -                           | -               |
+| DSSM      | -                         | -                           | -               |
+| FM        | 9.77                      | 0.73                        | 1.45            |
+| DeepFM    | 13.64                     | 0.83                        | 1.72            |
+| Wide&Deep | 13.58                     | 0.80                        | 1.72            |
+| NFM       | 13.36                     | 0.75                        | 1.72            |
+| AFM       | 19.40                     | 1.02                        | 2.34            |
+| AutoInt   | 19.40                     | 0.98                        | 2.06            |
+| DCN       | 16.25                     | 0.78                        | 1.67            |
+| FNN(DNN)  | 10.03                     | 0.64                        | 1.63            |
+| PNN       | 12.92                     | 0.72                        | 1.85            |
+| FFM       | -                         | -                           | -               |
+| FwFM      | 1175.24                   | 8.90                        | 2.12            |
+| xDeepFM   | 32.27                     | 1.34                        | 2.25            |
+
+#### Config file of Criteo dataset:
+
+```
+# dataset config
+field_separator: "\t"
+seq_separator: " "
+USER_ID_FIELD: ~
+ITEM_ID_FIELD: ~
+LABEL_FIELD: label
+
+load_col: 
+    inter: '*'
+
+highest_val:
+    index: 2292530
+
+fill_nan: True
+normalize_all: True
+min_item_inter_num: 0
+min_user_inter_num: 0
+
+drop_filter_field : True
+
+
+# training and evaluation
+epochs: 500
+train_batch_size: 2048
+eval_batch_size: 2048
+eval_setting: RO_RS
+group_by_user: False
+valid_metric: AUC
+metrics: ['AUC', 'LogLoss']
+```
+
+Other parameters (including model parameters) are default value. 
+
+### 3）Avazu dataset:
+
+#### Time and memory cost on Avazu dataset:
+
+| Method    | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) |
+| --------- | -------------------------: | ---------------------------: | ---------------: |
+| LR        | 9.30                      | 0.76                        | 1.42            |
+| DIN       | -                         | -                           | -               |
+| DSSM      | -                         | -                           | -               |
+| FM        | 25.68                     | 0.94                        | 2.60            |
+| DeepFM    | 28.41                     | 1.19                        | 2.66            |
+| Wide&Deep | 27.58                     | 0.97                        | 2.66            |
+| NFM       | 30.46                     | 1.06                        | 2.66            |
+| AFM       | 31.03                     | 1.06                        | 2.69            |
+| AutoInt   | 38.11                     | 1.41                        | 2.84            |
+| DCN       | 30.78                     | 0.96                        | 2.64            |
+| FNN(DNN)  | 23.53                     | 0.84                        | 2.60            |
+| PNN       | 25.86                     | 0.90                        | 2.68            |
+| FFM       | -                         | -                           | -               |
+| FwFM      | 336.75                    | 7.49                        | 2.63            |
+| xDeepFM   | 54.88                     | 1.45                        | 2.89            |
+
+#### Config file of Avazu dataset:
+
+```
+# dataset config
+field_separator: "\t"
+seq_separator: " "
+USER_ID_FIELD: ~
+ITEM_ID_FIELD: ~
+LABEL_FIELD: label
+fill_nan: True
+normalize_all: True
+
+load_col:
+    inter: '*'
+    
+lowest_val:
+  timestamp: 14102931
+drop_filter_field : False
+
+# training and evaluation
+epochs: 500
+train_batch_size: 2048
+eval_batch_size: 2048
+eval_setting: RO_RS
+group_by_user: False
+valid_metric: AUC
+metrics: ['AUC', 'LogLoss']
+```
+
+Other parameters (including model parameters) are default value. 
+
+
+
+
+
+
+