Skip to content

Commit

Permalink
Merge pull request #4 from RUCAIBox/0.2.x
Browse files Browse the repository at this point in the history
get latest code from 0.2.x
  • Loading branch information
linzihan-backforward authored Dec 12, 2020
2 parents ee37080 + 64ea090 commit 7518311
Show file tree
Hide file tree
Showing 54 changed files with 2,476 additions and 555 deletions.
5 changes: 1 addition & 4 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,4 @@ jobs:
python -m pytest -v tests/config/test_config.py
export PYTHONPATH=.
python tests/config/test_command_line.py --use_gpu=False --valid_metric=Recall@10 --split_ratio=[0.7,0.2,0.1] --metrics=['Recall@10'] --epochs=200 --eval_setting='LO_RS' --learning_rate=0.3
- name: Test evaluation_setting
run: |
python -m pytest -v tests/evaluation_setting
34 changes: 14 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ oriented to the GPU environment.
for testing and comparing recommendation algorithms.

## RecBole News
**12/06/2020**: We release RecBole [v0.1.2](https://github.com/RUCAIBox/RecBole/releases/tag/v0.1.2).

**11/29/2020**: We constructed preliminary experiments to test the time and memory cost on three
different-sized datasets and provided the [test result](https://github.com/RUCAIBox/RecBole#time-and-memory-costs)
for reference.

**11/03/2020**: We release the first version of RecBole **v0.1.1**.


Expand Down Expand Up @@ -154,35 +160,23 @@ python run_recbole.py --model=[model_name]
```


## Time and memory cost of models
We test our models on three datasets of different size (small size, medium size and large size) to estimate their time and memory cost. You can
click links to check more information.<br>
(**NOTE:** Our test results only reflect the approximate time and memory cost of models. If you find any error in our result,
please let us know.)<br>
## Time and Memory Costs
We constructed preliminary experiments to test the time and memory cost on three different-sized datasets (small, medium and large). For detailed information, you can click the following links.<br>

* [General recommendation models](time_test_result/General_recommendation.md)<br>
* [Sequential recommendation models]()<br>
* [Context-aware recommendation models]()<br>
* [Knowledge-based recommendation models]()<br>
* [General recommendation models](asset/time_test_result/General_recommendation.md)<br>
* [Sequential recommendation models](asset/time_test_result/Sequential_recommendation.md)<br>
* [Context-aware recommendation models](asset/time_test_result/Context-aware_recommendation.md)<br>
* [Knowledge-based recommendation models](asset/time_test_result/Knowledge-based_recommendation.md)<br>

Here is our testing device information:<br>
```
GPU: TITAN GTX
Driver Version: 430.64
CUDA Version: 10.1
Memory size: 65412748 KB
CPU: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
The number of CPU cores: 8
Cache size: 11264KB
```
NOTE: Our test results only gave the approximate time and memory cost of our implementations in the RecBole library (based on our machine server). Any feedback or suggestions about the implementations and test are welcome. We will keep improving our implementations, and update these test results.


## RecBole Major Releases
| Releases | Date | Features |
|-----------|--------|-------------------------|
| v0.1.2 | 12/06/2020 | Basic RecBole |
| v0.1.1 | 11/03/2020 | Basic RecBole |


## Contributing

Please let us know if you encounter a bug or have any suggestions by [filing an issue](https://github.com/RUCAIBox/RecBole/issues).
Expand Down
189 changes: 189 additions & 0 deletions asset/time_test_result/Context-aware_recommendation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
## Time and memory cost of context-aware recommendation models

### Datasets information:

| Dataset | #Interaction | #Feature Field | #Feature |
| ------- | ------------: | --------------: | --------: |
| ml-1m | 1,000,209 | 5 | 134 |
| Criteo | 2,292,530 | 39 | 2,572,192 |
| Avazu | 4,218,938 | 21 | 1,326,631 |

### Device information

```
OS: Linux
Python Version: 3.8.3
PyTorch Version: 1.7.0
cudatoolkit Version: 10.1
GPU: TITAN RTX(24GB)
Machine Specs: 32 CPU machine, 64GB RAM
```

### 1) ml-1m dataset:

#### Time and memory cost on ml-1m dataset:

| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) |
| --------- | -----------------: | -----------------: | -----------: |
| LR | 18.34 | 2.18 | 0.82 |
| DIN | 20.37 | 2.26 | 1.16 |
| DSSM | 21.93 | 2.24 | 0.95 |
| FM | 19.33 | 2.34 | 0.83 |
| DeepFM | 20.42 | 2.27 | 0.91 |
| Wide&Deep | 26.13 | 2.95 | 0.89 |
| NFM | 23.36 | 2.26 | 0.89 |
| AFM | 20.08 | 2.26 | 0.92 |
| AutoInt | 22.41 | 2.34 | 0.94 |
| DCN | 28.33 | 2.97 | 0.93 |
| FNN(DNN) | 19.51 | 2.21 | 0.91 |
| PNN | 22.29 | 2.23 | 0.91 |
| FFM | 22.98 | 2.47 | 0.87 |
| FwFM | 23.38 | 2.50 | 0.85 |
| xDeepFM | 24.40 | 2.30 | 1.06 |

#### Config file of ml-1m dataset:

```
# dataset config
field_separator: "\t"
seq_separator: " "
USER_ID_FIELD: user_id
ITEM_ID_FIELD: item_id
LABEL_FIELD: label
threshold:
rating: 4.0
drop_filter_field : True
load_col:
inter: [user_id, item_id, rating]
item: [item_id, release_year, genre]
user: [user_id, age, gender, occupation]
# training and evaluation
epochs: 500
train_batch_size: 2048
eval_batch_size: 2048
eval_setting: RO_RS
group_by_user: False
valid_metric: AUC
metrics: ['AUC', 'LogLoss']
```

Other parameters (including model parameters) are default value.

### 2)Criteo dataset:

#### Time and memory cost on Criteo dataset:

| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) |
| --------- | -------------------------: | ---------------------------: | ---------------: |
| LR | 7.65 | 0.61 | 1.11 |
| DIN | - | - | - |
| DSSM | - | - | - |
| FM | 9.77 | 0.73 | 1.45 |
| DeepFM | 13.64 | 0.83 | 1.72 |
| Wide&Deep | 13.58 | 0.80 | 1.72 |
| NFM | 13.36 | 0.75 | 1.72 |
| AFM | 19.40 | 1.02 | 2.34 |
| AutoInt | 19.40 | 0.98 | 2.06 |
| DCN | 16.25 | 0.78 | 1.67 |
| FNN(DNN) | 10.03 | 0.64 | 1.63 |
| PNN | 12.92 | 0.72 | 1.85 |
| FFM | - | - | - |
| FwFM | 1175.24 | 8.90 | 2.12 |
| xDeepFM | 32.27 | 1.34 | 2.25 |

#### Config file of Criteo dataset:

```
# dataset config
field_separator: "\t"
seq_separator: " "
USER_ID_FIELD: ~
ITEM_ID_FIELD: ~
LABEL_FIELD: label
load_col:
inter: '*'
highest_val:
index: 2292530
fill_nan: True
normalize_all: True
min_item_inter_num: 0
min_user_inter_num: 0
drop_filter_field : True
# training and evaluation
epochs: 500
train_batch_size: 2048
eval_batch_size: 2048
eval_setting: RO_RS
group_by_user: False
valid_metric: AUC
metrics: ['AUC', 'LogLoss']
```

Other parameters (including model parameters) are default value.

### 3)Avazu dataset:

#### Time and memory cost on Avazu dataset:

| Method | Training Time (sec/epoch) | Evaluation Time (sec/epoch) | GPU Memory (GB) |
| --------- | -------------------------: | ---------------------------: | ---------------: |
| LR | 9.30 | 0.76 | 1.42 |
| DIN | - | - | - |
| DSSM | - | - | - |
| FM | 25.68 | 0.94 | 2.60 |
| DeepFM | 28.41 | 1.19 | 2.66 |
| Wide&Deep | 27.58 | 0.97 | 2.66 |
| NFM | 30.46 | 1.06 | 2.66 |
| AFM | 31.03 | 1.06 | 2.69 |
| AutoInt | 38.11 | 1.41 | 2.84 |
| DCN | 30.78 | 0.96 | 2.64 |
| FNN(DNN) | 23.53 | 0.84 | 2.60 |
| PNN | 25.86 | 0.90 | 2.68 |
| FFM | - | - | - |
| FwFM | 336.75 | 7.49 | 2.63 |
| xDeepFM | 54.88 | 1.45 | 2.89 |

#### Config file of Avazu dataset:

```
# dataset config
field_separator: "\t"
seq_separator: " "
USER_ID_FIELD: ~
ITEM_ID_FIELD: ~
LABEL_FIELD: label
fill_nan: True
normalize_all: True
load_col:
inter: '*'
lowest_val:
timestamp: 14102931
drop_filter_field : False
# training and evaluation
epochs: 500
train_batch_size: 2048
eval_batch_size: 2048
eval_setting: RO_RS
group_by_user: False
valid_metric: AUC
metrics: ['AUC', 'LogLoss']
```

Other parameters (including model parameters) are default value.







Loading

0 comments on commit 7518311

Please sign in to comment.