[Question] Performance Issue on General Methods #753

johnny12150 · 2021-03-05T09:28:58Z

I have tested some general methods such as pop and itemKNN with Tmall dataset.
However, their topK metrics seem a little bit odd to me.
This is what I get with pop and itemKNN respectively.

Fri 05 Mar 2021 15:03:20 INFO valid result: 
recall@20 : 0.0413    mrr@20 : 0.1413    ndcg@20 : 0.0688    hit@20 : 0.3017    precision@20 : 0.0536    
Fri 05 Mar 2021 15:03:20 INFO Saving current best: saved/Pop-Mar-05-2021_15-00-52.pth
Fri 05 Mar 2021 15:03:20 INFO Loading model structure and parameters from saved/Pop-Mar-05-2021_15-00-52.pth
Fri 05 Mar 2021 15:04:09 INFO best valid result: {'recall@20': 0.0413, 'mrr@20': 0.1413, 'ndcg@20': 0.0688, 'hit@20': 0.3017, 'precision@20': 0.0536}
Fri 05 Mar 2021 15:04:09 INFO test result: {'recall@20': 0.032, 'mrr@20': 0.2363, 'ndcg@20': 0.1468, 'hit@20': 0.464, 'precision@20': 0.1399}

Fri 05 Mar 2021 15:10:54 INFO valid result: 
recall@20 : 0.2272    mrr@20 : 0.5338    ndcg@20 : 0.3037    hit@20 : 0.8744    precision@20 : 0.2156    
Fri 05 Mar 2021 15:10:54 INFO Saving current best: saved/ItemKNN-Mar-05-2021_15-07-31.pth
Fri 05 Mar 2021 15:10:54 INFO Loading model structure and parameters from saved/ItemKNN-Mar-05-2021_15-07-31.pth
Fri 05 Mar 2021 15:12:43 INFO best valid result: {'recall@20': 0.2272, 'mrr@20': 0.5338, 'ndcg@20': 0.3037, 'hit@20': 0.8744, 'precision@20': 0.2156}
Fri 05 Mar 2021 15:12:43 INFO test result: {'recall@20': 0.1657, 'mrr@20': 0.7576, 'ndcg@20': 0.5654, 'hit@20': 0.9826, 'precision@20': 0.5355}

I just pick a paper that uses the dataset as well and the performance results are in the picture below.

The MRR and NDCG shouldn't be that high (10x times higher than most papers calculated).

By the way, this is my config setting in yaml.

USER_ID_FIELD: user_id
load_col:
  inter: [user_id, item_id, timestamp]
epochs: 30
topk: [20]
valid_metric: MRR@20
split_ratio: [0.7,0.1,0.2]
training_neg_sample_num: 100

The text was updated successfully, but these errors were encountered:

tsotfsk · 2021-03-05T11:55:16Z

Hi@johnny12150, can you tell me which version of the RecBole you are using？

Actually, we have fixed the bug twice and the rule-based model‘s result will be affected(Pop，ItemKNN), while the neural network model will be almost unaffected. You can get some details in issue #699, #622.

Let me give you an example to illustrate the changes. Suppose that there are 5 items in our dataset and we evaluate the model by Recall@3. For one user, he/her just have one ground truth, and the output of the model is [0, 0, 0, 0, 0].

before pr #658, they all rank 1st, and we will get Recall@3=1.
after pr #658 and before pr #731, they all rank 5th, and we will get Recall@3=0.
after pr #731, the Recall@3=0 or 1, because we will randomly choose three of the five as our recommendation.

The results you listed seem to be the earliest version.

johnny12150 · 2021-03-06T11:41:55Z

I have tested it with the newest released on pip version and it seems the same.

I also used the command print(recbole.__version__) to check the version is correct.
The previous results were tested on the previous version.

Sat 06 Mar 2021 18:21:15 INFO best valid result: {'recall@5': 0.1116, 'recall@10': 0.1658, 'recall@20': 0.23, 'recall@50': 0.3315, 'mrr@5': 0.5435, 'mrr@10': 0.5568, 'mrr@20': 0.5624, 'mrr@50': 0.5647, 'ndcg@5': 0.3558, 'ndcg@10': 0.3043, 'ndcg@20': 0.2921, 'ndcg@50': 0.3172, 'hit@5': 0.7152, 'hit@10': 0.8138, 'hit@20': 0.8929, 'hit@50': 0.9603, 'precision@5': 0.3305, 'precision@10': 0.257, 'precision@20': 0.1854, 'precision@50': 0.111}
Sat 06 Mar 2021 18:21:15 INFO test result: {'recall@5': 0.0891, 'recall@10': 0.1412, 'recall@20': 0.2051, 'recall@50': 0.3078, 'mrr@5': 0.6739, 'mrr@10': 0.6837, 'mrr@20': 0.6872, 'mrr@50': 0.6883, 'ndcg@5': 0.488, 'ndcg@10': 0.4229, 'ndcg@20': 0.3571, 'ndcg@50': 0.354, 'hit@5': 0.8396, 'hit@10': 0.9114, 'hit@20': 0.9601, 'hit@50': 0.9916, 'precision@5': 0.4635, 'precision@10': 0.3811, 'precision@20': 0.2889, 'precision@50': 0.182}

The MRR is even higher with ItemKNN.
I think the parameter training_neg_sample_num won't affect the candidates that the recommender can recommend, right?

tsotfsk · 2021-03-06T12:01:50Z

Thanks for your information. The version you used is between #658 and #731, so the result should not be very high. Is your dataset
downloaded from our library RecDatasets? If it's true, I want to know which type you are using?(click data or buy data). If not, could you provide me with a copy? My E-Mail is [email protected]

johnny12150 · 2021-03-06T12:07:51Z

Yes, I have tried both click and buy without removing duplicates.
I am going to try the diginetica ones now since this problem happened in the last version, too.
If it's the same then I will give yoochoose a shoot.

tsotfsk · 2021-03-06T15:00:21Z

OK, I find that this dataset is so big that it takes me one hour to test it. My result of Pop is very different from yours.

06 Mar 22:39    INFO best valid result: {'recall@20': 0.0191, 'mrr@20': 0.0175, 'ndcg@20': 0.0155, 'hit@20': 0.0255, 'precision@20': 0.0013}
06 Mar 22:39    INFO test result: {'recall@20': 0.028, 'mrr@20': 0.0349, 'ndcg@20': 0.0256, 'hit@20': 0.0477, 'precision@20': 0.0024}

The result is tested in the last version. Maybe your setting is inconsistent with the paper, or you are using a sample of the dataset . Please check it and I will test ItemKNN and let you know the result as soon as possible.

johnny12150 · 2021-03-06T17:21:48Z

I reinstall the package and test the datasets again and three datasets have all matched the expectation with pop.
However, the ItemKNN seems to be a little be higher than expected around 0.25 with MRR@20 with Tmall.
Edit

07 Mar 01:00    INFO best valid result: {'recall@5': 0.1857, 'recall@10': 0.2688, 'recall@20': 0.3655, 'recall@50': 0.5049, 'mrr@5': 0.17, 'mrr@10': 0.1838, 'mrr@20': 0.1914, 'mrr@50': 0.196, 'ndcg@5': 0.1457, 'ndcg@10': 0.1741, 'ndcg@20': 0.2021, 'ndcg@50': 0.2346, 'hit@5': 0.2849, 'hit@10': 0.3883, 'hit@20': 0.498, 'hit@50': 0.6389, 'precision@5': 0.068, 'precision@10': 0.0494, 'precision@20': 0.0339, 'precision@50': 0.019}
07 Mar 01:00    INFO test result: {'recall@5': 0.1964, 'recall@10': 0.282, 'recall@20': 0.3789, 'recall@50': 0.5126, 'mrr@5': 0.2236, 'mrr@10': 0.2381, 'mrr@20': 0.2455, 'mrr@50': 0.2495, 'ndcg@5': 0.1736, 'ndcg@10': 0.2005, 'ndcg@20': 0.2298, 'ndcg@50': 0.264, 'hit@5': 0.3569, 'hit@10': 0.4657, 'hit@20': 0.5722, 'hit@50': 0.6945, 'precision@5': 0.0958, 'precision@10': 0.0703, 'precision@20': 0.0481, 'precision@50': 0.0268}

johnny12150 · 2021-03-07T08:06:01Z

I tested the ItemKNN with tmall-click and it stuck before training for more than 10 hours.

07 Mar 02:54    INFO Build [ModelType.TRADITIONAL] DataLoader for [evaluation] with format [InputType.POINTWISE]
07 Mar 02:54    INFO Evaluation Setting:
        Group by user_id
        Ordering: {'strategy': 'shuffle'}
        Splitting: {'strategy': 'by_ratio', 'ratios': [0.8, 0.1, 0.1]}
        Negative Sampling: {'strategy': 'full', 'distribution': 'uniform'}
07 Mar 02:54    INFO batch_size = [[100, 100]], shuffle = [False]

07 Mar 02:54    WARNING Batch size is changed to 2200292.
07 Mar 02:54    WARNING Batch size is changed to 2200292.

Is there any config setting I missed in yaml file?

EliverQ · 2021-03-07T08:21:30Z

I tested the ItemKNN with tmall-click and it stuck before training for more than 10 hours.

07 Mar 02:54    INFO Build [ModelType.TRADITIONAL] DataLoader for [evaluation] with format [InputType.POINTWISE]
07 Mar 02:54    INFO Evaluation Setting:
        Group by user_id
        Ordering: {'strategy': 'shuffle'}
        Splitting: {'strategy': 'by_ratio', 'ratios': [0.8, 0.1, 0.1]}
        Negative Sampling: {'strategy': 'full', 'distribution': 'uniform'}
07 Mar 02:54    INFO batch_size = [[100, 100]], shuffle = [False]

07 Mar 02:54    WARNING Batch size is changed to 2200292.
07 Mar 02:54    WARNING Batch size is changed to 2200292.

Is there any config setting I missed in yaml file?

Hi, @johnny12150 . Could you please provide your complete yaml file? I'll test it and let you know the result as soon as possible.

johnny12150 · 2021-03-07T10:13:16Z

@EliverQ
This is the one that I currently use.

USER_ID_FIELD: user_id
load_col:
  inter: [user_id, item_id, timestamp]
epochs: 30
train_batch_size: 100
eval_batch_size: 100  # val and test batch_size
topk: [10, 20]
valid_metric: MRR@20
stopping_step: 5  # early stop after num epochs
split_ratio: [0.8,0.1,0.1]

johnny12150 · 2021-03-11T08:32:58Z

I reinstall the package and test the datasets again and three datasets have all matched the expectation with pop.
However, the ItemKNN seems to be a little be higher than expected around 0.25 with MRR@20 with Tmall.
Edit

07 Mar 01:00    INFO best valid result: {'recall@5': 0.1857, 'recall@10': 0.2688, 'recall@20': 0.3655, 'recall@50': 0.5049, 'mrr@5': 0.17, 'mrr@10': 0.1838, 'mrr@20': 0.1914, 'mrr@50': 0.196, 'ndcg@5': 0.1457, 'ndcg@10': 0.1741, 'ndcg@20': 0.2021, 'ndcg@50': 0.2346, 'hit@5': 0.2849, 'hit@10': 0.3883, 'hit@20': 0.498, 'hit@50': 0.6389, 'precision@5': 0.068, 'precision@10': 0.0494, 'precision@20': 0.0339, 'precision@50': 0.019}
07 Mar 01:00    INFO test result: {'recall@5': 0.1964, 'recall@10': 0.282, 'recall@20': 0.3789, 'recall@50': 0.5126, 'mrr@5': 0.2236, 'mrr@10': 0.2381, 'mrr@20': 0.2455, 'mrr@50': 0.2495, 'ndcg@5': 0.1736, 'ndcg@10': 0.2005, 'ndcg@20': 0.2298, 'ndcg@50': 0.264, 'hit@5': 0.3569, 'hit@10': 0.4657, 'hit@20': 0.5722, 'hit@50': 0.6945, 'precision@5': 0.0958, 'precision@10': 0.0703, 'precision@20': 0.0481, 'precision@50': 0.0268}

I use the code of a survey paper provided and the result is much lower.

https://github.com/rn5l/session-rec/blob/master/algorithms/knn/iknn.py

2017pxy assigned tsotfsk Mar 7, 2021

2017pxy added the question Further information is requested label Mar 7, 2021

johnny12150 closed this as completed Mar 15, 2021

Sherry-XLL added the benchmark label Feb 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Performance Issue on General Methods #753

[Question] Performance Issue on General Methods #753

johnny12150 commented Mar 5, 2021 •

edited

Loading

tsotfsk commented Mar 5, 2021 •

edited

Loading

johnny12150 commented Mar 6, 2021 •

edited

Loading

tsotfsk commented Mar 6, 2021 •

edited

Loading

johnny12150 commented Mar 6, 2021 •

edited

Loading

tsotfsk commented Mar 6, 2021 •

edited

Loading

johnny12150 commented Mar 6, 2021 •

edited

Loading

johnny12150 commented Mar 7, 2021

EliverQ commented Mar 7, 2021

johnny12150 commented Mar 7, 2021

johnny12150 commented Mar 11, 2021

[Question] Performance Issue on General Methods #753

[Question] Performance Issue on General Methods #753

Comments

johnny12150 commented Mar 5, 2021 • edited Loading

tsotfsk commented Mar 5, 2021 • edited Loading

johnny12150 commented Mar 6, 2021 • edited Loading

tsotfsk commented Mar 6, 2021 • edited Loading

johnny12150 commented Mar 6, 2021 • edited Loading

tsotfsk commented Mar 6, 2021 • edited Loading

johnny12150 commented Mar 6, 2021 • edited Loading

johnny12150 commented Mar 7, 2021

EliverQ commented Mar 7, 2021

johnny12150 commented Mar 7, 2021

johnny12150 commented Mar 11, 2021

johnny12150 commented Mar 5, 2021 •

edited

Loading

tsotfsk commented Mar 5, 2021 •

edited

Loading

johnny12150 commented Mar 6, 2021 •

edited

Loading

tsotfsk commented Mar 6, 2021 •

edited

Loading

johnny12150 commented Mar 6, 2021 •

edited

Loading

tsotfsk commented Mar 6, 2021 •

edited

Loading

johnny12150 commented Mar 6, 2021 •

edited

Loading