关于运行SRGNN在douban和yoochoose数据下评价指标为0 #82

starletbb · 2024-03-20T13:44:49Z

作者您好，关于上述问题的运行结果如下：
General Hyper Parameters:
gpu_id = 0
use_gpu = True
seed = 2020
state = INFO
reproducibility = True
data_path = dataset/douban
checkpoint_dir = saved
show_progress = True
save_dataset = False
dataset_save_path = None
save_dataloaders = False
dataloaders_save_path = None
log_wandb = False

Training Hyper Parameters:
epochs = 500
train_batch_size = 4096
learner = adam
learning_rate = 0.001
train_neg_sample_args = {'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}
eval_step = 1
stopping_step = 10
clip_grad_norm = None
weight_decay = 0.0
loss_decimal_place = 4

Evaluation Hyper Parameters:
eval_args = {'split': {'LS': 'valid_and_test'}, 'order': 'TO', 'group_by': 'user', 'mode': {'valid': 'full', 'test': 'full'}}
repeatable = True
metrics = ['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']
topk = [10]
valid_metric = MRR@10
valid_metric_bigger = True
eval_batch_size = 4096
metric_decimal_place = 4

Dataset Hyper Parameters:
field_separator =
seq_separator =
USER_ID_FIELD = user_id
ITEM_ID_FIELD = item_id
RATING_FIELD = rating
TIME_FIELD = timestamp
seq_len = None
LABEL_FIELD = label
threshold = None
NEG_PREFIX = neg_
load_col = {'inter': ['user_id', 'item_id', 'rating', 'timestamp', 'likes_num']}
unload_col = None
unused_col = None
additional_feat_suffix = None
rm_dup_inter = None
val_interval = None
filter_inter_by_user_or_item = True
user_inter_num_interval = [0,inf)
item_inter_num_interval = [0,inf)
alias_of_user_id = None
alias_of_item_id = None
alias_of_entity_id = None
alias_of_relation_id = None
preload_weight = None
normalize_field = None
normalize_all = None
ITEM_LIST_LENGTH_FIELD = item_length
LIST_SUFFIX = _list
MAX_ITEM_LIST_LENGTH = 50
POSITION_FIELD = position_id
HEAD_ENTITY_ID_FIELD = head_id
TAIL_ENTITY_ID_FIELD = tail_id
RELATION_ID_FIELD = relation_id
ENTITY_ID_FIELD = entity_id
benchmark_filename = None

Other Hyper Parameters:
worker = 0
wandb_project = recbole
shuffle = True
require_pow = False
enable_amp = False
enable_scaler = False
transform = None
embedding_size = 64
step = 1
loss_type = CE
numerical_features = []
discretization = None
kg_reverse_r = False
entity_kg_num_interval = [0,inf)
relation_kg_num_interval = [0,inf)
MODEL_TYPE = ModelType.SEQUENTIAL
gnn_transform = sess_graph
training_neg_sample_num = 0
eval_setting = TO_LS,full
MODEL_INPUT_TYPE = InputType.POINTWISE
eval_type = EvaluatorType.RANKING
single_spec = True
local_rank = 0
device = cuda
valid_neg_sample_args = {'distribution': 'uniform', 'sample_num': 'none'}
test_neg_sample_args = {'distribution': 'uniform', 'sample_num': 'none'}

C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\recbole\data\dataset\dataset.py:648: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.

feat[field].fillna(value=0, inplace=True)
C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\recbole\data\dataset\dataset.py:650: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.

feat[field].fillna(value=feat[field].mean(), inplace=True)
20 Mar 16:34 INFO douban
The number of users: 738701
Average actions of users: 2.8767510491403816
The number of items: 29
Average actions of items: 75894.85714285714
The number of inters: 2125056
The sparsity of the dataset: 90.08018222481785%
Remain Fields: ['user_id', 'item_id', 'rating', 'timestamp', 'likes_num']
20 Mar 16:36 INFO Constructing session graphs.
100%|██████████| 1038965/1038965 [01:56<00:00, 8952.76it/s]
20 Mar 16:38 INFO Constructing session graphs.
100%|██████████| 145071/145071 [00:15<00:00, 9590.91it/s]
20 Mar 16:38 INFO Constructing session graphs.
100%|██████████| 202320/202320 [00:23<00:00, 8537.58it/s]
20 Mar 16:38 INFO SessionGraph Transform in DataLoader.
20 Mar 16:38 INFO SessionGraph Transform in DataLoader.
20 Mar 16:38 INFO SessionGraph Transform in DataLoader.
20 Mar 16:38 INFO [Training]: train_batch_size = [4096] negative sampling: [{'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}]
20 Mar 16:38 INFO [Evaluation]: eval_batch_size = [4096] eval_args: [{'split': {'LS': 'valid_and_test'}, 'order': 'TO', 'group_by': 'user', 'mode': {'valid': 'full', 'test': 'full'}}]
20 Mar 16:38 INFO SRGNN(
(item_embedding): Embedding(29, 64, padding_idx=0)
(gnncell): SRGNNCell(
(incomming_conv): SRGNNConv(
(lin): Linear(in_features=64, out_features=64, bias=True)
)
(outcomming_conv): SRGNNConv(
(lin): Linear(in_features=64, out_features=64, bias=True)
)
(lin_ih): Linear(in_features=128, out_features=192, bias=True)
(lin_hh): Linear(in_features=64, out_features=192, bias=True)
)
(linear_one): Linear(in_features=64, out_features=64, bias=True)
(linear_two): Linear(in_features=64, out_features=64, bias=True)
(linear_three): Linear(in_features=64, out_features=1, bias=False)
(linear_transform): Linear(in_features=128, out_features=64, bias=True)
(loss_fct): CrossEntropyLoss()
)
Trainable parameters: 64064
Train 0: 100%|████████████████████████| 254/254 [03:50<00:00, 1.10it/s, GPU RAM: 0.44 G/2.00 G]
20 Mar 16:42 INFO epoch 0 training [time: 230.78s, train loss: 651.7647]
Evaluate : 100%|██████████████████████████| 36/36 [00:14<00:00, 2.54it/s, GPU RAM: 0.44 G/2.00 G]
C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\recbole\evaluator\base_metric.py:78: RuntimeWarning: Mean of empty slice.
avg_result = value.mean(axis=0)
C:\Users\HP\anaconda3\envs\pytorch-gpu\lib\site-packages\numpy\core_methods.py:184: RuntimeWarning: invalid value encountered in divide
ret = um.true_divide(
20 Mar 16:42 INFO epoch 0 evaluating [time: 14.26s, valid_score: nan]
20 Mar 16:42 INFO valid result:
recall@10 : nan mrr@10 : nan ndcg@10 : nan hit@10 : nan precision@10 : nan
Train 1: 100%|████████████████████████| 254/254 [02:55<00:00, 1.45it/s, GPU RAM: 0.44 G/2.00 G]
20 Mar 16:45 INFO epoch 1 training [time: 175.20s, train loss: 571.7500]
Evaluate : 100%|██████████████████████████| 36/36 [00:09<00:00, 3.79it/s, GPU RAM: 0.44 G/2.00 G]
20 Mar 16:46 INFO epoch 1 evaluating [time: 9.54s, valid_score: nan]
20 Mar 16:46 INFO valid result:
recall@10 : nan mrr@10 : nan ndcg@10 : nan hit@10 : nan precision@10 : nan
Train 2: 100%|████████████████████████| 254/254 [02:54<00:00, 1.46it/s, GPU RAM: 0.44 G/2.00 G]
20 Mar 16:48 INFO epoch 2 training [time: 174.05s, train loss: 562.1718]
Evaluate : 100%|██████████████████████████| 36/36 [00:08<00:00, 4.31it/s, GPU RAM: 0.44 G/2.00 G]
20 Mar 16:49 INFO epoch 2 evaluating [time: 8.39s, valid_score: nan]
20 Mar 16:49 INFO valid result:
recall@10 : nan mrr@10 : nan ndcg@10 : nan hit@10 : nan precision@10 : nan

如何复现
yaml文件如下：

model config

embedding_size: 64
step: 1
loss_type: 'CE'
gnn_transform: sess_graph

dataset config

field_separator: "\t" #指定数据集field的分隔符
seq_separator: " " #指定数据集中token_seq或者float_seq域里的分隔符
USER_ID_FIELD: user_id #指定用户id域
ITEM_ID_FIELD: item_id #指定物品id域
RATING_FIELD: rating #指定打分rating域
TIME_FIELD: timestamp #指定时间域
NEG_PREFIX: neg_ #指定负采样前缀
LABEL_FIELD: label #指定标签域
ITEM_LIST_LENGTH_FIELD: item_length #指定序列长度域
LIST_SUFFIX: _list #指定序列前缀
MAX_ITEM_LIST_LENGTH: 50 #指定最大序列长度
POSITION_FIELD: position_id #指定生成的序列位置id
#指定从什么文件里读什么列，这里就是从ml-1m.inter里面读取user_id, item_id, rating, timestamp这四列,剩下的以此类推
load_col:
inter: [user_id, item_id, rating, timestamp,likes_num]

training settings

epochs: 500 #训练的最大轮数
train_batch_size: 4096 #训练的batch_size
learner: adam #使用的pytorch内置优化器
learning_rate: 0.001 #学习率
training_neg_sample_num: 0 #负采样数目
eval_step: 1 #每次训练后做evalaution的次数
stopping_step: 10 #控制训练收敛的步骤数，在该步骤数内若选取的评测标准没有什么变化，就可以提前停止了

evalution settings

eval_setting: TO_LS,full #对数据按时间排序，设置留一法划分数据集，并使用全排序
metrics: ["Recall", "MRR","NDCG","Hit","Precision"] #评测标准
valid_metric: MRR@10 #选取哪个评测标准作为作为提前停止训练的标准
eval_batch_size: 4096 #评测的batch_size

**实验环境（请补全下列信息

操作系统: Windows
RecBole 版本 0.2.0
Python 版本 3.9
PyTorch 版本 2.1.1
我不知道应该从何处进行解决，麻烦作者能够帮忙解决一下，万分感谢！

starletbb added the bug Something isn't working label Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于运行SRGNN在douban和yoochoose数据下评价指标为0 #82

关于运行SRGNN在douban和yoochoose数据下评价指标为0 #82

starletbb commented Mar 20, 2024

关于运行SRGNN在douban和yoochoose数据下评价指标为0 #82

关于运行SRGNN在douban和yoochoose数据下评价指标为0 #82

Comments

starletbb commented Mar 20, 2024

model config

dataset config

training settings

evalution settings