Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

执行超参数搜索时KeyError报错 #62

Open
ithok opened this issue Mar 24, 2023 · 6 comments
Open

执行超参数搜索时KeyError报错 #62

ithok opened this issue Mar 24, 2023 · 6 comments

Comments

@ithok
Copy link

ithok commented Mar 24, 2023

ERROR:hyperopt.fmin:job exception: 'model'

0%| | 0/12 [1:01:04<?, ?trial/s, best loss=?]
Traceback (most recent call last):
File "run_hyper.py", line 26, in
main()
File "run_hyper.py", line 18, in main
hp.run()
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/recbole/trainer/hyper_tuning.py", line 411, in run
fmin(
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/hyperopt/fmin.py", line 553, in fmin
rval.exhaust()
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/hyperopt/fmin.py", line 356, in exhaust
self.run(self.max_evals - n_done, block_until_done=self.asynchronous)
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/hyperopt/fmin.py", line 292, in run
self.serial_evaluate()
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/hyperopt/fmin.py", line 170, in serial_evaluate
result = self.domain.evaluate(spec, ctrl)
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/hyperopt/base.py", line 907, in evaluate
rval = self.fn(pyll_rval)
File "/opt/conda/envs/siton_env/lib/python3.8/site-packages/recbole/trainer/hyper_tuning.py", line 349, in trial
result_dict["model"],
KeyError: 'model'

RT,报错信息如上,跑的模型是Hmlet,输入指令如下:

python run_hyper.py --model='HMLET' --dataset='ml-1m' --config_files='ml-1m.yaml' --params_file=Hmlet.hyper

@hyp1231
Copy link
Member

hyp1231 commented Mar 25, 2023

请问您的 recbole 版本是?我在 recbole 1.0.1 和 1.1.1 都进行了测试,似乎没有复现您说的问题。

@ithok
Copy link
Author

ithok commented Mar 25, 2023

Name Version Build Channel

_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
absl-py 1.4.0 pypi_0 pypi
ca-certificates 2023.01.10 h06a4308_0
cachetools 5.3.0 pypi_0 pypi
certifi 2022.12.7 py38h06a4308_0
charset-normalizer 2.1.1 pypi_0 pypi
cloudpickle 2.2.1 pypi_0 pypi
cmake 3.25.0 pypi_0 pypi
colorama 0.4.4 pypi_0 pypi
colorlog 4.7.2 pypi_0 pypi
filelock 3.9.0 pypi_0 pypi
future 0.18.3 pypi_0 pypi
google-auth 2.16.2 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
grpcio 1.51.3 pypi_0 pypi
hyperopt 0.2.5 pypi_0 pypi
idna 3.4 pypi_0 pypi
importlib-metadata 6.1.0 pypi_0 pypi
jinja2 3.1.2 pypi_0 pypi
joblib 1.2.0 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.2 h6a678d5_6
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libstdcxx-ng 11.2.0 h1234567_1
lit 15.0.7 pypi_0 pypi
markdown 3.4.1 pypi_0 pypi
markupsafe 2.1.2 pypi_0 pypi
mpmath 1.2.1 pypi_0 pypi
ncurses 6.4 h6a678d5_0
networkx 3.0 pypi_0 pypi
numpy 1.23.5 pypi_0 pypi
oauthlib 3.2.2 pypi_0 pypi
openssl 1.1.1t h7f8727e_0
pandas 1.5.3 pypi_0 pypi
pillow 9.3.0 pypi_0 pypi
pip 23.0.1 py38h06a4308_0
plotly 5.13.1 pypi_0 pypi
protobuf 4.22.1 pypi_0 pypi
psutil 5.9.4 pypi_0 pypi
py4j 0.10.9.7 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pyg-lib 0.2.0+pt20cu117 pypi_0 pypi
pyparsing 3.0.9 pypi_0 pypi
python 3.8.16 h7a1cb2a_3
python-dateutil 2.8.2 pypi_0 pypi
pytz 2022.7.1 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
readline 8.2 h5eee18b_0
recbole 1.1.1 pypi_0 pypi
requests 2.28.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.9 pypi_0 pypi
scikit-learn 1.2.2 pypi_0 pypi
scipy 1.10.1 pypi_0 pypi
setuptools 65.6.3 py38h06a4308_0
six 1.16.0 pypi_0 pypi
sqlite 3.41.1 h5eee18b_0
sympy 1.11.1 pypi_0 pypi
tabulate 0.9.0 pypi_0 pypi
tenacity 8.2.2 pypi_0 pypi
tensorboard 2.12.0 pypi_0 pypi
tensorboard-data-server 0.7.0 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
thop 0.1.1-2209072238 pypi_0 pypi
threadpoolctl 3.1.0 pypi_0 pypi
tk 8.6.12 h1ccaba5_0
torch 2.0.0+cu117 pypi_0 pypi
torch-cluster 1.6.1+pt20cu117 pypi_0 pypi
torch-geometric 2.2.0 pypi_0 pypi
torch-scatter 2.1.1+pt20cu117 pypi_0 pypi
torch-sparse 0.6.17+pt20cu117 pypi_0 pypi
torch-spline-conv 1.2.2+pt20cu117 pypi_0 pypi
torchaudio 2.0.1+cu117 pypi_0 pypi
torchvision 0.15.1+cu117 pypi_0 pypi
tqdm 4.65.0 pypi_0 pypi
triton 2.0.0 pypi_0 pypi
typing-extensions 4.4.0 pypi_0 pypi
urllib3 1.26.13 pypi_0 pypi
werkzeug 2.2.3 pypi_0 pypi
wheel 0.38.4 py38h06a4308_0
xz 5.2.10 h5eee18b_1
zipp 3.15.0 pypi_0 pypi
zlib 1.2.13 h5eee18b_0

您好,是1.1.1版本的 condalist如上

@ithok
Copy link
Author

ithok commented Mar 25, 2023

看起来像是hyperopt的问题

@ithok
Copy link
Author

ithok commented Mar 26, 2023

您好,可以提供一下您运行的hyperopt的版本吗

@hyp1231
Copy link
Member

hyp1231 commented Mar 26, 2023

您好!目前初步判定是我们对 RecBole 1.1.1 版本适配产生的 bug。

bug 产生原因:
RecBole 1.1.1 的某个 commit 里给超参调优的目标函数中加入了新的返回值

RUCAIBox/RecBole@05a223e#diff-fe46181d8ec96ec5c6e9b1edd0dd0af2b8e4f8b4cab6a7e5a2090ef8a34f20eeL334-R347

但是 RecBole-GNN 的超参调优的目标函数并没有返回 'model' 这个 key

return {
'best_valid_score': best_valid_score,
'valid_score_bigger': config['valid_metric_bigger'],
'best_valid_result': best_valid_result,
'test_result': test_result
}

您如果着急的话可以先给 recbole_gnn/quick_start.py 的 82 行附近的返回值中加入一个新 key 值:

return {
  'model': config['model'],
  # ... ...
}

我们也将马上修复并进行测试,感谢找到这个 bug!!

@ithok
Copy link
Author

ithok commented Mar 26, 2023

好的,非常感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants