-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Model compression fails for some descriptor neuron structures #1370
Labels
Comments
Currently only |
We only support the case |
Thanks for the immediate and clear answer! |
Feature request in #1422. |
njzjz
added a commit
to njzjz/deepmd-kit
that referenced
this issue
Oct 18, 2023
Fix deepmodeling#1370. Signed-off-by: Jinzhe Zeng <[email protected]>
wanghan-iapcm
pushed a commit
that referenced
this issue
Oct 19, 2023
Fix #1370. Support compressing any neuron structure; fix compression error when `neuron[0]` is 1 (although no one will set `neuron[0]` to 1). Signed-off-by: Jinzhe Zeng <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Summary
When I compress the model with some neuron structures of the descriptor, it fails.
When I use the neuron of [25, 50, 100], it succeeds.
But when I use [50, 50, 50] or [100, 50, 25], it fails.
Deepmd-kit version, installation way, input file, running commands, error log, etc.
DP-kit version: v2.0.2
installation: source install
running commands: dp compress -i frozen_model.pb -t input.json
frozen_model.pb files will be attached.
frozen_model.zip
input.json:
`
{
"_comment": " model parameters",
"model": {
"type_map": ["Ge", "Sb", "Te"],
"descriptor" :{
"type": "se_e2_a",
"sel": [30, 30, 30],
"rcut_smth": 0.10,
"rcut": 6.00,
"neuron": [100, 50, 25],
"resnet_dt": false,
"axis_neuron": 16,
"activation_function": "tanh",
"seed": 1,
"_comment": " that's all"
},
"fitting_net" : {
"neuron": [200, 200, 200],
"activation_function": "tanh",
"resnet_dt": true,
"seed": 1,
"_comment": " that's all"
},
"_comment": " that's all"
},
}
`
Error log:
`
Traceback (most recent call last):
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1853, in _create_c_op
c_op = pywrap_tf_session.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 50 and 200 for '{{node add_4/add}} = AddV2[T=DT_DOUBLE](Tanh_1, concat)' with input shapes: [1501,50], [1501,200].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1175, in binary_op_wrapper
out = r_op(x)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1195, in r_binary_op_wrapper
return func(x, y, name=name)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1486, in _add_dispatch
return gen_math_ops.add_v2(x, y, name=name)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 481, in add_v2
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3528, in _create_op_internal
ret = Operation(
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 2015, in init
self._c_op = _create_c_op(self._graph, node_def, inputs,
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1856, in _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 50 and 200 for '{{node add_4/add}} = AddV2[T=DT_DOUBLE](Tanh_1, concat)' with input shapes: [1501,50], [1501,200].
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/grad/ssrokyz/.conda/envs/tf24/bin/dp", line 33, in
sys.exit(load_entry_point('deepmd-kit==2.0.2', 'console_scripts', 'dp')())
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/entrypoints/main.py", line 447, in main
compress(**dict_args)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/entrypoints/compress.py", line 119, in compress
train(
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/entrypoints/train.py", line 103, in train
_do_work(jdata, run_opt, is_compress)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/entrypoints/train.py", line 159, in _do_work
model.build(train_data, stop_batch)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/train/trainer.py", line 285, in build
self.descrpt.enable_compression(self.model_param['compress']["min_nbor_dist"], self.model_param['compress']['model_file'], self.model_param['compress']['table_config'][0], self.model_param['compress']['table_config'][1], self.model_param['compress']['table_config'][2], self.model_param['compress']['table_config'][3])
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/descriptor/se_a.py", line 335, in enable_compression
= self.table.build(min_nbor_dist,
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/utils/tabulate.py", line 148, in build
vv, dd, d2 = self._make_data(xx, ii)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/utils/tabulate.py", line 236, in _make_data
tt, zz = self._layer_1(
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/deepmd_kit-2.0.2-py3.8-linux-x86_64.egg/deepmd/utils/tabulate.py", line 255, in _layer_1
return t, self.activation_fn(tf.matmul(x, w) + b) + t
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1180, in binary_op_wrapper
raise e
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1164, in binary_op_wrapper
return func(x, y, name=name)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1486, in _add_dispatch
return gen_math_ops.add_v2(x, y, name=name)
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 481, in add_v2
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3528, in _create_op_internal
ret = Operation(
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 2015, in init
self._c_op = _create_c_op(self._graph, node_def, inputs,
File "/home/grad/ssrokyz/.conda/envs/tf24/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1856, in _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 50 and 200 for '{{node add_4}} = AddV2[T=DT_DOUBLE](Tanh_1, concat)' with input shapes: [1501,50], [1501,200].
`
The text was updated successfully, but these errors were encountered: