Skip to content
This repository has been archived by the owner on Nov 11, 2023. It is now read-only.

Conversion error: Shape must be rank 1 but is rank 2 #91

Closed
larrywal-express opened this issue Jan 24, 2022 · 31 comments
Closed

Conversion error: Shape must be rank 1 but is rank 2 #91

larrywal-express opened this issue Jan 24, 2022 · 31 comments
Labels
YOLOv5 Read the README.

Comments

@larrywal-express
Copy link

larrywal-express commented Jan 24, 2022

Issue Type

Support, Others

OS

Windows

OS architecture

x86_64

Programming Language

Python

Framework

PyTorch

Download URL for ONNX / OpenVINO IR

model.zip

Convert Script

python openvino2tensorflow.py \
--model_path lite_openvino_model/lite.xml \
--output_saved_model \
--output_pb \
--output_weight_quant_tflite

Description

@PINTO0309 Thanks for your good work.
I obtained the error output log in trying to convert from .xml to saved_model, pb, tflite. Also tried to used replace.json for layer_id 325, but all effort was fruitless. How do i solve this problem?

Relevant Log Output

ERROR: Exception encountered when calling layer "tf.reshape_6" (type TFOpLambda).

Shape must be rank 1 but is rank 2 for '{{node tf.reshape_6/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT64](Placeholder, tf.reshape_6/Reshape/shape)' with input shapes: [1,1,1,256], [2,1].

Call arguments received:
• tensor=tf.Tensor(shape=(1, 1, 1, 256), dtype=float32)
• shape=['tf.Tensor(shape=(1,), dtype=int64)', 'tf.Tensor(shape=(1,), dtype=int64)']
• name=None
ERROR: model_path : lite_openvino_model/lite.xml
ERROR: weights_path: lite_openvino_model/lite.bin
ERROR: layer_id : 326
ERROR: input_layer0 layer_id=312: KerasTensor(type_spec=TensorSpec(shape=(1, 1, 1, 256), dtype=tf.float32, name=None), name='tf.math.reduce_mean/Mean:0', description="created by layer 'tf.math.reduce_mean'")
ERROR: input_layer1 layer_id=325: tf.Tensor(
[[ 1]
[16]], shape=(2, 1), dtype=int64)
ERROR: The trace log is below.
Traceback (most recent call last):
File "openvino2tensorflow.py", line 3758, in convert
tf_layers_dict[layer_id] = tf.reshape(op1, shape)
File "D:\Anaconda3\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "D:\Anaconda3\lib\site-packages\keras\layers\core\tf_op_layer.py", line 107, in handle
return TFOpLambda(op)(*args, **kwargs)
File "D:\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.reshape_6" (type TFOpLambda).

Shape must be rank 1 but is rank 2 for '{{node tf.reshape_6/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT64](Placeholder, tf.reshape_6/Reshape/shape)' with input shapes: [1,1,1,256], [2,1].

Call arguments received:
• tensor=tf.Tensor(shape=(1, 1, 1, 256), dtype=float32)
• shape=['tf.Tensor(shape=(1,), dtype=int64)', 'tf.Tensor(shape=(1,), dtype=int64)']
• name=None

Source code for simple inference testing code

model.zip

@PINTO0309
Copy link
Owner

PINTO0309 commented Jan 24, 2022

Start the survey. By the way, I only look at the structure of the model and ask questions. Is this YOLOv5-Lite?
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/180_YOLOv5-Lite

@larrywal-express
Copy link
Author

@PINTO0309 Yes, but is modified

@PINTO0309
Copy link
Owner

Thanks for your help. It would be better if you could include the architecture of the model if possible, so that other engineers can easily find the issue when they search for it.

Either way, I'll get to work. Please wait a moment.

@larrywal-express
Copy link
Author

@PINTO0309 Okay.
Thanks

@PINTO0309
Copy link
Owner

Since all Gather slice positions were being processed as NCHW, resulting in an error, I created a JSON to adjust to axis assuming NHWC. This requires you to analyze what is displayed in the error message to see from which layer the slice is misaligned.

The other point is that the last 5D Reshape and 5D Transpose transformation shapes are broken and need to be adjusted.

  • replace.json
{
    "format_version": 2,
    "layers": [
        {
            "layer_id": "316",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "320",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                3
            ]
        },
        {
            "layer_id": "321",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0
            ]
        },
        {
            "layer_id": "322",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "377",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "383",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                1
            ]
        },
        {
            "layer_id": "384",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0
            ]
        },
        {
            "layer_id": "385",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "389",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                2
            ]
        },
        {
            "layer_id": "390",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0
            ]
        },
        {
            "layer_id": "391",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "394",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                1,
                16,
                16,
                3,
                9
            ]
        },
        {
            "layer_id": "396",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0,
                3,
                1,
                2,
                4
            ]
        },
        {
            "layer_id": "432",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "438",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                1
            ]
        },
        {
            "layer_id": "439",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0
            ]
        },
        {
            "layer_id": "440",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "444",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                2
            ]
        },
        {
            "layer_id": "445",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0
            ]
        },
        {
            "layer_id": "446",
            "type": "Squeeze",
            "replace_mode": "insert_after",
            "values": [
                0
            ]
        },
        {
            "layer_id": "449",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                1,
                32,
                32,
                3,
                9
            ]
        },
        {
            "layer_id": "451",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0,
                3,
                1,
                2,
                4
            ]
        }
    ]
}

I have not checked that the model works correctly at all. Please check for yourself and if there are any problems with the structure you can look into it yourself.

@PINTO0309
Copy link
Owner

model_float32 tflite

@larrywal-express
Copy link
Author

larrywal-express commented Jan 24, 2022

@PINTO0309 Wonderful. Thanks for your support. I noted that shape="" of layer id 375, 376, 430, 431, 455, 480 were not replaced. More so, how do i identify values to be 0 or 1 or 2... with reference to json below and can you please recommend the software to view .bin?

        {
            "layer_id": "320",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                3
            ]
        },
        {
            "layer_id": "321",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0 

@PINTO0309
Copy link
Owner

PINTO0309 commented Jan 24, 2022

I did not understand what you were expecting. All layers in between Convolution and Transpose are garbage. In other words, ShapeOf, Gather, Concat, and Unsqueeze were all determined to be unnecessary by the optimizer's optimization. In my experience, layers that are used only for shape estimation have been found to degrade performance during inference, so the behavior of these optimizers makes sense.

  • openvino
    image
  • tflite
    image

"320" and "321" have also been deleted for the same reason. ShapeOf, Gather, Concat, and Unsqueeze are not needed during inference.
image

These layers are only needed if the input image is undefined in terms of height and width, and are essentially unnecessary if the input geometry is statically determined.

@larrywal-express
Copy link
Author

larrywal-express commented Jan 26, 2022

Thanks. I noticed that the tested output of onnx is different from tflite as below. for this reason, the target box of tflite was not accurate with wrong detection.
tflite

ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 15.64478874206543ms
shape: (1, 3840, 9)
[array([[[5.2931595e+00, 9.6953697e+00, 2.4912457e+01, ...,
         1.5526780e-01, 2.2650501e-01, 4.0879369e-01],
        [2.0134949e+01, 6.2680893e+00, 3.3342621e+01, ...,
         2.1189997e-01, 2.2379622e-01, 4.7713119e-01],
        [3.8248707e+01, 1.7845745e+00, 3.6460529e+01, ...,
         1.0526398e-01, 1.7205426e-01, 8.1319189e-01],
        ...,
        [4.3373840e+02, 4.8147830e+02, 1.3926814e+02, ...,
         1.4624476e-02, 1.3875341e-01, 2.2544542e-01],
        [4.6488284e+02, 4.8298419e+02, 1.3351683e+02, ...,
         1.9375145e-02, 2.0011839e-01, 2.2959533e-01],
        [4.8577332e+02, 4.8264972e+02, 1.1711755e+02, ...,
         2.0683646e-02, 2.8468835e-01, 3.3594516e-01]]], dtype=float32)]
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 141.2348747253418ms
shape: (1, 3840, 9)
array([[[ 3.55738926e+00,  3.09487915e+00,  1.86278858e+01, ...,
          7.89284706e-03,  9.33742523e-03,  9.98364687e-01],
        [ 1.97514572e+01,  2.04518318e+00,  2.98684349e+01, ...,
          1.28795207e-02,  6.61897659e-03,  9.98675346e-01],
        [ 3.79077911e+01, -1.81230640e+00,  3.11474991e+01, ...,
          1.60465279e-05,  8.62568617e-04,  9.99998212e-01],
        ...,
        [ 4.29759338e+02,  4.83508667e+02,  1.08239449e+02, ...,
          1.31325126e-02,  4.26258028e-01,  3.09236467e-01],
        [ 4.56263580e+02,  4.84719513e+02,  1.05472176e+02, ...,
          1.98948979e-02,  5.13177693e-01,  4.79941279e-01],
        [ 4.89616119e+02,  4.88561218e+02,  1.09522301e+02, ...,
          6.32282197e-02,  2.74803936e-01,  6.55043006e-01]]],
      dtype=float32)

onnx

@PINTO0309
Copy link
Owner

PINTO0309 commented Jan 26, 2022

docker run --gpus all -it --rm \
-v `pwd`:/home/user/workdir \
ghcr.io/pinto0309/openvino2tensorflow:latest

python3 -m onnxsim lite.onnx lite.onnx

$INTEL_OPENVINO_DIR/deployment_tools/model_optimizer/mo.py \
--input_model lite.onnx \
--data_type FP32
  • replace.json
{
    "format_version": 2,
    "layers": [
        {
            "layer_id": "358",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0,
                3,
                1,
                2,
                4
            ]
        },
        {
            "layer_id": "393",
            "type": "Const",
            "replace_mode": "direct",
            "values": [
                0,
                3,
                1,
                2,
                4
            ]
        }
    ]
}
openvino2tensorflow \
--model_path lite.xml \
--output_saved_model \
--output_pb \
--output_no_quant_float32_tflite \
--weight_replacement_config replace.json
  • onnx_tflite_test.py
import onnxruntime
import tensorflow as tf
import time
import numpy as np
from pprint import pprint

H=512
W=512
MODEL='model_float32'

############################################################

onnx_session = onnxruntime.InferenceSession('lite.onnx')
input_name = onnx_session.get_inputs()[0].name
output_name = onnx_session.get_outputs()[0].name

roop = 1
e = 0.0
result = None
inp = np.ones((1,3,H,W), dtype=np.float32)
for _ in range(roop):
    s = time.time()
    result = onnx_session.run(
        [output_name],
        {input_name: inp}
    )
    e += (time.time() - s)
print('ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@')
print(f'elapsed time: {e/roop*1000}ms')
print(f'shape: {result[0].shape}')
pprint(result)

############################################################

interpreter = tf.lite.Interpreter(model_path=f'{MODEL}.tflite', num_threads=4)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

roop = 1
e = 0.0
result = None
inp = np.ones((1,H,W,3), dtype=np.float32)
for _ in range(roop):
    s = time.time()
    interpreter.set_tensor(input_details[0]['index'], inp)
    interpreter.invoke()
    result = interpreter.get_tensor(output_details[1]['index'])
    e += (time.time() - s)
print('tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@')
print(f'elapsed time: {e/roop*1000}ms')
print(f'shape: {result.shape}')
pprint(result)
python3 onnx_tflite_test.py
ONNX output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 6.359338760375977ms
shape: (1, 3840, 9)
[array([[[5.2931595e+00, 9.6953697e+00, 2.4912457e+01, ...,
         1.5526780e-01, 2.2650501e-01, 4.0879369e-01],
        [2.0134949e+01, 6.2680893e+00, 3.3342621e+01, ...,
         2.1189997e-01, 2.2379622e-01, 4.7713119e-01],
        [3.8248707e+01, 1.7845745e+00, 3.6460529e+01, ...,
         1.0526398e-01, 1.7205426e-01, 8.1319189e-01],
        ...,
        [4.3373840e+02, 4.8147830e+02, 1.3926814e+02, ...,
         1.4624476e-02, 1.3875341e-01, 2.2544542e-01],
        [4.6488284e+02, 4.8298419e+02, 1.3351683e+02, ...,
         1.9375145e-02, 2.0011839e-01, 2.2959533e-01],
        [4.8577332e+02, 4.8264972e+02, 1.1711755e+02, ...,
         2.0683646e-02, 2.8468835e-01, 3.3594516e-01]]], dtype=float32)]
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
tflite output @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
elapsed time: 23.181676864624023ms
shape: (1, 3840, 9)
array([[[5.29315758e+00, 9.69537735e+00, 2.49124527e+01, ...,
         1.55268013e-01, 2.26505280e-01, 4.08792853e-01],
        [2.01349468e+01, 6.26809692e+00, 3.33426208e+01, ...,
         2.11900204e-01, 2.23796397e-01, 4.77130651e-01],
        [3.82487106e+01, 1.78458500e+00, 3.64605331e+01, ...,
         1.05264165e-01, 1.72054380e-01, 8.13191533e-01],
        ...,
        [4.33738434e+02, 4.81478302e+02, 1.39268143e+02, ...,
         1.46242362e-02, 1.38754994e-01, 2.25445867e-01],
        [4.64882843e+02, 4.82984192e+02, 1.33516907e+02, ...,
         1.93749964e-02, 2.00120524e-01, 2.29595125e-01],
        [4.85773315e+02, 4.82649719e+02, 1.17117645e+02, ...,
         2.06834618e-02, 2.84691095e-01, 3.35944831e-01]]], dtype=float32)

@larrywal-express
Copy link
Author

@PINTO0309 the process was successful with equality, but the obtained detection result on detection.py is still wrong. similar to my earlier image post.

@PINTO0309
Copy link
Owner

Please attach the code for inference testing. The exchange of correspondence will be inefficient.

@larrywal-express
Copy link
Author

https://github.com/zldrobit/yolov5/blob/tf-android/detect.py

Above as testing. I am also trying to run it on Android.

@PINTO0309
Copy link
Owner

PINTO0309 commented Jan 28, 2022

Can you provide me with one still image for testing? I have no good way of knowing what your model is inferring. And is the model you used Float32? Is it INT8? Float16? Please describe the information in detail.

@larrywal-express
Copy link
Author

melon_boyang
i used Float32.

@larrywal-express
Copy link
Author

the model was trained on four classes

@PINTO0309
Copy link
Owner

Do you have any test code for onnx that you were able to successfully infer?

@larrywal-express
Copy link
Author

1.zip
2.zip

the code above. Meanwhile, i noticed that the output of my tflite model is 5D instead of 4D.

@PINTO0309
Copy link
Owner

PINTO0309 commented Jan 28, 2022

The figure below shows the ONNX file you provided.
So output2 and output3 are 5D.
image

Only output1 was in 3D. Is there a problem?
image

What I don't understand is whether it is a problem with my conversion tool or with the structure of ONNX.

@larrywal-express
Copy link
Author

The 3D of output1 is okay. but the 5D of output2 and output3 doesn't align with the original yolov5.tflite. For example, output2 is [1,16,16,3,9], while one of yolov5 is [1,256,3,9]. the split [16,16] and [32,32] in my tflite is multiplied in the yolov5.tflite of [256] and [1024].

@larrywal-express
Copy link
Author

lite.tflite
lite
yolov5.tflite
yolov5

Am just thinking if there is a way to convert the 5D to 4D, the detection might work very fine.

@PINTO0309
Copy link
Owner

I'll say it again. The .onnx that I extracted by unzipping the zip file you attached in your first comment is already 5D. You keep pointing to the .tflite file, but it's not the .tflite that's the problem. The problem is with your model.

Check the ONNX file first.

@larrywal-express
Copy link
Author

Alright... is there a way to solve this problem? Because the tested onnx is working fine compared to the tflite...

@PINTO0309
Copy link
Owner

PINTO0309 commented Jan 29, 2022

First, I have not seen the structure of the YOLOv5-Lite model. So I do not know if any final structure is correct.
However, you said this early in the issue.

Yes, but is modified

However, somewhere in the comments on this issue, it doesn't say how you modified the model. It is obvious that you are already in 5D when you export from PyTorch to ONNX, and all I can say at this point is that you made some mistake when you modified the PyTorch model.

Your lite.onnx
image
image

If 4D is correct, please provide the 4D formatted ONNX first, because I don't know the correct structure after Transpose. Note that a discussion of how to correctly export YOLOv5-Lite models from PyTorch to ONNX is beyond the scope of this repository.

@larrywal-express
Copy link
Author

in

above is the yolov5 onnx also in 5D.

i converted the pt to onnx using yolov5 repo. that is the onnx sent to you. I followed all your instructions till the final stage of tflite.

@PINTO0309
Copy link
Owner

There is a 3D output of the converted model. What could be the problem?
image

@larrywal-express
Copy link
Author

yes.. correct.. the conversion was successful. I am also confused too about the tested detection results

@PINTO0309
Copy link
Owner

I assure you. It's not a problem with my tools.

The ONNX that you generated from PyTorch has three outputs. Does the official model also have three outputs? If the output of the official model is one instead of three, then there is some mistake at the time you first generate ONNX.

  • lite.onnx
    image

yes.. correct..

I'm afraid that's probably not correct.
Shouldn't there be three input layers for the output of [1,N,9] in the first place? Your ONNX lite.onnx has only two inputs.

First, please consult the experts in the yolov5-lite repository. Then, when you are able to generate ONNX with the correct structure, please come back to this repository.
https://github.com/ppogg/YOLOv5-Lite

@larrywal-express
Copy link
Author

shufflenet.zip

Above the model generated from customized dataset using original YOLOv5-Lite. I actually used two scale of detection instead of three scale used by the original yolov5.

@larrywal-express
Copy link
Author

lite4D.zip
@PINTO0309 I have succeeded in converting the onnx in 5D to 4D as attached above. Now want to convert to the tflite, but still facing some errors. Can you please help me out?

@PINTO0309
Copy link
Owner

I apologize for the inconvenience, but can you issue a separate issue so that the various discussions don't get mixed up?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
YOLOv5 Read the README.
Projects
None yet
Development

No branches or pull requests

2 participants