Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export Model to Onnx as Fp16 Not Working #305

Open
RhinoInani opened this issue Jul 9, 2024 · 0 comments
Open

Export Model to Onnx as Fp16 Not Working #305

RhinoInani opened this issue Jul 9, 2024 · 0 comments

Comments

@RhinoInani
Copy link

Hello,
I have been working to try to export an onnx classification model as fp16 but have been running into issues as certain layers are not converting to fp16.

Here are the steps I have taken so far:

1. I have followed issue: #245

Changed the files (dcvn3.py, and dcnv3_func.py) to force dtype=torch.float16 rather than torch.float.

2. I have also forced the model to be .half() in the export.py file in the classification folder.

Here is an updated version of what I changed the torch2onnx function:

def torch2onnx(args, cfg):
    model = get_model(args, cfg).eval().cuda()

    # speed_test(model)

    onnx_name = f'{args.model_name}_half.onnx'
    torch.onnx.export(model.half(),
                      torch.rand(1, 3, args.size, args.size).cuda().half(),
                      onnx_name,
                      opset_version=16,
                      do_constant_folding=False,
                      input_names=['input'],
                      output_names=['output'])

    return model

While the export works fine the later steps for testing the onnx model do not work due to layer issues, as described in later sections.

3. Changed the core_op in the corresponding yaml file to core_op: 'DCNv3_pytorch' as shown below:

File: classification/configs/internimage_b_1k_224.yaml

DATA:
  IMG_ON_MEMORY: True
MODEL:
  TYPE: intern_image
  DROP_PATH_RATE: 0.5
  INTERN_IMAGE:
    CORE_OP: 'DCNv3_pytorch'
    DEPTHS: [4, 4, 21, 4]
    GROUPS: [7, 14, 28, 56]
    CHANNELS: 112
    LAYER_SCALE: 1e-5
    OFFSET_SCALE: 1.0
    MLP_RATIO: 4.0
    POST_NORM: True
TRAIN:
  EMA:
    ENABLE: True
    DECAY: 0.9999
  BASE_LR: 5e-4

4. Convert Onnx file to Inference Session via onnxruntime:

inference_sess = ort.InferenceSession(onnx_file, providers=['CUDAExecutionProvider'], core_op='DCNv3_pytorch', sess_options=ort.SessionOptions())

This is the error I am running into when the line above is run:

Traceback (most recent call last):                                                                                                                                                                                   
  File "onnx_intern_image_test.py", line 119, in <module>                                                                                                                                                            
    inference_sess = ort.InferenceSession(onnx_file, providers=['CUDAExecutionProvider'], core_op='DCNv3_pytorch', sess_options=ort.SessionOptions())                                                                
  File "~*******/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 360, in __init__                                                          
    self._create_inference_session(providers, provider_options, disabled_optimizers)                                                                                                                                 
  File "~*****/miniconda3/envs/internimage/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 397, in _create_inference_session                                         
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)                                                                                                                 
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from ~*******InternImage/classification/intern_image_b_1k_224_half.onnx failed:Type Error: Type parameter (T) of Optype
 (Div) bound to different types (tensor(float) and tensor(float16) in node (Div_209).  

As you can see there are different types for the layer "Div_209".

Please let me know ASAP if there are any fixes or any known ways to convert the classification model to fp16 for onnx.

Thanks in advanced!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant