[Build] Fails to parse FP16 LayerNormalization in opset>=18 #16341

galagam · 2023-06-13T16:42:05Z

Describe the issue

FP16 LayerNormalization fails for opsets 18 and 19, but works for opset 17, throwing this error in session creation:

[ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. In Node, ("", ReduceMean, "", -1) : ("_0x19959b0_XU": tensor(float),) -> ("_0x19959b0_Mean2D",) , Error Unrecognized attribute: axes for operator ReduceMean

FP32 LayerNormalization is not affected.
FP16 LayerNormalization in opset 17 works as expected.

Please see a short Python script to reproduce under "Build script".
Generated with latest ONNX and ONNXRunTime releases:
pip install onnx==1.14.0 onnxruntime==1.15.0

Urgency

Required for NVIDIA project, can't share full details publicly.

Target platform

x86 Ubuntu 20.04

Build script

#!/usr/bin/python3

# Requires: pip3 install onnx==1.14.0 onnxruntime==1.15.0

import onnxruntime as ort
import onnx
from onnx import TensorProto
from onnx import helper


def gen_model(opset, precision):
    x = helper.make_tensor_value_info('x', precision, shape=(8,4))
    scale = helper.make_tensor_value_info('scale', precision, shape=(4,))
    y = helper.make_tensor_value_info('y', precision, shape=(8,4))

    node1 = helper.make_node('LayerNormalization', ['x', 'scale'], ['y'])

    graph_def = helper.make_graph(
        [node1],
        'layernorm-fp16',
        [x, scale],
        [y],
    )

    model_def = helper.make_model(graph_def, producer_name='onnx')
    model_def.opset_import[0].version = opset

    # Check and save model to file
    onnx.checker.check_model(model_def)
    print(f'The model  is checked!')

    filename = f'/tmp/model-layernorm-fp16-opset-{opset}.onnx'
    onnx.save(model_def, filename)
    return filename


def create_session(filename):
    try:
        s = ort.InferenceSession(filename)
        print('Created onnxruntime session!')
    except Exception as e:
        print(f'Failed to create onnxruntime session {e}')


if __name__ == '__main__':
    precisions = (TensorProto.FLOAT, TensorProto.FLOAT16)
    opsets = (17, 18, 19)

    for opset in opsets:
        for precision in precisions:
            print(f'Testing LayerNorm node for opset {opset}, precision {precision}:')
            filename = gen_model(opset=opset, precision=precision)
            create_session(filename)
            print('-'*50)

Error / output

Testing LayerNorm node for opset 17, precision 1:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 17, precision 10:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 18, precision 1:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 18, precision 10:
The model  is checked!
Failed to create onnxruntime session [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. In Node, ("", ReduceMean, "", -1) : ("_0x1a7af10_XU": tensor(float),) -> ("_0x1a7af10_Mean2D",) , Error Unrecognized attribute: axes for operator ReduceMean
--------------------------------------------------
Testing LayerNorm node for opset 19, precision 1:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 19, precision 10:
The model  is checked!
Failed to create onnxruntime session [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. In Node, ("", ReduceMean, "", -1) : ("_0x19959b0_XU": tensor(float),) -> ("_0x19959b0_Mean2D",) , Error Unrecognized attribute: axes for operator ReduceMean
--------------------------------------------------

Visual Studio Version

No response

GCC / Compiler Version

No response

(edited formatting)

The text was updated successfully, but these errors were encountered:

YUNQIUGUO · 2023-06-13T20:06:15Z

Does your tested onnx model contains a ReduceMean operator? looks like the error message shows failed at ReduceMean.

[ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. In Node, ("", ReduceMean, "", -1) : ("_0x19959b0_XU": tensor(float),) -> ("_0x19959b0_Mean2D",) , Error Unrecognized attribute: axes for operator ReduceMean

so fyi, in opset 18, ReduceMean introduces axes as an optional input. Before opset 18, axes was an attribute. Error message may indicate this.

-ReduceMean-18:
https://github.com/onnx/onnx/blob/main/docs/Operators.md#ReduceMean
-ReduceMean-13:
https://github.com/onnx/onnx/blob/main/docs/Changelog.md#ReduceMean-13

You may want to update your model if possible.

tianleiwu · 2023-06-13T20:27:54Z

This is the output with ONNX 1.14 or 1.13 and onnxruntime-gpu 1.15 with ort.InferenceSession(filename, providers=['CUDAExecutionProvider', 'CPUExecutionProvider']):

 python  test_ln.py
Testing LayerNorm node for opset 17, precision 1:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 17, precision 10:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 18, precision 1:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 18, precision 10:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 19, precision 1:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------
Testing LayerNorm node for opset 19, precision 10:
The model  is checked!
Created onnxruntime session!
--------------------------------------------------

It seems no issue in my machine (Ubuntu 20.04).

galagam · 2023-06-15T17:50:09Z

Thanks for the prompt replies.
@YUNQIUGUO - there's no ReduceMean, the graph contains only a single LayerNormalization node, but I'm assuming ONNX-RT is composing the LayerNorm implementation by combining several more basic ops, such as ReduceMean.

@tianleiwu I can confirm this is working when I enable the CUDAExecutionProvider, so issue here is only with CPU implementation.

tianleiwu · 2023-06-16T00:51:05Z

Currently, cpu supports float version of LayerNormalization up to opset 17 (as in #12978). @skottmckay

For opset 18/19, when there is no implement of LayerNormalization in CPU, it will use ONNX function of opset-18. That error might have same root cause as in #16438.

The solution is to extend the float version of LayerNormalization in CPU EP to opset 18.

justinchuby · 2023-06-22T23:21:19Z

For opset 18/19, when there is no implement of LayerNormalization in CPU, it will use ONNX function of opset-18. That error might have same root cause as in #16438.

Looks very likely so. Is there a way for ORT to select to correct decomposition there?

justinchuby · 2023-06-22T23:47:01Z

Added observation in #16438

galagam added the build build issues; typically submitted using template label Jun 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Build] Fails to parse FP16 LayerNormalization in opset>=18 #16341

[Build] Fails to parse FP16 LayerNormalization in opset>=18 #16341

galagam commented Jun 13, 2023 •

edited

Loading

YUNQIUGUO commented Jun 13, 2023 •

edited

Loading

tianleiwu commented Jun 13, 2023 •

edited

Loading

galagam commented Jun 15, 2023

tianleiwu commented Jun 16, 2023 •

edited

Loading

justinchuby commented Jun 22, 2023

justinchuby commented Jun 22, 2023

[Build] Fails to parse FP16 LayerNormalization in opset>=18 #16341

[Build] Fails to parse FP16 LayerNormalization in opset>=18 #16341

Comments

galagam commented Jun 13, 2023 • edited Loading

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

YUNQIUGUO commented Jun 13, 2023 • edited Loading

tianleiwu commented Jun 13, 2023 • edited Loading

galagam commented Jun 15, 2023

tianleiwu commented Jun 16, 2023 • edited Loading

justinchuby commented Jun 22, 2023

justinchuby commented Jun 22, 2023

galagam commented Jun 13, 2023 •

edited

Loading

YUNQIUGUO commented Jun 13, 2023 •

edited

Loading

tianleiwu commented Jun 13, 2023 •

edited

Loading

tianleiwu commented Jun 16, 2023 •

edited

Loading