Output of roberta models varies between NNPA compiles #2789

cjvolzka · 2024-04-05T18:48:08Z

After compiling roberta-sequence-classification-9.onnx and roberta-base-11.onnx for NNPA and running the generate .so. The model output can vary slightly.

For example, compiling and running roberta-sequence-classification-9 twice may yield

1st: [[ 0.455078, -0.35498 ]]
2nd: [[ 0.457520, -0.355957 ]]
3rd: [[ 0.458984, -0.357422 ]]
...

The values are always very close but I wouldn't expect any differences at all.

Notes:

Don't seem to see the issue compiling for CPU
Multiple runs of the same .so file yeilds consistent results. I only encounter the issue if I recompile and then re-run the model.
I use the same version of onnx-mlir and the .onnx files for all tests (see test_robert.sh script)
I don't think this is from a recent change. I've noticed random failures in our tolerance test in the past. However I couldn't reproduce the issue I investigated those times because I would re-run the same .so for extended periods and not observe issue. Since I couldn't reproduce it, I thought I just needed to loosen our tolerance test from commits over time. It seems to be happening more lately so I tried both re-compiling and re-running the model and found I can reproduce the issue readily.
I'm attaching roberta-out.zip with a script and our s390x test client which will show the issue. To run:
- Extract .zip to a folder
- Download models from the onnx model zoo and put in the same folder
- run ./test-roberta.sh roberta-sequence-classification-9
  - Notice the actual column's output varies between compile and run cycles. (Note: exp column are the CPU expected values based on the example values in the onnx-model-zoo )
- run ./test-roberta.sh roberta-base-11
  - There's many more output values for this model. So it's easier to look for lines likeMin passing r_tol: and Max absolute difference (Min passing a_tol for r_tol to be 0) and notice it keeps shifting. Those are the minimum a_tol and r_tol values for the test to pass that run.

The text was updated successfully, but these errors were encountered:

cjvolzka · 2024-04-08T00:15:35Z

I tried to track this down some. I don't have an exact commit but it becomes much worse (and easily reproducible) somewhere after ebac513.

At ebac513 I got it to happen twice out of 70 tests. The next four commits had an issue that caused the model to not compile. The next commit to compile, b7e981d, the issue is readily reproducible and values shift slightly almost every test iteration.

I'm still tracking down the first commit to show the issue. This is what I've tested so far ... means there's commits between the two shown that I haven't tested.

b7e981d - almost every iteration different
708faf5 - (Compile failure)
74b00e3 - (Compile failure)
cc4f31d - (Compile failure)
2fac346 - (Compile failure)
ebac513 - two out of 70 iterations different
...
de3ebd3 - 5 out of 159 iterations different
...
7316d28 - 19 out of 94 iterations different
...
cd7576e - 6 out of 93 iterations different
...
a70c43a - 1 out of 40 iterations different
...
ec91b39 - 1 out of 61 iterations different
d2f4797 - 1 out of 26 iterations different
38b16b0 - 1 out of 43 iterations different
dea431d - stable for 200 iterations
...
c55a536 - stable for 39 iterations
...
4e1c970 - (4.0.0) stable for over 200 iterations

…s and some rules in zhigh-to-onnx pass The onnx-to-zhigh pass has two phases: 1) converting multiple onnx ops into a single zhigh op, and 2) converting a single onnx op to a single zhigh op, where the second phase uses DimAnalysis (Patterns in the 1st phase at this moment does not use DimAnalysis) The problem is DimAnalysis is currently called before the 1st phase, which is not good because the 1st phase may change the IR so the information from DimAnalysis is obsoleted to the 2nd phase. Correct position for DimAnalysis would be just before the 2nd phase. Other than that, this PR changes slightly the rules in zhigh-to-onnx pass so that for binary ops, only one input (instead of two) that is from stick would be enough to trigger the rule to convert a zhigh op back to an onnx op. Resolves onnx#2789 --------- Signed-off-by: Tung D. Le <[email protected]> (cherry picked from commit 80a63f2) Signed-off-by: Charles Volzka <[email protected]>

…s and some rules in zhigh-to-onnx pass (#2797) The onnx-to-zhigh pass has two phases: 1) converting multiple onnx ops into a single zhigh op, and 2) converting a single onnx op to a single zhigh op, where the second phase uses DimAnalysis (Patterns in the 1st phase at this moment does not use DimAnalysis) The problem is DimAnalysis is currently called before the 1st phase, which is not good because the 1st phase may change the IR so the information from DimAnalysis is obsoleted to the 2nd phase. Correct position for DimAnalysis would be just before the 2nd phase. Other than that, this PR changes slightly the rules in zhigh-to-onnx pass so that for binary ops, only one input (instead of two) that is from stick would be enough to trigger the rule to convert a zhigh op back to an onnx op. Resolves #2789 --------- (cherry picked from commit 80a63f2) Signed-off-by: Tung D. Le <[email protected]> Signed-off-by: Charles Volzka <[email protected]> Co-authored-by: Tung D. Le <[email protected]>

tungld mentioned this issue Apr 11, 2024

Fixing the location of DimAnalysis in onnx-to-zhigh pass and some rules in zhigh-to-onnx pass #2794

Merged

cjvolzka closed this as completed in #2794 Apr 11, 2024

cjvolzka mentioned this issue Apr 11, 2024

[Cherry-pick] Fixing the location of DimAnalysis in onnx-to-zhigh pas… #2797

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output of roberta models varies between NNPA compiles #2789

Output of roberta models varies between NNPA compiles #2789

cjvolzka commented Apr 5, 2024

cjvolzka commented Apr 8, 2024 •

edited

Loading

Output of roberta models varies between NNPA compiles #2789

Output of roberta models varies between NNPA compiles #2789

Comments

cjvolzka commented Apr 5, 2024

cjvolzka commented Apr 8, 2024 • edited Loading

cjvolzka commented Apr 8, 2024 •

edited

Loading