forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix for mixed precision run and their input parameters
Fixes GPU faults Seen when running Mixed Precision inferences workloads with Bert v1.1 (fp16 + int8 Quant of Conv + MatMul) Was hitting an edge case with mixed precision where the input parameters were not being populated and using uninitizlied values for the parameters which would "work" silently as no issue in inference arise. For bert though, segment_ids is pushed through a gather onnx operator which uses these as an index. Using uninitialized memory made this error such that it was non obvious why we were getting failures between our runs and we saw the issue intermittently between machines/cards/etc. Fixes here are as follows to the MIGraphX Execution Provider -Fp16 quantization after int8 -Additional debug logging for workflow of loading/quantization - Set input/output parameters as seperate run prior to int8 calibration - Set all dynamic data as input parameters for int8 static calibration to be performed with MIGraphX Without these changes models will fail to copy input parameters on mixed precision runs when we decided to quantize as MIGraphX assumes all inputs will be used for calibration not just the input data read in from a calibration table.
- Loading branch information
1 parent
328bc48
commit 0fda866
Showing
1 changed file
with
65 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters