Unification of BF16 enablement process #31034
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
Function optimization
PR changes
Others
Describe
This PR unifies the addition of bfloat16 so that the process of adding quantize and dequantize reorders is similar to the quantization process. This way, pass
cpu_quantize_squash_pass.cc
can also be used in the bfloat16 process.Therefore, the following changes have been added:
cpu_bfloat16_pass.cc
- adds quantize and dequantize before and after each operator which is marked as bfloat16cpu_quantzie_squash_pass.cc
dequantize
andquantize, then inserts
requantize` if scales are different or removes both operators when scale is the same (the latter is the most common in the case of bfloat16)force _fp32_output
attributeAdditionally, to
cpu_quantzie_squash_pass.cc
has been added: