Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Conversion from FP32 to Mixed Precision Models #14584

Closed
anirudh2290 opened this issue Apr 1, 2019 · 5 comments · Fixed by #15118
Closed

Conversion from FP32 to Mixed Precision Models #14584

anirudh2290 opened this issue Apr 1, 2019 · 5 comments · Fixed by #15118

Comments

@anirudh2290
Copy link
Member

anirudh2290 commented Apr 1, 2019

API Addition

Users want to bring a FP32 model and convert it to a Mixed precision model to run inference on it. They want to use the model zoo to convert pretrained models in Python and other frontends. They can do this with gluon models today by casting the inputs and the blocks but the same cannot be done for symbolic models (json and params). Proposing to add an API to convert FP32 models to FP16.

Considering the recent AMP work in progress here: #14173, I think we should add a conversion API to FP16 model under AMP namespace:

amp.convert_model(sym, arg_params, aux_params, target_dtype="float16", 
                  target_precision_ops=None, original_precision_ops=None,
                  widest_precision_ops=None, excluded_sym_names=None)

With the target_precision_ops, original_precision_ops and widest_precision_ops, users should be able to override the default in the amp lists.

Backend Changes

Additionally, Add a NNVM pass for the backend. This would by default use the amp lists for FP16, FP32 and widest type casts to use FP16 or FP32 inputs.
This pass will perform graph traversal and adding amp_cast and amp_multicast layers for FP16 and FP32 ops.

Planning to start working on the POC unless someone is already working on this.

@ptrendx @DickJC123 @pengzhao-intel @ZhennanQin @eric-haibin-lin @Caenorst

EDIT: Proposal posted on dev list: https://cwiki.apache.org/confluence/display/MXNET/Conversion+from+FP32+to+Mixed+Precision+Models

@mxnet-label-bot
Copy link
Contributor

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: Feature

@ZhennanQin
Copy link
Contributor

Hi @anirudh2290, It's really good to have AMP supported for symbolic models.

One thing I want to mention is, we want to let AMP support all kinds of low precision float, not only FP16. For the API itself, I think it's OK as target_dtype and op list are all configurable. For the NNVM pass implementation, we wish to avoid hard code FP16 and make it easy extend to support other low precision float, e.g. BF16.

@anirudh2290
Copy link
Member Author

@ZhennanQin Thanks for the suggestion! Yes the plan is tho keep the NNVM pass target_dtype agnostic and allow for easy extension for BF16. In future, expectation is amp_cast and amp_multicast ops should support BF16 apart from FP16 and there should be corresponding lists like these for (https://github.com/apache/incubator-mxnet/pull/14173/files#diff-b79bfa3e02355c43ca5b195ef67172a5R21) BF16 too.

@AnaRhisT94
Copy link

Hi,
according to this:
https://discuss.mxnet.io/t/load-params-from-symbol-models-to-gluon/804
You can load .params and .json into Gluon model.
Then, can we use the conversion from FP32 to FP16. right?

Also, would love to know if you quickly guide me how to cast gluon models into fp16.

@anirudh2290
Copy link
Member Author

@AnaRhisT94 yes you can do that if you want to run your entire model in FP16 precision. You can do that by casting your inputs to FP16 and casting your params to FP16. You can look at the FP16 tutorial on how to do this: http://mxnet.incubator.apache.org/versions/master/faq/float16.html?highlight=mixed#using-the-gluon-api . This particular AMP feature would help for situations where you want to run specific layers in FP16 while others like softmax in FP32 and also want to be able to select which layers to run in FP16 versus FP32.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants