-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add design doc for onnx convertor #9296
Conversation
Nice start! Let's keep the momentum going. Things I would love for this document to address before focusing on software design details like how to organize modules and commands to run:
Also, I didn't hear @wangkuiyi say we needed bi-directional conversion. Can we just focus on fluid to ONNX for now? ONNX to Fluid seems a little unimportant right now. Kindly correct me if I am wrong. |
And perhaps a minimal set of models for support for v1. |
Update: I read this #9108 again and it seems like being "ONNX compatible" is a focus. I still think we should first aim to finish fluid-to-ONNX first. But yes, this doc can contain design considerations for onnx-on-fluid also. |
So I am sharing some info on behalf of @sidgoyal78 and myself. We did our part of thinking and reviewing of ONNX tools and supported framework utils. Here are a few notes we made during the process:
|
@varunarora Thanks for your valuable comments and we are totally agree with your suggestions. @kuke is focus on remain sections and please feel free to add your opinions. @kuke Could you invite @varunarora and @sidgoyal78 be the co-contributor for this PR?
|
Hi @varunarora and @sidgoyal78, I have invited you as the collaborators of this pull request (https://github.com/kuke/Paddle/tree/onnx). Please feel free to edit this draft directly if you have any ideas. Let's complete this design doc together ASAP. Finally we invite others to have a review. |
Me and @sidgoyal78 are on the same page! Let's make it happen. |
|
||
* **fluid**: Contain wrappers for fluid related APIs. Fluid has provided some low-level APIs to parse or generate the inference model. However, directly using these low-level APIs makes the code tediously long. This module wraps low-level APIs to provide simplied interfaces. | ||
|
||
* **onnx**: ONNX uses proto file to save computation flow and model weights. This module is responsible for parsing and generating ONNX binary model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generating ONNX binary model. -> generating a ONNX binary model.
|
||
* **onnx**: ONNX uses proto file to save computation flow and model weights. This module is responsible for parsing and generating ONNX binary model. | ||
|
||
* **onnx_fluid**: Concepts in fluid like program, block etc. haven't direct corresponding concepts in ONNX. Even that both contains operator concept, for many operators adaption is also necessary. This module is the most important module responsible for acutal converting. Adaption for different level concepts should be provided like fluid program/block to ONNX graph, fluid operators to ONNX operators etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be as follows:
Concepts in fluid like program, block etc. do not have any direct corresponding concepts in ONNX. Even though both contain the operator concept, for many operators adaptation is also necessary. This module is the most important module and is responsible for the actual conversion. Adaption for different level of concepts should be provided like fluid program/block to ONNX graph, fluid operators to ONNX operators etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank yo for the PR. There are some grammatical issues. Please fix them before merging.
@abhinavarora : I think this was a WIP pull request, so naturally, it isn't complete as of now. |
@kuke I think we should be ready for a review here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor comments. Yeah, I think we can invite @wangkuiyi to review now.
|
||
Therefore, it is necessary to enable the conversion between PaddlePaddle and ONNX. This design doc is aimed at implementing a convertor, mainly for converting between **Fluid** models and ONNX (it is very likely that we may support older v2 models in the future). A complete convertor should be bidirectional - with a frontend AND a backend, but considering the importance, the we will start with the frontend i.e. Fluid models to ONNX models. | ||
|
||
One thing that makes it doable in Fluid's case is the use of a static IR - the `ProgramDesc` - as opposed to a dynamic graph, as created in the cases of frameworks like PyTorch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we move this line to the end of How it works
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
``` | ||
python convert.py --fluid_model <fluid inference model> --onnx_model <ONNX model> validate True | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to separate conversion and validation, and we did in the main repo. So line 48~52 should be like:
- Convert Fluid inference model to ONNX binary model
python convert.py --fluid_model <fluid inference model> --onnx_model <ONNX model>
- Validate the converted model
python validate.py --fluid_model <fluid inference model> --onnx_model <ONNX model>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha yes, I caught this too but let it pass since this was a design doc. Fixed now :)
@varunarora Thanks for the update. |
* Convert Fluid inference model to ONNX binary model | ||
|
||
``` | ||
python convert.py --fluid_model <fluid inference model> --onnx_model <ONNX model> validate True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert.py => fluid_to_onnx.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we will change the script name in main repo, then here. There are some tiny inconsistencies in implementation already, so later we'll modify this design doc again.
The conversion and model validation will be completed consecutively, finally output a readable model structure description. And for the converse conversion, users only need to exchange the input and output. | ||
|
||
|
||
# Challenges and mitigation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great points. Let's move a step forward -- after this PR is merged, let us consider to list some models (maybe from the Paddle Book) which could be trained and exported to the ONNX format, and doing an application-driving work, which should allow us tracking the progress easily. Also, I suppose that the progress would reveal more compatibility issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We have listed some models to be supported at first stage, and things are going as the plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Let's merge it first and refine it later.
* Convert Fluid inference model to ONNX binary model | ||
|
||
``` | ||
python convert.py --fluid_model <fluid inference model> --onnx_model <ONNX model> validate True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we will change the script name in main repo, then here. There are some tiny inconsistencies in implementation already, so later we'll modify this design doc again.
The conversion and model validation will be completed consecutively, finally output a readable model structure description. And for the converse conversion, users only need to exchange the input and output. | ||
|
||
|
||
# Challenges and mitigation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We have listed some models to be supported at first stage, and things are going as the plan.
* add flash_attention on model chatglm_v2 * add flash attn and consider sequence parallel --------- Co-authored-by: huxinye <2392038429>
Resolve #9297