-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design doc of compile time register gradient operators #4517
Design doc of compile time register gradient operators #4517
Conversation
52fec02
to
12713e9
Compare
doc/design/register_grad_op.md
Outdated
|
||
## Problem | ||
|
||
Since we separate users program in two stages, compile time and runtime, we should record and look up the mapping relationship between an operator and its gradient operators when compile. However, we register this relationship in runtime by these `OpInfo` fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The execution of a neural network topology in PaddlePaddle is separated into two stages, complie-time and run-time.
At complie-time, a ProgramDesc will be generated. At run-time, the ProgramDesc will be executed on specific hardware. We can refer to the design of computation-graphs.
The Gradient Operator's OpDesc is also generated at compile-time. We have to find the mapping relationship between Operator's OpDesc and its GradientOp's OpDesc.
However, we make the mapping relationship between Operator and its GradientOp at run-time in OpInfo
class currently.
doc/design/register_grad_op.md
Outdated
|
||
## Problem | ||
|
||
Since we separate users program in two stages, compile time and runtime, we should record and look up the mapping relationship between an operator and its gradient operators when compile. However, we register this relationship in runtime by these `OpInfo` fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Problem Posed
In our current operator registration mechanism, for each operator, the programmer should register a gradient operator creator function, which takes a C++ operator instance, and returns the corresponding gradient instance.
However, as we decided to separate the compilation and execution of DL models, we need to reshape the creator to take a protobuf OpDesc
message, and returns a corresponding message.
More than that, the new registration mechanism need to support the fact that an operators' gradient computation might be a composition of operators.
No description provided.