-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Executor interface design and implementation #4537
Conversation
70663eb
to
714148a
Compare
714148a
to
d4be973
Compare
paddle/framework/executor.cc
Outdated
return new ExecutorImpl(GetScope(), GetDevice(place), &pdesc, is_linear); | ||
} | ||
|
||
void ExecutorImpl::Run() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one of the most important thing for the executor is Run should be thread-safe (e.g., ok to do concurrent Runs). This is a must for inferencing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. We must allow doing concurrent Runs in inference with only one copy of parameters in memory.
I am thinking whom and where to do parameters loading/saving. Our topology can be serialized to ProgramDesc
, and what will the parameters serialized.
paddle/framework/executor.cc
Outdated
|
||
ProgramDescView* ProgramDescView::Create(bool is_linear) { | ||
if (is_linear) { | ||
return new LinearListView(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need LinearListView
since we already have GraphView
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LinearListView
organizes the topology in a linearlist, and operators will be executed sequentially.GraphView
organizes the topology in a Graph, and further optimization can be applied based on Graph structure.
I think that we can haveLinearListView
at now, maybeGraphView
can be implemented later.
paddle/framework/executor.cc
Outdated
class LinearListView; | ||
class GraphView; | ||
|
||
// Immutable view of a ProgramDesc organized for efficient execution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been implemented by framework/op_desc.h
paddle/framework/executor.h
Outdated
virtual void Run() = 0; | ||
}; | ||
|
||
Executor* NewLocalExecutor(const platform::Place&, const ProgramDesc&, bool); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename and redesign NewLocalExecutor
into
NewExecutor(const std::vector<Place>& places);
- No need for the
bool optimize
parameter. ProgramDesc
is a parameter toExecutor::Run
.
paddle/framework/executor.h
Outdated
|
||
class Executor { | ||
public: | ||
virtual ~Executor() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@helinwang has a suggestion -- given that the construction of an executor could be expensive -- including the creation of thread pools, it would be reasonable to reuse an executor to run multiple ProgramDesc's.
Therefore we need the constructor:
Executor(const std::vector<Place>& places);
and the Run
method:
virtual void Run(const ProgramDesc& program, Scope* scope);
paddle/operators/feed_op.cc
Outdated
public: | ||
FeedOpMaker(framework::OpProto* proto, framework::OpAttrChecker* op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
AddAttr<int>("data_type", "output data type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure that all Input, Output and Attribute names adhere to the naming convention. https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/name_convention.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/operators/feed_op.cc
Outdated
|
||
PADDLE_ENFORCE_GT(tensors.size(), static_cast<size_t>(col)); | ||
auto in_dim = tensors[col].dims(); | ||
ctx->SetOutputDim("Out", in_dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QiJune should we use dims
attribute to infer output shape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you also add enforce (*tensors)[col].numel == product(dims)
. Chances are a user specifies the wrong the col
.
paddle/operators/fetch_op.cc
Outdated
|
||
auto input_dim = ctx->GetInputDim("Input"); | ||
PADDLE_ENFORCE_GT(tensors->size(), col); | ||
(*tensors)[col].Resize(input_dim); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QiJune Same for fetch op, inferShape according to dims
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then enforce (*tensors)[col].numel == product(dims)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tonyyang-svail
I have a discussion with @jacquesqiao, InferShape will be done in compile-time first. So, InferShape of FeedOp and FetchOp can not use run-time concepts, like GlobalScope.
FeedOp has an attribute dims
to set its output tensor dims. And FecthOp does not need a dims
attribute. dims of Tensors in fetch_result
can be set from FetchOp's input tensor.
paddle/framework/executor.h
Outdated
* @return | ||
* vector<bool> Same size as ops. Indicates whether an op should be run. | ||
*/ | ||
std::vector<bool> Prune(const ProgramDesc& pdesc, int block_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prune
is a high level optimize part, which should be done before executor run. The executor takes a "no redundant op groups".
paddle/framework/executor.cc
Outdated
std::vector<bool> should_run = Prune(pdesc, block_id); | ||
PADDLE_ENFORCE_EQ(should_run.size(), block.ops_size()); | ||
for (size_t i = 0; i < should_run.size(); ++i) { | ||
// if (should_run[i]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No comment out code, please.
paddle/framework/executor.cc
Outdated
PADDLE_ENFORCE_EQ(should_run.size(), block.ops_size()); | ||
for (size_t i = 0; i < should_run.size(); ++i) { | ||
// if (should_run[i]) { | ||
if (true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to add a constant value and assigned with true. Otherwise, this would be a magic value.
paddle/framework/executor_test.cc
Outdated
USE_OP(fill_constant); | ||
USE_OP(sgd); | ||
|
||
using std::string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems useless?
}; | ||
|
||
/* @Brief | ||
* Pruning the graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of Prune
? I'd thought that it needs a target parameter, which could be either a variable or an operator, and it returns a new ProgramDesc that includes only dependent operators. But why doesn't the following code take a target parameter?
explicit Executor(const std::vector<platform::Place>& places); | ||
~Executor(); | ||
|
||
/* @Brief |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think C++ code is the document, and we don't really need to use Doxygen. Therefore, we can write much shorter comments. For this specific case Executor::Run
, I don't even think that it needs a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am approving this PR so it doesn't last too long time. But please consider my comments in #4537 (comment).
In my mind,
-
ProgramDesc shouldn't carry targets because a program includes all the instructions supposed to be executed.
-
The Prune function's signature should be
int/bool Prune( const ProgramDesc* input, const std::vector<std::string>& targets, ProgramDesc* output);
Fix #4523 #4557
Timeline(@tonyyang-svail):
Oct 3rd, Tuesday
vector<Tensor> Executor.Run)
. ItProgramDesc
andScope
Oct 4th, Wednesday
Oct 5th, Thursday
FeedOp
(by @QiJune )FeedOp
toProgramDesc
, so that it will be ran first.FetchOp
(by @QiJune )Prune
. Itconst ProgramDesc& input
(UseFeedOp
to findfeed
, and useFetchOp
to findtarget
.vector<bool>
indicates a op should be run or no.Oct 6th, Friday
Prune
Oct 7th, Saturday
executor_test.cc
Run()
Preprocessing()
Oct 8th, Sunday
Prune
Add Backward ProtoBuf. (Several bugs found on backward There should be a fill_one_like_op as an starting point of backward pass #4627. Fixed)AppendBackward
. Switch toProgramDescBind
.Oct 9th, Monday
Integrate InitOp into. (More discussion needed for where to put init op)ProgramDesc