-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/use cudnn #7141
Feature/use cudnn #7141
Conversation
@@ -73,8 +73,7 @@ cc_test(var_type_inference_test SRCS var_type_inference_test.cc DEPS op_registry | |||
cc_library(selected_rows SRCS selected_rows.cc DEPS tensor) | |||
cc_test(selected_rows_test SRCS selected_rows_test.cc DEPS selected_rows) | |||
|
|||
|
|||
cc_library(init SRCS init.cc DEPS gflags device_context place stringpiece) | |||
cc_library(init SRCS init.cc DEPS gflags device_context place stringpiece operator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init does rely on operator
void DummyTrans(const platform::DeviceContext* ctx, | ||
const KernelTypePair& kernel_pair, const Variable& in, | ||
Variable* out) { | ||
PADDLE_ENFORCE(in.IsType<Tensor>(), "Only Support Tensor transform!."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all Tensor
s are actually LoDTensor
s, here should be
in.IsType<LoDTensor>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Will fixed in next PR
#endif | ||
} | ||
|
||
void UseALL() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since UseCUDNN calls UseCUDA
UseCUDA calls UseMKLDNN...
UseALL is no needed. We can call UseCUDNN directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, UseXXX is recursively called previous UseXXX.
But UseALL called UseCUDNN looks odd, just make it more clearly. And this interface should be removed in the future, we should ONLY allow user configure op in the attribute.
if ((actual_kernel_key == candidate_key) || | ||
(kernels.count(candidate_key) && | ||
trans_map.GetNullable(candidate_pair))) { | ||
expected_kernel_key = candidate_key; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default Priority will overwrite user's configuration.
We should strictly obey users' configuration first. If user does not provide a preference, then, we can find a kernel key under the guide of default Priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this does not obey the user configuration first rule. Will fix it in the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cost of DataTrans are different, and the cost from small to large is the following: DataType
, Layout
, CPU<->GPU
. when choosing candidate_key
, these cost should take into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我觉得我们把问题搞复杂了,目前为止只有 CPU <-> GPU的需求,MKLDNNLayout <-> kPlain的需求。下个PR里从op的attribute让用户选是否使用就可以,考虑cost就和tensorflow的cost model一样了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
premature optimization is the root of all evil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this PR will block another, some fix work will be done in another PR together.
Thanks! These fix will be done in #6660 |
No description provided.