-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #18616
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR and sorry for the delay!
using ContractionVariant = std::variant<PrimitiveTrait<kOnednnConvConfig>, | ||
PrimitiveTrait<kOnednnMatmulConfig>>; | ||
using FusionsConfigPointer = xla::cpu::OneDnnFusionConfig*; | ||
using OptimizationsConfigPointer = xla::cpu::OneDnnOptimizationConfig*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per Google's C++ Style Guide, we avoid making aliases in header if their purpose is only for convenience. We only have aliases in header when it's intended to be part of the API (e.g., for users to use).
I think we can keep ContractionVariant
, but maybe rename it to OneDnnContractionVariant
(or put it in a onednn
namespace) to make it clear this is specific to oneDNN.
FusionsConfigPointer
and OptimizationsConfigPointer
seem trivial and shouldn't be declared here.
(Aliases are fine in .cc files.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
OneDnnFusionConfig_FusionKind kind = OneDnnFusionConfig::UNDEFINED; | ||
kind = | ||
(addend->shape().rank() == 1) | ||
? (fusions_config->ops().empty() ? OneDnnFusionConfig::BIAS | ||
: OneDnnFusionConfig::UNDEFINED) | ||
: OneDnnFusionConfig::BINARY_ADD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Lines 715-718 can be put directly in the initialization in line 713.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
kind = | ||
(addend->shape().rank() == 1) | ||
? (fusions_config->ops().empty() ? OneDnnFusionConfig::BIAS | ||
: OneDnnFusionConfig::UNDEFINED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean the code doesn't support having 1D addend (bias) and fused ops at the same time? Please document this as a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should ensure that 1D addends are fused only when they are the first post-op operation. So the following should be acceptable:
- Bias (1D) + <0 or more non-add post-op>
- Bias (1D) + <0 or more non-add post-op> + Add (non-1D) + <0 or more non-add post-op>
Following is not allowed:
- Bias (1D) + <0 or more non-add post-op> + Add (1D) + <0 or more non-add post-op>
This condition was added a few months ago because at the time oneDNN had optimized implementations for broadcasted add operations across certain dimensions only. As a result, some cases defaulted to the ref implementation, which significantly impacted performance.
We can re-evaluate this restriction with the latest oneDNN release and/or relax this a bit by blocking only those 1D cases where broadcasting occurs along some specific dimensions.
Hi @penpornk, thanks for the review! I have added a commit to address your comments. Please take a look whenever possible. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes and I'm very sorry for the delay!
…ith oneDNN primitives Imported from GitHub PR #18616 This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases. Copybara import of the project: -- 68bcdf8 by Akhil Goel <[email protected]>: Refactor Add Handler -- 462890b by Akhil Goel <[email protected]>: Address review comments Merging this change closes #18616 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d PiperOrigin-RevId: 702800208
…ith oneDNN primitives Imported from GitHub PR openxla/xla#18616 This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases. Copybara import of the project: -- 68bcdf81a47fb0f753d837c034931094c5cd8017 by Akhil Goel <[email protected]>: Refactor Add Handler -- 462890bb75f2fcea3fdc5966bfa7a2b8f94b255a by Akhil Goel <[email protected]>: Address review comments Merging this change closes #18616 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d1f5029bc53d7ad55e27e1ab5630a736a1 PiperOrigin-RevId: 702800208
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d1f5029bc53d7ad55e27e1ab5630a736a1 PiperOrigin-RevId: 702995406
…ith oneDNN primitives Imported from GitHub PR openxla/xla#18616 This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases. Copybara import of the project: -- 68bcdf81a47fb0f753d837c034931094c5cd8017 by Akhil Goel <[email protected]>: Refactor Add Handler -- 462890bb75f2fcea3fdc5966bfa7a2b8f94b255a by Akhil Goel <[email protected]>: Address review comments Merging this change closes #18616 PiperOrigin-RevId: 703054496
This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases.