[XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #18616

akhilgoe · 2024-10-22T19:12:59Z

This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases.

penpornk

Thank you for the PR and sorry for the delay!

penpornk · 2024-11-05T15:45:49Z

xla/service/cpu/onednn_contraction_rewriter.h

+using ContractionVariant = std::variant<PrimitiveTrait<kOnednnConvConfig>,
+                                        PrimitiveTrait<kOnednnMatmulConfig>>;
+using FusionsConfigPointer = xla::cpu::OneDnnFusionConfig*;
+using OptimizationsConfigPointer = xla::cpu::OneDnnOptimizationConfig*;


Per Google's C++ Style Guide, we avoid making aliases in header if their purpose is only for convenience. We only have aliases in header when it's intended to be part of the API (e.g., for users to use).

I think we can keep ContractionVariant, but maybe rename it to OneDnnContractionVariant (or put it in a onednn namespace) to make it clear this is specific to oneDNN.

FusionsConfigPointer and OptimizationsConfigPointer seem trivial and shouldn't be declared here.

(Aliases are fine in .cc files.)

penpornk · 2024-11-05T15:52:46Z

xla/service/cpu/onednn_contraction_rewriter.cc

+      OneDnnFusionConfig_FusionKind kind = OneDnnFusionConfig::UNDEFINED;
+      kind =
+          (addend->shape().rank() == 1)
+              ? (fusions_config->ops().empty() ? OneDnnFusionConfig::BIAS
+                                               : OneDnnFusionConfig::UNDEFINED)
+              : OneDnnFusionConfig::BINARY_ADD;


Nit: Lines 715-718 can be put directly in the initialization in line 713.

penpornk · 2024-11-05T15:53:53Z

xla/service/cpu/onednn_contraction_rewriter.cc

+      kind =
+          (addend->shape().rank() == 1)
+              ? (fusions_config->ops().empty() ? OneDnnFusionConfig::BIAS
+                                               : OneDnnFusionConfig::UNDEFINED)


Does this mean the code doesn't support having 1D addend (bias) and fused ops at the same time? Please document this as a comment in the code.

This should ensure that 1D addends are fused only when they are the first post-op operation. So the following should be acceptable:

Bias (1D) + <0 or more non-add post-op>

Bias (1D) + <0 or more non-add post-op> + Add (non-1D) + <0 or more non-add post-op>

Following is not allowed:

Bias (1D) + <0 or more non-add post-op> + Add (1D) + <0 or more non-add post-op>

This condition was added a few months ago because at the time oneDNN had optimized implementations for broadcasted add operations across certain dimensions only. As a result, some cases defaulted to the ref implementation, which significantly impacted performance.
We can re-evaluate this restriction with the latest oneDNN release and/or relax this a bit by blocking only those 1D cases where broadcasting occurs along some specific dimensions.

akhilgoe · 2024-11-06T02:11:52Z

Hi @penpornk, thanks for the review! I have added a commit to address your comments. Please take a look whenever possible. Thanks!

penpornk

Thank you for the changes and I'm very sorry for the delay!

…ith oneDNN primitives Imported from GitHub PR #18616 This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases. Copybara import of the project: -- 68bcdf8 by Akhil Goel <[email protected]>: Refactor Add Handler -- 462890b by Akhil Goel <[email protected]>: Address review comments Merging this change closes #18616 FUTURE_COPYBARA_INTEGRATE_REVIEW=#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d PiperOrigin-RevId: 702800208

…ith oneDNN primitives Imported from GitHub PR openxla/xla#18616 This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases. Copybara import of the project: -- 68bcdf81a47fb0f753d837c034931094c5cd8017 by Akhil Goel <[email protected]>: Refactor Add Handler -- 462890bb75f2fcea3fdc5966bfa7a2b8f94b255a by Akhil Goel <[email protected]>: Address review comments Merging this change closes #18616 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d1f5029bc53d7ad55e27e1ab5630a736a1 PiperOrigin-RevId: 702800208

FUTURE_COPYBARA_INTEGRATE_REVIEW=#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d PiperOrigin-RevId: 702995406

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d1f5029bc53d7ad55e27e1ab5630a736a1 PiperOrigin-RevId: 702995406

…ith oneDNN primitives Imported from GitHub PR openxla/xla#18616 This PR refactors the code that fuses add operation to matmul / convolution primitives. It removes usage of macros and separate templatized handlers for matmul and convolution cases. Copybara import of the project: -- 68bcdf81a47fb0f753d837c034931094c5cd8017 by Akhil Goel <[email protected]>: Refactor Add Handler -- 462890bb75f2fcea3fdc5966bfa7a2b8f94b255a by Akhil Goel <[email protected]>: Address review comments Merging this change closes #18616 PiperOrigin-RevId: 703054496

Refactor Add Handler

68bcdf8

loislo requested a review from penpornk October 22, 2024 23:53

penpornk suggested changes Nov 5, 2024

View reviewed changes

Address review comments

462890b

akhilgoe mentioned this pull request Nov 7, 2024

[XLA:CPU][oneDNN][Stale] Add post-ops for oneDNN Convolutions #17855

Closed

Merge branch 'main' into akhil/conv_fusions_3_c

7e1082d

akhilgoe mentioned this pull request Nov 25, 2024

[XLA:CPU][oneDNN] Add post-ops for oneDNN Convolutions #19099

Closed

penpornk approved these changes Dec 4, 2024

View reviewed changes

copybara-service bot mentioned this pull request Dec 5, 2024

PR #18616: [XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #20200

Merged

copybara-service bot mentioned this pull request Dec 5, 2024

PR #18616: [XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives tensorflow/tensorflow#82311

Merged

copybara-service bot closed this in c6d47e1 Dec 5, 2024

copybara-service bot pushed a commit that referenced this pull request Dec 5, 2024

Adding dumping functionality for HloUnoptimizedSnapshot.

ecd3ff5

FUTURE_COPYBARA_INTEGRATE_REVIEW=#18616 from Intel-tensorflow:akhil/conv_fusions_3_c 7e1082d PiperOrigin-RevId: 702995406

copybara-service bot mentioned this pull request Dec 5, 2024

Adding dumping functionality for HloUnoptimizedSnapshot. #20185

Merged

copybara-service bot mentioned this pull request Dec 5, 2024

Adding dumping functionality for HloUnoptimizedSnapshot. tensorflow/tensorflow#82280

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #18616

[XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #18616

akhilgoe commented Oct 22, 2024

penpornk left a comment

penpornk Nov 5, 2024

akhilgoe Nov 6, 2024

penpornk Nov 5, 2024

akhilgoe Nov 6, 2024

penpornk Nov 5, 2024

akhilgoe Nov 6, 2024

akhilgoe commented Nov 6, 2024

penpornk left a comment

[XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #18616

[XLA:CPU][oneDNN] Refactor code that fuses Add operation with oneDNN primitives #18616

Conversation

akhilgoe commented Oct 22, 2024

penpornk left a comment

Choose a reason for hiding this comment

penpornk Nov 5, 2024

Choose a reason for hiding this comment

akhilgoe Nov 6, 2024

Choose a reason for hiding this comment

penpornk Nov 5, 2024

Choose a reason for hiding this comment

akhilgoe Nov 6, 2024

Choose a reason for hiding this comment

penpornk Nov 5, 2024

Choose a reason for hiding this comment

akhilgoe Nov 6, 2024

Choose a reason for hiding this comment

akhilgoe commented Nov 6, 2024

penpornk left a comment

Choose a reason for hiding this comment