-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use gemm to replace matmul + add #234
Conversation
continue; | ||
} | ||
|
||
// Gemm only support Matrix, need to check the shape of MatMul and Add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gemm only support Matrix, need to check the shape of MatMul and Add [](start = 7, length = 67)
if mat_mul is [K] * [K, N], should be able to update the shape as [1, K] * [K, N], and make it work for gemm. will update this. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to revert this change as the data is not aware of this.
In reply to: 243648189 [](ancestors = 243648189)
Regarding the versions do you think it's better to use the opset version specified in the model rather than hardcoding it to 9 or 10? #Resolved |
Sometimes the version impacts the rule we write in the fusion. For example, after add-v7, the broadcast is judged directly from the input shape, but before that version, there's a attribute to specify whether it's a broadcast or not. Things like this will impact how we write the code. So, we need to clearly specify the version we are supporting. If there's a new version of the ops, we need to check the changes in the op to see whether the fusion still applies, or do some code changes accordingly. In reply to: 449459404 [](ancestors = 449459404) |
|
||
namespace onnxruntime { | ||
|
||
class MatMulAddFusion : public onnxruntime::GraphTransformer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we try to implement this as a rewrite rule?
It is a pity to traverse the whole graph for these local transformations.
You can check the identity elimination as well as the the PR here. It is a bit outdated but I will rebase it most probably today.
} | ||
|
||
auto next_node_itr = node->OutputNodesBegin(); | ||
if (next_node_itr == node->OutputNodesEnd()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious: do we've to check for this condition if we've already checked for node->GetOutputEdgesCount() != 1
? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…at it can work for Gemm
… as shape [1, K]*[K, N], this may cause runtime failure, as the we can't change input data shape.
…o [1, K]*[K, N]. It enables fuse Matmul + Add to Gemm, but the issue is the data is not aware of this, so the data shape is still [K]*[K, N] and cause runtime issue.
2. Update Gemm so that we can fuse Matmul [K] * [K, N] + Add [1, N] into Gemm with shape [1,K] * [K, N] + [1, N]
…into hecli/gemm Conflicts: onnxruntime/core/graph/matmul_add_fusion.cc
# Conflicts: # onnxruntime/contrib_ops/contrib_kernels.cc # onnxruntime/core/providers/cpu/nn/conv.cc
No description provided.