-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CPU][ARM][x64]Snippets MatMul via brgemm emitter and executor #28304
base: master
Are you sure you want to change the base?
[CPU][ARM][x64]Snippets MatMul via brgemm emitter and executor #28304
Conversation
a5b829d
to
6ca4f1b
Compare
982e2c2
to
6e05cb1
Compare
@a-sidorova, Could you please review as well, as you are reviewing #28229. The test cases passed on arm for snippets MatMul. Thank you! |
src/plugins/intel_cpu/src/transformations/tpp/aarch64/pass/lowered/brgemm_tpp_blocking.cpp
Outdated
Show resolved
Hide resolved
void jit_brgemm_emitter::emit_impl(const std::vector<size_t>& in, const std::vector<size_t>& out) const { | ||
validate_arguments(in, out); | ||
std::unordered_set<size_t> exclude = {}; | ||
store_context(exclude); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that we will merge #27391 soon. This PR efficently provides efficient work with reg spills - we will able to spill only needed (live) registers
Just for information and to align with other our activities 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please create the ticket to support optimized reg spills in jit_brgemm emitter on arm and leave the comment with todo
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CVS-162498 is created and comment is added in code, thanks Alexandra!
if (ENABLE_SNIPPETS_LIBXSMM_TPP) | ||
ov_add_compiler_flags(-Wno-missing-declarations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate why you need to add this flag? Can we avoid it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to suppress "warning as error" in compile libxsmm. Otherwise there are compile error such as "intel_cpu/thirdparty/libxsmm/src/generator_common_aarch64.c:60:6: error: no previous declaration for ‘libxsmm_generator_vcvt_f32i8_aarch64_sve’ [-Werror=missing-declarations]"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please leave a comment that we have to use this flag to avoid thirdparty's compilation errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment is added!
src/plugins/intel_cpu/src/emitters/tpp/aarch64/kernel_executors/brgemm.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/emitters/tpp/aarch64/kernel_executors/brgemm.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/emitters/tpp/aarch64/kernel_executors/brgemm.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/emitters/tpp/aarch64/kernel_executors/brgemm.hpp
Outdated
Show resolved
Hide resolved
6e05cb1
to
96e274c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more major comments from my side 👍🏼
LGTM
return BrgemmGenericKernelConfig::operator==(rhs) && | ||
(get_static_params() == rhs.get_static_params() || | ||
*get_static_params() == *(rhs.get_static_params())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_compile_flags is the info combined from m_static_params->m_compile_flags and beta
Oh, I see. Then I agree - beta
is already handled. Thank you for the explanation!
ce991bb
to
2a7d14e
Compare
src/plugins/intel_cpu/src/emitters/tpp/aarch64/jit_brgemm_emitter.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/emitters/tpp/common/kernel_executors/brgemm.cpp
Outdated
Show resolved
Hide resolved
gemm_p.a.primary = in1; | ||
gemm_p.b.primary = in0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in0
and in1
look like mixed up: I'd say that A input should be in0
, not in1
. However, in x64 impl, there is the same situation... Do you have any idea why it is done in this way?
@IvanNovoselov or maybe you have a secret TPP knowledge why we form runtime args in such way? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data is row major
in ov, MatMul in libxsmm assume data is column major
. Exchange in0
and in1
could avoid data repack. The M/N, lda/ldb and in0/in1 precisions are also exchanged in libxsmm_create_gemm_shape()
. @IvanNovoselov could you confirm it or correct me if I misunderstand it. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, got it, thanks for the explanation! Maybe we could leave an explanatory comment then to avoid potential questions in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment added!
src/plugins/intel_cpu/src/emitters/tpp/common/kernel_executors/brgemm.hpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/emitters/tpp/common/kernel_executors/brgemm.hpp
Outdated
Show resolved
Hide resolved
ce3e097
to
5411abb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
gemm_p.a.primary = in1; | ||
gemm_p.b.primary = in0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, got it, thanks for the explanation! Maybe we could leave an explanatory comment then to avoid potential questions in the future?
Vladislav comments apply-2
5411abb
to
3560cd9
Compare
Details:
Tickets: