-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable Avx512 #408
Disable Avx512 #408
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #408 +/- ##
=======================================
Coverage 91.55% 91.55%
=======================================
Files 74 74
Lines 12911 12911
=======================================
Hits 11821 11821
Misses 1090 1090
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
This is a neat optimization that would be a shame to remove. I can't reproduce the issue locally because I have an 11th generation intel CPU.
I don't understand here, could you give more details? |
@@ -129,6 +129,7 @@ using AlignedPaddedVec = std::vector<double, AlignedPaddedAllocator<double>>; | |||
#endif | |||
|
|||
#if defined(USE_XSIMD) | |||
/* | |||
#if __AVX512F__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If AVX-512 is disabled, shouldn't __AVX512F__
not be defined, and this section of code would be avoided?
AMD implements AVX-512, but not in its native format. It uses two 256 bit
registers to mimic 512 bit wide register expected by AVX-512 code. When I
am using intel compilers on AMD Epyc processors, `simd_t` has only 4
doubles despite Epyc having AVX-512F support. But the disabled code expects
8 doubles for simd_t. So the code is not compiling on new AMD chips when I
am using Intel compilers.
No issues with gcc though. So, this could be an Intel compiler specific
bug. However Intel compilers still outperform gcc on AMD chips. So this
loss of optimization shouldn't be a big deal.
We can try isolating this specific condition of the Intel compiler + AMD
zen v4 chips. But Intel is also deprecating AVX-512 in their line of chips.
It is going towards high efficiency E-cores in place of high width
registers which consume a lot of power. So I felt it may not be worth the
effort to have multiple branches targeting various scenarios. With most of
the new HPC machines using AMD zen4 chips, this could lead to a lot of
users complaining and more maintenance requests.
We can keep the code commented for the time being or keep the PR open and
do nothing. I am OK either way.
Bharat Medasani
…On Wed, Apr 24, 2024 at 10:36 AM Andrew Giuliani ***@***.***> wrote:
This is a neat optimization that would be a shame to remove. I can't
reproduce the issue locally because I have an 11th generation intel CPU.
The current AVX512 optimization code is not working when compiled with
Intel compilers because xsimd is generating 4 doubles for simd_t when Intel
compiler is used. However the disabled code is expecting 8 doubles.
I don't understand here, could you give more details?
—
Reply to this email directly, view it on GitHub
<#408 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA62VEHSVWOYPBNP45VWGNTY667NFAVCNFSM6AAAAABGVBTSSOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVGEYDGMRVGI>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
thanks for the explanation, I'm ok with merging this PR |
Intel is deprecating AVX512 from their chips. And AMD implementation of AVX512 is a non-native implementation with two 256 bit registers concatenated. The current AVX512 optimization code is not working when compiled with Intel compilers because xsimd is generating 4 doubles for simd_t when Intel compiler is used. However the disabled code is expecting 8 doubles. We can forego this optimization for more portability.