-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AVX512 support for CRC32c implementation (on Intel platforms) #68
Closed
Closed
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
c6248e0
Added CRC32C AVX512 support.
javazque cf22bca
Fixed routine name to indicate crc32c
pbadari def3a68
Merge pull request #1 from pbadari/avx512_support
pbadari 375fa35
Add sse42 avx512_intrinsics support
javazque 9e18b50
Merge pull request #2 from pbadari/sse42_avx512_intrinsics
pbadari 469ef71
Merge branch 'main' into main
JonathanHenson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the inline check correct now? this pulled the inline asm and we really want to check that the intrinsics are available. IIRC there's instances where the assembler installed in the path is newer than the gcc version, would that drift slip through here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Work as expected. We tested on CentOS 8 using gcc 13.1 as 2.36 and Ubuntu 23.04 using gcc 12.2 as 2.4
-- Performing Test AWS_ARCH_INTEL - Success
-- Performing Test AWS_ARCH_ARM64
-- Performing Test AWS_ARCH_ARM64 - Failed
-- Performing Test AWS_ARCH_ARM32
-- Performing Test AWS_ARCH_ARM32 - Failed
-- Performing Test AWS_HAVE_GCC_INLINE_ASM
-- Performing Test AWS_HAVE_GCC_INLINE_ASM - Success
Regarding you question the answer yes, but we can add some checks for intrinsics.
If this doesn't answer your question, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about this one, would it help?
https://github.com/awslabs/aws-c-common/blob/87a1b565caf8a0cbb560c475f8529a3632263afd/cmake/AwsSIMD.cmake#L38
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not at all. We will need support for SSE42 and AVX512 intrinsics since AVX2 only supports 256-bit wide vector registers. Should we add sse42 and avx512 intrinsic support to that cmake file or directly to aws checksum CMakeLists.txt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to add it to aws-c-common so it can be used more broadly, and so we don't wind up with these scatter shot everywhere. Unfortunately that means we need to coordinate merges now. I'll go ahead and file the PR for aws-c-common and post it here. In the meantime we have some build-chain wiring to sort out anyways and we can get going on that.
For future readers of this, I suspect this might cause some Rust build issues as I doubt AVX512 is turned on by default in cargo and those build flags are usually global to the build process. So if your linker is complaining about it, try making sure AVX512 is turned on in cargo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awslabs/aws-c-common#1041
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awslabs/aws-c-common#1041
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awslabs/aws-c-common#1041
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awslabs/aws-c-common#1041
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re rust build questions: AVX512 should only be a codegen-time thing, I wouldn't expect it to depend on anything configured on the linker.