Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AVX-512 for SIMD unary operators, softmax #131

Merged
merged 1 commit into from
Apr 28, 2024
Merged

Conversation

robertknight
Copy link
Owner

Support AVX-512 for Exp, Sigmoid, Tanh, Erf, Softmax and other operators which use SIMD-accelerated implementations in rten-vecmath.

The performance benefit is small because these operations are limited by memory bandwidth, but doing this revealed some issues with supporting different instruction sets on the same architecture. Usage of #[inline] and #[target_feature] had to be corrected to ensure that SIMD intrinics are actually inlined reliably. Otherwise the function call overhead negates the benefits of SIMD entirely.

The SimdInt and SimdFloat docs have been updated to explain the rules around inlining and target features.

Support AVX-512 for Exp, Sigmoid, Tanh, Erf, Softmax and other operators which
use SIMD-accelerated implementations in rten-vecmath.

The performance benefit is small because these operations are limited by memory
bandwidth, but doing this revealed some issues with supporting different
instruction sets on the same architecture. Usage of `#[inline]` and
`#[target_feature]` had to be corrected to ensure that SIMD intrinics are
actually inlined reliably. Otherwise the function call overhead negates the
benefits of SIMD entirely.

The `SimdInt` and `SimdFloat` docs have been updated to explain the rules around
inlining and target features.
@robertknight robertknight merged commit 20c77d7 into main Apr 28, 2024
2 checks passed
@robertknight robertknight deleted the avx512-vecmath branch April 28, 2024 07:17
robertknight added a commit that referenced this pull request Apr 29, 2024
Remove the `#[target_feature]` attributes from the generic erf, tanh functions.
Previously they were being compiled with AVX2 instructions even when called from
the fallback (non-AVX2) dispatch.

This change follows the pattern established in
#131, which fixed the issue for Exp,
Sigmoid and Softmax operators.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant