-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "simd_support" feature #1056
Conversation
This is necessary, because support for `__m64` was removed from nighly Rust [1]. Fixes rust-random#1050. [1] rust-lang/stdarch#823
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
The specialized instructions are already available with `sse2`.
#[cfg(not(target_feature = "sse4.2"))] | ||
#[cfg(not(target_feature = "sse2"))] | ||
wmul_impl! { (u16x8, u32x8),, 16 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this is a 256-bit op and thus requires AVX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think packed_simd provides a fallback implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it?
Can you at least clarify on what basis you decided to make this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without 128-bit XMM multiplications, packed_simd falls back to a scalar implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, why do we bother with these cfg
selectors at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is to provide a specialized implementation that uses less instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you're saying the use of a SSE4.2 target originally was a mistake?
Maybe it was, since the insrtuctions appear to be supported on SSE2: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_mullo_epi16&expand=3967,3990
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appears to be a mistake. I guess that was my bad
Assuming @dhardy is satisfied, I'm going ahead and merge this. |
No description provided.