-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for cfg_target_feature
#29717
Comments
The only known issue with this I'm aware of is that it doesn't doesn't use the same mechanism in LLVM to query whether a feature is activated or not. We already maintain our own list of what features are allowed here (e.g. we're not just exposing what LLVM accepts) and then we currently map that down to an LLVM feature name. When testing whether the LLVM feature is activated, however, we should robustly query LLVM via its own internal mechanisms rather than explicitly checking ourselves. |
I opened a separate issue #30462 for the missing |
cfg target_feature -- for github issue searchability as well also a new forum link -- https://internals.rust-lang.org/t/comprehensive-list-of-desired-cfg-target-feature-s/3201 |
So this is a list of the current target features supported by 16bit-mode - 16-bit mode (i8086).
32bit-mode - 32-bit mode (80386).
3dnow - Enable 3DNow! instructions.
3dnowa - Enable 3DNow! Athlon instructions.
64bit - Support 64-bit instructions.
64bit-mode - 64-bit mode (x86_64).
adx - Support ADX instructions.
aes - Enable AES instructions.
atom - Intel Atom processors.
avx - Enable AVX instructions.
avx2 - Enable AVX2 instructions.
avx512bw - Enable AVX-512 Byte and Word Instructions.
avx512cd - Enable AVX-512 Conflict Detection Instructions.
avx512dq - Enable AVX-512 Doubleword and Quadword Instructions.
avx512er - Enable AVX-512 Exponential and Reciprocal Instructions.
avx512f - Enable AVX-512 instructions.
avx512ifma - Enable AVX-512 Integer Fused Multiple-Add.
avx512pf - Enable AVX-512 PreFetch Instructions.
avx512vbmi - Enable AVX-512 Vector Bit Manipulation Instructions.
avx512vl - Enable AVX-512 Vector Length eXtensions.
bmi - Support BMI instructions.
bmi2 - Support BMI2 instructions.
call-reg-indirect - Call register indirect.
clflushopt - Flush A Cache Line Optimized.
clwb - Cache Line Write Back.
cmov - Enable conditional move instructions.
cx16 - 64-bit with cmpxchg16b.
f16c - Support 16-bit floating point conversion instructions.
fast-partial-ymm-write - Partial writes to YMM registers are fast.
fma - Enable three-operand fused multiple-add.
fma4 - Enable four-operand fused multiple-add.
fsgsbase - Support FS/GS Base instructions.
fxsr - Support fxsave/fxrestore instructions.
hle - Support HLE.
idivl-to-divb - Use 8-bit divide for positive values less than 256.
idivq-to-divw - Use 16-bit divide for positive values less than 65536.
invpcid - Invalidate Process-Context Identifier.
lea-sp - Use LEA for adjusting the stack pointer.
lea-uses-ag - LEA instruction needs inputs at AG stage.
lzcnt - Support LZCNT instruction.
mmx - Enable MMX instructions.
movbe - Support MOVBE instruction.
mpx - Support MPX instructions.
mwaitx - Enable MONITORX/MWAITX timer functionality.
pad-short-functions - Pad short functions.
pclmul - Enable packed carry-less multiplication instructions.
pcommit - Enable Persistent Commit.
pku - Enable protection keys.
popcnt - Support POPCNT instruction.
prefetchwt1 - Prefetch with Intent to Write and T1 Hint.
prfchw - Support PRFCHW instructions.
rdrnd - Support RDRAND instruction.
rdseed - Support RDSEED instruction.
rtm - Support RTM instructions.
sahf - Support LAHF and SAHF instructions.
sgx - Enable Software Guard Extensions.
sha - Enable SHA instructions.
slm - Intel Silvermont processors.
slow-bt-mem - Bit testing of memory is slow.
slow-incdec - INC and DEC instructions are slower than ADD and SUB.
slow-lea - LEA instruction with certain arguments is slow.
slow-shld - SHLD instruction is slow.
slow-unaligned-mem-16 - Slow unaligned 16-byte memory access.
slow-unaligned-mem-32 - Slow unaligned 32-byte memory access.
smap - Supervisor Mode Access Protection.
soft-float - Use software floating point features..
sse - Enable SSE instructions.
sse-unaligned-mem - Allow unaligned memory operands with SSE instructions.
sse2 - Enable SSE2 instructions.
sse3 - Enable SSE3 instructions.
sse4.1 - Enable SSE 4.1 instructions.
sse4.2 - Enable SSE 4.2 instructions.
sse4a - Support SSE 4a instructions.
ssse3 - Enable SSSE3 instructions.
tbm - Enable TBM instructions.
vmfunc - VM Functions.
x87 - Enable X87 float instructions.
xop - Enable XOP instructions.
xsave - Support xsave instructions.
xsavec - Support xsavec instructions.
xsaveopt - Support xsaveopt instructions.
xsaves - Support xsaves instructions. Would a patch that implements macros for all of these need to go through the RFC process? |
This commit adds support for detecting all of llc's target features for the x86/x86_64 architectures. Closes rust-lang#30462. Helps with rust-lang#29717 and rust-lang#34382.
cc @BurntSushi @eddyb @nikomatsakis -- I believe all of you were recently involved in discussion around SIMD stabilization. |
This would be useful to use in matrixmultiply, ndarray even before simd stabilization. I get better results if I can pick the particular stable rust-implemented loop that generates best code given the particular available vector types / available target features. |
For people interested in helping with this: There is a allowlist of features that are exposed by this mechanism, which is very limited. See https://github.com/rust-lang/rust/blob/master/src/librustc_driver/target_features.rs#L23-L28, in particular the variables I found #34412 which seems to be a good example of a PR that expands the set of features that can be used with |
Is this blocking stabilization of this feature? And, if so, is this the only thing blocking stabilization, or are there other things? |
To answer my own question: IMO, the above concern doesn't need to block this feature from being stabilized. In my case, I don't care whether LLVM has the CPU features enabled in its code generation because I'm not using LLVM-based code gen for my code that uses SIMD and stuff. The above concern does need to block the stabilization of intrinsics and other things exposed from libcore/libstd that depend on So, IMO, this seems like it this is good to go. |
One clarification might be in order: When code uses this feature, can it assume that every feature name will be globally unique, or should it assume that feature names might be reused across different architectures. For example, would a feature "+crypto" imply AAarch32 or AAarch64, or is it possible that, say, a MIPS target could have a "+crypto" feature? In some instances this would affect whether one writes this: #[cfg(target_feature="crypto")] or this: #[cfg(all(target_feature="crypto", any(target="aarch64", target="aarch32")))] |
@briansmith AFAIK this issue is not block on anything. The concern you linked to about querying LLVM directly was actually resolved back in April! Also AFAIK we haven't discussed issues of namespaces or implication of features across architectures. |
Awesome!
The RFC says "This RFC proposes a target_feature cfg, that would be set to the features of the architecture that are known to be supported by the exact target e.g." To me, that implies that some feature names may overlap between architectures. This means that one probably should use I also noticed that the documentation for each target feature is missing. In particular, there doesn't seem to be any documentation describing what each value of For example, in the case of Aarch32/Aarch64 crypto features, ARM has defined a "crypto" feature that enables AES, SHA-1, and SHA-2 instructions. However, the CPU actually reports whether each feature is enabled/disabled separately, so in theory there could be individual sub-features of "crypto" in the future. Also, in clang "+crypto" enables floating point and NEON, according to its documentation. If it is expected that the Rust target features will stay in sync with LLVM's, then we could just have documentation for target_feature that points to the external LLVM documentation. That's what I'd recommend for now. |
Note in particular the GCC documentation says "feature crypto implies simd, which implies fp. Conversely, nofp implies nosimd, which implies nocrypto." (emphasis mine). Does this mean that in Rust |
@briansmith yeah makes sense to me that we'd want to make sure we document all supported target features in one way or another, including hierarchies and implications amongst them. I do believe we support |
What's the relation to cargo's |
Cargo gained a feature recently to ferry along everything from The In other words, @briansmith's points about documentation, relationships, names, what they actually mean, etc, I believe are also all relevant for this stabilization issue. |
Ok that's important information. I don't like that stable prints something different, but it's important to know. Very late clarify (March 2017): Normally, on nightly we opt in to nightly features (such as printing more cfg's), so it's surprising when it is a silently enabled stable/nightly difference. |
I'd like to hopefully make progress on this issue as the real issue I'd like to make progress on, #37406, I have now realized is blocked on this report. As of today we have solved my original concern by querying LLVM directly about this information instead of attempting to infer it ourselves. However @briansmith's concern remains about namespacing across targets and such. I would also like to show that practically today all our features match LLVM exactly (we don't change the names). This is intentional for these features but we are not tying ourselves to LLVM's naming, we reserve the right to change the name and add features (such as As a result I would like to propose the following:
That is, I'd like to stabilize this mechanism for controlling features to targets, and then leave stabilization of the names here to the actual relevant subsystems. For example I would expect SIMD stabilization to be coupled with the SIMD attributes, and crt-static should have its own bikeshed for the name. This does force our hand I think on the "global namespace" argument where it would be mega confusing for x86_64 Linux to have a Along those lines, I will... @rfcbot fcp merge |
Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams:
No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Yeah as pointed out by @gnzlbg the original motivation for stabilizing this more quickly evaporated which caused this to sit for awhile. Work has started on the implementation and the next outstanding piece I believe is gating each feature name behind a separate feature gate. This is now pretty tightly coupled with simd stabilization (without many other motivational factors) so I wouldn't expect this to stabilize ahead or long after SIMD (but rather at the same time). SIMD's still farther out, though. |
Counter-intuitively, specifying Also, compiling against a native CPU with this feature available also doesn't cause Is this intentional? |
@raphaelcohn I believe you need |
If |
I'll double-check I transcribed that correctly. However, if it is supposed to |
Right, |
@raphaelcohn the RFC says
That is,
per the rfc, the analogous syntax is I think that |
#48556 has entered FCP which I believe will implicitly stabilize this attribute |
This commit stabilizes the SIMD in Rust for the x86/x86_64 platforms. Notably this commit is stabilizing: * The `std::arch::{x86, x86_64}` modules and the intrinsics contained inside. * The `is_x86_feature_detected!` macro in the standard library * The `#[target_feature(enable = "...")]` attribute * The `#[cfg(target_feature = "...")]` matcher Stabilization of the module and intrinsics were primarily done in rust-lang/stdarch#414 and the two attribute stabilizations are done in this commit. The standard library is also tweaked a bit with the new way that stdsimd is integrated. Note that other architectures like `std::arch::arm` are not stabilized as part of this commit, they will likely stabilize in the future after they've been implemented and fleshed out. Similarly the `std::simd` module is also not being stabilized in this commit, only `std::arch`. Finally, nothing related to `__m64` is stabilized in this commit either (MMX), only SSE and up types and intrinsics are stabilized. Closes rust-lang#29717 Closes rust-lang#44839 Closes rust-lang#48556
Stabilize x86/x86_64 SIMD This commit stabilizes the SIMD in Rust for the x86/x86_64 platforms. Notably this commit is stabilizing: * The `std::arch::{x86, x86_64}` modules and the intrinsics contained inside. * The `is_x86_feature_detected!` macro in the standard library * The `#[target_feature(enable = "...")]` attribute * The `#[cfg(target_feature = "...")]` matcher Stabilization of the module and intrinsics were primarily done in rust-lang/stdarch#414 and the two attribute stabilizations are done in this commit. The standard library is also tweaked a bit with the new way that stdsimd is integrated. Note that other architectures like `std::arch::arm` are not stabilized as part of this commit, they will likely stabilize in the future after they've been implemented and fleshed out. Similarly the `std::simd` module is also not being stabilized in this commit, only `std::arch`. Finally, nothing related to `__m64` is stabilized in this commit either (MMX), only SSE and up types and intrinsics are stabilized. Closes #29717 Closes #44839 Closes #48556
Added as part of the SIMD work, supports feature detection based on e.g. target processor. This issue tracks stabilization.
Relevant issues and bugs
The text was updated successfully, but these errors were encountered: