-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using UMONITOR, UMWAIT, TPAUSE in CLR and exposing in Intel specific hardware intrinsics #66873
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics Issue DetailsUse cases
|
At the very least, this proposal needs to be updated to follow the These instructions are available in user-mode and don't appear to have any oddities that would prevent their support in the JIT. It might be interesting to see if @stephentoub, @jkotas has anywhere this could be used in-box. Things like working with the GC would likely not be easy to support and like For reference:
The C++ signatures for these are:
Rust provides similarly named APIs. |
This comment was marked as off-topic.
This comment was marked as off-topic.
It would be interesting to experiment with replacing the lock spin loops using these intrinsics. It should provide better overall performance, especially on machines with many cores. The common locks are implemented in C/C++ in CoreCLR today, so we would need to reimplement them in C# first before the managed intrinsics can be used for those. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Can somebody create an API-Shape for this Proposal? |
This comment was marked as resolved.
This comment was marked as resolved.
I've updated it loosely based on the above. Made a couple tweaks and gave an explanation of why |
Looks good as proposed. namespace System.Runtime.Intrinsics.X86;
[Intrinsic]
[CLSCompliant(false)]
public abstract class WaitPkg : X86Base
{
public static new bool IsSupported { get; }
// UMONITOR: void _umonitor(void *address);
public static unsafe void SetUpUserLevelMonitor(void* address);
// UMWAIT: uint8_t _umwait(uint32_t control, uint64_t counter);
public static bool WaitForUserLevelMonitor(uint control, ulong counter);
// TPAUSE: uint8_t _tpause(uint32_t control, uint64_t counter);
public static bool TimedPause(uint control, ulong counter);
[Intrinsic]
public new abstract class X64 : X86Base.X64
{
internal X64() { }
public static new bool IsSupported { get; }
}
} |
Summary
x86 based hardware introduced the
waitpkg
ISA back in 2020 which can be used to better facilitate low-power and low-latency spin-loops.API Suggestion
Additional Considerations
There is a model specific register
IA32_UMWAIT_CONTROL
(MSR 0xE1
) which provides additional information. However, model specific registers can only be read by ring 0 (the kernel) and as such this information is not available to user mode programs without the underlying OS exposing an explicit API. As such, this information is not surfaced to the end user.IA32_UMWAIT_CONTROL[31:2]
— Determines the maximum time in TSC-quanta that the processor can reside in either C0.1 or C0.2. A zero value indicates no maximum time. The maximum time value is a 32-bit value where the upper 30 bits come from this field and the lower two bits are zero.IA32_UMWAIT_CONTROL[1]
— Reserved.IA32_UMWAIT_CONTROL[0]
— C0.2 is not allowed by the OS. Value of “1” means all C0.2 requests revert to C0.1.This information is not strictly pertinent to the user either and would not normally influence their use of the APIs. For example, if
IA32_UMWAIT_CONTROL[0]
is1
, it simply means that a user call ofTimedPause
wherecontrol == 0
will be treated ascontrol == 1
:Likewise, if the user specified
counter
is larger thanIA32_UMWAIT_CONTROL[31:2]
thenTimedPause
returnstrue
indicating that the pause ended due to expiration of the operating system time-limit rather than reaching/exceeding the specifiedcounter
(returnsfalse
). The same applies toWaitForUserLevelMonitor
.The text was updated successfully, but these errors were encountered: