-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Compiler intrinsics #11475
Comments
@jkotas @yizhang82 @smosier @stephentoub @MadsTorgersen @gafter @dotnet/roslyn-compiler |
Looks exactly how I had imagined, @jaredpar and I were talking about this afternoon, so I'm super happy to see this! |
How do we feel about not taking a dependency on CoreFX defining the CompilerIntrinsic attribute, instead allowing the compiler to make a decision about which attribute to look for via an argument, or (and this may be heavy handed) making a keyword out of this. I'd hate for this feature to be gated on your need to upgrade the compiler AND your standard library. |
There is no dependency on CoreFX defining the attribute. The attribute can be defined in source as it is demonstrated in the example above (notice the In other words you can just define the attribute anywhere in your project. |
Gotcha, thanks. |
Event adders/removers seem to be missing from the example. |
How would class Program
{
[CompilerIntrinsic]
static unsafe extern void* LoadFunctionPointer(Action<object> target);
[CompilerIntrinsic]
static unsafe extern void* LoadFunctionPointer(Action<string> target);
unsafe static void Main()
{
var writeLineObject = LoadFunctionPointer((Action<object>)Console.WriteLine);
var writeLineString = LoadFunctionPointer((Action<string>)Console.WriteLine);
}
} |
@svick Yes, exactly. |
What are the reasons to prefer this over supporting inline assembly directly? |
@miloush Because this approach doesn't need to change the language syntax and semantics. |
@tmat Feels a little bit like hack to avoid changes in the language. I agree it's much easier to ship though than designing IL support if it had to offer comparable features. Do you expect all IL instructions to be available via such methods or only the "impossible" ones? |
@miloush Only a few special purpose patterns as needed. The intrinsics are designed to be used in very rare cases. Most C# programmers won't ever see one. Embedding full IL sublanguage into C# would be a lot of effort and would make the language much more complex for something that should be used very rarely. That's not a good trade off to make. |
What if we could define them as generic functions, [CompilerIntrinsic]
static unsafe extern void* LoadFunctionPointer<T>(Action<T> target);
var writeLineObject = LoadFunctionPointer<object>(Console.WriteLine);
var writeLineString = LoadFunctionPointer<string>(Console.WriteLine); Or with a different name. [CompilerIntrinsic("LoadFunctionPointer")]
static unsafe extern void* LoadFunctionPointerStr(Action<string> target);
[CompilerIntrinsic("LoadFunctionPointer")]
static unsafe extern void* LoadFunctionPointerObj(Action<object> target); Or a general one, [CompilerIntrinsic]
static unsafe extern void* LoadFunctionPointer(Delegate target); Because you will need to cast the argument anyways. |
+1 I like the general idea as it would obviate almost the need for IL patching in SharpDX interop codegen (similar to MCG), though, ideally, I would prefer a generic solution like (in between the proposal and @miloush concerns): [CompilerIntrinsic] static extern void ilemit(string emit);
...
ilemit("ldtoken Program");
ilemit("ldtoken fld"); The underlying In addition to
|
@xoofx I don't understand how adding quotes around intrinsics adds any value. As I already mentioned we'd only support very small subset of IL patterns, which is expressed by the set of supported intrinsics. So the amount of IL you could express would be the same.
Have you tried to prototype such approach? I believe the opposite is true. It'd be much more complicated. |
The benefits are:
I have almost always started to develop my IL in IL ASM, compile an assembly from it and linked to it. Then I had to go to the Mono.Cecil route to avoid having another assembly (ILMerge has not been always safe to use on some assemblies). Exactly as you can see it in PtrUtils.il by @joeduffy It has been somewhat prototyped in "Tool to allow inline IL in C# / VB.Net", although it was of course not done at compilation time but as a ILDASM post-processing injection trick.
Not sure what you both mean by "complicated". Do you mean complicated to implement in Roslyn? (in that case, knowing a bit of Roslyn, I know for sure that it is quite easy to hook a prototype into it for this kind of things) Or do you mean complicated to read for someone that doesn't know IL ASM? So yes, I prefer a well known syntax like on the right side, rather than the left side which is a bit more cumbersome to use and author:
My proposal has type safety as well. Roslyn would parse the asm string in exactly the same way it would do it if it was part of the language. It means that a reference inside the string would be evaluated (you could navigate to it), you would have a SyntaxToken for it, and it would be part of the SyntaxTree. It is just that by escaping the IL ASM in a string, you don't introduce any breaking changes to the language(s) and it is compatible typically with any .NET language. |
Yes. Writing code in IL is an explicit non-goal of this feature. We want our users to write code in C#. |
Fair enough if it is a requirement. |
Too bad, we could've had an ASM (maybe il { }) for C# http://wiki.freepascal.org/Lazarus_Inline_Assembler |
@pebezo There are million things we could potentially have in C#. Adding all of them won't make the language better. https://i.kinja-img.com/gawker-media/image/upload/s--BdEb-Dl5--/c_scale,fl_progressive,q_80,w_800/185xbqdcr7fgmjpg.jpg |
After a few hours of work, here we go: http://xoofx.com/blog/2016/05/25/inline-il-asm-in-csharp-with-roslyn/ 🎉 |
@xoofx Nice! Certainly better than strings ;) However a few issues you might want to think about: a) The IL validation you mention in the blog post (balanced stack, etc.), especially when combined with all other C# language features -- like using il(...) in the middle of expression, etc. Perhaps you could do the final validation once everything is on IL level, but then you won't be able to report diagnostics to the user as they are typing. b) Debugging -- il(...) looks like a statement but it is not a statement. How are you gonna place sequence points? Is EnC gonna work in a method that contains il() calls? c) How many IDE features (e.g. refactorings, rename, etc.) need to understand il(...) calls in order to not produce incorrect results? Complexity is hidden in the detail... |
@tmat The points you mention are of course things that would have to be handled/crafted carefully for a non-prototype code. That would require certainly more work than the original proposal. I never claimed the opposite. but...
first, I'm making a large difference in nature (and not in levels) between the word "complexity" and "complicated"... But I definitely agree that a clean work would require more work for sure! Though, nothing complicated here, just laborious work. 😉 |
@xoofx Cool prototype! Looking at it, though, I think I like @tmat's proposal more. In order to incorporate your changes we would basically have to create an entire IL sub-language inside C# and validate it all in the compiler. I would prefer to separate concerns here and let an IL compiler handle IL and the C# compiler handle C#. |
@xoofx I understand you haven't spent too much time on your prototype so it doesn't cover everything. The original proposal is designed to avoid these problems and thus the complete implementation of it would be a simple local change in the compiler. Your proposal adds more features but they come with a price. The features are inherently non-local, they don't only affect the compiler but all language services need to deal with them. |
Why not have an attribute instead that allows people to write inline IL as a string? Something like this: [ILSub(@"
sizeof !!T
ret")]
public static extern int SizeOf<T>(); That way we don't have to make people remember the intrinsic function names as well as the corresponding IL ones, and we won't need to worry if new opcodes are added to IL. |
Seems like a good thing to have. On the other hand that does not allow for obtaining tokens to methods. |
I moved further along in my implementation.. There is also extension methods which are close as I can seem to get for getting MethodTokens or TypeRef using Expression. I will be making more updates and refactoring the Managed Intrinsic to completely replace Intrinsic which I previously implemented. The only thing I can think of in relation to use via Emit would be to use the combination of Expressions to allow OpCodes to be created from either a byte[] or otherwise. Thus a syntax like you wanted as seen above could be possible using a combination of the byte[] changed into op codes and some type of IL parser.. I have a old implementation of the an IL Parser if you wanted to play with it, I have linked to it below. |
IIRC, even if this is something that would be desirable in C#, the CLR doesn't support mixed mode methods. A method may either be IL or native, but not some combination of the two. |
It's also too much work. This is a fairly esoteric feature. It does not have to be "nice". It's an escape hatch for power users. The team can spend the time better on other things rather than building extensive support for esoteric features. |
@ HaloFour, after JIT it's not IL anymore so there's no mixing to worry about. The bigger question no one seems to want to ask is what intrinsic are you going to invoke or represent in IL which has any meaningful benefits? The JIT is where these improvements need to made not the compiler. You can already dynamically emit methods or compile them manually, JIT the result and replace amother method. The bottom line is that this is an esoteric feature which is mostly desired by power users as GSPP indicates but also has useful applications outside of using the DLR. Is it worth it to have native compiler support now that there is the DLR is the real question. Since it offers such possibilities and with higher performance potential I desire it but then again I don't use the DLR much at all. (I had already cooked up a Dictionary based approach which would use expressions almost just before it's release) Maybe it's better suited as an extension or replacement to the DLR itself with a special type 'il' just like we have 'dynamic' for the for CLR where it could also be used by any CLR language. |
@tmat We should update the proposal to modify the parameter passing of the calling convention to instead be name suffixed. [CompilerIntrinsic]
static unsafe extern int CallIndirect(arg0, void* managedMethodPointer); // managed cc
[CompilerIntrinsic]
static unsafe extern int CallIndirectCDecl(arg0, void* functionPointer); // unmanaged
[CompilerIntrinsic]
static unsafe extern int CallIndirectStdCall(arg0, void* functionPointer); // unmanaged
[CompilerIntrinsic]
static unsafe extern int TailCallIndirectFastCall(arg0, void* functionPointer); // unmanaged I have some of these intrinsics implemented here (namely CallIndirect and friends): https://github.com/mjsabby/roslyn/tree/intrinsic_backup The implementation is a prototype and the mechanics of how it generates the code are likely to change, but there is correctness. I intend to keep my fork around until we can get this proposal landed. There is also a nuget package of this that is much like the |
dotnet/corefx#13561 has another case where "infoof" functionality will come in handy for the |
@mjsabby Updated the proposal. |
The return type should match the argument type, e.g. |
@jkotas Fixed |
I assume there will be restrictions on how much data flow analysis Roslyn can do to get the mapped data array - can I do: var myArray = new byte[] { 1, 2, 3 };
var data = AddressOfMappedData(myArray);
UseData(data, myArray.Length /* constprop optimized to 3 at compile time */); Or do I have to do: var data = AddressOfMappedData(new byte[] { 1, 2, 3 });
UseData(data, 3); If it's the latter, any chance we can make the intrinsic return a |
Here's another approach for those interested in embedded IL code support: I also wanted to be able to write IL directly in my C# code, so I wrote a Fody addin which lets you do just that: InlineIL.Fody. Just add the NuGet package and now you have a compile-time For those who don't know what Fody is, it's an extensible weaver which adds a build step that modifies assemblies generated by Roslyn. |
I've implemented most of the Compiler Intrinsics according to this proposal and put them in a nuget package
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp2.1</TargetFramework>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="CSCWithCompilerIntrinsics" Version="2.8.2" />
</ItemGroup>
</Project>
namespace System.Runtime.CompilerServices
{
[AttributeUsage(AttributeTargets.Method)]
internal class CompilerIntrinsicAttribute : Attribute { }
}
namespace CompilerIntrinsicsTest
{
using System;
using System.Runtime.CompilerServices;
interface IFoo
{
void Bar<T>(T j);
}
class Foo : IFoo
{
public void Bar<T>(T j)
{
Console.WriteLine($"{j}.");
}
}
class Program
{
[CompilerIntrinsic]
private static unsafe extern void* LoadFunctionPointer(Action<string> target);
[CompilerIntrinsic]
private static unsafe extern void* LoadVirtualFunctionPointer(Action<int> bar);
[CompilerIntrinsic]
static unsafe extern void CallIndirect(IFoo arg0, int arg1, void* functionPointer);
[CompilerIntrinsic]
static unsafe extern void CallIndirect(string arg0, void* functionPointer);
static void Main(string[] args)
{
unsafe
{
CallIndirect("Hello World from Compiler Intrinsics.", LoadFunctionPointer(PrintMessage));
var foo = new Foo();
CallIndirect(foo, 42, LoadVirtualFunctionPointer(foo.Bar<int>));
}
}
private static void PrintMessage(string message)
{
Console.WriteLine(message);
}
}
} |
@jkotas @MichalStrehovsky are you sufficiently satisfied with #24621 for Now only the CallIndirect intrinsic has metadata impact. |
Yes. |
This is an alternate proposal for the compiler intrinsicts feature: dotnet#191 dotnet/roslyn#11475 This alternate design proposal comes after reviewing a prototype implementation of the original proposal by @msjabby as well as the use throughout a significant code base. This design was done with significant input from @msjabby, @tmat and @jkotas.
This is an alternate proposal for the compiler intrinsicts feature: dotnet#191 dotnet/roslyn#11475 This alternate design proposal comes after reviewing a prototype implementation of the original proposal by @mjsabby as well as the use throughout a significant code base. This design was done with significant input from @mjsabby, @tmat and @jkotas.
Closing because LDM has decided to not take this design and instead adopt a more scoped approach. https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md |
Background
C# and VB languages aren't able to express certain IL instructions and patterns (e.g. ldtoken, calli, ldftn, etc.). That's ok, it's not the job of a language to express all possible patterns of the underlying VM. However, there are scenarios where these patterns are very useful. In these cases users are currently left with options that are painful to use, break debugging, break IDE experience (refactorings etc.), are hard to maintain. These include various forms of IL rewriting, writing code in plain IL and compiling it to a separate assembly, runtime code generation, etc.
A few examples of such scenarios:
calli
,ldftn
, etc.infoof
feature could be implemented as a library method if it was possible to directly emitldtoken
instructionIf only Roslyn compilers could provide a way to emit these commonly needed IL patterns without changing the language specification ...
Proposal
A compiler intrinsic is a
static extern
method declared in compilation source code that is marked withCompilerIntrinsicAttribute
and its name is well-known to the compiler.Each intrinsic provides a way to emit certain IL instruction or pattern that otherwise can’t be expressed in the C# language, using existing syntax and preserving standard evaluation stack behavior. These intrinsics can thus be used in the middle of ordinary statements and expressions with no adverse effect on debugging, EnC, IntelliSense, refactorings, etc. Intrinsics have well-known names that determine the IL instruction pattern to emit and imply a signature pattern. The specific signature declared by the user along with the call-site arguments provide the compiler all the information it needs to emit the requested IL.
The declaration of an intrinsic won't be emitted to metadata. It can't be, as the type loader would not be able to load the containing type due to missing implementation for the intrinsics.
Additional constraints might need to be enforced on intrinsic call-sites and corresponding errors reported during binding phase. One of the constraints is that some arguments passed to the call-site of an intrinsic method must evaluate to a compile-time constant. This constraint is needed when the generated IL depends on the value of such arguments (e.g.
mappedData
in AddressOfMappedData).Currently proposed intrinsics
Generic Intrinsics
It might be useful to allow the declarations to be generic in order to reduce the amount of declarations required in the code, for example
Local Intrinsics
Provided that C# 7 allows local functions to be declared as
extern
and custom attributes applied to them it would be possible to declare intrinsics "inline" (see #11731), for example:Possible future intrinsics
The compiler can define any number of intrinsics and add more as needed. The following example demonstrates what would be possible. The exact set of intrinsics to introduce is up for a debate.
Note that the following code compiles and works in the current version of C# compiler and IDE services. It just doesn't compile to the desired IL.
The text was updated successfully, but these errors were encountered: