Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiler intrinsics #1685

Merged
merged 3 commits into from
Jul 17, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 183 additions & 0 deletions intrinsics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Compiler Intrinsics

* [x] Proposed
* [ ] Prototype: Not Started
* [ ] Implementation: Not Started
* [ ] Specification: Below
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move this to the issue only.


## Summary

This proposal provides language constructs that expose low level IL opcodes that cannot currently
be accessed efficiently, or at all: `ldftn`, `ldvirtftn`, `ldtoken` and `calli`. These low level op
codes can be important in high performance code and developers need an efficient way to access
them.

## Motivation

The motivations and background for this feature are described in the following isssue (as is a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo isssue

potential implementation of the feature):

https://github.com/dotnet/csharplang/issues/191

This alternate design proposal comes after reviewing a prototype implementation of the original
proposal by @msjabby as well as the use throughout a significant code base. This design was done
with significant input from @mjsabby, @tmat and @jkotas.

## Detailed Design

### Allow address of to target methods

Method groups will now be allowed as arguments to an address-of expression. The type of such an
expression will be `void*`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: why do we prefer "void *" instead of "IntPtr"? If we use IntPtr, we don't need to use unsafe part?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a safe operation which is part of the motivation for returning a void* here.


``` csharp
class Util {
public static void Log() { }
}

// ldftn Util.Log
void* ptr = &Util.Log;
```

Given there is no delegate conversion here the only mechanism for filtering members in the method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think that this is going to be a significant bar for existing libraries to use this feature, if they've already got a number of method overloads that they want to distinguish between?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you have a section on that below, hadn't gotten there yet.

group is by static / instance access. If that cannot distinguish the members then a compile time
error will occur.

``` csharp
class Util {
public void Log() { }
public void Log(string p1) { }
public static void Log(int i) { };
}

unsafe {
// Error: Method group Log has more than one applicable candidate.
void* ptr1 = &Log;

// Okay: only one static member to consider here.
void* ptr2 = &Util.Log;
}
```

The addressof expression in this context will be implemented in the following manner:

- ldftn: when the method is static
- ldvirtftn: when the method is an instance method.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we allow the emission of ldftn if we know that the method is non-virtual?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# always uses callvirt instead of call and I was trying to mirror that here. I don't have any particular reason beyond that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a way to do both ldvirtftn and ldftn would let library writers determine whether a method has been overridden by a subclass. This has been used in performance optimization in the past, but it's unavailable to code that doesn't live in the runtime.

Copy link
Member

@jkotas jkotas Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: There is nothing that says that the runtime has to return the same pointer from ldvirtftn all the time. .NET Framework / Core do actually return different pointers for the same method.

If the pointer equality is used as an optimization, comparing function pointers will work but not reliably.

If the pointer equality is used for correctness, comparing function pointers is not going to work well. (Your HasOverriddenBeginEndRead example is used as both performance optimization but also to achieve correctness.)

Copy link
Member

@MichalStrehovsky MichalStrehovsky Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing that says that the runtime has to return the same pointer

We could add public API to compare function pointers for equality, couldn't we? Then this optimization would become available to library developers outside of CoreLib (provided we get a ldvirtftn/ldftn distinction in this proposal).

I'm calling this out because the original proposal in dotnet/roslyn#11475 had this, but this one doesn't. But I'll be happy if this makes it into Roslyn even in the currently proposed form.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichalStrehovsky

The difference between these two for non-virtual methods would only be about whether we get a NullReferenceException for a null this, right?

For call and callvirt this also affects binary compatibility. Imagine a method in a utility library changes from non-virtual to virtual and a consuming assembly isn't recompiled. In C# the code in the consuming assembly will automatically begin calling the method as a virtual.

I'm unsure if ldftn and ldvirtftn behave in a similar way. Been trying to track that down but haven't found any good dos on it. Assuming they do then it would be a consideration when deciding what code the address-of & operator generated.

Copy link
Member

@MichalStrehovsky MichalStrehovsky Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For call and callvirt this also affects binary compatibility

Isn't this just a side effect, as opposed from being a conscious design choice? When the compiler knows the this is going to non-null, it will happily generate a call:

    public class Base
    {
        public void Frob()
        {
        }
    }

    class Derived : Base
    {
        public void Call()
        {
            // Emits call
            Frob();
        }

        public static void CallStatic(Derived d)
        {
            // Emits callvirt
            d.Frob();

            // Emits call
            d?.Frob();
        }
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this just a side effect, as opposed from being a conscious design choice?

It's hard to say whether in C# 2.0 (where callvirt became the default over call) whether binary compat was a concern. At this point though it's a known quantity of the language that we consider when making changes. It's not an unbreakable change by any means but it's a consideration we make.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're over focusing on the callvirt thing here a bit. It's not a big blocking problem so much as it is a box that needs to be checked. I don't suspect there would be any substantial push back from LDM if we make a different decision for how ldftn / ldvirtftn were chosen.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, LGTM.


Restrictions of this feature:

- Instance methods can only be specified when using an invocation expression on a value
- Local functions cannot be used in `&`. The implementation details of these methods are
deliberately not specified by the language. This includes whether they are static vs. instance or
exactly what signature they are emitted with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should productize the NativeCallableAttribute that will work together with this. This proposal together with NativeCallableAttribute will deliver complete solution for low-level managed/unmanaged interop for function pointers:

  • address-of + NativeCallableAttribute can be used the get unmanager pointer of method implemented in C#
  • Calli indirect can be used to call unmanaged pointer

.NET Core supports the NativeCallableAttribute attribute internally, but the feature was not exposed in public contracts and documented.

@jeffschwMSFT @AaronRobinsonMSFT

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this attribute do today in the internal support?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It adds unmanaged->managed->unmanaged transition to the method prolog/epilog.

Marking the method with NativeCallableAttribute and taking its address is more efficient low-level equivalent of Marshal.GetFunctionPointerForDelegate. It avoids the delegate wrapping and unwrapping overhead. More details are in description for dotnet/coreclr#1566

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking the method with NativeCallableAttribute and taking its address is more efficient low-level equivalent of Marshal.GetFunctionPointerForDelegate

They are not exactly equivalent. The first one doesn't support marshalling(you need to use blittable types), the second one support marshalling.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with @AaronRobinsonMSFT and I better understand how this could interact with the feature. Agree there are benefits. Seems easy to add if the runtime decides to productize this. Will add this to the "future consideration" section.

### handleof

The `handleof` contextual keyword will translate a field, member or type into their equivalent
`RuntimeHandle` using the `ldtoken` instruction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity - I was not familiar with RuntimeHandle - it technically is RuntimeXXXXHandle right? Want to make sure and not assume something new will be created. It might be beneficial to enumerate all of them at least once and then use the RuntimeHandle as the placeholder going forward.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. I will clarify that here.


The arguments to `handleof` are identical to `nameof`. It must be a simple name, qualified name,
member access, base access with a specified member, or this access with a specified member. The
argument expression identifies a code definition, but it is never evaluated.

The `handleof` expression is evaluated at runtime and has a return type of `RuntimeHandle`. This
can be executed in safe code as well as unsafe.

```
RuntimeHandle stringHandle = handleof(string);
```

Restrictions of this feature:

- Properties cannot be used in a `handleof` expression.
- The `handleof` expression cannot be used when there is an existing `handelof` name in scope. For
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: handelof

example a type, namespace, etc ...

### calli

The compiler will add support for a new type of `extern` function that efficients translates into
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: efficients

a `.calli` instruction. The exten attribute will be marked with an attribute of the following
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: exten

shape:

``` csharp
[AttributeUsage(AttributeTargets.Method)]
public sealed class CallIndirectAttribute : Attribute
{
public CallingConvention CallingConvention { get; set; }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be read-only. Design guidelines say that attribute arguments should be either positional or named, not both.

public CallIndirectAttribute(CallingConvention callingConvention)
{
CallingConvention = callingConvention;
}
}
```

This allows developers to define methods in the following form:

``` csharp
[CallIndirect(CallingConvention.Cdecl)]
Copy link
Member

@jkotas jkotas Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum does not have a value for regular managed calling convention. (@mjsabby use case and some other use cases needs the managed calling convention).
I think we may need to have two attributes:

  • CallIndirectAttribute for regular managed calls that does not have CallingConvention.
  • UnmanagedCallIindirectAttribute for unmanaged calls that does have CallingConvention

An alternative is to add a value to for the managed calling convention to CallingConvention enum. I have rejected it because of the new value would be invalid in all existing APIs that use CallingConvention enum today. All these APIs are for unmanaged calls.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a value to for the managed calling convention to CallingConvention enum

Add a new value in CallingConvention seems better than use two attributes. simpler and easy to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luqunl I agree in principle that two attributes are more annoying - however, I agree with @jkotas that adding it to the enum is less than ideal.

Another approach that would work from the API perspective would be to make the default constructor for CallIndirectAttribute specify a sentinel value of sorts that indicates managed convention - a default that probably makes sense - and another constructor that takes a CallingConvention enum value. In this case the default is 'managed to managed' and if a calling convention is used the intent is to leave the managed world so let us know what is being called.

The nice aspect about this is that the sentinel value can be anything and doesn't require breaking the API boundary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add an open issues to track this down. The language design can proceed while we figure this out.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should dis-allow DllImport method with CallIndirect attribute.

static extern int MapValue(string s, void *ptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the C# compiler going to emit this method into the metadata, or will it disappear at source compilation time? The runtime typically won't load a type that has an extern method without a DllImport or some other special marking.

What happens if I try to reference this method in a non-call context (try to construct a delegate to this or get a RuntimeHandle of it)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation generates method bodies for this reason, and at least for the managed calling convention the calli gets inlined. Or would you like roslyn to prevent the cases you're mentioning and not emit metadata?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just wondering if it's going to be actually emitted as extern, or extern was just a spare keyword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichalStrehovsky the extern keyword here indicates the method will not have a body that is supplied by the developer. Instead it will be supplied by the compiler, or the runtime depending on how it's omitted. In this case as @mjsabby indicates we'll be putting the calli implementation in the body here.

This is still being fleshed out a bit and is more compiler centric vs. language centric. Hence I didn't dive too deep into that here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For calli, Will we support generic method? also named arguments?
such as Int MapValue<T1, T2>(T1 a1, T2 a2, void *p)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure about supporting a generic method. As long as it's supported in the runtime I think we can make it work fine in the language.

As for named arguments: yes. Likely optional as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic calli is not supported by the runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @luqunl meant support for this:

.assembly foo { }

.method static !!0 Calli<TResult>(native int)
{
    ldarg.0
    calli !!0 ()
    ret
}

.method static int32 Test()
{
    ldc.i4 42
    ret
}

.method static int32 Main()
{
    .entrypoint
    ldftn int32 Test()
    call !!0 Calli<int32>(native int)
    ret
}

Rather than generating a calli whose signature is generic (something like calli !!0<!!0>()).

This seems to work fine.

Allowing this on generic methods would save quite a bit of typing so I would vote for not blocking this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would nice to allow it - if we do the matching work in CoreCLR and other runtimes to support it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing this on generic methods would save quite a bit of typing so I would vote for not blocking this.

Yes. Interop would love to support this. Currently MCG and System.Private.Interop will define/generic a lot calli helper methods, similar to the following

internal static void HasThisCall__9<TArg0>(
object __this,
global::System.IntPtr pfn,
TArg0 arg0)
{
// This method is implemented elsewhere in the toolchain
}

Copy link
Member

@jkotas jkotas Jul 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would save quite a bit of typing

I do not think that the extra typing is a problem here. This is a low-level feature where saving typing is not important. Also, this would be typically auto-generated, not typed by humans.

I would look at the potential support for generic unmanaged calli as second phase of this feature. Generic unmanaged calli would make this feature significantly more complex because of it would dependent on a runtime feature that does not exist yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would look at the potential support for generic unmanaged calli as second phase of this feature.

FWIW: expanding this to include generic support at a later date is basically a bug fix level change. I agree a second phase is a good landing place for this.

Generic unmanaged calli would make this feature significantly more complex because of it would dependent on a runtime feature that does not exist yet

It adds complexity yes but not significant complexity to the compiler. The compiler is already setup to look at RuntimeFeatures to enable / disable items we know depend on runtime changes. As long as the fix in CoreClr included an entry here we could depend on it.

That being said I think we can move forward for now with generic disabled and revisit once we make a bit more progress. It won't really need language buy off at that point, just keyboard time to get the change in.


unsafe {
var i = MapValue("42", &int.Parse);
Console.WriteLine(i);
}
```

## Considerations

### Disambiguating method groups

There was some discussion around features that would make it easier to disambiguate method groups
passed to an addressof expression. For instance potentially adding signature elements to the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressof <- should be surrounded by back ticks (`)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not code though, it's a language concept.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some people use it to highlight keywords as well 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(whether language keywords or words that are key to the sentence)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a language concept, then it should be hyphenated "address-of" (i.e. 'addressof' is not a word).

syntax:

``` csharp
class Util {
public static void Log() { ... }
public static void Log(string) { ... }
}

unsafe {
// Error: ambiguous Log
void *ptr1 = &Util.Log;

// Use Util.Log();
void *ptr2 = &Util.Log();
}
```

This was rejected because a compelling case could not be made nor could a simple syntax be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we consider allowing a "cast"? ie, something like &((Action<string>)Util.Log)?

envisioned here. Also there is a fairly straight forward work around: simple define another
method, possible local function, that is unambiguous and uses C# code to call into the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement:

possible local function

contradicts statement above:

Local functions cannot be used in &. The implementation details of these methods are
deliberately not specified by the language. This includes whether they are static vs. instance or
exactly what signature they are emitted with.

desired function.

``` csharp
unsafe {
static void LocalLog() => Util.Log();
Copy link
Member

@tmat tmat Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this an example of a static local function discussed in Future Considerations below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. I was trying to keep the sample concise. I will make it clearer by using a plain old static.

void* ptr = &LocalLog;
}
```

### LoadTypeTokenInt32

The original proposal allowed for metadata tokens to be loaded as `int` values at compile time.
Essentially have `tokenof` that has the same arguments as `handleof` but is evaluated at
compile time to an `int` constant.

This was rejected as it causes significant problem for IL rewrites (of which .NET has many). Such
rewriters often manipulate the metadata tables in a way that could invalidate these values. There
is no reasonable way for such rewriters to update these values when they are stored as simple
`int` values.

The underlying idea of having an opaque handle for metadata entries will continue to be explored
by the runtime team.

## Future Considerations

### static local functions

This refers to [the proposal](https://github.com/dotnet/csharplang/issues/1565) to allow the
`static` modifier on local functions. Such a function would be guaranteed to be emitted as
`static` and with the exact signature specified in source code. Such a function should be a valid
argument to `&` as it contains none of the problems local functions have today.