Exploration: Shapes and Extensions #164
Replies: 385 comments 29 replies
-
A type class ( |
Beta Was this translation helpful? Give feedback.
-
The only reason I can think of that a declaration such as public extension Comparable<T> of IComparable<T> : SComparable<T> ; is necessary is that not every type that implements |
Beta Was this translation helpful? Give feedback.
-
This looks really interesting. My thoughts:
|
Beta Was this translation helpful? Give feedback.
-
Well, I have to say that this explanation of "Type classes" as shapes and how it could extend anything of a type was what I needed to wrap my head around the concept, at least, instead of needing a Ph. D in functional language jargon. This is now making me pretty excited for the possibilities this could bring once the kinks are worked out. |
Beta Was this translation helpful? Give feedback.
-
I'd rather to wait for proper clr support for "virtual extension methods" (#52) and further, apply after the fact (dotnet/roslyn#8127). This looks too magical and honestly I don't see much value added for "generic numeric operations" specially that code generators can seemlessly support that scenario without performance penalty enforced via this proposal. Meanwhile, a portion of "extension everything" can be implemented right now as it doesn't need voodoo under the hood. |
Beta Was this translation helpful? Give feedback.
-
This proposal seem like completely summarize functionality we requested so far Still there are something I disagree with
Also I still want to propose public static class IntExt : int,SGroup<int>
// must place type to extend before any shapes
// same rule as class : class,interface,interface
{
// same new implementation
// allow static class extended from type can implement non static member
} However this is very interesting transpile implementation |
Beta Was this translation helpful? Give feedback.
-
@alrz I don't know what performance penalty you're referring to. The invocations of shape methods get specialized by the JIT so they become simple non-virtual invocations of static methods. |
Beta Was this translation helpful? Give feedback.
-
@ArlZ Could you elaborate as to how you foresee code generators as seamlessly supporting writing generic numeric operations? What would that even look like? |
Beta Was this translation helpful? Give feedback.
-
Built-in operators don't emit methods, but shapes need a method call anyways. With generators one can overload a method over numeric types, [NumericOverload]
int AddAll(int[] nums) {..} I'd admit that wouldn't be an straightforward generator (and doesn't provide much flexibility) but still possible. I think other use cases like duck typing etc, would be addressed by #52 in which case it doesn't need complex code generation on the compiler side. |
Beta Was this translation helpful? Give feedback.
-
So you'd have to use I think the "just do it with generators instead" argument is kind of a rabbit hole. You could make an argument that many of the features already in C# shouldn't be there because they could be done with generators instead. Of course, doing this means the tooling support is vastly inferior and the code much harder to follow. I think generators are a great feature, but they need to be used carefully and thoughtfully. Speaking from experience using PostSharp, though its powerful it makes it much harder to reason about what's happening with your code. Also, I'm not sure why you keep making a perf argument against this feature. Mads/Gafter already said that tests showed nearly zero perf hit. If that ultimately changes I think they should take that into account but as it stands I'm satisfied to take their word on this. |
Beta Was this translation helpful? Give feedback.
-
The JIT optimizes struct generics very well. In Release build, virtual calls are replaced to non-virtual calls. And in many cases, non-virtual calls are expanded inline. As a result, the performance of the Here is the benchmark: https://gist.github.com/ufcpp/15af6d3d7606fb3771a91c81898dcfa3 |
Beta Was this translation helpful? Give feedback.
-
Regarding disambiguation, some new syntactic support could be handy:
The |
Beta Was this translation helpful? Give feedback.
-
I imagine and hope the answer is yes, but would the following be a valid extension? public extension PointlessExtension<T> of T
where T: class
{
public T PointlessReferenceToSelf => this;
} In short, would the extension be applicable directly to the generic argument? |
Beta Was this translation helpful? Give feedback.
-
@TheOtherSamP How would it work? by reading this proposal it would get implemented like this:
Disregard the fact that it is a struct but can you derive from T today? something like the following might work:
|
Beta Was this translation helpful? Give feedback.
-
@eyalsk That's a fair question, and one I'll confess I gave little thought. I honestly don't know how to best make that work internally, but it feels like a fairly essential capability to me. That's something you can do with current extension methods, and something I use often enough that I know I'd miss it if it were lacking in these new extensions. To give a slightly more useful (while still simple) example: public static T[] ToArrayContaining<T>(this T item) => new []{item}; This is an extension method using the current syntax that uses this capability. Yes that would continue to work, but the ability to do things like this would be nice in the new syntax too. Without it, the new extension features would feel incomplete to me because they're taking a step back, and I'd end up mixing and matching between the two extension method syntaxes in my code. That feels wrong. |
Beta Was this translation helpful? Give feedback.
-
That seems too much ambitious and dangerous for the CLR to ever implement. Matching a shape is not casting and it is done on a per-type basis, not per-instance. The actual type that matches Plus imagine this: shape SCountable
{
int Count { get; }
} Should This overcomplicates a proposal that is intended to bridge the gap between static members and interfaces with complex, inconsistent, and dangerous operations carried out by the CLR. Besides, there are already good dynamic duck-typing libraries that can be used for this purpose, so if a simple piece of syntax is really needed, something like
You are describing a role. Essentially it means a value type can "derive" from any concrete type if it adds no fields and overrides no methods. I like this notion of value type polymorphism, since if even deriving from a generic parameter was possible, it would be a step towards tags. However, I still think this is an extension of the original proposal, and not a completion. Being able to generate a type implementing particular interfaces and delegating all instance calls to static calls is a good feature on its own. The rest is just giving hints to the compiler. |
Beta Was this translation helpful? Give feedback.
-
How do the Extensions get into this list and when? Do Extensions for the base type suffice? Or any of the implemented interfaces? There could be any number of such Extensions in an app, each which does something different. This also doesn't seem to help with the case of attempting to match an arbitrary type to a shape at runtime where no such Extensions would exist unless some developer already established that relationship. Anyway, I don't particularly care how "shapes" (or whatever they end up being called) are implemented as long as they fit the bill of being a named constraint of required members and that it's possible to provide "glue" to make an existing type meet that constraint via extension members or similar, and with a minimum of overhead in the process. It's my opinion that "shapes" be an evolution of interfaces rather than a completely new "type" so that it can be used within the existing ecosystem. |
Beta Was this translation helpful? Give feedback.
-
There's I think a number of separate problems which the shapes/roles/extensions discussion is trying to solve:
I think they've been muddied together which is leading to a lot of confusion. Whilst solutions to these problems might be related, they aren't necessarily. Specifically my proposal above for how to put this in the runtime solves 1 and 2 but not 3. Just traits/shapes in their original compiler only incarnation would solve 1 and 3 but not 2. As an alternative to shapes for solving 3, I think Associated Types could be a reasonable idea For example in #2482 I suggest the following as a way to achieve an allocation free Select method: public interface IEnumerable<T, TEnumerator> where TEnumerator : struct, IEnumerator<T>
{
TEnumerator GetEnumerator();
}
public static SelectEnumerable<TEnumerable, TEnumerator, TSource, TResult> Select<TEnumerable, TEnumerator, TSource, TResult>(this TEnumerable source, Func<TSource, TResult> func) where TEnumerable : struct, IEnumerable<TSource, TEnumerator> where TEnumerator : struct, IEnumerator<TSource>;
public struct SelectEnumerable<TEnumerable, TEnumerator, TSource, TResult> : IEnumerable<TResult, SelectEnumerator<TEnumerator, TSource, TResult>> where TEnumerable : struct, IEnumerable<TSource, TEnumerator> where TEnumerator : struct, IEnumerator<TSource> {...}
public struct SelectEnumerator<TEnumerator, TSource, TResult> : IEnumerator<TResult> where TEnumerator : IEnumerator<TSource> {...} We could use associated types to replace that with: public interface IEnumerable
{
public type Enumerator : struct, IEnumerator;
Enumerator GetEnumerator();
}
public interface IEnumerator
{
public type T;
T Current { get; }
bool MoveNext();
}
public static SelectEnumerable<TResult, SourceEnumerable> Select<SourceEnumerable , TResult>(this SourceEnumerable enumerable, Func<SourceEnumerable.Enumerator.T, TResult> func) where SourceEnumerable : IEnumerable;
public struct SelectEnumerable<TResult, SourceEnumerable> : IEnumerable where SourceEnumerable : IEnumerable
{
type Enumerator = SelectEnumerator<TResult, SourceEnumerable.Enumerator>;
}
public struct SelectEnumerator<TResult, SourceEnumerator> : IEnumerator where SourceEnumerator : IEnumerator
{
type T = TResult;
} This allows us to hide most of the type parameters from the signature of Select. The library author needs to deal with them, but the library consumer can ignore them. This has similar performance implications to shapes, but doesn't require creating a whole new category of type. |
Beta Was this translation helpful? Give feedback.
-
I'd just like to highlight that whichever implementation approach is selected, performance should absolutely be a top priority for this feature. The type system, and method invocation are fundamental features of the language, and any implementation of a feature which invokes a performance penalty for a behavior which is almost indistinguishable from a "free" operation, should be avoided. As a developer, I don't expect a heavy performance penalty for using As a developer, I don't expect a heavy performance penalty for using One objective of this is to facilitate clean, readable code. If I'm doing a code review, I shouldn't get caught out because I didn't know that a developer had changed an Interface into a Shape at some point, and now code which previously was optimal, is now victim to a performance hit. Some areas like gaming, HFT, ML etc were mentioned above - these area care more about performance, than they do about clean code. If we're going to consider these sectors, then we need to consider their actual objectives. Wasting cycles isn't a luxury that can be afforded in these high frequency, or hyper-scale sectors. |
Beta Was this translation helpful? Give feedback.
-
@peter-dolkens, if a neat syntactic disambiguation can be devised, I'm all for it, but if not, I believe your objection shouldn't prevail and block this feature. Unlike C++ and Rust, C# and .NET in general don't hold the tenet of zero cost abstractions (besides, you'd pay in performance overhead only if you actually use this stuff). This would rather be a usage scenario for linters, code colourers and possibly other tools aiding in code review and integration. |
Beta Was this translation helpful? Give feedback.
-
As far as the original approach to this issue goes, a witness struct constrained to an interface has the potential of having zero additional costs, considering there is no boxing and all methods called on the witness are concrete and potentially even inlineable. |
Beta Was this translation helpful? Give feedback.
-
@ByteEater-pl I never said the feature should be blocked, just that it should be implemented in the appropriate place - if that's the CLR, then so be it. If it can be done efficiently outside the CLR, then fine. C# and .Net historically don't hold the tenet of zero cost abstractions - but there has been a massive shift with dotnet core, to the point where I believe even basics like string methods have been rewritten in native c# because they're performing faster than reusing the native c implementations. In regards to linters, etc - C# historically relied heavily on the power of the VS IDE, but again we're seeing a rapid shift away from the heavy reliance on visual studio, and far more emphasis on lightweight, portable code that is accessible, and doesn't require a full VS installation to be effective. dotnet core, and the emphasis on performance is a massive transformation for the language, and company as a whole, and it's seeing a surge of increased interest as a result - we can't be so willing to fall back into the "old ways" just because they're easier, or else we risk losing that interest and excitement that we've spent the last 5 years building. |
Beta Was this translation helpful? Give feedback.
-
The topic of zero-cost abstraction (wrt to this space) is absolutely part of the design discussion. It will absolutely be considered, though we may or may not choose which direction to go here based on our personal thinking on the pros/cons of the different options. |
Beta Was this translation helpful? Give feedback.
-
Why not just call them generic interfaces? If generics don't cover such scenaria they could be extended. Why introduce another separate term and a keyword that means other things in graphics-related code? |
Beta Was this translation helpful? Give feedback.
-
Ideally every interface should be able to define a shape the class already implements. Could glue it on at compile time by extending "using", say "using xx as yy". At runtime could have a special cast (a safe one that uses reflection to check and probably an unsafe one too for already marked-as-unsafe code contexts for speed) |
Beta Was this translation helpful? Give feedback.
-
Could the "efficient" way prove to be an insecure way? |
Beta Was this translation helpful? Give feedback.
-
The compiler would make such if you do the glueing at compile time. Problem is do you trust the calling library to have it? Unless class loader generates it. |
Beta Was this translation helpful? Give feedback.
-
The caller of some method should be the one telling the compiler (or even the runtime in dynamic cases) that they're passing something as an implicit interface implementation, not the method itself say it only accepts shapes or it accepts interfaces and shapes. I also think adding more terminology and have a split world is problematic. What I'd expect is compiletime and runtime generated interface implementation wrappers around objects (and structs) based on caller claims. |
Beta Was this translation helpful? Give feedback.
-
Were relations between multiple types considered as well? public shape SConvertible<TFrom, TTo>
{
TTo Convert(TFrom value);
}
public extension SConvertible<List<TFrom>, List<TTo>> when SConvertible<TFrom, TTo>
{
TTo Convert(TFrom value) => value.Select(Convert).ToList();
} |
Beta Was this translation helpful? Give feedback.
-
This functionality is amazing, but it idolizes the practice of relying on the compiler's optimization skills. Relying on optimizations is especially dangerous for large-scale projects. Just look at C++'s standard templates for example. While functional, things like |
Beta Was this translation helpful? Give feedback.
-
Shapes and Extensions
This is essentially a merger of two other proposals:
These two features have good synergy, and would benefit from being designed together. This proposal is a concrete shot at doing so, knowing full well that there are a myriad of different decisions that could be made. This is not to be particularly opinionated about those choices - it's just easier to understand and discuss a general proposal when it has a concrete shape.
Extensions
The idea behind "extension everything" in most proposals is to use a different approach to declaration syntax from today's "static methods with a
this
modifier", instead providing a type-like declaration with a name, an indication of the type to be extended, and a set of member declarations for that type. This syntactic approach generalizes more easily to other member kinds, including properties, static members and even operators.Here is an example adding and using a static property:
The name of the
extension
declaration (like that of the static class containing an extension method today) is useful primarily for disambiguation purposes. We'll get back to that later.What should extension declarations compile into? The straightforward answer is a static class with static members (taking an extra parameter for the receiver if necessary). However, as we'll see below, this proposal suggests a different approach.
Shapes
Interfaces abstract over the shape of objects and values that are instances of types. The idea behind type classes is essentially to abstract over the shapes of the types themselves instead. Furthermore, where a type needs to opt in through its declaration to implement an interface, somebody else can make it implement a type class in separate code.
In C#, let's call type classes "shapes":
This declaration says that a type can be an
SGroup<T>
if it implements a+
operator overT
, and aZero
static property.As an example, the
int
type is halfway to implementingSGroup<int>
, since it has a+
operator overint
, and above we showed how to use an extension to add a static int-valued propertyZero
. Let's make it so that an extension declaration can also declare that the extended type implements a given shape:This declaration extends
int
not only with theZero
property, but withSGroup<int>
-ness. In the scope of this extension,int
is known to be anSGroup<int>
.In general, a "shape" declaration is very much like an interface declaration, except that it:
That last restriction is important: a shape is not a type. Instead, the primary purpose of a shape is to be used as a generic constraint, limiting type arguments to have the right shape, while allowing the body of the generic declaration to make use of that shape:
So as an important special case, shapes address the long-desired goal of abstracting numeric and computational code over the specific data types being manipulated, while allowing clean use of operators.
Let's call the
AddAll
method with some ints:Clearly we need to check the constraint at the call site. If this is called within the scope of the
IntGroup
extension declaration above, the compiler does indeed know thatT = int
satisfies theSGroup<T>
constraint. However, there is more going on: how does theAddAll
method know howint
is anSgroup<int>
- at the call site? There needs to be more information passed in than just the (inferred)int
type argument and thenumbers
array.Implementation
There is an implementation trick at play here, which is stolen straight out of the type classes proposal referenced above. The trick starts as follows:
In our example, the shape and extension declarations translate into this:
(Instance declarations of operators aren't allowed in C#, so those
+
"methods" are encoded as instance methods, just as today's operator declarations are actually encoded as static methods in IL).Note that the struct encoding of
IntGroup
has a declaration for the+
operator even though the original extension declaration doesn't. It captures what it thinks+
means on ints, and thus fulfills theSGroup<int>
interface.The generic method taking shape-constrained type parameters is translated as follows:
struct
and by the underlying interfaces of those shapesstruct
constraint)Let's see that on the
AddAll
method from above:See how the extra
Impl
type parameter carries the knowledge of how to do+
andZero
into the method. The benefit of doing this with a struct type parameter, rather than, say, extra delegate parameters, is that the runtime does a really good job of optimizing it: it will specialize the generic method code for each different struct it gets called with, so that the method body can inline and optimize the specific+
andZero
implementations. Measurements on the linked type classes proposal show incredibly good performance, with near-zero cost to the abstraction.In general I show the translations instantiating the
Impl
structs once when possible, but it is essentially free to create an instance of an empty struct, so we could also consider instantiating it every single time we need to call a member on it. That's less readable though.Finally, there's a bit of extra work for the call site: It needs to infer and pass that extra type argument:
It infers
T = int
the normal way, and then looks to find exactly one declaration in scope that implementsSGroup<int>
onint
. Finding theIntGroup
extension, it passes its underlying struct type. In case of ambiguities, the original code needs to disambiguate, just as when more than one extension applies elsewhere. We'll get to that later.Implementing shapes directly
Once shapes are in the world, new types will want to implement them directly, instead of via an extension declaration:
This is easily supported by simply a) checking that the type does indeed conform to the shape, and b) generating an extension next to the type declaration, or rather its underlying struct, witnessing the implementation:
Whenever the
Z10
type is in scope, so is the fact that it is anSGroup<Z10>
.Instance members
So far we only explored shapes and extensions for static members, but they should apply equally to instance members.
In order to create the underlying interface for
SComparable<T>
we need to take a page out of the current extension methods feature and add an extra parameter to convey the receiver of theCompareTo
call. What should be the type of that receiver? Well that depends on what type the shape is ultimately implemented on. In other words, we need to give the interface an extra type parameter representing the "this type", and let implementers fill that in:Essentially, any shape that defines instance members needs to also have an extra
This
type parameter.From there on, the translation of generic methods over these shapes is unsurprising. This method:
Translates to this:
The instance method call
result.CompareTo(t)
"on" theresult
gets translated into an instance method callimpl.CompareTo(result, t)
on theimpl
struct, taking the "receiver" as a first parameter.Extending interfaces with shapes
Note that the shape
SComparable<T>
is almost identical to the existing interfaceIComparable<T>
. Obviously there's a completely trivial implementation ofSComparable<T>
on anyT
that implementsIComparable<T>
, and so we can write that implementation once and for all by extending the interface itself:We don't even need to provide a body; the compiler can just figure it out. We just have to say it to make it true (and to declare the underlying struct to "witness" the SComparableness to generic methods).
Under the hood, the compiler translates to:
Shapes in generic types
So far we've seen generic methods with type parameters constrained by shapes. We can do the same for generic classes, where a given type argument gets to come in with its own way of doing certain things.
As an example let's build a
SortedList<T>
whereT
needs to beSComparable<T>
. This will then work both forT
's that inherently implementIComparable<T>
(and henceSComparable<T>
if the previous section is applied), but for otherT
's the instantiator ofSortedList<T>
can apply an extension and imbueT
with a suitable comparison to apply inside of the list (please forgive algorithmic errors! 😀):We can implement this much like we do with generic methods, adding an extra type parameter to pass in the implementation struct. We can even store an instance of that struct in a static field if we want.
One problem here is that the
Impl
type argument becomes part of the type identity of the constructedSortedList
type. So ifSortedList<T>
is constructed with the same explicit type argument in two different places that implementSComparable<T>
with different extensions, those are different constructedSortedList<T>
types! The shape implementation becomes part of the type identity, and if it differs, those types are not interchangeable.Also, generic types can be overloaded on arity, so introducing secret extra type parameters can potentially throw a wrench into families of generic types all differing only on arity.
The type classes proposal linked above actually makes the "implicit" type parameters explicit. This comes with its own problems, but does have the advantage that the number of type parameters shown in source code corresponds to the number in IL.
Extensions on shapes
Using an approach similar to the shape-parametized types above, we can let extensions extend shapes, not just types. Let's say we want to write an extension that offers the trivial implementation of all the comparison operators on everything that implements
SComparable<T>
:Just like the generic methods and types explored above, the underlying struct for this extension needs to have an extra type parameter for the implementation of the
SComparable<T, T>
interface:If that extension is in scope at the declaration of the
Max
method above, the comparison operators can now be used directly:This gets straightforwardly implemented by passing the method's
Impl
type parameter (implementingSComparable
) to theComparison
struct above, instantiating that, and calling its operator implementations:Explicit implementation and disambiguation
This section is a potentially useful tangent, that one can choose to go down only a certain part of the way.
We can consider explicit implementation, akin to what interfaces have, where the shape's members don't show up on the extended types themselves, but only when accessed through the shape directly. For instance, integers can also be viewed as a group under multiplication, but since that would mean implementing
+
as*
andZero
as1
, we would not have those versions show up directly on theint
type:Thus, if both
IntGroup
andIntMulGroup
were in scope,int.Zero
would still yield0
, not1
.When passing an
SGroup
constrained type argument, however, we'd still want to be able to disambiguate whether we meant "int
with addition" or "int
with multiplication".Specifying which shape or extension to use
When there is more than one declaration in scope providing a given member or shape implementation, the compiler cannot automatically infer which one to use. We may be able to give sensible resolution rules that deal with a lot of cases, but there's going to be situations where you want to specify which extension declaration you meant to use.
An approach to this could be to simply allow the name of the extension declaration itself as a type name, with the rough meaning of "same type as the extended type, but give priority to this extension." It's sort of similar to
base
meaning "this type, but start member lookup in the base type":This would also work as an approach to get at explicitly implemented members:
When accessing instance members on a receiver, to get at an explicitly implemented member, or to choose an extension to "view it as", cast the instance to the shape or extension name:
Or maybe it looks better with and
as
expression:Using extensions as types
The number of places where you can use shapes as types is very limited: we've only seen them as constraints and in disambiguating uses. That is because they do not correspond to a single underlying type.
Extensions however, really do correspond to a single underlying type: the one that they extend. We could therefore imagine allowing them to be used as types of fields, parameters, etc. They would then denote, at runtime, the underlying type, but the compiler would know to "view it as" the extension.
Let's again imagine that
PointComparable
explicitly implementsSComparable<Point>
on the typePoint
. But now I want to write code that compares Points all the time, and I don't want to have to cast every single time. Instead, can I just declare that I want to view these particular ints asPointComparable
's?:This translates into:
For public interfaces we would have a way to signal the "overlay" extension type in metadata, e.g. through an attribute.
Extensions as wrapper types
One potentially useful further step to this, is to allow extensions to explicitly implement their own members, not just ones from shapes. What it would mean is, they don't actually expose the member on the underlying extended type, but only when the extension itself is used as the type.
This can be used to create compile time "wrapper types", that compile down to using the underlying type at runtime, but give it an extra face at compile time:
This is an example of giving a typed overlay to something less typed. That appears to be a common scenario, and is the whole basis for e.g. TypeScript's type system. Whether or not this is the right mechanism for it is probably debatable, but it is certainly a mechanism.
Discussion
This is a very high level proposal - it is more than a proof on concept that a design exists, and many details would need to be locked down (and changed) if we want to pursue this, e.g.:
using
ed, or is it in effect just through its presence?@this
parameter?Some issues with the proposal as it currently stands:
Other directions one might explore to achieve some of the same goals:
Looking forward to further discussion of the pros and cons!
Mads
Beta Was this translation helpful? Give feedback.
All reactions