-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Language support for Tuples #347
Comments
So much 👍. A lot of useful applications for this. int counter;
bool shouldDoThing;
try {
(counter, shouldDoThing) = MyMethod(param);
} catch (Exception ex) {
// Handle exception
} Tuple members in scope sounds very useful. How would it handle cases like I think I'm against having a Tuple be able to be implicitly "splat". I'd be a much bigger fan of a javascript-esque spread operator so rather than public double Avg(int sum, int count) => count==0 ? 0 : sum/count;
Console.WriteLine($"Avg: {Avg(Tally(myValues))}"); you'd do public double Avg(int sum, int count) => count==0 ? 0 : sum/count;
Console.WriteLine($"Avg: {Avg(...Tally(myValues))}"); or something similar. In the grand scheme of things though, this may not be nearly as confusing to developers as I'm thinking. |
|
I think the proposal is generally good; however, some issues that came to mind:
|
Actually simple value types may not incur the cost associated with copying as they are good candidates for inlining. |
I think simple cases like |
Can we use "empty tuple"? If the empty tuple () M() => (); // instread of void M() {}
Func<()> f = () => (); // instead of Action |
@ufcpp: Great idea! |
Just a thought, but is it possible to use the
could be transformed to
Of couse, this would not work straight with generics, but I imagine that problem is rather close to the object/dynamic differentiation that is already implemented. This idea does mean that the return value is not a value type, but it does solve the unification between assemblies issue. |
@erik-kallen The disadvantages of the existing tuple types is that
|
@gafter I probably wasn't clear enough when I wrote what I did, but my intention was that when the compiler encounters these special attributes it could create a temporary type which is has a runtime type of I acknowledge the reference type thing to be an issue in my idea, though. |
@erik-kallen We're actively exploring the idea of "erasing" the member names using attributes as you suggest. However we're leaning toward using struct versions of tuples. |
How would this be supported in lambdas? For example, I have this method:
With type-inference, the argument looks like this:
Is the return type always inferred? If not, you get something really weird:
|
@RyanBL They would be target-typed. In the absence of a target type |
Regardless of being a value or reference type, as @erik-kallen, I envision tuples to be like System.Tuple... The diference between tuples and anonymous types is that anonymous types are types with properties with well defined names and types that you can pass around to libraries like ASP.NET routing and Dapper and tuples have well defined properties (Item1,Item2, ...) that can be aliased at compile time. However, in order to make them useful for function return values, the compiler could apply an attribute with the compile time names, like parameter names are part of the function signature. |
Another thought- while not directly related to tuples, the idea that the compiler can provide a strongly-named way of using tuples (avoiding Item1 and Item2) seems like it could be extended to making a more strongly named sort of dictionary (where the items have more meaningful names than .Key and .Value). I often run into situations where I need to index some collection on more than one dimension and you're forced to either build custom types or use multiple dictionaries which might share the exact same type signature. If that happens then the only differentiation is in the variable name and possibly XML comments. Example: class Baz {
public String LongName {get; private set;}
public String ShortName {get; private set;}
...
}
void Foo(IEnumerable<Baz> enumerable) {
var lookup = enumerable.ToDictionary(e => e.ShortName, e => e);
...
Bar(lookup);
}
void Bar(IDictionary<String, Baz> lookup) {
//Now what did lookup index by? LongName or ShortName? I need to check calling code or documentation to know.
} |
@MgSam This proposal explicitly describes support for tuples with named members. |
I feel it would be important for tuples to work across assemblies. I could think of a few places in most of the projects I've had at work that would be improved by this proposal, and in almost every case it's in an interface implemented in one assembly and consumed in another. Whether or not the existing Tuple classes are used, serious consideration should be made that this proposal allows for exposure of tuple returns across assemblies. Maybe it'd be good to add struct analogues to the existing classes? |
@gafter I know, I just thought the problem was similar enough to warrant mentioning here. There really isn't much of a distinction in Github between proposal threads and discussion threads and in any case I don't have a good enough solution in mind to make a separate proposal. |
Some of this has been said, so some of this is +1 to those comments :-). My thoughts (albeit colored by a decade plus of hacking Common Lisp) ... Do not conflate immutability and tuples, let me decide independently how they are used because I know sometimes I want to mutate some state. Do use structs because a primary use is Multiple Value Return (MVR), and that should be efficient. In the same way spurious boxing affected Roslyn perf, an inner loop using MVR could become a perf impact in a large but responsive application. I can't believe I'm asking for more syntax, but please consider "return values(...)" akin to return yield, where I make it very clear to the reader I'm return multiple values. Though I do admit the parenthesis and no comma op is almost manifest enough, but I feel I want "return values" :-). Tuples will definitely be used beyond MVR. We often have little carrier or terd types that sucks to have to name and make more first class. Consider in one place in my program I need a list of Foo's, and it turns out I'd like to timestamp the items in the list. The timestamp is meta to the abstraction of being a Foo, and I really don't want to declare FooWithTimeStampForSinglePlaceUsage :-). Note too, in this scenario I want to mutate the timestamp with user activity. Do support for productivity partially supplied tuple elements and allow me to deconstruct to fewer vars than elements. Perhaps in the MVR case I can declare default values for elements if they are not supplied, or you just default to default(T). This works well too with your idea of using named arg like syntax for filling in SOME of the elements and defaulting others. OFTEN when using MVR you do not need all the values returns because the extra values are only sometimes helpful. Imagine Truncate returns the Floor of x/y and a remainder, but 90% of the time I just need the first value of the integer division. It may be too much for C#, but I'd also consider if I having a deconstructing binding site with more variables declared than the expression returns, then you just fill in my extra values with default(T) ... I'll fill them in with actual values in the next few lines of code, but now I have the tuple I want already in hand without an intermediary needed. I didn't think too deeply, but it seems some sort of unification across assms is needed for a smooth MVR experience (where these may be used the most). I'd also unify with anon types at least to the extent of implicit conversions (note, this would be for productivity coding, but yes, too much convenience in the hands of the masses can lead to too much inefficiency in code :-)). I really think what you call "tuple members in scope" is VERY NOT C#. It smacks of an expression based language (which C# is not) where falling off functions returns the last value of the last expression of whatever branch you were in. It is also very subtle for C#, and I think the MVR feature should be a bit more explicit, like 'ref', for ready readability. I like adding splatting, but I think it should be explicit (a la funcall vs. apply in Common Lisp, or * in python). I get we already to some not readily readable resolution around paramarrays, but I'd strongly consider breaking from the precedent here for manifest readability. Thanks for listening! |
It would be better if:
But for the anoymous definations, just convert to Item1,Item2……ItemN:
Do we have this function now? |
@MaleDong That is how tuples would behave according to this proposal and as implemented in the previews/RC. If a function or property returns a named tuple then the compiler will allow you to reference the elements of that tuple by name. |
@gafter Any conclusions drawn from the exploration of that idea? I'm guessing it's not making it into C# 7 but was anything deferred to be revisited in the future? |
@atifaziz all tuple names get erased. They are not part of the underlying type. The names are encoded in attributes on the API declarations that use the tuples. |
@mattwar Yeah I get that.
Interesting and that's what I was hoping but I couldn't find these attributes in a test assembly with tuples that I compiled using the |
@atifaziz It is encoded in System.Runtime.CompilerServices.TupleElementNamesAttribute |
Thanks @mattwar! Exactly the info I was looking for & glad to see it made the cut. |
Being able to return multiple values without using ref/out parameters would be nice. So would be the ability to pass around things without creating single-use "model" classes. But overall I feel this proposal adds too much new syntax while solving a rather small set of problems. Too much cost/complexity for too little gain. Most Importantly: MaintainabilityPositional assignments, deconstruction, mutability and conversions seem like features that could easily lead to severe maintainability issues. Actually, I will make a stronger statement: if introduced, they certainly will lead to severe maintainability issues, especially in large corporate applications. I'm saying this as someone who maintains several medium-size legacy apps in C#. Imagine dealing with implicit casts inside deconstruction statements. Or code that is using tuples to store and manipulate things throughout a 50-line method with multiple loops. This will be done, and this will be done a lot. GeneralityThis proposal really describes several distinct features, and I think each of those features could be implemented in a more generic and broadly useful way. For example, I often use current Tuples to pass composite models to MVC views. Unlike viewbags it ensures type safety, but in the process we loose all the attribute names. Inline type declarations really should solve this, but the current proposal doesn't seem to address such use case, and other similar use cases.
... Deconstruction sounds like a way to "capture" multiple assignments in one statement. But will it solve a rather common problem of partial object copy?
Ideally, this should be possible without repeating names.
But it sounds like deconstruction like this will not work without explicit Deconstruct implementation, creating which would defy the whole point. ... Tuples won't help me when I'm calling Etc, etc. In short:
Confusing SyntaxHaving syntax where method calls, tuple literals, inline type declarations and tuple deconstruction look the same will be visually confusing, and probably lead to incorrect mental models when new people learn the language. It's not like we're in Lisp where everything truly is a list. We're talking about four drastically different things here. Discrepancies with current syntaxOn the other hand, the proposed syntax is very different from what we already have in C# for very similar features. Current type definitions: Proposal for what constitutes inline type definition: Current multi-valued return method: This proposal: Implementation details aside, anonymously typed objects are intuitively similar to tuples: strongly typed entities without a "proper" class. Here is their initialization right now: Here are tuple "equivalents": Note that anonymous object initializer # 2 captures names, while tuple return positionally assigns values, which is dangerous if we have a tuple with several objects of the same type. C# 6.0 added a dictionary initialization syntax, which isn't strongly typed, but also deals with names and values, just like tuples: |
@reinventor, Please create a new issue about it, preferably new issue per feature or improvement. |
@reinventor, Dictionary initializers is a C# 3.0 feature. What you're mentioning is index initializers which were introduced in C# 6- Dictionary initializers rely on the implementation of Objects of this type:
can be initialized with the dictionary initializer syntax:
And objects of this type:
can be initialized with indexer initializer syntax:
Yeah! Nitpicking! I know! 😄 |
I ran into two things playing with tuples today (VS2017 RC). The first is that there is no warning if you return a named tuple that does match the named order of return-type tuple: public static (int sum, int count) Tally( IEnumerable<int> items )
{
var tally = (count: 0, sum: 0); // <- Whoops
foreach( var item in items )
{
++tally.count;
tally.sum += item;
}
return tally; // <- No warning that names are mismatched.
} While I understand some might want this behavior it seems very error prone. So the current recommendation seems to not use tuples as local variables as the following will at least generate a warning: return (count:count, sum:sum); // Warns that 'count'/'sum' mismatch return tuple. If it was possible to declare a local variable using the return type, something like: decltype(return) tally; // tally is declared as a (int sum, int count) and tally.sum / tally.count can be used.
....
return tally; then this would at least eliminate some potential errors. I think #3281 already talks about a C++-like decltype keyword. It would also apply to declaring a local tuple of the same type as a parameter. Maybe I missed a better way to do this. The other minor issue is that in VS 2017 I think that the Formatting/Spacing rules need updating to include support for tuples as they are ignoring the normal ( )'s spacing rules currently. |
Is there any chance that this will eventually be supported? I'm running into scenarios where tuple literals end up twice as long because I'm just repeating long names. var reallyLongNameA = 1;
var reallyLongNameB = 2;
var tuple = (reallyLongNameA, reallyLongNameB);
Console.WriteLine(tuple.reallyLongNameA); // Errors, only tuple.Item1 is valid var reallyLongNameA = 1;
var reallyLongNameB = 2;
var tuple = (reallyLongNameA: reallyLongNameA, reallyLongNameB: reallyLongNameB); // This works, but it seems unnecessary.
Console.WriteLine(tuple.reallyLongNameA); |
Along same lines, what does a |
|
@HaloFour Okay, so just that simple, ey? So in a real sense, Tuple returns are the answer to returned anonymous types? |
Playing with Visual Studio 2017, I just discovered that, while I can do var (a,b) = SomethingThatReturnsATuple(); I cannot do I don't see a reason why, var and let should have the same destructuring capabilities. |
@tec-goblin There is a proposal to support that: #15074. |
Tuples have been added to C# 7.0. We are tracking it at dotnet/csharplang#59 |
There are many scenarios where you'd like to group a set of typed values temporarily, without the grouping itself warranting a "concept" or type name of its own.
Other languages use variations over the notion of tuples for this. Maybe C# should too.
This proposal follows up on #98 and addresses #102 and #307.
Background
The most common situation where values need to be temporarily grouped, a list of arguments to (e.g.) a method, has syntactic support in C#. However, the probably second-most common, a list of results, does not.
While there are many situations where tuple support could be useful, the most prevalent by far is the ability to return multiple values from an operation.
Your options today include:
Out parameters:
This approach cannot be used for async methods, and it is also rather painful to consume, requiring variables to be first declared (and
var
is not an option), then passed as out parameters in a separate statement, then consumed.On the bright side, because the results are out parameters, they have names, which help indicate which is which.
System.Tuple:
This works for async methods (you could return
Task<Tuple<int, int>>
), and you only need two statements to consume it. On the downside, the consuming code is perfectly obscure - there is nothing to indicate that you are talking about a sum and a count. Finally, there's a cost to allocating the Tuple object.Declared transport type
This has by far the best consumption experience. It works for async methods, the resulting struct has meaningful field names, and being a struct, it doesn't require heap allocation - it is essentially passed on the stack in the same way that the argument list to a method.
The downside of course is the need to declare the transport type. THe declaration is meaningless overhead in itself, and since it doesn't represent a clear concept, it is hard to give it a meaningful name. You can name it after the operation that returns it (like I did above), but then you cannot reuse it for other operations.
Tuple syntax
If the most common use case is multiple results, it seems reasonable to strive for symmetry with parameter lists and argument lists. If you can squint and see "things going in" and "things coming out" as two sides of the same coin, then that seems to be a good sign that the feature is well integrated into the existing language, and may in fact improve the symmetry instead of (or at least in addition to) adding conceptual weight.
Tuple types
Tuple types would be introduced with syntax very similar to a parameter list:
The syntax
(int sum, int count)
indicates an anonymous struct type with public fields of the given names and types.Note that this is different from some notions of tuple, where the members are not given names but only positions. This is a common complaint, though, essentially degrading the consumption scenario to that of
System.Tuple
above. For full usefulness, tuples members need to have names.This is fully compatible with async:
Tuple literals
With no further syntax additions to C#, tuple values could be created as
Of course that's not very convenient. We should have a syntax for tuple literals, and given the principle above it should closely mirror that of argument lists.
Creating a tuple value of a known target type, should enable leaving out the member names:
Using named arguments as a syntax analogy it may also be possible to give the names of the tuple fields directly in the literal:
Which syntax you use would depend on whether the context provides a target type.
Tuple deconstruction
Since the grouping represented by tuples is most often "accidental", the consumer of a tuple is likely not to want to even think of the tuple as a "thing". Instead they want to immediately get at the components of it. Just like you don't first bundle up the arguments to a method into an object and then send the bundle off, you wouldn't want to first receive a bundle of values back from a call and then pick out the pieces.
Languages with tuple features typically use a deconstruction syntax to receive and "split out" a tuple in one fell swoop:
This way there's no evidence in the code that a tuple ever existed.
Details
That's the general gist of the proposal. Here are a ton of details to think through in the design process.
Struct or class
As mentioned, I propose to make tuple types structs rather than classes, so that no allocation penalty is associated with them. They should be as lightweight as possible.
Arguably, structs can end up being more costly, because assignment copies a bigger value. So if they are assigned a lot more than they are created, then structs would be a bad choice.
In their very motivation, though, tuples are ephemeral. You would use them when the parts are more important than the whole. So the common pattern would be to construct, return and immediately deconstruct them. In this situation structs are clearly preferable.
Structs also have a number of other benefits, which will become obvious in the following.
Mutability
Should tuples be mutable or immutable? The nice thing about them being structs is that the user can choose. If a reference to the tuple is readonly then the tuple is readonly.
Now a local variable cannot be readonly, unless we adopt #115 (which is likely), but that isn't too big of a deal, because locals are only used locally, and so it is easier to stick to an immutable discipline if you so choose.
If tuples are used as fields, then those fields can be readonly if desired.
Value semantics
Structs have built-in value semantics:
Equals
andGetHashCode
are automatically implemented in terms of the struct's fields. This isn't always very efficiently implemented, so we should make sure that the compiler-generated struct does this efficiently where the runtime doesn't.Tuples as fields
While multiple results may be the most common usage, you can certainly imagine tuples showing up as part of the state of objects. A particular common case might be where generics is involved, and you want to pass a compound of values for one of the type parameters. Think dictionaries with multiple keys and/or multiple values, etc.
Care needs to be taken with mutable structs in the heap: if multiple threads can mutate, tearing can happen.
Conversions
On top of the member-wise conversions implied by target typing, we can certainly allow implicit conversions between tuple types themselves.
Specifically, covariance seems straightforward, because the tuples are value types: As long as each member of the assigned tuple is assignable to the type of the corresponding member of the receiving tuple, things should be good.
You could imagine going a step further, and allowing pointwise conversions between tuples regardless of the member names, as long as the arity and types line up. If you want to "reinterpret" a tuple, why shouldn't you be allowed to? Essentially the view would be that assignment from tuple to tuple is just memberwise assignment by position.
Unification across assemblies
One big question is whether tuple types should unify across assemblies. Currently, compiler generated types don't. As a matter of fact, anonymous types are deliberately kept assembly-local by limitations in the language, such as the fact that there's no type syntax for them!
It might seem obvious that there should be unification of tuple types across assemblies - i.e. that
(int sum, int count)
is the same type when it occurs in assembly A and assembly B. However, given that structs aren't expected to be passed around much, you can certainly imagine them still being useful without that.Even so, it would probably come as a surprise to developers if there was no interoperability between tuples across assembly boundaries. This may range from having implicit conversions between them, supported by the compiler, to having a true unification supported by the runtime, or implemented with very clever tricks. Such tricks might lead to a less straightforward layout in metadata (such as carrying the tuple member names in separate attributes instead of as actual member names on the generated struct).
This needs further investigation. What would it take to implement tuple unification? Is it worth the price? Are tuples worth doing without it?
Deconstruction and declaration
There's a design issue around whether deconstruction syntax is only for declaring new variables for tuple components, or whether it can be used with existing variables:
In other words is the form
(_, _, _) = e;
a declaration statement, an assignment expression, or something in between?This discussion intersects meaningfully with #254, declaration expressions.
Relationship with anonymous types
Since tuples would be compiler generated types just like anonymous types are today, it's useful to consider rationalizing the two with each other as much as possible. With tuples being structs and anonymous types being classes, they won't completely unify, but they could be very similar. Specifically, anonymous types could pick up these properties from tuples:
{ string Name, int Age}
. If so, we'd need to also figure out the cross-assembly story for them.Optional enhancements
Once in the language, there are additional conveniences that you can imagine adding for tuples.
Tuple members in scope in method body
One (the only?) nice aspect of out parameters is that no returning is needed from the method body - they are just assigned to. For the case where a tuple type occurs as a return type of a method you could imagine a similar shortcut:
Just like parameters, the names of the tuple are in scope in the method body, and just like out parameters, the only requirement is that they be definitely assigned at the end of the method.
This is taking the parameter-result analogy one step further. However, it would special-case the tuples-for-multiple-returns scenario over other tuple scenarios, and it would also preclude seeing in one place what gets returned.
Splatting
If a method expects n arguments, we could allow a suitable n-tuple to be passed to it. Just like with params arrays, we would first check if there's a method that takes the tuple directly, and otherwise we would try again with the tuple's members as individual arguments:
Here,
Tally
returns a tuple of type(int sum, int count)
that gets splatted to the two arguments toAvg
.Conversely, if a method expects a tuple we could allow it to be called with individual arguments, having the compiler automatically assemble them to a tuple, provided that no overload was applicable to the individual arguments.
I doubt that a method would commonly be declared directly to just take a tuple. But it may be a method on a generic type that gets instantiated with a tuple type:
There are probably a lot of details to figure out with the splatting and unsplatting rules.
The text was updated successfully, but these errors were encountered: