-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed changes for deconstruction, declaration expressions, and discards (C# 7.0) #285
Comments
Another alternative is nothing. really var (x,) = e;
(int x, int) = e;
M(out int);
switch (o)
{
case int:
case long:
Console.WriteLine("int or long");
break;
} |
@Thaina This proposal is already implemented in released C# 7 and it's an underscore. Not sure why this was left Open, because Original proposal was closed 14 Jan 2017. |
It is open here for any discussion. |
moving to dotnet/csharpstandard |
I don't think any action on the specification is needed for this. This is simply a record of public discussion that led into language changes in C# 7.0. |
Pre-meeting evaluation: I would expect this to be covered by the PR Mads is including; I assume Neal added the meeting-discuss label just to confirm that we're happy to close? |
Everyone's happy for us to close this and leave it as resolved. |
@gafter commented on Fri Oct 28 2016
This week's LDM meetings (2016-10-25 and 2016-10-26) proposed some possible changes to the handling of deconstruction, declaration expressions, and wildcards that, even if done later, would affect the shape of compiler APIs today. That suggests we would want to consider what to do today so that we are compatible with that possible future. This is a summary of the proposed changes and their API impact.
Wildcards
The LDM is proposing to change the character that we would use to represent a "wildcard" from
*
to_
. We also considered some alternatives, but both*
and?
have a logical (if not technical) ambiguity, becauseint?
andint*
are valid types._
doesn't likely have the same kind of syntactic ambiguity, because it is already an identifier in all of the places where we want to support wildcards. But it may be a semantic ambiguity for that same reason. The reason we like_
is that users already introduce variables, for example parameters, named_
when their intention is to ignore them.We want to support both the "short form"
_
and the "long form"var _
just in case there is an ambiguity with, for example, a field in scope. We'll return to this later.Declaration Expressions
We currently represent an out variable declaration using a "declaration expression" node in the syntax model. That was done because we were thinking that we may want to generalize declaration expressions in the future.
There is some slight discomfort with declaration expressions as they appear in the current API and proposed language spec, because while they are expressions in the syntax model, they are not expressions in the draft spec, and do not have types, and therefore there are special rules called out for them wherever they may appear in the specification. Besides the mismatch between the spec and model, we expect we may want to allow declaration expressions in more contexts in the future, in which case we will want them to be treated as expressions. In that case we will be better served by treating them as expressions (i.e. they can have a type) today.
Deconstruction
Possibly generalizing declaration expressions in the future makes us want to reconsider our syntax model for deconstruction today. For example, given the statement
We can think of this as a special deconstruction declaration statement (as it is today), or alternatively we can think of it as a statement expressions containing an assignment expression with a tuple expression on the left-hand-side. In that case the tuple expression contains two declaration expressions. The latter formulation makes more sense in the possible future in which we generalize declaration expressions.
Similarly, given the statement
We can think of this as a special deconstruction declaration statement (as it is today), or alternatively we can think of it as a statement expressions containing an assignment expression with a declaration expression on the left-hand-side. The latter formulation makes more sense in the possible future in which we generalize declaration expressions.
This reformulation of deconstruction allows us to remove from the syntax model the new statement form for a deconstruction declaration. It also allows us to generalize what we allow in the future:
Here, the left-hand-side contains a mixture of already-existing variables (in this case
x
) and newly declared variables (int y
). And it can be used in an expression context as well (e.g. as the body of an expression-bodied method).Wildcards (revisited)
Given this new understanding of the direction of the syntax, there are four forms that wildcards can take. First, it can take the place of an identifier in a designator (i.e. in a declaration expression):
Since
_
is already an identifier, no syntax model change is required. However, semantically we want this to create an anonymous variable, and shadow any true variable (e.g. parameter or field) from an enclosing scope named_
. There is no name conflict error if wildcards are declared this way more than once in a scope.Second, it can similarly be used to declare a pattern variable:
Third, it can take the place of an identifier in a simple expression where an lvalue is expected and is used as a target in a deconstruction assignment or out parameter, but in that case its special behavior as a wildcard only occurs if looking up
_
doesn't find a variable of that nameThis special name lookup is similar to the way we handle
var
.Finally, it can be used where a parameter can be declared today. However, we relax the single-definition rule to allow multiple conflicting declarations (same scope or nested scopes), in which case the identifier
_
binds as a wildcard.We have to be careful with these changes so that any program that uses
_
as an identifier and is legal today continues to compile with the same meaning under these revised rules.Syntax model changes
This allows us to simplify the handling of the
for
loop to handle deconstruction. Now the deconstruction is just one of the expressions in the expression list of the initializer part, and doesn't require its own placeholder in the syntax. That means that the syntax node for thefor
loop remains unchanged from the C# 6 version.This requires a change to the way we handle the deconstruction form of the
foreach
loop. Because we want the left-hand-side to be capable of representing all of these formswe now need to use expression for the syntax node before the
in
keyword.We can remove the syntax node for the deconstruction declaration statement, because that is just an assignment statement in this model.
Syntax.xml changes
The following changes are proposed compared to the current implementation in master. We remove
and we remove the
Deconstruction
field from theForStatementSyntax
We change the
VariableComponent
field ofForEachComponentStatementSyntax
to be anExpressionSyntax
, and probably change the name ofForEachComponentStatementSyntax
.And we change
to
We leave unchanged
SemanticModel changes
We may want to change the behavior of
GetTypeInfo
on a declaration expression, depending on how the shape of the specification evolves.We probably need to consider what the behavior of
SemanticModel
APIs should be on wildcards.Summary
The changes to declaration expressions and deconstruction should be done today so that we don't have an incompatible change later.
Wildcards are an interesting problem. Even if we don't want to implement them for C# 7, we want to wall off the semantic space so that valid C# 7 programs don't change meaning or become invalid in a later language version. I suspect the simplest way to do that is to implement wildcards today.
/cc @dotnet/ldm @dotnet/roslyn-compiler
@gafter commented on Fri Oct 28 2016
There is an interesting syntactic ambiguity arising from declaration expressions. Consider the code
Here the "tuple expression" on the left has a declaration for a variable
C
of typeA<B>
. However, the same "tuple expression" elsewherehas a first subexpression equivalent to
(A < B) > C
. Writing the specification (and implementing the parser) to distinguish these two contexts will be fun.@gafter commented on Fri Oct 28 2016
Similarly, consider the code
Here the "tuple expression" on the left has a declaration for a variable
C
of typeA<B,D>
. However, the same "tuple expression" elsewherehas a first subexpression equivalent to
(A < B)
and a second equivalent to(D > C)
.@gafter commented on Fri Oct 28 2016
Another advantage of adding support for wildcards today rather than later is that we can warn for unused pattern and out variables.
@HaloFour commented on Fri Oct 28 2016
While I understand the ambiguity issues of
*
it still seems like a better choice than_
. I can just imagine the massive pit of failure that will be people accidentally deconstructing into legally declared and used variables.@CyrusNajmabadi commented on Fri Oct 28 2016
@gafter Can you show hte changes you intend to make to the foreach-syntax model?
Thanks!
@CyrusNajmabadi commented on Fri Oct 28 2016
Actually, n/m. I can see you describe it as:
@alrz commented on Fri Oct 28 2016
@HaloFour You can't accidentally do anything. because basically any mistake is a compile-time error. And more to the point of this thread, I can understand how it's easier to represent in AST because it would be merely a semantical change. In fact, just because it's an identifier, it allows you to use
T _
to represent type-check patterns or_
to ignore lamba args without introducing any new syntax and further ambiguities.@alrz commented on Fri Oct 28 2016
@gafter Regarding declaration expression ambiguities, I think it's better to require them to always be initialized. Any use case involving an uninitialized declaration expression can use a pattern instead.
@gafter commented on Wed Nov 30 2016
That was our intent. Declaration expressions can appear in an
out
argument, or in a tuple expression that is in a deconstruction context such as anout
argument or the left-hand-side of an assignment expression or the target of aforeach
loop.@HaloFour commented on Fri Oct 28 2016
@alrz
Sure you can. It would be impossible for the compiler to tell if you intend to deconstruct into a local variable or field in scope that happens to be named
_
or if you intend to use a wildcard:Is this necessarily a common scenario? Probably not. But as a legal identifier it could never be ruled out.
@vbcodec commented on Fri Oct 28 2016
@HaloFour
for pt as 2D point
(_, y) = pt;
change _ for 0, but for 3D point, variable _ will remain unchanged
@HaloFour commented on Fri Oct 28 2016
@vbcodec
Considering that parameters of deconstructor methods are always
out
it doesn't matter what method is resolved the_
variable would always be overwritten. I imagine thatPoint3D
won't have a two parameter deconstructor, though.@vbcodec commented on Fri Oct 28 2016
@HaloFour
I meant that Point3D has only one method with three outs. If it has also second method with two outs, then compiler probably will pick that method, and variable _ will be changed. But method with two outs (for Point3D), will lead to messy results.
@HaloFour commented on Fri Oct 28 2016
@vbcodec
I'm not entirely sure what that has to do with this discussion. Either the compiler will find a suitable deconstructor method and
_
will be overwritten, or the compiler will not find a suitable deconstructor method and that will result in a compiler error.@vbcodec commented on Fri Oct 28 2016
@HaloFour
Wildcards allow to use methods with higher number outs than target need (two in your example). So without support for wildcards compiler generate error, but with support for wildcards, there won't be error if last (or first) parameter will be _.
@HaloFour commented on Fri Oct 28 2016
@vbcodec
I'm quite sure that this is not the case. If you wanted to deconstruct
Point3D
and it only offered a 3 parameter deconstructor you'd be required to specify a combination of 3 legal identifiers and wildcards combined:@vbcodec commented on Fri Oct 28 2016
@HaloFour
Ok, I got it,, concluded my version from https://blogs.msdn.microsoft.com/dotnet/2016/08/24/whats-new-in-csharp-7-0/
If it do not allow to skip unwanted data (in term of quantity, not names), then there is low benefit.
@alrz commented on Fri Oct 28 2016
@gafter I'm saying that in all those places, instead of a declaration expression we use a pattern so that the following would be possible,
Or something like that. Perhaps, only complete patterns would be allowed in these contexts.
@HaloFour This is also applicable to
var
. That's a legal identifier and it could never be ruled out, right? But you'd never use a lowercase identifier for class name — ok, you shouldn't but you can, and when you do you should be aware of the consequences. Your example uses a new feature (deconstruction) so we'd never have to deal with such ambiguity. In all other existing code, backward compatibility is a must and as mentioned in OP the code using_
would never stop compiling. In fact, this is just another reason to implement wildcards now, otherwise we could never use_
for wildcards.@orthoxerox commented on Sat Oct 29 2016
@HaloFour The easiest way out is to forbid (or warn against) tuple deconstruction, out var and pattern matching into variables named
_
now in v7, before wildcards are implemented. Then onlyout _
remains syntactically valid and the vNext compiler can emit a warning if variable_
is shadowed by a wildcard of a different type.@HaloFour commented on Sat Oct 29 2016
@alrz
The difference would be much more subtle. The declaration of
var
as a valid type only prevents use ofvar
as a contextual keyword, which at worst would lead to a compiler error when trying to use them both together. That's not the case here. Accidental use of both together could easily lead to perfectly legal code that results in unexpected overwriting of existing values in variables. Types also have much simpler scopes than variables. Because_
is used as both a wildcard and as a property selector shorthand relatively commonly today there are more scenarios where this will collide with wildcards.@orthoxerox
That's a pretty reasonable suggestion but how much it limit where wildcards could be used within the language? For example there are proposals to allow wildcards to ignore lambda arguments as well as to declare ignored variables. In both of those cases it's already perfectly legal today to use
_
as the identifier name.@gafter @MadsTorgersen
What I don't get is where the ambiguity lies with
*
. Yes, there is the potential collision with pointer types, but you can't use pointer types in type switch (is int*
has always been illegal syntax). And inout var
the lack of a following identifier would immediately disambiguate it.Is it because that last case would make it seem like
is
supported pointers?@axel-habermaier commented on Sat Oct 29 2016
@HaloFour: Is that really that much of an issue? Just write an analyzer that warns whenever someone declares a variable with name
_
in a non-wildcard position and that's it. That warning could even be implemented by the compiler itself in a new warning wave (#1580) -- speaking of which, warning waves unfortunately still do not seem to be implemented....@vbcodec commented on Sat Oct 29 2016
@HaloFour
if (o is int *)
cannot create ambiguity, because it is illegal (should be) to use wildcard in type switch, so it always will be interpreted as pointer. `
@HaloFour commented on Sat Oct 29 2016
@axel-habermaier
Why should perfectly legal code now be a warning?
At least when Java decided to reserve
_
they did so over multiple versions of the language. Not this wishy-washy scope shadowing exception. I have the utmost respect for the C# team in trying to evolve the language while retaining backwards compatibility but there's a point where there's too much gray area. In all previous examples of adding a contextual keyword it was always explicitly an either/or proposition. Here every single use needs to be examined by the developer s you can't even assume which of these overlapping scopes might actually be in use later.@vbcodec commented on Sat Oct 29 2016
@gafter
Why to use
than just
M(_);
?
@dsaf commented on Sat Oct 29 2016
Was there a chance to re-visit the previous scoping decision as well #14697? Thanks.
@dsaf commented on Sat Oct 29 2016
Why would wildcard ever be used next to type e.g.
out int *
? Surely if you don't care about value you don't care about type eitherout var *
orout *
. Wouldn't this help with pointers ambiguity?@HaloFour commented on Sat Oct 29 2016
@dsaf
Disambiguation with methods overloaded by
out
parameters. This is true without using wildcards too:@dsaf commented on Sat Oct 29 2016
@HaloFour theoretically in well-designed code such method calls should be equivalent, but I understand that it can make a problem in practice. Maybe type can go postfix
out * int
?@dsaf commented on Sat Oct 29 2016
On the other hand, two of the main competitors went for
_
wildcards - and they were starting from scratch:https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/Patterns.html#//apple_ref/doc/uid/TP40014097-CH36-ID420
http://docs.scala-lang.org/tutorials/FAQ/finding-symbols#keywordsreserved-symbols
@dsaf commented on Sat Oct 29 2016
I know it's unrelated, but I kind of like this in Scala:
@HaloFour commented on Sat Oct 29 2016
@dsaf
dotnet/csharplang#3561 :)
@dsaf commented on Sat Oct 29 2016
@HaloFour awesome! I wonder if using a different character helps or not though... Regardless, the industry should've stuck with proper keyboards: https://en.wikipedia.org/wiki/Space-cadet_keyboard
@HaloFour commented on Sat Oct 29 2016
@dsaf
First thing that comes to mind looking at that keyboard is that Facebook could probably make a fortune selling keyboards with a dedicated "Like" button.
I do agree that care must be made when considering special characters as they might be very difficult to use with non-US keyboard layouts. Now if you excuse me I need to order a Thunderbolt 3 Esc-key dongle.
@AdamSpeight2008 commented on Sat Oct 29 2016
How will
_
work with the line continuation character in VB?@gafter
int?
andint*
maybe legal type names. butint? ?
andint* *
would still work as we as looking for an identifier not a type identifier.@gafter commented on Sun Oct 30 2016
There is no proposal to introduce
_
as a wildcard in VB.What we are "looking for" depends on the context.
int**
is a valid type.@HaloFour commented on Sun Oct 30 2016
@gafter
Yes,
int**
is a valid type, but I don't think that really matters. Type-switch isn't legal on pointer types. That leavesout
declarations and deconstruction. In both cases the type is required to be followed by an identifier/target, so it's never technically ambiguous.M(out int*)
could only ever legally mean anout
parameter of typeint
that is being discarded andM(out int**)
could only ever legally mean anout
parameter of typeint*
that is being discarded. At worst it's slightly visually ambiguous, but only to the minority of C# developers who use unsafe code. And even then the result is either compilable or it's not, the likelihood of a legal but subtle bug is exceptionally low.That's not the case with
_
. As demonstrated above if the compiler treats_
as a special case variable with an exception in shadowing rules in any scope the result can easily be accidentally overwriting a perfectly legal explicitly declared scope or field. It's ambiguous in the worst way; neither the developer nor the compiler could be absolutely sure that they aren't writing incorrect code.If you guys continue to go with
_
I seriously suggest making it strictly an either/or proposition. Instead of basing wildcards on the legal identifier with loosey-goosey shadowing rules make it a completely separate feature. If there is an explicitly declared field or local in scope then wildcards are simply not available. Attempting to use anout
declaration, deconstruction or type-switch into a new variable named_
would be a compiler error. At least then the likelihood of accidentally overwriting legal variables or fields should be mitigated:@ErikSchierboom commented on Mon Oct 31 2016
I think moving from
*
to_
is a great choice, particularly because so many other languages use the_
character for wildcard handling (F#, Scala, Haskell).@DavidArno commented on Mon Oct 31 2016
I feel that the whole "semantically we want this to create an anonymous variable, and shadow any true variable (e.g. parameter or field) from an enclosing scope named
_
" approach is possibly the wrong one to take. For the code snippet,the fact that
var (x, _) = point;
doesn't modify_
will cause a lot of confusion and bugs. A less error-prone approach could be to:out/is var
scope leakage),_
will result in a new anonymous variable, any attempt to read one of them gives a compiler error and a real var _ cannot then be later defined:Update
Based on @HaloFour's excellent idea below, I think that
F1
above probably should not work as written. As per his suggestion, once a real variable_
comes into scope, then wildcard usage of_
should be prohibited, to minimise the chance of confusion and bugs over whether_
is a real variable or not. So it would become:@DiryBoy commented on Mon Oct 31 2016
How about
-
(minus) for wildcard?Looks like no ambiguity, and no need to hold Shift to type it.
@vbcodec commented on Mon Oct 31 2016
@DavidArno @HaloFour
@gafter wrote "We want to support both the "short form" _ and the "long form" var _ just in case there is an ambiguity with, for example, a field in scope."
That means (for compiler with support for wildcards):
and without defined variable _
The point is, that every defined _ preserve value if subsequent deconstruction will be like _(x, var _) or var (x, ).
This is compatible with compiler without support for wildcards, because for such compiler
@HaloFour commented on Mon Oct 31 2016
@DavidArno
My code examples are for how I think it should behave, not how it behaves given the description above. I think that we're largely on the same page that wildcards and explicitly declared
_
variables/fields should never mix. That's why:With
_
in scope wildcards are not possible, so the deconstruction declaration would fail just as it would if that variable were named anything else. However the following would work fine:Even with those rules I still think that this would be a source of confusion. What if I forgot about that
_
variable above? That's why I think the compiler should go forward and forbid use of_
in deconstruction if it's not a wildcard.@ErikSchierboom
I'd agree with you if
_
weren't already a legal C# identifier. That is the source of the ambiguity.@DavidArno commented on Mon Oct 31 2016
Ah, my apologies. I misunderstood you.
I think that "I think the compiler should go forward and forbid use of
_
in deconstruction if it's not a wildcard" is a very sensible suggestion.@vbcodec,
I seriously hope that isn't @gafter's intention, as treating
_
as a real var, and not a wildcard, in that example makes wildcards pretty pointless. Ignoring the second value and only creating anx
variable is exactly what wildcard use invar (x, _) =
would be for.@alrz commented on Mon Oct 31 2016
👍 There is no reason to support deconstructing to a variable named _ specifically because these are new features and there is no backward compatibility issue in doing that. However, if we want _ as wildcard I think it is mandatory to be implemented along with deconstructions, i.e. now.
@vbcodec commented on Mon Oct 31 2016
@DavidArno
They treat _ both as real var and as wildcard, depending on context.
They not decided to support wildcards, but to make room in syntax and semantics for possible future implementation. For now _ remain just indentifier and can be declared only once in the scope, and use it as 'temporary widcard' may be problematic
with added support for wildcards, code is much better
The only problem is with last line, where new variable _ cannot be added after deconstructions var with _ as wildcard, but this is small loss.
Current rules are bit complex, but clear and logic. Later time, developers using _ must take care while also using _ as wildcard for deconstruction and outs.
@DavidArno commented on Mon Oct 31 2016
@vbcodec,
As per the last line of @gafter's OP: "I suspect the simplest way to do that is to implement wildcards today." I think you are therefore mistaken.
@HaloFour commented on Mon Oct 31 2016
@gafter
Just to note, it's also common to see developers using
_
in lieu of a form of dotnet/csharplang#3561:@vbcodec
Allowing for
_
to sometimes be a wildcard and to sometimes be a variable and to sometimes allow deconstruction to splat all over a perfectly legal value is the very antithesis of clear and logical.@vbcodec commented on Mon Oct 31 2016
if _ makes so much confusion, then let they switch from _ to #
@CyrusNajmabadi commented on Tue Nov 01 2016
@AdamSpeight2008 The right decision here is to probably use a new keyword for VB. I suggest "Underscore" as the name of the keyword. So "_" will be used for line continuation. And "Underscore" will be used when you don't wan to name something :)
@AnthonyDGreen commented on Tue Nov 01 2016
@AdamSpeight2008
In VB the wildcard will likely just be *. That's been the working assumption for a while now.
@gafter commented on Tue Nov 01 2016
@CyrusNajmabadi The
_
character used to be calledSPACING UNDERSCORE
in Unicode, but its name has changed toLOW LINE
. So the VB keyword should probably beLowLine
. 😉@vbcodec commented on Tue Nov 01 2016
@AnthonyDGreen
Character * is best, preserve it for VB. as it do not collide with any name and look very well,
Also could be nice to write
(Dim x, *) = (1,1)
than
(Dim x, Dim *) = (1,1)
@HaloFour commented on Tue Nov 01 2016
@vbcodec
I'd expect it to be the following:
Dim (x, *) = (1, 1)
Makes me jealous that C# won't have
*
😞@AdamSpeight2008 commented on Tue Nov 01 2016
@HaloFour Come over to the dark side.
@aluanhaddad commented on Sun Nov 06 2016
_
is far more common in existing code than one might think. I don't personally use it but I see it often enough.@ErikSchierboom The analogy with other languages; F#, Scala, etc; still holds regardless of the operative character. It is a conceptual analogy and I don't think anyone coming from languages where
_
is a wildcard will be thrown off by the use of*
or any other glyph.The idea that the meaning is ambiguous, and over a non-lexical scope at that (e.g. what if a base class has or introduces a visible writable property named
_
), and that ambiguity involves turning a discarded value into a destructive write operation is pretty serious. @HaloFour makes a very strong argument for reconsidering this decision.@vbcodec commented on Sun Nov 06 2016
IMO, maybe let there be two wildcards, one 'relative'
_
and one absolute#
(used in some patterns as wildcars https://msdn.microsoft.com/en-us/library/0c899ak8(v=vs.110).aspx)@vbcodec commented on Sun Nov 06 2016
@aluanhaddad
Better is to make breaking change and disallow use
_
for other purposes than local (inside function) variables. The same with statements likeif (true)...
and similar. Deviated patterns must be removed, as they are roadblocks for new features.@CyrusNajmabadi commented on Sun Nov 06 2016
We only make breaking changes if the benefit is absolutely staggeringly high. In this case the breaking change would absolutely not be there.
@CyrusNajmabadi commented on Sun Nov 06 2016
What if a base class has or instroduces a type called 'var'? What if it introduces an instance method that hides an extension method you were calling? What if it introduces a static field that hides another outer static field that you were referencing? What if it introduces an operator that subtly changes overload resolution?
What if, what if, what if? :)
We have to operate in a model of practicality. We don't operate solely on the 'what if' case because it allows for a system where a miniscule chance of problems outweighs any potential benefits.
As mentioned already, there are numerous facets that we judge language features against. And while we could work toward a system with 0 ambiguity, that then might degrade benefits that we're trying to get everywhere. If we were a language that decided that 0-ambiguity was a pillar that we would trump everything else (similar to how we often look at back-compat), then that would probably be hte end of the argument. But that's not our language. We have and will continue to accept ambiguity as long as we feel that there is enough value in total to the user.
Or, more simply: all things being equal, we would go with the less ambiguous option. But if not all things are equal, we're not simply going to focus only on a single issue like this without consideration for all the other factors we think are desirable.
This is important to understand because the debate keeps going circularly around the issue of ambiguity. I'm trying to make it clear that such arguments will likely be insufficient as they focus on one issue in exclusion of the entirety of the topic and attempt to elevate it to be the criteria by which this feature is judged. There's pretty much only one thing that ever gets to make that sort of argument, and that's back-compat. And that's because we've generally elevated that aspect of the language to such a paramount position when making decisions. Ambiguity has not been elevated to that position. it's important, but it's not the end-all (even though some commentators have made it clear it is for them).
@HaloFour commented on Sun Nov 06 2016
@CyrusNajmabadi
The argument against
*
was what? Ambiguity. Specifically ambiguity of the intended type. But there was ZERO parsing ambiguity and ZERO accidental (and silent) overwriting of someone's data. You can't play the ambiguity card while simultaneously claiming that the ambiguity card isn't relevant.The
*
option also avoids bizarre compiler errors resulting from the fact that even when the developer intends to use wildcards the compiler must favor variable identifiers in any form of existing syntax. And it also avoids the pages of spec changes to try to describe the special declaration, scoping and visibility rules that would make an early JavaScript designer blush.And this is because of some assumption that C# devs use
_
as wildcards? I can guarantee you that most C# devs haven't used F# or other functional languages and are completely unaware of this assumption. There is nothing in any of the design documentation for C# that even remotely implies this. My anecdotal experience is that your assumption is wrong, and that all you're ending up doing is both unnecessarily complicating the use of a legal and valid identifier while at the same time crippling the feature of wildcards behind the necessary backward compatibility concerns.@aluanhaddad commented on Sun Nov 06 2016
@CyrusNajmabadi indeed it's a slippery slope. I'm not saying that ambiguity is the key issue here I'm actually saying that it's a breaking change that is the key issue. It has a chance to break existing code and backwards compatibility in a way that as @HaloFour points out is simply not an issue with
*
.*
also for what it's worth has a significant precedent to mean wildcard. Since patterns can't match pointers there is no possibility of breakage, and the syntax is completely intuitive.So while I agree with your general premise that ambiguity is not the most important factor I think using
_
has no particular value and bears a fair number of downsides.Also considering that the use of
=>
in match expressions was rejected, do to very reasonable concerns about conflation with existing language constructs leading to a non-intuitive scoping semantics, the_
being more familiar to Scala and F# programmers is really kind of irrelevant as both of those languages use arrows.@CyrusNajmabadi commented on Sun Nov 06 2016
Sure. And we've gone down that path before numerous times. You make it out to be a major concern. I've pointed out numerous time why it is not felt to be a significant issue.
I want to make the next point extremely clear: I do not design the language with a superseding consideration of the difficulty of the job it incurs on Mads to write the spec :) I do not design the language with a superseding consideration of what sort of difficulty will be incurred by the compiler writers when they have to implement the feature. I design the language with a core goal of producing what i think will be the best actual language for customers to use for their own code.
That is my paramount concern and it drives the evaluations i make to judge if a language choice is appropriate or not. Indeed, i have fought for numerous changes in teh past that led to quite involved language+compiler work because i felt it was providing the right language for customers to use at the end of the day.
Appeals to emotion are not relevant to my decision making process :)
In your opinion. As noted in other discussions, your opinion is noted and understood. I often feel like discussion go in circles with you because you never want to accept that others may feel differently than you, or may evaluate things in a different manner than you. Let me be clear: i understand and acknowledge your opinion. I just do not share it. :)
I find terms like 'crippling' extremely unnecessary in the context of this discussion. We've evaluated a huge number of different types of code constructs, and we've actually found that the vast majority of them are not hampered in any way. Importantly the cases we think are actually common end up being unimpaired by these decisions. The only cases we've seen impaired appear to be in code that is extremely pathological, and unlike any that we've actually seen in the vast amount of code we've actually examined over the decades. So, saying that anything is 'crippled' is excessive hyperbole that diminishes your case.
As i've pointed out before, we've seen arguments like this levied against every single interesting language feature we've ever delivered. It seems incredibly common in our industry for incredibly tiny concerns to be often exaggerated to be the most concerning and damning critique against a feature. People can get so bogged down over the most tiny issue that it becomes all encompassing and tends to drown out the rest of the discussion. I'm looking for a bit of perspective here as it is all too easy and seductive at times to think of things in nothing but binary forms.
The introduction of 'var' didn't bring the C# ecosystem crashing down, as some were concerned it might. The ability for code to change meaning when an extension was added to a namespace didn't end the ability for developers to code effectively, as some thought it would. And the ability for _ to be both a variable name and also a wildcard, will very likely not end up crippling anything. Yes. It introduces issues. But we need to be rational and pragmatic and not equate every issue to a sky-is-falling moment every time we discuss things.
@CyrusNajmabadi commented on Sun Nov 06 2016
Can you be very specific in an example that demonstrates this. Thanks!
@CyrusNajmabadi commented on Sun Nov 06 2016
That's fine. Others disagree. A final decision will be made by the LDM as to if there is enough value here. Your perspective, and others who find no value in _ will def be taken into account.
@aluanhaddad commented on Sun Nov 06 2016
@CyrusNajmabadi I apologize if I came off as abrasive. I love the C# language and have deep, earnest respect for those who design it. It is a wonderful language and I have happily watched it get better and better with each release.
I once saw a presentation by a developer at JetBrains where he said that his company feels that C# is the gold standard for language evolution.
I thought to myself, it is about time someone acknowledged that because it is true.
So when I say that
_
has no value, I do not mean to devalue the opinions of others who value it. I was just perfectly happy with*
and don't see any issues with it.@CyrusNajmabadi commented on Sun Nov 06 2016
Thanks :) but could you still point out the back-compat issue. I really want to know what it is as we strive to avoid back compat breaks as much as possible. I want to make sure i'm not missing something very critical here.
@vbcodec commented on Sun Nov 06 2016
@HaloFour
Not really, because
_
is also ambiguous but differently (type vs name). And*
is less ambiguous than_
.Anyway I am still more conviced to
*
than_
.@aluanhaddad commented on Sun Nov 06 2016
later he takes this awful code and adds some syntactic sugar to make his nasty method look hip he changes
GetPseudoRandom
to@CyrusNajmabadi commented on Sun Nov 06 2016
That is not a breaking change. A breaking change means that code meaning changes in the absense of any actual changes directly to it. i.e. you upgrade your compiler and now your existing code changes its meaning.
It is not a 'breaking change' if you yourself introduce changes that make your mean something different. Such changes are the norm for the language and occur all the time. For example, if someone introduces an extension to called "Where" to
IEnumerable<T>
in your own project's namespace, then that will supercede the extensions we've found in usings outside of that namespace. That's not a 'breaking change'. That's simply how the language works :)@aluanhaddad commented on Sun Nov 06 2016
You win.
@CyrusNajmabadi commented on Sun Nov 06 2016
It's not about winning or losing. :) It's about making sure we're correctly assessing the issue and ensuring that the potential issues are correctly categorized. When discussing language issues we like to be precise, and that means using the appropriate terminology. 'Breaking back compat' is a term with very specific meaning, and we need to ensure that we use it appropriately lest there be confusion over htat.
In this case, i haven't seen any actual back-compat concerns. But i want to make sure that any potential cases are not missed. So if someone says there's a back-compat concern, i want it absolutely investigated to make sure we're not overlooking something. The purpose of this dialog wasn't to win or lose anything. It was to make sure your concern was heard and understood and to ensure that we weren't missing something super critical.
In this case, there was a misunderstanding of terminology. Hopefully this helps clear things up, and makes the conversation clearer in the future :)
@CyrusNajmabadi commented on Sun Nov 06 2016
Note: your specific code example is one i would describe as being a counter-intuitive code change result. Such things are not desirable, but exist in an enormous number of cases in the language. For example, for many, the following is counter-intuitive:
In nearly all code cases, it's completely safe to just convert hte above code to:
This was how the language worked for many releases. Then we introduced lambdas. And now there is a very different meaning if your code happens actually be:
Now, moving the declaration to be with the assignment changes the meaning of your code. This is very unintuitive to some users, but plays deeply into concepts of scopes and captures and how they interact.
These unintuitive issues can and do hit people occasionally. However, we accept such non-intuitive aspects of the language due to overall greater benefits that we see overall.
@aluanhaddad commented on Sun Nov 06 2016
@CyrusNajmabadi I was attempting to be playful when saying
Thanks for making the distinction clear. For what is is worth, in the case of closures, the semantic changes obviously broke existing code but the benefit was overwhelming, the expressiveness of the language increased dramatically. I'm not sure the same can be said here as it is mainly a choice of which glyph to use. I'm fairly confident that none of my code will break as I don't use
_
as an identifier or even a prefix.@CyrusNajmabadi commented on Sun Nov 06 2016
That is not hte case. The addition of closures should not have broken existing code. How could they? How would existing code even use closures :)
(Note: my above example is simply about the non-intuitiveness of lambda capturing. it's not about hte actual breaking change we made with foreach-scoping changing).
@CyrusNajmabadi commented on Sun Nov 06 2016
I regret bringing up the foreach case because it also overlaps a decision we made to have an actual breaking change in the language. That's added some confusion around hte point i was trying to make. So please use the below example instead:
Note: your specific code example is one i would describe as being a counter-intuitive code change result. Such things are not desirable, but exist in an enormous number of cases in the language. For example, for many, the following is counter-intuitive:
In nearly all code cases, it's completely safe to just convert hte above code to:
This was how the language worked for many releases. Then we introduced lambdas. And now there is a very different meaning if your code happens actually be:
Now, moving the declaration to be with the assignment changes the meaning of your code. This is very unintuitive to some users, but plays deeply into concepts of scopes and captures and how they interact.
These unintuitive issues can and do hit people occasionally. However, we accept such non-intuitive aspects of the language due to overall greater benefits that we see overall.
@CyrusNajmabadi commented on Sun Nov 06 2016
The overall point i'm making is that unrelated to actual breaking changes we often do add language features that can lead to unintuitive consequences to later code changes. Previously, it was always safe to do something, but the addition of lambdas made such cases no longer safe.
Note: at the time, this was a thing that people were actually concerned with. But we accepted that the issue was acceptable and that the benefits of how C# variables are captured outweighed potential pitfalls like the one i pointed out above.
Similar issues exist with many C# language features. Indeed, if a C# language feature involves scope, in any way, it's nearly always hte case that there's some type of code change that may have unintuitive consequences in terms of how the code now runs. We do not seek to avoid that outright. It is simply too constraining, and would lead to far too unpleasant a language, where we every feature we added had to painstakingly ensure that it could not be affected by anything else**.
** Note: this has been something that people have requested when we added new features. Linq/Extensions in particular were something that some wanted completely different scoping rules around. There was abject horror by some that pulling in a using could change how a method call was resolved. But, in the end, things turned out fine :)
@CyrusNajmabadi commented on Sun Nov 06 2016
Ah, sorry :) Internet communication is hard :)
@HaloFour commented on Sun Nov 06 2016
@aluanhaddad
Actually, in your example the
_
field wouldn't be affected because you're using the "long form" of wildcards where you're declaring a new scope. In that case the compiler can unambiguously determine that_
is supposed to be a wildcard. I do believe this to be the prototypical case of using wildcards with deconstruction. The ambiguities only arise from the "short form" where the syntax might allow deconstruction or assignment into an existing identifier called_
if it happens to exist.To note, I do think that
_
is a perfectly good wildcard. If it weren't for the complications that arise from it being a legal identifier I'd wholeheartedly support it. Those complications only exist in the "short form", most specifically without
or raw assignments, which is why I'm proposing to curtail them both.@AdamSpeight2008 commented on Mon Nov 07 2016
The ack symbol
@
would be a good symbol to use, as the wildcard, as it minimal current usage in both languages. It also doesn't have the problem of being a legal identifier in either language.C#
@identifier @"verbatum string literal"
VB.net
All of which can be distinguished fairly easily.
Possible Sequence Operators
All of which are current invalid, and thus available for use.
Bad Point Visually prominent, but this could be alleviated via wise color choice for syntax highlighting. eg One close to the background color, so it blend in.
In vb.net its usage doesn't look too bad (to me)
@svick commented on Mon Nov 07 2016
@HaloFour
I thought it would be interesting to test this assumption, so I analyzed C# repos that were trending today on GitHub (my source code). Out of the 3214 single parameter lambdas that ignored its parameter, 1000 used
_
, though the vast majority of those are from just 3 projects (Rx.NET, roslyn and UniRx) out of the 25.To get a better image of what "most C# devs" do, I also looked at some recently modified repos that have at most 100 stars. There, out of 354 single parameter lambdas, that didn't use their parameter, 54 were
_
.My conclusion is that
_
is indeed used as wildcard, though not that often by most C# devs.More data:
Trending repos
Single parameter lambdas with ignored parameter
_
arg0
x
i
ex
Multi parameter lambdas with one ignored parameter
_
c
sender
y
s
Multi parameter lambdas with multiple ignored parameters
useCachedResult
,encoder
arg0
,arg1
_
,__
x
,i
s
,e
Recently updated repos with <= 100 stars
Single parameter lambdas with ignored parameter
index
_
ev
e
a
Multi parameter lambdas with one ignored parameter
sender
s
context
_
eventData
Multi parameter lambdas with multiple ignored parameters
sender
,e
s
,e
_
,__
k
,o
sender
,args
@HaloFour commented on Mon Nov 07 2016
@svick
That's pretty awesome.
Conversely, I'm curious as to the number of times that
_
is used as a parameter (or variable in general) where the value is never used after it is assigned (regardless of how frequently it is assigned). That might give a better indicator as to whether or not developers in general treat that identifier as a wildcard, as well as other ways that identifier has been used. It's probably self-selecting as the people likely to reach for_
are the same people familiar to seeing how it's used in other languages.The only caveat I'd have to numbers mined from github is that the audience is likely a bit more technically advanced. It's not a good slice of what I think of as "typical developers", those internal business developers. But we clearly can't mine that code and any data is better than nothing.
@HaloFour commented on Mon Nov 07 2016
@AdamSpeight2008
You could also argue that the
@
character looks like a black hole into which the value is lost. A visual representation of/dev/null
. 😁@gafter commented on Mon Nov 07 2016
In case you don't know the story of @MadsTorgersen and me with the
@
character, see http://gafter.blogspot.com/2007/01/primate-parts.html@AdamSpeight2008 commented on Tue Nov 08 2016
@gafter @MadsTorgersen So your saying red would be the color for the possible wildcard
@
?@DavidArno commented on Tue Nov 08 2016
@HaloFour,
In my view, all new features should be aimed at how those "more technically advanced" developers work/want to work.
@AdamSpeight2008 commented on Wed Nov 09 2016
Another alternative is
...
(triple dot) .@gafter commented on Fri Dec 02 2016
This work is being placed into the branch https://github.com/dotnet/roslyn/tree/features/wildcard during development.
@aluanhaddad commented on Thu Nov 17 2016
@HaloFour Thank you for explaining this.
@ErikSchierboom commented on Fri Nov 25 2016
@gafter The link to the wildcard branch is incorrect. It should be: https://github.com/dotnet/roslyn/tree/features/wildcard
@gafter commented on Sat Jan 14 2017
This work has been completed; closing.
@dsaf commented on Fri Mar 10 2017
scala/scala3#2041
The text was updated successfully, but these errors were encountered: