Skip to content

Latest commit

 

History

History
111 lines (69 loc) · 5.71 KB

LDM-2022-01-26.md

File metadata and controls

111 lines (69 loc) · 5.71 KB

C# Language Design Meeting for January 26th, 2022

Agenda

  1. Open questions in UTF-8 string literals

Quote of the Day

  • "Doesn't everyone love the native compiler?"

Discussion

Open questions in UTF-8 string literals

https://github.com/dotnet/csharplang/blob/main/proposals/utf8-string-literals.md#unresolved-questions
#184

Today, we looked to answer some initial questions on the UTF-8 prototype implementation. It's important to note that this prototype is intended to be handed off to partners to play with so we can get some feedback on what works and what doesn't: the rules we're deciding today are not intended to be the final LDM rulings on the questions. We can and should revisit some or all of these questions after initial impressions are gathered from users.

Conversions from null literals

We do have precedent in the language for conversions that depend on the constant value being converted: for example, integer constants can be converted to bytes if the constant value would fit in a byte. However, we don't think that it's necessary to special case here. We already allow (string)null to be converted to Span<char> via a user-defined conversion, so we don't think there's any new ground being tread here. It does marginally widen the breaking change surface, but we think that's ok for the prototype.

Conclusion

The conversion will apply to null string constants.

Conversion kinds

After examining the existing types of implicit conversions, we think conversion is best specified as a new kind of conversion. It doesn't fit well into any existing conversion: constant conversions, for example, have both constant input and output values. This conversion won't have a constant output value.

Conclusion

We will introduce a new conversion kind for string constant to UTF-8 bytes.

Implicit standard conversion

The next question after deciding on a separate kind is whether this conversion will be a standard conversion or not. This impacts whether it can be used as an input or output conversion for user-defined conversions, and is also where history complicates our decision a bit. The native C# compiler (used for C# 5 and earlier) implemented the spec incorrectly and considered more conversions to be standard than actually are, leading to an experience that is potentially inconsistent, no matter what we do. For now, though, our initial gut reaction is that there's too much involved in one of these conversions to be considered in the list of standard conversions. We will revisit this question after prototype feedback.

Conclusion

Not a standard conversion, for now.

Expression tree representation

In theory, this conversion could be represented in an expression tree just as the final bytes wrapping an array creation. However, the conversion itself has semantic meaning, and we think that it's important for expression trees to show this meaning. Therefore, we will block this conversion in expression trees, and future modernization efforts will consider this case well.

Conclusion

Blocked.

Natural type of u8 literals

Since this is a prototype, we think that the proposed byte[] is fine for now. We will likely want to have a deeper debate about whether string literals should have a natural type of a mutable array, but we don't think that debate is necessary for now.

Conclusion

The natural type of u8 literals will be byte[], for the prototype.

Conversion depth

There is some interplay here with the standard conversion rules: if the conversion is a standard conversion, then any place that takes a byte[] as a user defined conversion would be able to take a string literal. That leaves things like IEnumerable<byte>, which standard conversions have no impact on.

There's a slippery slope here. If we don't make the conversion a standard conversion, then there's a potentially-infinite number of places to make the conversion work. We'll err on the side of getting feedback for now: the current set of conversion targets (byte[], Span<byte>, and ReadOnlySpan<byte>), and we'll see what users think after using the prototype.

Conclusion

No new conversion targets added for now.

Breaking changes

This one is definitely the trickiest of the issues we're discussing. We just made a big breaking change in C# with lambda natural types. This was successful, but it took a lot of users using the feature and telling us what was broken for us to hammer down the rules. The better function member rule as proposed will address some of the breaking changes, but not all: new instance methods could be called instead of extension methods, for example. Additionally, the justification we used for such changes with lambdas does not apply here. The justification for lambdas was that any extension methods almost certainly delegated to the original method, just providing a target type. This reasoning almost certainly does not hold up for byte[] vs string.

We think the best way to gather feedback is again to be bold and have the prototype make no adjustments at all. This will almost certainly break some people, but based on the lambda feature we have real evidence that people use our previews and report back to us when things break. This feedback will be used to help determine how agressive we need to be around breaking changes, or whether we need to abandon the conversion form entirely and only do a u8 syntax form.

Conclusion

The prototype will not adjust any rules here, so we can hopefully see what breaks in practice.

Suffix case sensitivity

We should be consistent with other constant suffixes.

Conclusion

Consistency means that u8 or U8 will be allowed.