From 38e7ce29fabcf4c40d0aae8cea82c4dd09c57518 Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Thu, 18 Feb 2021 17:49:07 -0800 Subject: [PATCH 01/13] Add improved interpolated strings spec. --- proposals/improved-interpolated-strings.md | 256 +++++++++++++++++++++ 1 file changed, 256 insertions(+) create mode 100644 proposals/improved-interpolated-strings.md diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md new file mode 100644 index 0000000000..b830882c00 --- /dev/null +++ b/proposals/improved-interpolated-strings.md @@ -0,0 +1,256 @@ +# Improved Interpolated Strings + +## Summary + +We introduce a new pattern for creating and using interpolated string expressions to allow for efficient formatting and use in both general `string` scenarios +and more specialized scenarios such as logging frameworks, without incurring unnecessary allocations from formatting the string in the framework. + +## Motivation + +Today, string interpolation mainly lowers down to a call to `string.Format`. This, while general purpose, can be inefficient for a number of reasons: + +1. It boxes any struct arguments, unless the runtime has happened to introduce an overload of `string.Format` that takes exactly the correct types of arguments +in exactly the correct order. + * This ordering is why the runtime is hesitant to introduce generic versions of the method, as it would lead to combinatoric explosion of generic instanciations + of a very common method. +2. It has to allocate an array for the arguments in most cases. +3. There is no opportunity to avoid instanciating the instance if it's not needed. Logging frameworks, for example, will recommend avoiding string interpolation +because it will cause a string to be realized that may not be needed, depending on the current log-level of the application. + +Internally, the runtime has a type called `ValueStringBuilder` to help deal with the first 2 of these scenarios. They pass a stackalloc'd buffer to the builder, +repeatedly call `AppendFormat` with every part, and then get a final string out. If the resulting string goes past the bounds of the stack buffer, they can then +move to an array on the heap. However, this type is dangerous to expose directly, as incorrect usage could lead to a rented array to be double-disposed, which +then will cause all sorts of undefined behavior in the program as two locations think they have sole access to the rented array. This proposal creates a way to +use this type safely from native C# code by just writing an interpolated string literal, leaving written code unchanged while improving every interpolated string +that a user writes. It also extends this pattern to allow for interpolated strings passed as arguments to other methods to use a builder pattern, defined by +receiver of the method, that will allow things like logging frameworks to avoid allocating strings that will never be needed, and giving C# users familiar, +convenient interpolation syntax. + +## Detailed Design + +### The builder pattern + +We introduce a new builder pattern that can represent an interpolated string passed as an argument to a method. The simple English of the pattern is as follows: + +When a `string` is passed as an argument to a method, we look at the receiver of the method. If the receiver has an invocable member `GetInterpolatedStringBuilder` +that can invoked with 2 int parameters, `baseLength` and `formatHoleCount`, and that returns a type that is identity-convertible to the type of the corresponding +parameter, and that type has instance `TryFormat` methods can be invoked for every part of the interpolated string, then we lower the interpolation using that, +instead of into a traditional call to `string.Format(formatStr, args)`. A more concrete example is helpful for picturing this: + +```cs +// The builder that will actually "build" the interpolated string" +public ref struct LoggerParamsBuilder +{ + // Storage for the built-up string + + private bool _logLevelEnabled; + + public LoggerParamsBuilder(int baseLength, int formatHoleCount, bool logLevelEnabled) + { + // Initialization logic + _logLevelEnabled = logLevelEnabled + } + + public bool TryFormat(string s) + { + if (!_logLevelEnabled) return false; + + // Store and format part as required + return true; + } + + public bool TryFormat(T t) + { + if (!_logLevelEnabled) return false; + + // Store and format part as required + return true; + } +} + +// The logger class. The user has an instance of this, accesses it via static state, or some other access +// mechanism +public class Logger +{ + // Initialization code omitted + private LogLevel _myLogLevel; + + public class LoggerImpl + { + LogLevel _myLogLevel; + Logger _parent; + internal LoggerImpl(LogLevel myLogLevel, Logger parent) + { + _myLogLevel = myLogLevel; + } + + public LoggerParamsBuilder GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) + { + return new LoggerParamsBuilder(baseLength, formatHoleCount, logLevelEnabled: _parent._currentLogLevel >= _myLogLevel); + } + + public void Log(LoggerParamsBuilder builder) + { + // Impl of logging + } + } + + public LoggerImpl Trace { get; } = new Logger(LogLevel.Trace, this); // Would need to be in a constructor to use `this` in real code. +} + +Logger logger = GetLogger(LogLevel.Info); + +// Given the above definitions, usage looks like this: +logger.Trace.Log($"{"this"} will never be printed because info is < trace!"); + +// This is converted to: +var receiverTemp = logger.Trace; +var builder = receiverTemp.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1); +_ = builder.TryFormat("this") && builder.TryFormat(" will never be printed because info is < trace!"); +receiverTemp.Log(builder); +``` + +Here, because `logger.Trace` has an instance method called `GetInterpolatedStringBuilder` with the correct parameters, that returns a value of the type that `Log` was +expecting, we say that the interpolated string has an implicit builder conversion to that parameter, and it lowers to the pattern shown above. The specese needed for +this is a bit complicated, and is expanded below. + +#### Builder type applicability + +A type is said to be an _applicable\_interpolated\_string\_builder\_type_ if, given an _interpolated\_string\_literal_ `S`, the following is true: + +* Overload resolution with an identifier of `TryFormat` and a parameter type of `string` succeeds, and contains a single instance method that returns a `bool`. +* For every _regular\_balanced\_text_ component of `S` (`Si`) without an _interpolation\_format_ component, overload resolution with an identifier of `TryFormat` and parameter +of the type of `Si` succeeds, and contains a single instance method that returns a `bool`. +* For every _regular\_balanced\_text_ component of `S` (`Si`) with an _interpolation\_format_ component, overload resolution with an identifier of `TryFormat` and parameter +types of `Si` and `string` succeeds, and contains a single instance method that returns a `bool`. + +Note that these rules do not permit extension methods for the `TryFormat` calls. We could consider enabling that if we choose, but this is analogous to the enumerator +pattern, where we allow `GetEnumerator` to be an extension method, but not `Current` or `MoveNext()`. + +#### Interpolated string builder conversion + +We add a new implicit conversion type: The _implicit\_string\_builder\_conversion_. An _implicit\_string\_builder\_conversion_ permits an _interpolated\_string\_expression_ +to be converted to an _applicable\_interpolated\_string\_builder\_type_. There are 2 ways that this conversion can occur: + +1. A method argument is converted as part of determining applicable function members (covered below), or +2. Given an _interpolated\_string\_expression_ `S` being converted to type `T`, the following is true: + * `T` is an _applicable\_interpolated\_string\_builder\_type_, and + * Overload resolution on the type `T` with the identifier `GetInterpolatedStringBuilder` and 2 int parameters with names `baseLength` and `formatHoleCount` returns a + single static method with return type `T`. + +#### Applicable function member adjustments + +We adjust the wording of the [applicable function member algorithm](https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#applicable-function-member) +as follows (a new sub-bullet is added at the front of each section, in bold): + +A function member is said to be an ***applicable function member*** with respect to an argument list `A` when all of the following are true: +* Each argument in `A` corresponds to a parameter in the function member declaration as described in [Corresponding parameters](expressions.md#corresponding-parameters), and any parameter to which no argument corresponds is an optional parameter. +* For each argument in `A`, the parameter passing mode of the argument (i.e., value, `ref`, or `out`) is identical to the parameter passing mode of the corresponding parameter, and + * **for an interpolated string argument to a value parameter, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the receiver of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * for a value parameter or a parameter array, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the argument to the type of the corresponding parameter, or + * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. After all, a `ref` or `out` parameter is an alias for the argument passed. +For a function member that includes a parameter array, if the function member is applicable by the above rules, it is said to be applicable in its ***normal form***. If a function member that includes a parameter array is not applicable in its normal form, the function member may instead be applicable in its ***expanded form***: +* The expanded form is constructed by replacing the parameter array in the function member declaration with zero or more value parameters of the element type of the parameter array such that the number of arguments in the argument list `A` matches the total number of parameters. If `A` has fewer arguments than the number of fixed parameters in the function member declaration, the expanded form of the function member cannot be constructed and is thus not applicable. +* Otherwise, the expanded form is applicable if for each argument in `A` the parameter passing mode of the argument is identical to the parameter passing mode of the corresponding parameter, and + * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the receiver of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * for a fixed value parameter or a value parameter created by the expansion, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the type of the argument to the type of the corresponding parameter, or + * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. + +Important note: this means that if there are 2 otherwise equivalent overloads, one with a builder type that creates an _interpolated\_string\_builder\_conversion_ without +needing the receiver, and one that creates one by calling a method on the receiver, these overloads will be considered ambiguous. We could potentially make changes to the +better function member algorithm to resolve this if we so choose, but it would require distinguishing "naturally-occuring" conversions from conversions that only occur +because the receiver has an applicable `GetInterpolatedStringBuilder` method. + +#### Better conversion from expression adjustments + +We change the [better conversion from expression](https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#better-conversion-from-expression) section to the +following: + +Given an implicit conversion `C1` that converts from an expression `E` to a type `T1`, and an implicit conversion `C2` that converts from an expression `E` to a type `T2`, `C1` is a ***better conversion*** than `C2` if: +1. `E` is an _interpolated\_string\_expression_, `C1` is an _interpolated\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _interpolated\_string\_builder\_conversion_, or +2. `E` does not exactly match `T2` and at least one of the following holds: + * `E` exactly matches `T1` ([Exactly matching Expression](expressions.md#exactly-matching-expression)) + * `T1` is a better conversion target than `T2` ([Better conversion target](expressions.md#better-conversion-target)) + +This change does mean that, given an overload of `Log(string s)` and `Log(LoggerParamsBuilder l)`, `Log($"")` will prefer the second overload, not the first, assuming +the above logging example. This is added so that existing string literals can still be used with such a logging framework, and users are silently converted to better +behavior if the type author introduces a method to enable this. + +### InterpolatedStringBuilder and Usage + +We introduce a new type in `System.Runtime.CompilerServices`: `InterpolatedStringBuilder`. This is a ref struct with many of the same semantics as `ValueStringBuilder`, +intended for direct use by the C# compiler. This struct would look approximately like this: + +```cs +public ref struct InterpolatedStringBuilder +{ + private char[] _array; + internal int _count; + public SpanInterpolatedStringBuilder(int baseLength, int numHoles) + { + _array = ArrayPool.Shared.Rent(baseLength); + _count = 0; + } + public string ToString() + { + string result = _array.AsSpan(0, _count).ToString(); + ArrayPool.Shared.Return(_array); + Return result; + } + public bool TryFormat(string s) => TryFormat((ReadOnlySpan)s); + public bool TryFormat(ReadOnlySpan s) + { + if (s.Length >= _array.Length - _count) Grow(); + s.AsSpan().CopyTo(_array); + _count += s.Length; + return true; + } + … // other TryFormat overloads for other types, a generic, etc. +} +``` + +We also provide a new `string.Format` overload, as follows: + +```cs +public class String +{ + public static string Format(InterpolatedStringBuilder builder) => builder.ToString(); +} +``` + +We make a slight change to the rules for the meaning of an [_interpolated\_string\_expression_](https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#interpolated-strings): + +If the type of an interpolated string is `System.IFormattable` or `System.FormattableString`, the meaning is a call to `System.Runtime.CompilerServices.FormattableStringFactory.Create`. If the type is `string`, the meaning of the expression is a call to `string.Format`. In both cases **if there exists an overload that takes an instance of an _applicable\_interpolated\_string\_builder\_type_, that overload is used according to the builder pattern. Otherwise**, the argument list of the call consists of a format string literal with placeholders for each interpolation, and an argument for each expression corresponding to the place holders. + +### Lowering + +Both the general pattern and the specific changes for interpolated strings directly converted to `string`s follow the same lowering pattern. The `GetInterpolatedStringBuilder` method is +invoked on the receiver (whether that's the temporary method receiver for an _interpolated\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a +standard conversion derived from the target type), and stored into a temp local. `TryFormat` is then repeatedly invoked on that temp, with each part of the interpolated string, in order. +The temp is then evaluated as the result of the expression. + +## Other considerations + +### Incorporating spans for heap-less strings + +`ValueStringBuilder` as it exists today has 2 constructors: one that takes a count, and allocates on the heap eagerly, and one that takes a `Span`. That `Span` is usually +a fixed size in the runtime codebase, around 250 elements on average. To truly replace that type, we should consider an extension to this where we also recognize `GetInterpolatedString` +methods that take a `Span`, instead of just the count version. However, we see a few potential thorny cases to resolve here: + +* We don't want to stackalloc repeatedly in a hot loop. If we were to do this extension to the feature, we'd likely want to share the stackalloc'd span between loop +iterations. We know this is safe, as `Span` is a ref struct that can't be stored on the heap, and users would have to be pretty devious to manage to extract a +reference to that `Span` (such as creating a method that accepts such a builder then deliberately retrieving the `Span` from the builder and returning it to the +caller). However, allocating ahead of time produces other questions: + * Should we eagerly stackalloc? What if the loop is never entered, or exists before it needs the space? + * If we don't eagerly stackalloc, does that mean we introduce a hidden branch on every loop? Most loops likely won't care about this, but it could affect some tight loops that don't + want to pay the cost. +* Some strings can be quite big, and the appropriate amount to `stackalloc` is dependent on a number of factors, including runtime factors. We don't really want the C# compiler and +specification to have to determine this ahead of time, so we'd want to resolve https://github.com/dotnet/runtime/issues/25423 and add an API for the compiler to call in these cases. It +also adds more pros and cons to the points from the previous loop, where we don't want to potentially allocate large arrays on the heap many times or before one is needed. + +### Non-try version of the API + +For simplicity, this spec currently just proposes recognizing a `TryFormat` method, and things that always succeed (like `InterpolatedStringBuilder`) would always return true from the method. +This was done to support partial formatting scenarios where the user wants to stop formatting if an error occurs or if it's unnecessary, such as the logging case, but could potentially +introduce a bunch of unnecessary branches in standard interpolated string usage. We could consider an addendum where we use just `Format` methods if no `TryFormat` method is present, but +it does present questions about what we do if there's a mix of both TryFormat and Format calls. From cf1e7623f7f4bc514ee3a38b7626285ad24fd40c Mon Sep 17 00:00:00 2001 From: Fred Silberberg Date: Thu, 18 Feb 2021 18:00:48 -0800 Subject: [PATCH 02/13] Add missing assignment --- proposals/improved-interpolated-strings.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index b830882c00..507b23c9d4 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -82,6 +82,7 @@ public class Logger internal LoggerImpl(LogLevel myLogLevel, Logger parent) { _myLogLevel = myLogLevel; + _parent = parent; } public LoggerParamsBuilder GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) From 4987982fbeb94c0eeb5a87d4764e74a14e64481a Mon Sep 17 00:00:00 2001 From: Fred Silberberg Date: Thu, 18 Feb 2021 18:06:01 -0800 Subject: [PATCH 03/13] A word --- proposals/improved-interpolated-strings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index 507b23c9d4..e7f24a0039 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -242,7 +242,7 @@ methods that take a `Span`, instead of just the count version. However, we iterations. We know this is safe, as `Span` is a ref struct that can't be stored on the heap, and users would have to be pretty devious to manage to extract a reference to that `Span` (such as creating a method that accepts such a builder then deliberately retrieving the `Span` from the builder and returning it to the caller). However, allocating ahead of time produces other questions: - * Should we eagerly stackalloc? What if the loop is never entered, or exists before it needs the space? + * Should we eagerly stackalloc? What if the loop is never entered, or exits before it needs the space? * If we don't eagerly stackalloc, does that mean we introduce a hidden branch on every loop? Most loops likely won't care about this, but it could affect some tight loops that don't want to pay the cost. * Some strings can be quite big, and the appropriate amount to `stackalloc` is dependent on a number of factors, including runtime factors. We don't really want the C# compiler and From 46a0842afcc765b9d6041e807a1dba9862084b28 Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Fri, 19 Feb 2021 14:00:36 -0800 Subject: [PATCH 04/13] Review feedback: * Some grammar cleanup * Add additonal motivation around Span * Look for a constructor on the type, rather than a static method, when looking for an interpolated_string_builder_conversion * Clarify that we can look either on instance receivers or containing types for GetInterpolatedString methods. * Slightly adjust the better conversion rules to ensure that constants still bind to string overloads, and add a detailed example explaining the resulting rules. * Example cleanup * Add discussion and examples of other discussed use cases. --- proposals/improved-interpolated-strings.md | 156 +++++++++++++++++++-- 1 file changed, 141 insertions(+), 15 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index e7f24a0039..3e195b524b 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -16,6 +16,8 @@ in exactly the correct order. 2. It has to allocate an array for the arguments in most cases. 3. There is no opportunity to avoid instanciating the instance if it's not needed. Logging frameworks, for example, will recommend avoiding string interpolation because it will cause a string to be realized that may not be needed, depending on the current log-level of the application. +4. It can never use `Span` or other ref struct types today, because ref structs are not allowed as generic type parameters, meaning that if a user wants to avoid +copying to intermediate locations they have to manually format strings. Internally, the runtime has a type called `ValueStringBuilder` to help deal with the first 2 of these scenarios. They pass a stackalloc'd buffer to the builder, repeatedly call `AppendFormat` with every part, and then get a final string out. If the resulting string goes past the bounds of the stack buffer, they can then @@ -32,10 +34,10 @@ convenient interpolation syntax. We introduce a new builder pattern that can represent an interpolated string passed as an argument to a method. The simple English of the pattern is as follows: -When a `string` is passed as an argument to a method, we look at the receiver of the method. If the receiver has an invocable member `GetInterpolatedStringBuilder` -that can invoked with 2 int parameters, `baseLength` and `formatHoleCount`, and that returns a type that is identity-convertible to the type of the corresponding -parameter, and that type has instance `TryFormat` methods can be invoked for every part of the interpolated string, then we lower the interpolation using that, -instead of into a traditional call to `string.Format(formatStr, args)`. A more concrete example is helpful for picturing this: +When an _interpolated\_string\_expression_ is passed as an argument to a method, we look at the receiver of the method. If the receiver has an invocable member +`GetInterpolatedStringBuilder` that can invoked with 2 int parameters, `baseLength` and `formatHoleCount`, and that returns a type that is identity-convertible +to the type of the corresponding parameter, and that type has instance `TryFormat` methods can be invoked for every part of the interpolated string, then we +lower the interpolation using that, instead of into a traditional call to `string.Format(formatStr, args)`. A more concrete example is helpful for picturing this: ```cs // The builder that will actually "build" the interpolated string" @@ -136,8 +138,7 @@ to be converted to an _applicable\_interpolated\_string\_builder\_type_. There a 1. A method argument is converted as part of determining applicable function members (covered below), or 2. Given an _interpolated\_string\_expression_ `S` being converted to type `T`, the following is true: * `T` is an _applicable\_interpolated\_string\_builder\_type_, and - * Overload resolution on the type `T` with the identifier `GetInterpolatedStringBuilder` and 2 int parameters with names `baseLength` and `formatHoleCount` returns a - single static method with return type `T`. + * `T` has an accessible constructor that takes 2 int parameters with the names `baseLength` and `formatHoleCount`, in that order. #### Applicable function member adjustments @@ -147,13 +148,13 @@ as follows (a new sub-bullet is added at the front of each section, in bold): A function member is said to be an ***applicable function member*** with respect to an argument list `A` when all of the following are true: * Each argument in `A` corresponds to a parameter in the function member declaration as described in [Corresponding parameters](expressions.md#corresponding-parameters), and any parameter to which no argument corresponds is an optional parameter. * For each argument in `A`, the parameter passing mode of the argument (i.e., value, `ref`, or `out`) is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a value parameter, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the receiver of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * **for an interpolated string argument to a value parameter, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver (if `A` is an instance method) or the containing type of `A` (if `A` is a static method) of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** * for a value parameter or a parameter array, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. After all, a `ref` or `out` parameter is an alias for the argument passed. For a function member that includes a parameter array, if the function member is applicable by the above rules, it is said to be applicable in its ***normal form***. If a function member that includes a parameter array is not applicable in its normal form, the function member may instead be applicable in its ***expanded form***: * The expanded form is constructed by replacing the parameter array in the function member declaration with zero or more value parameters of the element type of the parameter array such that the number of arguments in the argument list `A` matches the total number of parameters. If `A` has fewer arguments than the number of fixed parameters in the function member declaration, the expanded form of the function member cannot be constructed and is thus not applicable. * Otherwise, the expanded form is applicable if for each argument in `A` the parameter passing mode of the argument is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the receiver of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver (if `A` is an instance method) or the containing type of `A` (if `A` is a static method) of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** * for a fixed value parameter or a value parameter created by the expansion, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the type of the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. @@ -168,14 +169,23 @@ We change the [better conversion from expression](https://github.com/dotnet/csha following: Given an implicit conversion `C1` that converts from an expression `E` to a type `T1`, and an implicit conversion `C2` that converts from an expression `E` to a type `T2`, `C1` is a ***better conversion*** than `C2` if: -1. `E` is an _interpolated\_string\_expression_, `C1` is an _interpolated\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _interpolated\_string\_builder\_conversion_, or +1. `E` is a non-constant _interpolated\_string\_expression_, `C1` is an _interpolated\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _interpolated\_string\_builder\_conversion_, or 2. `E` does not exactly match `T2` and at least one of the following holds: * `E` exactly matches `T1` ([Exactly matching Expression](expressions.md#exactly-matching-expression)) * `T1` is a better conversion target than `T2` ([Better conversion target](expressions.md#better-conversion-target)) -This change does mean that, given an overload of `Log(string s)` and `Log(LoggerParamsBuilder l)`, `Log($"")` will prefer the second overload, not the first, assuming -the above logging example. This is added so that existing string literals can still be used with such a logging framework, and users are silently converted to better -behavior if the type author introduces a method to enable this. +This does mean that there are some potentially non-obvious overload resolution rules, depending on whether the interpolated string in question is a constant-expression or not. For example: + +```cs +void Log(string s) { ... } +void Log(LoggerParamsBuilder p) { ... } + +Log($""); // Calls Log(string s), because $"" is a constant expression +Log($"{"test"}"); // Calls Log(string s), because $"{"test"}" is a constant expression +Log($"{1}"); // Calls Log(LoggerParamsBuilder p), because $"{1}" is not a constant expression +``` + +This is introduced so that things that can simply be emitted as constants do so, and don't incur any overhead, while things that cannot be constant use the builder pattern. ### InterpolatedStringBuilder and Usage @@ -187,9 +197,9 @@ public ref struct InterpolatedStringBuilder { private char[] _array; internal int _count; - public SpanInterpolatedStringBuilder(int baseLength, int numHoles) + public InterpolatedStringBuilder(int baseLength, int formatHoleCount) { - _array = ArrayPool.Shared.Rent(baseLength); + _array = ArrayPool.Shared.Rent(baseLength /* Or some calculation based on what we see on average for the length of format holes */); _count = 0; } public string ToString() @@ -206,7 +216,7 @@ public ref struct InterpolatedStringBuilder _count += s.Length; return true; } - … // other TryFormat overloads for other types, a generic, etc. + … // other TryFormat overloads for other types (including ReadOnlySpan), a generic, etc. } ``` @@ -255,3 +265,119 @@ For simplicity, this spec currently just proposes recognizing a `TryFormat` meth This was done to support partial formatting scenarios where the user wants to stop formatting if an error occurs or if it's unnecessary, such as the logging case, but could potentially introduce a bunch of unnecessary branches in standard interpolated string usage. We could consider an addendum where we use just `Format` methods if no `TryFormat` method is present, but it does present questions about what we do if there's a mix of both TryFormat and Format calls. + +### Allow `string` types to be convertible to builders as well + +For type author simplicity, we could consider allowing expressions of type `string` to be implicitly-convertible to _applicable\_interpolated\_string\_builder\_types_. As proposed today, +authors will likely need to overload on both that builder type and regular `string` types, so their users don't have to understand the difference. This may be an annoying and non-obvious +overhead, as a `string` expression can be viewed as an interpolation with `expression.Length` prefilled length and 0 holes to be filled. + +## Other use cases + +### `TryFormat` on `Span` receivers + +The BCL has a number of helper methods that and usages of `ValueStringBuilder` that attempt to format a given string into a `Span`, and instead of moving to the heap if needed, give up if +the `Span` isn't big enough to hold the resulting text. With this proposal, it would be possible to support these cases by defining an extension method that looks like this: + +```cs +public static class MemoryExtensions +{ + public static bool TryWrite(this Span span, SpanInterpolatedStringBuilder builder, out int charsWritten) + { + charsWritten = builder._count; + return builder._success; + } + + public static SpanInterpolatedStringBuilder GetInterpolatedStringBuilder(this Span span, int baseLength, int formatHoleCount) => + new SpanInterpolatedStringBuilder(span, baseLength + formatHoleCount * AverageFormatHoleLengthConst); +} + +public ref struct SpanInterpolatedStringBuilder +{ + private Span _span; + internal bool _success; + internal int _count; + + public SpanInterpolatedStringBuilder(Span span, int baseLength) + { + _span = span; + _success = baseLength <= span.Length; + _count = 0; + } + + public bool TryFormat(string s) + { + if (!_success) + return false; + if (s.Length > _span.Length) + { + _success = false; + return false; + } + s.AsSpan().CopyTo(_span); + _span = _span.Slice(s.Length); + _count += s.Length; + return true; + } + + … // other TryFormat overloads for other types, a generic, etc. +} + +bool success = destinationSpan.TryWrite($”{a} = {b}”, out int charsWritten); + +// Maps to + +var receiverTemp = destinationSpan; +var builder = receiverTemp.GetInterpolatedStringBuilder(baseLength: 3, formatHoleCount: 2); +_ = builder.TryFormat(a) && builder.TryFormat(“ = “) && builder.TryFormat(b); +bool success = receiverTemp.TryWrite(builder, out int charsWritten); +``` + +### Utf8Formatter.TryFormat + +We could enable utf8-encoding of interpolated strings via a pattern similar to this: + +```cs +public static partial class Utf8Formatter +{ + public Utf8StringBuilder WithSpan(Span span) => new Utf8StringBuilder(span); +} + +public ref struct Utf8StringBuilder +{ + private Span _bytes; + public Utf8StringBuilder(Span bytes) => _bytes = bytes; + + public Utf8StringBuilder GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) + { + return this; + } + + public bool TryFormat(Utf8StringBuilder builder, out int bytesWritten) + { + ... + } + + public bool TryFormat(string s) + { + ... + } + + … // other TryFormat overloads for other types, a generic, etc. +} + +Span myBytes = stackalloc[50]; +bool success = Utf8Formatter.WithSpan(myBytes).TryFormat($"Hello world! {myVar}"); + +// Maps to + +var receiverTemp = Utf8Formatter.WithSpan(myBytes); +var builder = receiverTemp.GetInterpolatedStringBuilder(baseLength: 13, formatHoleCount: 1); +_ = builder.TryFormat("Hello world! ") && builder.TryFormat(myVar); +bool success = receiverTemp.TryFormat(builder, out int bytesWritten); +``` + +This differs from the existing patterns in the Utf8Formatter type, which take the `Span` to write into as an argument to the `TryFormat` method itself. This proposal is somewhat incompatible +with that approach, as it uses the receiver of the method to inform the builder of context, rather than using arguments to the method. It could theoretically be feasible to thread arguments +from the current method into the implicit call to `GetInterpolatedString`, but that raises a host of thorny issues around figuring out what corresponds to what in the signature, and significantly +complicates the determination of _applicable\_interpolated\_string\_builder\_types_. From 372fbd7aaaca53994240b93867ca22f605819424 Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Fri, 19 Feb 2021 14:23:19 -0800 Subject: [PATCH 05/13] Clarify applicable function member a bit more --- proposals/improved-interpolated-strings.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index 3e195b524b..f1ff4ba33e 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -148,13 +148,13 @@ as follows (a new sub-bullet is added at the front of each section, in bold): A function member is said to be an ***applicable function member*** with respect to an argument list `A` when all of the following are true: * Each argument in `A` corresponds to a parameter in the function member declaration as described in [Corresponding parameters](expressions.md#corresponding-parameters), and any parameter to which no argument corresponds is an optional parameter. * For each argument in `A`, the parameter passing mode of the argument (i.e., value, `ref`, or `out`) is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a value parameter, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver (if `A` is an instance method) or the containing type of `A` (if `A` is a static method) of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * **for an interpolated string argument to a value parameter, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver of `A` (if `A` is an instance method or extension method invoked in extension form) or the containing type of `A` (if instance overload resolution failed or if `A` is a static method not called as an extension method) with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** * for a value parameter or a parameter array, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. After all, a `ref` or `out` parameter is an alias for the argument passed. For a function member that includes a parameter array, if the function member is applicable by the above rules, it is said to be applicable in its ***normal form***. If a function member that includes a parameter array is not applicable in its normal form, the function member may instead be applicable in its ***expanded form***: * The expanded form is constructed by replacing the parameter array in the function member declaration with zero or more value parameters of the element type of the parameter array such that the number of arguments in the argument list `A` matches the total number of parameters. If `A` has fewer arguments than the number of fixed parameters in the function member declaration, the expanded form of the function member cannot be constructed and is thus not applicable. * Otherwise, the expanded form is applicable if for each argument in `A` the parameter passing mode of the argument is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver (if `A` is an instance method) or the containing type of `A` (if `A` is a static method) of this function with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver of `A` (if `A` is an instance method or extension method invoked in extension form) or the containing type of `A` (if instance overload resolution failed or if `A` is a static method not called as an extension method) with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** * for a fixed value parameter or a value parameter created by the expansion, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the type of the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. From 797a8f3f6a4fa5e059a3a67c4e7e5b3f069888ff Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Thu, 25 Feb 2021 13:02:08 -0800 Subject: [PATCH 06/13] Updates after discussion with Mads: * Flip the script on the receiver. Rather than calling a method on the receiver, we pass the receiver to a static method. This unifies the way of obtaining a builder. * Always obtain the builder from a static method. This allows delegation to constructors, or more advanced methods if the implementor chooses. * Move the builder to an out parameter of the static method. This allows overloading of the GetInterpolatedStringBuilder method. Updated the logging example to show how this would help. * Added open questions as appropriate. --- proposals/improved-interpolated-strings.md | 130 ++++++++++++--------- 1 file changed, 73 insertions(+), 57 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index f1ff4ba33e..c318191ff9 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -34,20 +34,26 @@ convenient interpolation syntax. We introduce a new builder pattern that can represent an interpolated string passed as an argument to a method. The simple English of the pattern is as follows: -When an _interpolated\_string\_expression_ is passed as an argument to a method, we look at the receiver of the method. If the receiver has an invocable member -`GetInterpolatedStringBuilder` that can invoked with 2 int parameters, `baseLength` and `formatHoleCount`, and that returns a type that is identity-convertible -to the type of the corresponding parameter, and that type has instance `TryFormat` methods can be invoked for every part of the interpolated string, then we -lower the interpolation using that, instead of into a traditional call to `string.Format(formatStr, args)`. A more concrete example is helpful for picturing this: +When an _interpolated\_string\_expression_ is passed as an argument to a method, we look at the type of the parameter. If the parameter type has a static method +`GetInterpolatedStringBuilder` that can invoked with 2 int parameters, `baseLength` and `formatHoleCount`, optionally takes a parameter the receiver is convertible to, +and has an out parameter of the type of original method's parameter and that type has instance `TryFormat` methods can be invoked for every part of the interpolated +string, then we lower the interpolation using that, instead of into a traditional call to `string.Format(formatStr, args)`. A more concrete example is helpful for +picturing this: ```cs // The builder that will actually "build" the interpolated string" -public ref struct LoggerParamsBuilder +public ref struct TraceLoggerParamsBuilder { + public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Logger logger, out TraceLoggerParamsBuilder builder) + { + builder = TraceLoggerParamsBuilder(baseLength, formatHoleCount, logger.EnabledLevel); + } + // Storage for the built-up string private bool _logLevelEnabled; - public LoggerParamsBuilder(int baseLength, int formatHoleCount, bool logLevelEnabled) + private TraceLoggerParamsBuilder(int baseLength, int formatHoleCount, bool logLevelEnabled) { // Initialization logic _logLevelEnabled = logLevelEnabled @@ -75,47 +81,30 @@ public ref struct LoggerParamsBuilder public class Logger { // Initialization code omitted - private LogLevel _myLogLevel; + public LogLevel EnabledLevel; - public class LoggerImpl + public void LogTrace(TraceLoggerParamsBuilder builder) { - LogLevel _myLogLevel; - Logger _parent; - internal LoggerImpl(LogLevel myLogLevel, Logger parent) - { - _myLogLevel = myLogLevel; - _parent = parent; - } - public LoggerParamsBuilder GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) - { - return new LoggerParamsBuilder(baseLength, formatHoleCount, logLevelEnabled: _parent._currentLogLevel >= _myLogLevel); - } - - public void Log(LoggerParamsBuilder builder) - { - // Impl of logging - } + // Impl of logging } - - public LoggerImpl Trace { get; } = new Logger(LogLevel.Trace, this); // Would need to be in a constructor to use `this` in real code. } Logger logger = GetLogger(LogLevel.Info); // Given the above definitions, usage looks like this: -logger.Trace.Log($"{"this"} will never be printed because info is < trace!"); +logger.LogTrace($"{"this"} will never be printed because info is < trace!"); // This is converted to: var receiverTemp = logger.Trace; -var builder = receiverTemp.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1); +TraceLoggerParamsBuilder.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1, receiverTemp, out var builder); _ = builder.TryFormat("this") && builder.TryFormat(" will never be printed because info is < trace!"); receiverTemp.Log(builder); ``` -Here, because `logger.Trace` has an instance method called `GetInterpolatedStringBuilder` with the correct parameters, that returns a value of the type that `Log` was -expecting, we say that the interpolated string has an implicit builder conversion to that parameter, and it lowers to the pattern shown above. The specese needed for -this is a bit complicated, and is expanded below. +Here, because `TraceLoggerParamsBuilder` has static method called `GetInterpolatedStringBuilder` with the correct parameters, including an out param that is the type +the `LogTrace` call was expecting, we say that the interpolated string has an implicit builder conversion to that parameter, and it lowers to the pattern shown above. +The specese needed for this is a bit complicated, and is expanded below. #### Builder type applicability @@ -138,7 +127,14 @@ to be converted to an _applicable\_interpolated\_string\_builder\_type_. There a 1. A method argument is converted as part of determining applicable function members (covered below), or 2. Given an _interpolated\_string\_expression_ `S` being converted to type `T`, the following is true: * `T` is an _applicable\_interpolated\_string\_builder\_type_, and - * `T` has an accessible constructor that takes 2 int parameters with the names `baseLength` and `formatHoleCount`, in that order. + * `T` has an accessible static void-returning method `GetInterpolatedStringBuilder` that takes 2 int parameters and 1 out parameter of type `T`, in that order. + +We want to make `GetInterpolatedStringBuilder` a static method with an `out` parameter for 2 reasons: + +1. By making it a `static` method instead of a constructor, we allow the implementation to pool builders if it so decides to. If we limited the pattern to constructors, +then the implementation would be required to always return new instances. +2. By making the builder an `out` parameter we allow overloading of the `GetInterpolatedStringBuilder` method by builder type, which is useful for scenarios like the logger +above, which could have `TraceLoggerParamsBuilder`/`DebugLoggerParamsBuilder`/`WarningLoggerParamsBuilder`/etc. #### Applicable function member adjustments @@ -148,20 +144,22 @@ as follows (a new sub-bullet is added at the front of each section, in bold): A function member is said to be an ***applicable function member*** with respect to an argument list `A` when all of the following are true: * Each argument in `A` corresponds to a parameter in the function member declaration as described in [Corresponding parameters](expressions.md#corresponding-parameters), and any parameter to which no argument corresponds is an optional parameter. * For each argument in `A`, the parameter passing mode of the argument (i.e., value, `ref`, or `out`) is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a value parameter, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver of `A` (if `A` is an instance method or extension method invoked in extension form) or the containing type of `A` (if instance overload resolution failed or if `A` is a static method not called as an extension method) with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * **for an interpolated string argument to a value parameter when `A` is an instance method or static extension method invoked in reduced from, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_ `Ai`, and overload resolution on `Ai` with the identifier `GetInterpolatedStringBuilder` and a parameter list of 2 int parameters, the receiver type of `A`, and an out parameter of type `Ai` succeeds with 1 invocable member. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an _implicit\_string\_builder\_conversion_. Or,** * for a value parameter or a parameter array, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. After all, a `ref` or `out` parameter is an alias for the argument passed. For a function member that includes a parameter array, if the function member is applicable by the above rules, it is said to be applicable in its ***normal form***. If a function member that includes a parameter array is not applicable in its normal form, the function member may instead be applicable in its ***expanded form***: * The expanded form is constructed by replacing the parameter array in the function member declaration with zero or more value parameters of the element type of the parameter array such that the number of arguments in the argument list `A` matches the total number of parameters. If `A` has fewer arguments than the number of fixed parameters in the function member declaration, the expanded form of the function member cannot be constructed and is thus not applicable. * Otherwise, the expanded form is applicable if for each argument in `A` the parameter passing mode of the argument is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_, and overload resolution on the instance receiver of `A` (if `A` is an instance method or extension method invoked in extension form) or the containing type of `A` (if instance overload resolution failed or if `A` is a static method not called as an extension method) with an identifier of `GetInterpolatedStringBuilder` with 2 int parameters of names `baseLength` and `formatHoleCount` succeeds with 1 invocable member, and the return type of that member is _identity\_convertible_ to the type of the corresponding parameter. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or** + * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion when `A` is an instance method or static extension method invoked in reduced form, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_ `Ai`, and overload resolution on `Ai` with the identifier `GetInterpolatedStringBuilder` and a parameter list of 2 int parameters, the receiver type of `A`, and an out parameter of type `Ai` succeeds with 1 invocable member. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or,** * for a fixed value parameter or a value parameter created by the expansion, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the type of the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. -Important note: this means that if there are 2 otherwise equivalent overloads, one with a builder type that creates an _interpolated\_string\_builder\_conversion_ without -needing the receiver, and one that creates one by calling a method on the receiver, these overloads will be considered ambiguous. We could potentially make changes to the -better function member algorithm to resolve this if we so choose, but it would require distinguishing "naturally-occuring" conversions from conversions that only occur -because the receiver has an applicable `GetInterpolatedStringBuilder` method. +Important note: this means that if there are 2 otherwise equivalent overloads, that only differ by the type of the _applicable\_interpolated\_string\_builder\_type_, these overloads will +be considered ambiguous. We could potentially make changes to the better function member algorithm to resolve this if we so choose, but this scenario unlikely to occur and isn't a priority +to address. + +Another important note is that, for a single overload, priority will be given to the builder construction method that takes a receiver type over builder construction that does not. This is +because the receiver version is checked for applicability before we look for general conversions, and this ordering is desirable. #### Better conversion from expression adjustments @@ -178,11 +176,11 @@ This does mean that there are some potentially non-obvious overload resolution r ```cs void Log(string s) { ... } -void Log(LoggerParamsBuilder p) { ... } +void Log(TraceLoggerParamsBuilder p) { ... } Log($""); // Calls Log(string s), because $"" is a constant expression Log($"{"test"}"); // Calls Log(string s), because $"{"test"}" is a constant expression -Log($"{1}"); // Calls Log(LoggerParamsBuilder p), because $"{1}" is not a constant expression +Log($"{1}"); // Calls Log(TraceLoggerParamsBuilder p), because $"{1}" is not a constant expression ``` This is introduced so that things that can simply be emitted as constants do so, and don't incur any overhead, while things that cannot be constant use the builder pattern. @@ -195,9 +193,13 @@ intended for direct use by the C# compiler. This struct would look approximately ```cs public ref struct InterpolatedStringBuilder { + public void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) + => new InterpolatedStringBuilder(baseLength, formatHoleCount); + private char[] _array; internal int _count; - public InterpolatedStringBuilder(int baseLength, int formatHoleCount) + + private InterpolatedStringBuilder(int baseLength, int formatHoleCount) { _array = ArrayPool.Shared.Rent(baseLength /* Or some calculation based on what we see on average for the length of format holes */); _count = 0; @@ -231,17 +233,36 @@ public class String We make a slight change to the rules for the meaning of an [_interpolated\_string\_expression_](https://github.com/dotnet/csharplang/blob/master/spec/expressions.md#interpolated-strings): -If the type of an interpolated string is `System.IFormattable` or `System.FormattableString`, the meaning is a call to `System.Runtime.CompilerServices.FormattableStringFactory.Create`. If the type is `string`, the meaning of the expression is a call to `string.Format`. In both cases **if there exists an overload that takes an instance of an _applicable\_interpolated\_string\_builder\_type_, that overload is used according to the builder pattern. Otherwise**, the argument list of the call consists of a format string literal with placeholders for each interpolation, and an argument for each expression corresponding to the place holders. +If the type of an interpolated string is `System.IFormattable` or `System.FormattableString`, the meaning is a call to `System.Runtime.CompilerServices.FormattableStringFactory.Create`. If the type is `string`, the meaning of the expression is a call to `string.Format`. In both cases **if there exists an overload that takes a single argument and there exists an _implicit\_string\_builder\_conversion_ from the interpolated string to the parameter type, that overload is used according to the builder pattern. Otherwise**, the argument list of the call consists of a format string literal with placeholders for each interpolation, and an argument for each expression corresponding to the place holders. ### Lowering Both the general pattern and the specific changes for interpolated strings directly converted to `string`s follow the same lowering pattern. The `GetInterpolatedStringBuilder` method is invoked on the receiver (whether that's the temporary method receiver for an _interpolated\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a -standard conversion derived from the target type), and stored into a temp local. `TryFormat` is then repeatedly invoked on that temp, with each part of the interpolated string, in order. -The temp is then evaluated as the result of the expression. +standard conversion derived from the target type), and stored into a temp local. `TryFormat` is then repeatedly invoked on that temp, with each part of the interpolated string, in order, +stopping subsequent calls if a `TryFormat` call returns `false`. The temp is then evaluated as the result of the expression. + +**Open Question** + +This lowering means that subsequent parts of the interpolated string after a false-returning `TryFormat` call don't get evaluated. This could potentially be very confusing, particularly +if the format hole is side-effecting. We could instead evaluate all format holes first, then repeatedly call `TryFormat` with the results, stopping if it returns false. This would ensure +that all expressions get evaluated as one might expect, but we call as few methods as we need to. While the partial evaluation might be desirable for some more advanced cases, it is perhaps +non-intuitive for the general case. + +Another alternative, if we want to always evaluate all format holes, is to remove the `TryFormat` version of the API and just do repeated `Format` calls. The builder can track whether it +should just be dropping the argument and immediately returning for this version. ## Other considerations +### Allow `string` types to be convertible to builders as well + +For type author simplicity, we could consider allowing expressions of type `string` to be implicitly-convertible to _applicable\_interpolated\_string\_builder\_types_. As proposed today, +authors will likely need to overload on both that builder type and regular `string` types, so their users don't have to understand the difference. This may be an annoying and non-obvious +overhead, as a `string` expression can be viewed as an interpolation with `expression.Length` prefilled length and 0 holes to be filled. + +This would allow new APIs to only expose a builder, without also having to expose a `string`-accepting overload. However, it won't get around the need for changes to better conversion from +expression, so while it would work it may be unnecessary overhead. + ### Incorporating spans for heap-less strings `ValueStringBuilder` as it exists today has 2 constructors: one that takes a count, and allocates on the heap eagerly, and one that takes a `Span`. That `Span` is usually @@ -266,12 +287,6 @@ This was done to support partial formatting scenarios where the user wants to st introduce a bunch of unnecessary branches in standard interpolated string usage. We could consider an addendum where we use just `Format` methods if no `TryFormat` method is present, but it does present questions about what we do if there's a mix of both TryFormat and Format calls. -### Allow `string` types to be convertible to builders as well - -For type author simplicity, we could consider allowing expressions of type `string` to be implicitly-convertible to _applicable\_interpolated\_string\_builder\_types_. As proposed today, -authors will likely need to overload on both that builder type and regular `string` types, so their users don't have to understand the difference. This may be an annoying and non-obvious -overhead, as a `string` expression can be viewed as an interpolation with `expression.Length` prefilled length and 0 holes to be filled. - ## Other use cases ### `TryFormat` on `Span` receivers @@ -288,17 +303,18 @@ public static class MemoryExtensions return builder._success; } - public static SpanInterpolatedStringBuilder GetInterpolatedStringBuilder(this Span span, int baseLength, int formatHoleCount) => - new SpanInterpolatedStringBuilder(span, baseLength + formatHoleCount * AverageFormatHoleLengthConst); } public ref struct SpanInterpolatedStringBuilder { + public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span span, out SpanInterpolatedStringBuilder builder) => + builder = new SpanInterpolatedStringBuilder(span, baseLength + formatHoleCount * AverageFormatHoleLengthConst); + private Span _span; internal bool _success; internal int _count; - public SpanInterpolatedStringBuilder(Span span, int baseLength) + private SpanInterpolatedStringBuilder(Span span, int baseLength) { _span = span; _success = baseLength <= span.Length; @@ -328,7 +344,7 @@ bool success = destinationSpan.TryWrite($”{a} = {b}”, out int charsWritten); // Maps to var receiverTemp = destinationSpan; -var builder = receiverTemp.GetInterpolatedStringBuilder(baseLength: 3, formatHoleCount: 2); +SpanInterpolatedStringBuilder.GetInterpolatedStringBuilder(baseLength: 3, formatHoleCount: 2, receiverTemp, out var builder); _ = builder.TryFormat(a) && builder.TryFormat(“ = “) && builder.TryFormat(b); bool success = receiverTemp.TryWrite(builder, out int charsWritten); ``` @@ -348,9 +364,9 @@ public ref struct Utf8StringBuilder private Span _bytes; public Utf8StringBuilder(Span bytes) => _bytes = bytes; - public Utf8StringBuilder GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) + public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Utf8StringBuilder instance, out Utf8StringBuilder builder) { - return this; + builder = instance; } public bool TryFormat(Utf8StringBuilder builder, out int bytesWritten) @@ -372,7 +388,7 @@ bool success = Utf8Formatter.WithSpan(myBytes).TryFormat($"Hello world! {myVar}" // Maps to var receiverTemp = Utf8Formatter.WithSpan(myBytes); -var builder = receiverTemp.GetInterpolatedStringBuilder(baseLength: 13, formatHoleCount: 1); +Utf8StringBuilder.GetInterpolatedStringBuilder(baseLength: 13, formatHoleCount: 1, receiverTemp, out var builder); _ = builder.TryFormat("Hello world! ") && builder.TryFormat(myVar); bool success = receiverTemp.TryFormat(builder, out int bytesWritten); ``` From 40df306a07f9c6d3629993fdb2e05d66e8a2740d Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Fri, 26 Feb 2021 10:47:43 -0800 Subject: [PATCH 07/13] Add alignment component to applicable_interpolated_string_builder_type rules. --- proposals/improved-interpolated-strings.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index c318191ff9..abcf8029c6 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -111,10 +111,14 @@ The specese needed for this is a bit complicated, and is expanded below. A type is said to be an _applicable\_interpolated\_string\_builder\_type_ if, given an _interpolated\_string\_literal_ `S`, the following is true: * Overload resolution with an identifier of `TryFormat` and a parameter type of `string` succeeds, and contains a single instance method that returns a `bool`. -* For every _regular\_balanced\_text_ component of `S` (`Si`) without an _interpolation\_format_ component, overload resolution with an identifier of `TryFormat` and parameter -of the type of `Si` succeeds, and contains a single instance method that returns a `bool`. -* For every _regular\_balanced\_text_ component of `S` (`Si`) with an _interpolation\_format_ component, overload resolution with an identifier of `TryFormat` and parameter -types of `Si` and `string` succeeds, and contains a single instance method that returns a `bool`. +* For every _regular\_balanced\_text_ component of `S` (`Si`) without an _interpolation\_format_ component or _constant\_expression_ (alignment) component, overload resolution +with an identifier of `TryFormat` and parameter of the type of `Si` succeeds, and contains a single instance method that returns a `bool`. +* For every _regular\_balanced\_text_ component of `S` (`Si`) with an _interpolation\_format_ component and no _constant\_expression_ (alignment) component, overload resolution +with an identifier of `TryFormat` and parameter types of `Si` and `string`(in that order) succeeds, and contains a single instance method that returns a `bool`. +* For every _regular\_balanced\_text_ component of `S` (`Si`) with a _constant\_expression_ (alignment) component and no _interpolation\_format_ component, overload resolution +with an identifier of `TryFormat` and parameter types of `Si` and `int` (in that order) succeeds, and contains a single instance method that returns a `bool`. +* For every _regular\_balanced\_text_ component of `S` (`Si`) with an _interpolation\_format_ component and a _constant\_expression_ (alignment) component, overload resolution +with an identifier of `TryFormat` and parameter types of `Si`, `int`, and `string` (in that order) succeeds, and contains a single instance method that returns a `bool`. Note that these rules do not permit extension methods for the `TryFormat` calls. We could consider enabling that if we choose, but this is analogous to the enumerator pattern, where we allow `GetEnumerator` to be an extension method, but not `Current` or `MoveNext()`. From 2fbafd9daaa93ca61867963da800f34f9ff76d88 Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Mon, 1 Mar 2021 13:28:55 -0800 Subject: [PATCH 08/13] Fixup examples. --- proposals/improved-interpolated-strings.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index abcf8029c6..0aa6e8e7ca 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -96,7 +96,7 @@ Logger logger = GetLogger(LogLevel.Info); logger.LogTrace($"{"this"} will never be printed because info is < trace!"); // This is converted to: -var receiverTemp = logger.Trace; +var receiverTemp = logger; TraceLoggerParamsBuilder.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1, receiverTemp, out var builder); _ = builder.TryFormat("this") && builder.TryFormat(" will never be printed because info is < trace!"); receiverTemp.Log(builder); @@ -197,8 +197,8 @@ intended for direct use by the C# compiler. This struct would look approximately ```cs public ref struct InterpolatedStringBuilder { - public void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount) - => new InterpolatedStringBuilder(baseLength, formatHoleCount); + public void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, out InterpolatedStringBuilder builder) + => builder = new InterpolatedStringBuilder(baseLength, formatHoleCount); private char[] _array; internal int _count; @@ -311,8 +311,8 @@ public static class MemoryExtensions public ref struct SpanInterpolatedStringBuilder { - public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span span, out SpanInterpolatedStringBuilder builder) => - builder = new SpanInterpolatedStringBuilder(span, baseLength + formatHoleCount * AverageFormatHoleLengthConst); + public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span span, out SpanInterpolatedStringBuilder builder) + => builder = new SpanInterpolatedStringBuilder(span, baseLength + formatHoleCount * AverageFormatHoleLengthConst); private Span _span; internal bool _success; From b4f8bb235f587289a57fcfbbcc1509daed454ead Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Tue, 2 Mar 2021 11:26:57 -0800 Subject: [PATCH 09/13] Add additional questions, correct conversion names. --- proposals/improved-interpolated-strings.md | 39 ++++++++++++++++++++-- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index 0aa6e8e7ca..66faba664d 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -154,7 +154,7 @@ A function member is said to be an ***applicable function member*** with respect For a function member that includes a parameter array, if the function member is applicable by the above rules, it is said to be applicable in its ***normal form***. If a function member that includes a parameter array is not applicable in its normal form, the function member may instead be applicable in its ***expanded form***: * The expanded form is constructed by replacing the parameter array in the function member declaration with zero or more value parameters of the element type of the parameter array such that the number of arguments in the argument list `A` matches the total number of parameters. If `A` has fewer arguments than the number of fixed parameters in the function member declaration, the expanded form of the function member cannot be constructed and is thus not applicable. * Otherwise, the expanded form is applicable if for each argument in `A` the parameter passing mode of the argument is identical to the parameter passing mode of the corresponding parameter, and - * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion when `A` is an instance method or static extension method invoked in reduced form, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_ `Ai`, and overload resolution on `Ai` with the identifier `GetInterpolatedStringBuilder` and a parameter list of 2 int parameters, the receiver type of `A`, and an out parameter of type `Ai` succeeds with 1 invocable member. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an implicit _interpolated\_string\_builder\_conversion_. Or,** + * **for an interpolated string argument to a fixed value parameter or a value parameter created by the expansion when `A` is an instance method or static extension method invoked in reduced form, the type of the corresponding parameter is an _applicable\_interpolated\_string\_builder\_type_ `Ai`, and overload resolution on `Ai` with the identifier `GetInterpolatedStringBuilder` and a parameter list of 2 int parameters, the receiver type of `A`, and an out parameter of type `Ai` succeeds with 1 invocable member. An interpolated string argument applicable in this way is said to be immediately converted to the corresponding parameter type with an _implicit\_string\_builder\_conversion_. Or,** * for a fixed value parameter or a value parameter created by the expansion, an implicit conversion ([Implicit conversions](conversions.md#implicit-conversions)) exists from the type of the argument to the type of the corresponding parameter, or * for a `ref` or `out` parameter, the type of the argument is identical to the type of the corresponding parameter. @@ -171,7 +171,7 @@ We change the [better conversion from expression](https://github.com/dotnet/csha following: Given an implicit conversion `C1` that converts from an expression `E` to a type `T1`, and an implicit conversion `C2` that converts from an expression `E` to a type `T2`, `C1` is a ***better conversion*** than `C2` if: -1. `E` is a non-constant _interpolated\_string\_expression_, `C1` is an _interpolated\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _interpolated\_string\_builder\_conversion_, or +1. `E` is a non-constant _interpolated\_string\_expression_, `C1` is an _implicit\_string\_builder\_conversion_, `T1` is an _applicable\_interpolated\_string\_builder\_type_, and `C2` is not an _implicit\_string\_builder\_conversion_, or 2. `E` does not exactly match `T2` and at least one of the following holds: * `E` exactly matches `T1` ([Exactly matching Expression](expressions.md#exactly-matching-expression)) * `T1` is a better conversion target than `T2` ([Better conversion target](expressions.md#better-conversion-target)) @@ -242,7 +242,7 @@ If the type of an interpolated string is `System.IFormattable` or `System.Format ### Lowering Both the general pattern and the specific changes for interpolated strings directly converted to `string`s follow the same lowering pattern. The `GetInterpolatedStringBuilder` method is -invoked on the receiver (whether that's the temporary method receiver for an _interpolated\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a +invoked on the receiver (whether that's the temporary method receiver for an _implicit\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a standard conversion derived from the target type), and stored into a temp local. `TryFormat` is then repeatedly invoked on that temp, with each part of the interpolated string, in order, stopping subsequent calls if a `TryFormat` call returns `false`. The temp is then evaluated as the result of the expression. @@ -291,6 +291,39 @@ This was done to support partial formatting scenarios where the user wants to st introduce a bunch of unnecessary branches in standard interpolated string usage. We could consider an addendum where we use just `Format` methods if no `TryFormat` method is present, but it does present questions about what we do if there's a mix of both TryFormat and Format calls. +### Passing previous arguments to the builder + +There is unfortunate lack of symmetry in the proposal at it currently exists: invoking an extension method in reduced form produces different semantics than invoking the extension method in +normal form. This is different from most other locations in the language, where reduced form is just a sugar. We have a couple of potential options for resolving this: + +* Special case extension methods called in normal form. This feels pretty icky: why are extensions special here? +* Allow other previous parameters to be passed to the builder. This gets complicated quickly: how do we determine what to pass to the builder? What if the builder has a `GetInterpolatedString` +method that accepts the first parameter, but not the receiver, of an instance method? +* Pass parameters to the builder marked with a specific attribute, a la `EnumeratorCancellation` support. This would need rules about whether we pass the receiver (maybe if the method is marked +we pass the receiver, and we don't in the general case?), and what we do if parameters _after_ the string parameter are annotated, but it seems like a potential option. + +Some compromise is likely needed here, but either direction has complications. Some scenarios that would be affected by this is the `Utf8Formatter` below, or existing api patterns that have +an `IFormatProvider` as the first argument. + +### `await` usage in interpolation holes + +Because `$"{await A()}"` is a valid expression today, we need to rationalize how interpolation holes with await. We could solve this with a few rules: + +1. If an interpolated string used as a `string`, `IFormattable`, or `FormattableString` has an `await` in an interpolation hole, fall back to old-style formatter. +2. If an interpolated string is subject to an _implicit\_string\_builder\_conversion_ and _applicable\_interpolated\_string\_builder\_type_ is a `ref struct`, `await` is not allowed to be used +in the format holes. + +Fundamentally, this desugaring could use a ref struct in an async method as long as we guarantee that the `ref struct` will not need to be saved to the heap, which should be possible if we forbid +`await`s in the interpolation holes. + +Alternatively, we could simply make all builder types non-ref structs, including the framework builder for interpolated strings. This would, however, preclude us from someday recognizing a `Span` +version that does not need to allocate any scratch space at all. + +### Builders as ref parameters + +Some builders might want to be passed as ref parameters (either `in` or `ref`). Should we allow either? And if so, what will a `ref` builder look like? `ref $""` is confusing, as you're not actually +passing the string by ref, you're passing the builder that is created from the ref by ref, and has similar potential issues with async methods. + ## Other use cases ### `TryFormat` on `Span` receivers From fcf7daf1b57b61fcc4ba279f66ad226ea3ad7f34 Mon Sep 17 00:00:00 2001 From: Fred Silberberg Date: Tue, 2 Mar 2021 12:03:42 -0800 Subject: [PATCH 10/13] Fix span interpolated string example. Co-authored-by: Stephen Toub --- proposals/improved-interpolated-strings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index 66faba664d..c5027e8edb 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -345,7 +345,7 @@ public static class MemoryExtensions public ref struct SpanInterpolatedStringBuilder { public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span span, out SpanInterpolatedStringBuilder builder) - => builder = new SpanInterpolatedStringBuilder(span, baseLength + formatHoleCount * AverageFormatHoleLengthConst); + => builder = new SpanInterpolatedStringBuilder(span, baseLength); private Span _span; internal bool _success; From 7dd7ecc4be9f97b81787ed82fb538ba129d5c1c5 Mon Sep 17 00:00:00 2001 From: Fred Silberberg Date: Tue, 2 Mar 2021 15:10:55 -0800 Subject: [PATCH 11/13] Fix log call Co-authored-by: Yaakov --- proposals/improved-interpolated-strings.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index c5027e8edb..4f1fa3b9f4 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -99,7 +99,7 @@ logger.LogTrace($"{"this"} will never be printed because info is < trace!"); var receiverTemp = logger; TraceLoggerParamsBuilder.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1, receiverTemp, out var builder); _ = builder.TryFormat("this") && builder.TryFormat(" will never be printed because info is < trace!"); -receiverTemp.Log(builder); +receiverTemp.LogTrace(builder); ``` Here, because `TraceLoggerParamsBuilder` has static method called `GetInterpolatedStringBuilder` with the correct parameters, including an out param that is the type From 6e57643eba9cf80678183dd66380efba5df8cb88 Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Wed, 24 Mar 2021 17:11:28 -0700 Subject: [PATCH 12/13] Update spec based on LDM feedback. --- proposals/improved-interpolated-strings.md | 77 +++++++++++++++------- 1 file changed, 54 insertions(+), 23 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index 4f1fa3b9f4..2b64bc9216 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -44,9 +44,16 @@ picturing this: // The builder that will actually "build" the interpolated string" public ref struct TraceLoggerParamsBuilder { - public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Logger logger, out TraceLoggerParamsBuilder builder) + public static bool GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Logger logger, out TraceLoggerParamsBuilder builder) { + if (!logger._logLevelEnabled) + { + builder = default; + return false; + } + builder = TraceLoggerParamsBuilder(baseLength, formatHoleCount, logger.EnabledLevel); + return true; } // Storage for the built-up string @@ -61,16 +68,12 @@ public ref struct TraceLoggerParamsBuilder public bool TryFormat(string s) { - if (!_logLevelEnabled) return false; - // Store and format part as required return true; } public bool TryFormat(T t) { - if (!_logLevelEnabled) return false; - // Store and format part as required return true; } @@ -85,7 +88,6 @@ public class Logger public void LogTrace(TraceLoggerParamsBuilder builder) { - // Impl of logging } } @@ -93,12 +95,14 @@ public class Logger Logger logger = GetLogger(LogLevel.Info); // Given the above definitions, usage looks like this: -logger.LogTrace($"{"this"} will never be printed because info is < trace!"); +var name = "Fred Silberberg"; +logger.LogTrace($"{name} will never be printed because info is < trace!"); // This is converted to: var receiverTemp = logger; -TraceLoggerParamsBuilder.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1, receiverTemp, out var builder); -_ = builder.TryFormat("this") && builder.TryFormat(" will never be printed because info is < trace!"); +_ = TraceLoggerParamsBuilder.GetInterpolatedStringBuilder(baseLength: 47, formatHoleCount: 1, receiverTemp, out var builder) && + builder.TryFormat("Fred Silberberg") && + builder.TryFormat(" will never be printed because info is < trace!"); receiverTemp.LogTrace(builder); ``` @@ -123,6 +127,9 @@ with an identifier of `TryFormat` and parameter types of `Si`, `int`, and `strin Note that these rules do not permit extension methods for the `TryFormat` calls. We could consider enabling that if we choose, but this is analogous to the enumerator pattern, where we allow `GetEnumerator` to be an extension method, but not `Current` or `MoveNext()`. +These rules _do_ permit default parameters for the `TryFormat` calls, which will work with things like `CallerLineNumber` or `CallerArgumentExpression` (when supported by +the language). + #### Interpolated string builder conversion We add a new implicit conversion type: The _implicit\_string\_builder\_conversion_. An _implicit\_string\_builder\_conversion_ permits an _interpolated\_string\_expression_ @@ -131,14 +138,14 @@ to be converted to an _applicable\_interpolated\_string\_builder\_type_. There a 1. A method argument is converted as part of determining applicable function members (covered below), or 2. Given an _interpolated\_string\_expression_ `S` being converted to type `T`, the following is true: * `T` is an _applicable\_interpolated\_string\_builder\_type_, and - * `T` has an accessible static void-returning method `GetInterpolatedStringBuilder` that takes 2 int parameters and 1 out parameter of type `T`, in that order. + * `T` has an accessible static bool-returning method `GetInterpolatedStringBuilder` that takes 2 int parameters and 1 out parameter of type `T`, in that order. We want to make `GetInterpolatedStringBuilder` a static method with an `out` parameter for 2 reasons: 1. By making it a `static` method instead of a constructor, we allow the implementation to pool builders if it so decides to. If we limited the pattern to constructors, then the implementation would be required to always return new instances. -2. By making the builder an `out` parameter we allow overloading of the `GetInterpolatedStringBuilder` method by builder type, which is useful for scenarios like the logger -above, which could have `TraceLoggerParamsBuilder`/`DebugLoggerParamsBuilder`/`WarningLoggerParamsBuilder`/etc. +2. By making the builder an `out` parameter we allow the `GetInterpolatedStringBuilder` method to return a bool indicating whether to continue formatting, which is useful +for scenarios like the logger above that may want to skip any argument evaluation at all for cases when the log level isn't enabled. #### Applicable function member adjustments @@ -197,8 +204,11 @@ intended for direct use by the C# compiler. This struct would look approximately ```cs public ref struct InterpolatedStringBuilder { - public void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, out InterpolatedStringBuilder builder) - => builder = new InterpolatedStringBuilder(baseLength, formatHoleCount); + public bool GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, out InterpolatedStringBuilder builder) + { + builder = new InterpolatedStringBuilder(baseLength, formatHoleCount); + return true; + } private char[] _array; internal int _count; @@ -243,8 +253,8 @@ If the type of an interpolated string is `System.IFormattable` or `System.Format Both the general pattern and the specific changes for interpolated strings directly converted to `string`s follow the same lowering pattern. The `GetInterpolatedStringBuilder` method is invoked on the receiver (whether that's the temporary method receiver for an _implicit\_string\_builder\_conversion_ derived from the applicable function member algorithm, or a -standard conversion derived from the target type), and stored into a temp local. `TryFormat` is then repeatedly invoked on that temp, with each part of the interpolated string, in order, -stopping subsequent calls if a `TryFormat` call returns `false`. The temp is then evaluated as the result of the expression. +standard conversion derived from the target type). If the call returned `true`, `TryFormat` is repeatedly invoked on the builder out parameter, with each part of the interpolated string, +in order, stopping subsequent calls if a `TryFormat` call returns `false`. Finally, the original method is called, passing the initialized builder in place of the interpolated string expression. **Open Question** @@ -256,6 +266,8 @@ non-intuitive for the general case. Another alternative, if we want to always evaluate all format holes, is to remove the `TryFormat` version of the API and just do repeated `Format` calls. The builder can track whether it should just be dropping the argument and immediately returning for this version. +_Answer_: We will have conditional evaluation of the holes. + ## Other considerations ### Allow `string` types to be convertible to builders as well @@ -344,8 +356,16 @@ public static class MemoryExtensions public ref struct SpanInterpolatedStringBuilder { - public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span span, out SpanInterpolatedStringBuilder builder) - => builder = new SpanInterpolatedStringBuilder(span, baseLength); + public static bool GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Span span, out SpanInterpolatedStringBuilder builder) + { + if (baseLength > span.Length) + { + builder = default; + return false; + } + builder = new SpanInterpolatedStringBuilder(span, baseLength); + return true; + } private Span _span; internal bool _success; @@ -381,8 +401,11 @@ bool success = destinationSpan.TryWrite($”{a} = {b}”, out int charsWritten); // Maps to var receiverTemp = destinationSpan; -SpanInterpolatedStringBuilder.GetInterpolatedStringBuilder(baseLength: 3, formatHoleCount: 2, receiverTemp, out var builder); -_ = builder.TryFormat(a) && builder.TryFormat(“ = “) && builder.TryFormat(b); + +_ = SpanInterpolatedStringBuilder.GetInterpolatedStringBuilder(baseLength: 3, formatHoleCount: 2, receiverTemp, out var builder) && + builder.TryFormat(a) && + builder.TryFormat(“ = “) && + builder.TryFormat(b); bool success = receiverTemp.TryWrite(builder, out int charsWritten); ``` @@ -401,9 +424,16 @@ public ref struct Utf8StringBuilder private Span _bytes; public Utf8StringBuilder(Span bytes) => _bytes = bytes; - public static void GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Utf8StringBuilder instance, out Utf8StringBuilder builder) + public static bool GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, Utf8StringBuilder instance, out Utf8StringBuilder builder) { + if (baseLength > instance._bytes.Length) + { + builder = default; + return false; + } + builder = instance; + return true; } public bool TryFormat(Utf8StringBuilder builder, out int bytesWritten) @@ -425,8 +455,9 @@ bool success = Utf8Formatter.WithSpan(myBytes).TryFormat($"Hello world! {myVar}" // Maps to var receiverTemp = Utf8Formatter.WithSpan(myBytes); -Utf8StringBuilder.GetInterpolatedStringBuilder(baseLength: 13, formatHoleCount: 1, receiverTemp, out var builder); -_ = builder.TryFormat("Hello world! ") && builder.TryFormat(myVar); +_ = Utf8StringBuilder.GetInterpolatedStringBuilder(baseLength: 13, formatHoleCount: 1, receiverTemp, out var builder) && + builder.TryFormat("Hello world! ") && + builder.TryFormat(myVar); bool success = receiverTemp.TryFormat(builder, out int bytesWritten); ``` From 2550a43b391e844faaa0a2023f66489328a41612 Mon Sep 17 00:00:00 2001 From: Fredric Silberberg Date: Wed, 24 Mar 2021 17:17:10 -0700 Subject: [PATCH 13/13] Add a couple of open questions. --- proposals/improved-interpolated-strings.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/proposals/improved-interpolated-strings.md b/proposals/improved-interpolated-strings.md index 2b64bc9216..02008b11af 100644 --- a/proposals/improved-interpolated-strings.md +++ b/proposals/improved-interpolated-strings.md @@ -204,7 +204,7 @@ intended for direct use by the C# compiler. This struct would look approximately ```cs public ref struct InterpolatedStringBuilder { - public bool GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, out InterpolatedStringBuilder builder) + public static bool GetInterpolatedStringBuilder(int baseLength, int formatHoleCount, out InterpolatedStringBuilder builder) { builder = new InterpolatedStringBuilder(baseLength, formatHoleCount); return true; @@ -228,7 +228,7 @@ public ref struct InterpolatedStringBuilder public bool TryFormat(ReadOnlySpan s) { if (s.Length >= _array.Length - _count) Grow(); - s.AsSpan().CopyTo(_array); + s.CopyTo(_array); _count += s.Length; return true; } @@ -249,6 +249,15 @@ We make a slight change to the rules for the meaning of an [_interpolated\_strin If the type of an interpolated string is `System.IFormattable` or `System.FormattableString`, the meaning is a call to `System.Runtime.CompilerServices.FormattableStringFactory.Create`. If the type is `string`, the meaning of the expression is a call to `string.Format`. In both cases **if there exists an overload that takes a single argument and there exists an _implicit\_string\_builder\_conversion_ from the interpolated string to the parameter type, that overload is used according to the builder pattern. Otherwise**, the argument list of the call consists of a format string literal with placeholders for each interpolation, and an argument for each expression corresponding to the place holders. +**Open Question**: + +Do we want to instead just make the compiler know about `InterpolatedStringBuilder` and skip the `string.Format` call entirely? It would allow us to hide a method that we don't necessarily +want to put in people's faces when they manually call `string.Format`. + +**Open Question**: + +Do we want to have builders for `System.IFormattable` and `System.FormattableString` as well? + ### Lowering Both the general pattern and the specific changes for interpolated strings directly converted to `string`s follow the same lowering pattern. The `GetInterpolatedStringBuilder` method is @@ -256,7 +265,7 @@ invoked on the receiver (whether that's the temporary method receiver for an _im standard conversion derived from the target type). If the call returned `true`, `TryFormat` is repeatedly invoked on the builder out parameter, with each part of the interpolated string, in order, stopping subsequent calls if a `TryFormat` call returns `false`. Finally, the original method is called, passing the initialized builder in place of the interpolated string expression. -**Open Question** +**~~Open~~ Question** This lowering means that subsequent parts of the interpolated string after a false-returning `TryFormat` call don't get evaluated. This could potentially be very confusing, particularly if the format hole is side-effecting. We could instead evaluate all format holes first, then repeatedly call `TryFormat` with the results, stopping if it returns false. This would ensure