Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type ascription #803

Merged
merged 7 commits into from
Mar 16, 2015
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 175 additions & 0 deletions text/0000-type-ascription.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
- Start Date: 2015-2-3
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary

Add type ascription to expressions and patterns.

Type ascription on expression has already been implemented. Type ascription on
patterns can probably wait until post-1.0.

See also discussion on [#354](https://github.com/rust-lang/rfcs/issues/354) and
[rust issue 10502](https://github.com/rust-lang/rust/issues/10502).


# Motivation

Type inference is imperfect. It is often useful to help type inference by
annotating a sub-expression or sub-pattern with a type. Currently, this is only
possible by extracting the sub-expression into a variable using a `let`
statement and/or giving a type for a whole expression or pattern. This is un-
ergonomic, and sometimes impossible due to lifetime issues. Specifically, a
variable has lifetime of its enclosing scope, but a sub-expression's lifetime is
typically limited to the nearest semi-colon.

Typical use cases are where a function's return type is generic (e.g., collect)
and where we want to force a coercion.

Type ascription can also be used for documentation and debugging - where it is
unclear from the code which type will be inferred, type ascription can be used
to precisely communicate expectations to the compiler or other programmers.

By allowing type ascription in more places, we remove the inconsistency that
type ascription is currently only allowed on top-level patterns.

## Examples:

Generic return type:

```
// Current.
let z = if ... {
let x: Vec<_> = foo.enumerate().collect();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a current form? foo.enumerate().collect::<Vec<_>>()

Perhaps an example where multiple parameters are required could highlight the reduction in verbosity?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason one would not just add the type annotation to z? Using a temporary in this example seems a bit construed, maybe there is a better one?

x
} else {
...
};

// With type ascription.
let z = if ... {
foo.enumerate().collect(): Vec<_>
} else {
...
};
```

Coercion:

```
fn foo<T>(a: T, b: T) { ... }

// Current.
let x = [1u32, 2, 4];
let y = [3u32];
...
let x: &[_] = &x;
let y: &[_] = &y;
foo(x, y);

// With type ascription.
let x = [1u32, 2, 4];
let y = [3u32];
...
foo(x: &[_], y: &[_]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this have to be foo(&x: &[_], &y: &[_]);? &[T; N] coerces to &[T], but [T; N] doesn't. And at that point, don't we currently do this coercion implicitly, without requiring type ascription or let bindings? foo(&x, &y) works just fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except foo is generic and so the coercion wouldn't be triggered (as there is no &[_] expected type coming from the function itself).
And the use of "implicit coercions" could be considered a pleonasm - coercions in Rust are implicit conversions between types.
AFAIK, let statements with explicit types can't cause anything more than what some implicit type (coming from the arguments of a function you're calling for example) would be able to.

```

In patterns:

```
struct Foo<T> { a: T, b: String }

// Current
fn foo(Foo { a, .. }: Foo<i32>) { ... }

// With type ascription.
fn foo(Foo { a: i32, .. }) { ... }
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would we get inside foo:

fn foo(Foo { a: Bar, .. }: Foo<Bar>) { ... }
  • A variable named a, or
  • A variable named Bar (the current behavior)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I don't think there is a good answer. I think the least worst is to assume that users will prefer the common pattern of using the same name for both the name of the field and the new variable (fn foo(Foo { a, .. }: ... in the current syntax), then assume that a single : in a pattern always denotes a type, i.e., we assume Bar is always a type (this is backwards incompatible, as you allude to). If a user wants to rename the variable, then they'd have to use a : b : _, which is bad. I think the alternative is theoretically nicer - type ascription should be more optional, but less practical, since it is common to reuse the field name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that it’s probably more common to want to rename the variable than to ascribe the type. After all, we don’t ascribe the type of any struct field patterns today (because we can’t), and that isn’t causing any major problems. The Foo { a, .. } notation is really just shorthand/syntactic sugar for Foo { a: a, .. }, so I feel it should have a lower priority than other more fundamental parts of the syntax. Foo { a: b: _, .. } (explicitly renaming) looks pretty bad and is not an obvious way of resolving the ambiguity, while Foo { a: a: Type } (explicitly type-ascribing) looks OK and is fairly obvious given that the shorthand is just optional sugar.

I think that Foo { a: b, .. } not working would be too surprising to be worth it, and the backwards-incompatibility also just makes matters worse. (Even better in my opinion would be to change struct initialisers to stop overloading :, but that’s already been discussed (in this RFC and elsewhere).)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, on second thoughts, this makes more sense. Avoiding the backwards incompatibility/code churn is especially desirable. I think I was over-estimating how often type ascription would be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is more common to want renaming than type ascription, I think Foo {x: x: Foo, y: y: Bar} is still a bit strange, would Foo {.x: Foo, .y: Bar} look better as a sugar? (If we don't change the value binding sigil to =>.)



# Detailed design

The syntax of expressions is extended with type ascription:

```
e ::= ... | e: T
```

where `e` is an expression and `T` is a type. Type ascription has the same
precedence as explicit coercions using `as`.

When type checking `e: T`, `e` must have type `T`. The `must have type` test
includes implicit coercions and subtyping, but not explicit coercions. `T` may
be any well-formed type.

At runtime, type ascription is a no-op, unless an implicit coercion was used in
type checking, in which case the dynamic semantics of a type ascription
expression are exactly those of the implicit coercion.

The syntax of sub-patterns is extended to include an optional type ascription.
Old syntax:

```
P ::= SP: T | SP
SP ::= var | 'box' SP | ...
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think P vs. SP could be clarified here. pat: type is not a valid pattern (e.g., in match) today, and the only thing that resembles that syntax is let’s syntax. Is P supposed to represent what goes in between the let and = in let <P> = <value>;? If so, it should probably be clarified that P is not just a normal pattern.


where `P` is a pattern, `SP` is a sub-pattern, `T` is a type, and `var` is a
variable name.

New syntax:

```
P ::= SP: T | SP
SP ::= var | 'box' P | ...
```

Type ascription in patterns has the narrowest precedence, e.g., `box x: T` means
`box (x: T)`.

In type checking, if an expression is matched against a pattern, when matching
a sub-pattern the matching sub-expression must have the ascribed type (again,
this check includes subtyping and implicit coercion). Types in patterns play no
role at runtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does type ascription in patterns interact with let and function declarations? Presumably, let would now just be followed by a pattern (rather than a pattern optionally followed by a colon then a type), and the same for function parameters. Function parameters are required to have their types fully specified, but type ascription complicates this in that fn foo((x: i32, y: i64)) should be valid as well as the current fn foo((x, y): (i32, i64)). Presumably the rule simply needs to be adjusted to something like a pattern in a set of function parameters must have its type determinable without using type inference. That rule would allow some interesting cases that don’t even need type ascription like fn foo(NonGenericStruct(a, b), (), SingleVariantEnum), which may or may not be desirable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the last example is desirable. I'll update the RFC to clarify this and your previous point.


@eddyb has implemented the expressions part of this RFC,
[PR](https://github.com/rust-lang/rust/pull/21836).


# Drawbacks

More syntax, another feature in the language.

Interacts poorly with struct initialisers (changing the syntax for struct
literals has been [discussed and rejected](https://github.com/rust-lang/rfcs/pull/65)
and again in [discuss](http://internals.rust-lang.org/t/replace-point-x-3-y-5-with-point-x-3-y-5/198)).

If we introduce named arguments in the future, then it would make it more
difficult to support the same syntax as field initialisers.


# Alternatives

We could do nothing and force programmers to use temporary variables to specify
a type. However, this is less ergonomic and has problems with scopes/lifetimes.
Patterns can be given a type as a whole rather than annotating a part of the
pattern.

We could allow type ascription in expressions but not patterns. This is a
smaller change and addresses most of the motivation.

Rely on explicit coercions - the current plan [RFC 401](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md)
is to allow explicit coercion to any valid type and to use a customisable lint
for trivial casts (that is, those given by subtyping, including the identity
case). If we allow trivial casts, then we could always use explicit coercions
instead of type ascription. However, we would then lose the distinction between
implicit coercions which are safe and explicit coercions, such as narrowing,
which require more programmer attention. This also does not help with patterns.


# Unresolved questions

Is the suggested precedence correct? Especially for patterns.

Does type ascription on patterns have backwards compatibility issues?