Skip to content

Commit

Permalink
Create RFC for "return position enum impl trait"
Browse files Browse the repository at this point in the history
  • Loading branch information
yoshuawuyts committed Jan 5, 2023
1 parent 873890e commit 2ddaa30
Showing 1 changed file with 356 additions and 0 deletions.
356 changes: 356 additions & 0 deletions text/0000-multi-type-return-position-impl-trait.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,356 @@
- Feature Name: (fill me in with a unique ident, `multi_type_return_position_impl_trait`)
- Start Date: (fill me in with today's date, 2023-01-05)
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

This RFC enables [Return Position Impl Trait (RPIT)][RPIT] to work in functions
which return more than one type. This is achieved by desugaring the return type
into an enum with members containing each of the returned types, and
implementing traits which delegate to those members:

[RPIT]: https://doc.rust-lang.org/stable/rust-by-example/trait/impl_trait.html#as-a-return-type

```rust
// Possible already
fn single_iter() -> impl Iterator<Item = i32> {
1..10 // `std::ops::Range<i32>`
}

// Enabled by this RFC
fn multi_iter(x: i32) -> impl Iterator<Item = i32> {
match x {
0 => 1..10, // `std::ops::Range<i32>`
_ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>`
}
}
```

# Motivation
[motivation]: #motivation

[Return Position Impl Trait (RPIT)][RPIT] is used when you want to return a value, but
don't want to specify the type. In today's Rust (1.66.0 at the time of writing)
it's only possible to use this when you're returning a single type from the
function. The moment multiple types are returned from the function, the compiler
will error. This can be frustrating, because it means you're likely to either
resort to using `Box<dyn Trait>` or manually construct an enum to to map the
branches to. It's not always desirable or possible to use `Box<dyn Trait>`. And
constructing an enum manually can be both time-intensive, complicated, and can
obfuscate
the intent of the code.

What we're proposing here is not so much a new feature, as an expansion of the
cases in which `impl Trait` can be used. We've seen previous efforts for this,
in particular [RFC 1951: Expand Impl Trait][rfc1951] and more recently in [RFC
2515: Type Alias Impl Trait (TAIT)][TAIT]. This continues that expansion by
enabling more code to make use of RPIT.

[rfc1951]: https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md
[TAIT]: https://rust-lang.github.io/rfcs/2515-type_alias_impl_trait.html

A motivating example for this is use in error handling: it's not uncommon to
have a function return more than one error type, but you may not necessarily
care about the exact errors returned. You may either choose to define a `Box<dyn
Error + 'static>` which has the downside that [it itself does not implement
`Error`][no-error]. Or you may choose to define your own enum of errors, which
can be a lot of work and may obfuscate the actual intent of the code. It may
sometimes be preferable to return an `impl Trait` instead:

[no-error]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=97894fc907fa2d292cbe909467d4db4b

```rust
use std::error::Error;
use std::fs;

// ❌ Multi-type RPIT does not yet compile (Rust 1.66.0)
// error[E0282]: type annotations needed
fn main() -> Result<(), impl Error> {
let num = i8::from_str_radix("A", 16)?; // `Result<_, std::num::ParseIntError>`
let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>`
// ... use values here
Ok(())
}
```

# Desugaring
[reference-level-explanation]: #reference-level-explanation

## Overview

Let's take a look again at the code from our motivation section. This function
has two branches which each return a different type which implements the
[`Iterator` trait][`Iterator`]:

[`Iterator`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html

```rust
fn multi_iter(x: i32) -> impl Iterator<Item = i32> {
match x {
0 => 1..10, // `std::ops::Range<i32>`
_ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>`
}
}
```

This code should be desugared by the compiler into something resembling the following
([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=af4c0e61df25acaada168449df9838d3)):

```rust
// anonymous enum generated by the compiler
enum Enum {
A(std::ops::Range<i32>),
B(std::vec::IntoIter<i32>),
}

// trait implementation generated by the compiler,
// delegates to underlying enum member's values
impl Iterator for Enum {
type Item = i32;

fn next(&mut self) -> Option<Self::Item> {
match self {
Enum::A(iter) => iter.next(),
Enum::B(iter) => iter.next(),
}
}

// ..repeat for the remaining 74 `Iterator` trait methods
}

// the desugared function now returns the generated enum
fn multi_iter(x: i32) -> Enum {
match x {
0 => Enum::A(1..10),
_ => Enum::B(vec![5, 10].into_iter()),
}
}
```

## Step-by-step guide

This desugaring can be implemented using the following steps:

1. Find all return calls in the function
2. Define a new enum with a member for each of the function's return types
3. Implement the traits declared in the `-> impl Trait` bound for the new enum,
matching on `self` and delegating to the enum's members
4. Substitute the `-> impl Trait` signature with the concrete enum
5. Wrap each of the function's return calls in the appropriate enum member

The hardest part of implementing this RFC will likely be the actual trait
implementation on the enum, as each of the trait methods will need to be
delegated to the underlying types.

# Interaction with lifetimes

`dyn Trait` already supports multi-type _dynamic_ dispatch. The rules we're
proposing for multi-type _static_ dispatch using `impl Trait` should mirror the
existing rules we apply to `dyn Trait.` We should follow the same lifetime rules
for multi-type `impl Trait` as we do for `dyn Trait`:

```rust
fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range<i32>) -> impl Iterator<Item = i32> + 'a {
match x {
0 => iter_a, // `&'a std::ops::Range<i32>`
_ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>`
}
}
```

This code should be desugared by the compiler into something resembling the following
([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=60ddacbb20c4068a0fff44a5481a7136)):

```rust
enum Enum<'a> {
A(&'a mut std::ops::Range<i32>),
B(std::vec::IntoIter<i32>),
}

impl<'a> Iterator for Enum<'a> {
type Item = i32;

fn next(&mut self) -> Option<Self::Item> {
match self {
Enum::A(iter) => iter.next(),
Enum::B(iter) => iter.next(),
}
}

// ..repeat for the remaining 74 `Iterator` trait methods
}

fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range<i32>) -> Enum<'a> {
match x {
0 => Enum::A(iter_a),
_ => Enum::B(vec![5, 10].into_iter()),
}
}
```

It should be fine if multiple iterators use the same lifetime. But only a single
lifetime should be permitted on the return type, as is the case today when
using `dyn Trait`:

```rust
// ❌ Fails to compile (Rust 1.66.0)
// error[E0226]: only a single explicit lifetime bound is permitted
fn fails<'a, 'b>() -> Box<dyn Iterator + 'a + 'b> {
...
}
```

# Prior art
[prior-art]: #prior-art

## auto-enums crate

The [`auto-enums` crate][auto-enums] implements a limited variation of what is
proposed in this RFC using procedural macros. It's limited to a predefined set
of traits only, whereas this RFC enables multi-type RPIT to work for _all_
traits. This limitation exists in the proc macro because it doesn't have access
to the same type information as the compiler does, so the trait delegations
have to be authored by hand. Here's an example of the crate being used to
generate an `impl Iterator`:

[auto-enums]: https://docs.rs/auto_enums/latest/auto_enums/

```rust
use auto_enums::auto_enum;

#[auto_enum(Iterator)]
fn foo(x: i32) -> impl Iterator<Item = i32> {
match x {
0 => 1..10,
_ => vec![5, 10].into_iter(),
}
}
```

# Future possibilities
[future-possibilities]: #future-possibilities

## Anonymous enums

Rust provides a way to declare anonymous structs using tuples. But we don't yet
have a way to declare anonymous enums. A different way of interpreting the
current RFC is as a way to declare anonymous type-erased enums, by expanding what
RPIT can be used for. It stands to reason that there will be cases where people
may want anonymous _non-type-erased_ enums too.

Take for example the iterator code we've been using throughout this RFC. But
instead of `Iterator` yielding `i32`, let's make it yield `i32` or `&'static
str`:

```rust
fn multi_iter(x: i32) -> impl Iterator<Item = /* which type? */> {
match x {
0 => 1..10, // yields `i32`
_ => vec!["hello", "world"].into_iter(), // yields `&'static str`
}
}
```

One solution to make it compile would be to first map it to a type which can
hold *either* `i32` or `String`. The obvious answer would be to use an enum for
this:

```rust
enum Enum {
A(i32),
B(&'static str),
}

fn multi_iter(x: i32) -> impl Iterator<Item = Enum> {
match x {
0 => 1..10.map(Enum::A),
_ => vec!["hello", "world"].into_iter().map(Enum::B),
}
}
```

This code resembles the desugaring for multi-value RPIT we're proposing in this
RFC. In fact: it may very well be that a lot of the internal compiler machinery
used for multi-RPIT could be reused for anonymous enums.

The similarities might become even closer if we consider how "anonymous enums"
could be used for error handling. Sometimes it can be useful to know which error
was returned, so you can decide how to handle it. For this RPIT isn't enough: we
actually want to retain the underlying types so we can match on them. We might
imagine the earlier errror example could instead be written like this:

```rust
use std::{fs, io, num};

// The earlier mult-value RPIT version returned `-> Result<(), impl Error>`.
// This example declares an anonymous enum instead, using made-up syntax
fn main() -> Result<(), num::ParseIntError | io::Error> {
let num = i8::from_str_radix("A", 16)?; // `Result<_, std::num::ParseIntError>`
let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>`
// ... use values here
Ok(())
}
```

There are a lot of questions to be answered here. Which traits should
this implement? What should the declaration syntax be? How could we match on
values? All enough to warrant its own exploration and possible RFC in the
future.

## Language-level support for delegation/proxies

One of the trickiest parts of implementing this RFC will be to delegate from the
generated enum to the individual enum's members. If we implement this
functionality in the compiler, it may be beneficial to generalize this
functionality and create syntax for it. We're already seen [limited support for
delegation codegen][support] in Rust-Analyzer as a source action [^disclaimer], and [various crates]
implementing delegation exist on Crates.io.

[support]: https://github.com/rust-lang/rust-analyzer/issues/5944
[various crates]: https://crates.io/search?q=delegate

[^disclaimer]: I (Yosh) filed the issue and authored the extension to Rust-Analyzer
for this. Which itself was based on prior art found in the VS Code Java extension.

To provide some sense for what this might look like. Say we were authoring some
[newtype] which wraps an iterator. We could imagine we'd write that in Rust
by hand today like this:

[newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html

```rust
struct NewIterator<T>(iter: std::array::Iterator<T>);

impl<T> Iterator for NewIterator<T> {
type Item = T;

#[inline]
pub fn next(&mut self) -> Option<Self::Item> {
self.0.next()
}

// ..repeat for the remaining 74 `Iterator` trait methods
}
```

Forwarding a single trait with a single method is doable. But we can imagine
that repeating this for multiple traits and methods quickly becomes a hassle,
and can obfuscate the _intent_ of the code. Instead if we could declare that
`NewIterator` should _delegate_ its `Iterator` implementation to the iterator
contained within. Say we adopted a [Kotlin-like syntax], we could imagine it
could look like this:

[Kotlin-like syntax]: https://kotlinlang.org/docs/delegation.html#overriding-a-member-of-an-interface-implemented-by-delegation

```rust
struct NewIterator<T>(iter: std::array::Iterator<T>);

impl<T> Iterator for NewIterator<T> by Self.0; // Use `Self.0` as the `Iterator` impl
```

There are many open questions here regarding semantics, syntax, and expanding it
to other features such as method delegation. But given the codegen for both
multi-value RPIT and delegation will share similarities, it may be worth
exploring further in the future.

0 comments on commit 2ddaa30

Please sign in to comment.