diff --git a/text/0000-multi-type-return-position-impl-trait.md b/text/0000-multi-type-return-position-impl-trait.md new file mode 100644 index 00000000000..41f075e2c01 --- /dev/null +++ b/text/0000-multi-type-return-position-impl-trait.md @@ -0,0 +1,356 @@ +- Feature Name: (fill me in with a unique ident, `multi_type_return_position_impl_trait`) +- Start Date: (fill me in with today's date, 2023-01-05) +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +This RFC enables [Return Position Impl Trait (RPIT)][RPIT] to work in functions +which return more than one type. This is achieved by desugaring the return type +into an enum with members containing each of the returned types, and +implementing traits which delegate to those members: + +[RPIT]: https://doc.rust-lang.org/stable/rust-by-example/trait/impl_trait.html#as-a-return-type + +```rust +// Possible already +fn single_iter() -> impl Iterator { + 1..10 // `std::ops::Range` +} + +// Enabled by this RFC +fn multi_iter(x: i32) -> impl Iterator { + match x { + 0 => 1..10, // `std::ops::Range` + _ => vec![5, 10].into_iter(), // `std::vec::IntoIter` + } +} +``` + +# Motivation +[motivation]: #motivation + +[Return Position Impl Trait (RPIT)][RPIT] is used when you want to return a value, but +don't want to specify the type. In today's Rust (1.66.0 at the time of writing) +it's only possible to use this when you're returning a single type from the +function. The moment multiple types are returned from the function, the compiler +will error. This can be frustrating, because it means you're likely to either +resort to using `Box` or manually construct an enum to to map the +branches to. It's not always desirable or possible to use `Box`. And +constructing an enum manually can be both time-intensive, complicated, and can +obfuscate + the intent of the code. + +What we're proposing here is not so much a new feature, as an expansion of the +cases in which `impl Trait` can be used. We've seen previous efforts for this, +in particular [RFC 1951: Expand Impl Trait][rfc1951] and more recently in [RFC +2515: Type Alias Impl Trait (TAIT)][TAIT]. This continues that expansion by +enabling more code to make use of RPIT. + +[rfc1951]: https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md +[TAIT]: https://rust-lang.github.io/rfcs/2515-type_alias_impl_trait.html + +A motivating example for this is use in error handling: it's not uncommon to +have a function return more than one error type, but you may not necessarily +care about the exact errors returned. You may either choose to define a `Box` which has the downside that [it itself does not implement +`Error`][no-error]. Or you may choose to define your own enum of errors, which +can be a lot of work and may obfuscate the actual intent of the code. It may +sometimes be preferable to return an `impl Trait` instead: + +[no-error]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=97894fc907fa2d292cbe909467d4db4b + +```rust +use std::error::Error; +use std::fs; + +// ❌ Multi-type RPIT does not yet compile (Rust 1.66.0) +// error[E0282]: type annotations needed +fn main() -> Result<(), impl Error> { + let num = i8::from_str_radix("A", 16)?; // `Result<_, std::num::ParseIntError>` + let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>` + // ... use values here + Ok(()) +} +``` + +# Desugaring +[reference-level-explanation]: #reference-level-explanation + +## Overview + +Let's take a look again at the code from our motivation section. This function +has two branches which each return a different type which implements the +[`Iterator` trait][`Iterator`]: + +[`Iterator`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html + +```rust +fn multi_iter(x: i32) -> impl Iterator { + match x { + 0 => 1..10, // `std::ops::Range` + _ => vec![5, 10].into_iter(), // `std::vec::IntoIter` + } +} +``` + +This code should be desugared by the compiler into something resembling the following +([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=af4c0e61df25acaada168449df9838d3)): + +```rust +// anonymous enum generated by the compiler +enum Enum { + A(std::ops::Range), + B(std::vec::IntoIter), +} + +// trait implementation generated by the compiler, +// delegates to underlying enum member's values +impl Iterator for Enum { + type Item = i32; + + fn next(&mut self) -> Option { + match self { + Enum::A(iter) => iter.next(), + Enum::B(iter) => iter.next(), + } + } + + // ..repeat for the remaining 74 `Iterator` trait methods +} + +// the desugared function now returns the generated enum +fn multi_iter(x: i32) -> Enum { + match x { + 0 => Enum::A(1..10), + _ => Enum::B(vec![5, 10].into_iter()), + } +} +``` + +## Step-by-step guide + +This desugaring can be implemented using the following steps: + +1. Find all return calls in the function +2. Define a new enum with a member for each of the function's return types +3. Implement the traits declared in the `-> impl Trait` bound for the new enum, + matching on `self` and delegating to the enum's members +4. Substitute the `-> impl Trait` signature with the concrete enum +5. Wrap each of the function's return calls in the appropriate enum member + +The hardest part of implementing this RFC will likely be the actual trait +implementation on the enum, as each of the trait methods will need to be +delegated to the underlying types. + +# Interaction with lifetimes + +`dyn Trait` already supports multi-type _dynamic_ dispatch. The rules we're +proposing for multi-type _static_ dispatch using `impl Trait` should mirror the +existing rules we apply to `dyn Trait.` We should follow the same lifetime rules +for multi-type `impl Trait` as we do for `dyn Trait`: + +```rust +fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range) -> impl Iterator + 'a { + match x { + 0 => iter_a, // `&'a std::ops::Range` + _ => vec![5, 10].into_iter(), // `std::vec::IntoIter` + } +} +``` + +This code should be desugared by the compiler into something resembling the following +([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=60ddacbb20c4068a0fff44a5481a7136)): + +```rust +enum Enum<'a> { + A(&'a mut std::ops::Range), + B(std::vec::IntoIter), +} + +impl<'a> Iterator for Enum<'a> { + type Item = i32; + + fn next(&mut self) -> Option { + match self { + Enum::A(iter) => iter.next(), + Enum::B(iter) => iter.next(), + } + } + + // ..repeat for the remaining 74 `Iterator` trait methods +} + +fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range) -> Enum<'a> { + match x { + 0 => Enum::A(iter_a), + _ => Enum::B(vec![5, 10].into_iter()), + } +} +``` + +It should be fine if multiple iterators use the same lifetime. But only a single +lifetime should be permitted on the return type, as is the case today when +using `dyn Trait`: + +```rust +// ❌ Fails to compile (Rust 1.66.0) +// error[E0226]: only a single explicit lifetime bound is permitted +fn fails<'a, 'b>() -> Box { + ... +} +``` + +# Prior art +[prior-art]: #prior-art + +## auto-enums crate + +The [`auto-enums` crate][auto-enums] implements a limited variation of what is +proposed in this RFC using procedural macros. It's limited to a predefined set +of traits only, whereas this RFC enables multi-type RPIT to work for _all_ +traits. This limitation exists in the proc macro because it doesn't have access +to the same type information as the compiler does, so the trait delegations +have to be authored by hand. Here's an example of the crate being used to +generate an `impl Iterator`: + +[auto-enums]: https://docs.rs/auto_enums/latest/auto_enums/ + +```rust +use auto_enums::auto_enum; + +#[auto_enum(Iterator)] +fn foo(x: i32) -> impl Iterator { + match x { + 0 => 1..10, + _ => vec![5, 10].into_iter(), + } +} +``` + +# Future possibilities +[future-possibilities]: #future-possibilities + +## Anonymous enums + +Rust provides a way to declare anonymous structs using tuples. But we don't yet +have a way to declare anonymous enums. A different way of interpreting the +current RFC is as a way to declare anonymous type-erased enums, by expanding what +RPIT can be used for. It stands to reason that there will be cases where people +may want anonymous _non-type-erased_ enums too. + +Take for example the iterator code we've been using throughout this RFC. But +instead of `Iterator` yielding `i32`, let's make it yield `i32` or `&'static +str`: + +```rust +fn multi_iter(x: i32) -> impl Iterator { + match x { + 0 => 1..10, // yields `i32` + _ => vec!["hello", "world"].into_iter(), // yields `&'static str` + } +} +``` + +One solution to make it compile would be to first map it to a type which can +hold *either* `i32` or `String`. The obvious answer would be to use an enum for +this: + +```rust +enum Enum { + A(i32), + B(&'static str), +} + +fn multi_iter(x: i32) -> impl Iterator { + match x { + 0 => 1..10.map(Enum::A), + _ => vec!["hello", "world"].into_iter().map(Enum::B), + } +} +``` + +This code resembles the desugaring for multi-value RPIT we're proposing in this +RFC. In fact: it may very well be that a lot of the internal compiler machinery +used for multi-RPIT could be reused for anonymous enums. + +The similarities might become even closer if we consider how "anonymous enums" +could be used for error handling. Sometimes it can be useful to know which error +was returned, so you can decide how to handle it. For this RPIT isn't enough: we +actually want to retain the underlying types so we can match on them. We might +imagine the earlier errror example could instead be written like this: + +```rust +use std::{fs, io, num}; + +// The earlier mult-value RPIT version returned `-> Result<(), impl Error>`. +// This example declares an anonymous enum instead, using made-up syntax +fn main() -> Result<(), num::ParseIntError | io::Error> { + let num = i8::from_str_radix("A", 16)?; // `Result<_, std::num::ParseIntError>` + let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>` + // ... use values here + Ok(()) +} +``` + +There are a lot of questions to be answered here. Which traits should +this implement? What should the declaration syntax be? How could we match on +values? All enough to warrant its own exploration and possible RFC in the +future. + +## Language-level support for delegation/proxies + +One of the trickiest parts of implementing this RFC will be to delegate from the +generated enum to the individual enum's members. If we implement this +functionality in the compiler, it may be beneficial to generalize this +functionality and create syntax for it. We're already seen [limited support for +delegation codegen][support] in Rust-Analyzer as a source action [^disclaimer], and [various crates] +implementing delegation exist on Crates.io. + +[support]: https://github.com/rust-lang/rust-analyzer/issues/5944 +[various crates]: https://crates.io/search?q=delegate + +[^disclaimer]: I (Yosh) filed the issue and authored the extension to Rust-Analyzer +for this. Which itself was based on prior art found in the VS Code Java extension. + +To provide some sense for what this might look like. Say we were authoring some +[newtype] which wraps an iterator. We could imagine we'd write that in Rust +by hand today like this: + +[newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html + +```rust +struct NewIterator(iter: std::array::Iterator); + +impl Iterator for NewIterator { + type Item = T; + + #[inline] + pub fn next(&mut self) -> Option { + self.0.next() + } + + // ..repeat for the remaining 74 `Iterator` trait methods +} +``` + +Forwarding a single trait with a single method is doable. But we can imagine +that repeating this for multiple traits and methods quickly becomes a hassle, +and can obfuscate the _intent_ of the code. Instead if we could declare that +`NewIterator` should _delegate_ its `Iterator` implementation to the iterator +contained within. Say we adopted a [Kotlin-like syntax], we could imagine it +could look like this: + +[Kotlin-like syntax]: https://kotlinlang.org/docs/delegation.html#overriding-a-member-of-an-interface-implemented-by-delegation + +```rust +struct NewIterator(iter: std::array::Iterator); + +impl Iterator for NewIterator by Self.0; // Use `Self.0` as the `Iterator` impl +``` + +There are many open questions here regarding semantics, syntax, and expanding it +to other features such as method delegation. But given the codegen for both +multi-value RPIT and delegation will share similarities, it may be worth +exploring further in the future.