Skip to content

Commit

Permalink
Variant docs (#740)
Browse files Browse the repository at this point in the history
* untagged variants

* variant type spread docs

* variant coercion

* catch-all case

* rearrange
  • Loading branch information
zth authored Nov 2, 2023
1 parent 959daba commit 1bf628f
Showing 1 changed file with 198 additions and 34 deletions.
232 changes: 198 additions & 34 deletions pages/docs/manual/latest/variant.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,26 @@ var me = {

The output is slightly uglier and less performant than the former.

## Variant Type Spreads
Just like [with records](record#record-type-spread), it's possible to use type spreads to create new variants from other variants:

```rescript
type a = One | Two | Three
type b = | ...a | Four | Five
```

Type `b` is now:
```rescript
type b = One | Two | Three | Four | Five
```

Type spreads act as a 'copy-paste', meaning all constructors are copied as-is from `a` to `b`. Here are the rules for spreads to work:
- You can't overwrite constructors, so the same constructor name can exist in only one place as you spread. This is true even if the constructors are identical.
- All variants and constructors must share the same runtime configuration - `@unboxed`, `@tag`, `@as` and so on.
- You can't spread types in recursive definitions.

Note that you need a leading `|` if you want to use a spread in the first position of a variant definition.

### Pattern Matching On Variant

See the [Pattern Matching/Destructuring](pattern-matching-destructuring) section later.
Expand All @@ -160,10 +180,9 @@ See the [Pattern Matching/Destructuring](pattern-matching-destructuring) section

A variant value compiles to 3 possible JavaScript outputs depending on its type declaration:

- If the variant value is a constructor with no payload, it compiles to a string.
- If it's a constructor with a payload, it compiles to an object with the field `TAG` and the field `_0` for the first payload, `_1` for the second payload, etc.
- An exception to the above is a variant whose type declaration contains only a single constructor with payload. In that case, the constructor compiles to an object without the `TAG` field.
- Labeled variant payloads (the inline record trick earlier) compile to an object with the label names instead of `_0`, `_1`, etc. The object might or might not have the `TAG` field as per previous rule.
- If the variant value is a constructor with no payload, it compiles to a string of the constructor name. Example: `Yes` compiles to `"Yes"`.
- If it's a constructor with a payload, it compiles to an object with the field `TAG` and the field `_0` for the first payload, `_1` for the second payload, etc. The value of `TAG` is the constructor name as string by default, but note that the name of the `TAG` field as well as the string value used for each constructor name [can be customized](#tagged-variants).
- Labeled variant payloads (the inline record trick earlier) compile to an object with the label names instead of `_0`, `_1`, etc. The object will have the `TAG` field as per the previous rule.

Check the output in these examples:

Expand Down Expand Up @@ -294,7 +313,7 @@ Now, this maps 100% to the TypeScript code, including letting us bring over the

### String literals

The same logic is easily applied to string literals from TypeScript, only here the benefit is even larger, because string literals have the same limitations in TypeScript that polymorphic variants have in ReScript.
The same logic is easily applied to string literals from TypeScript, only here the benefit is even larger, because string literals have the same limitations in TypeScript that polymorphic variants have in ReScript:

```typescript
// direction.ts
Expand All @@ -303,9 +322,18 @@ type direction = "UP" | "DOWN" | "LEFT" | "RIGHT";

There's no way to attach documentation strings to string literals in TypeScript, and you only get the actual value to interact with.

### Valid `@as` payloads
Here's a list of everything you can put in the `@as` tag of a variant constructor:
- A string literal: `@as("success")`
- An int: `@as(5)`
- A float: `@as(1.5)`
- True/false: `@as(true)` and `@as(false)`
- Null: `@as(null)`
- Undefined: `@as(undefined)`

## Untagged variants

With _untagged variants_ it is possible to represent a heterogenous array.
With _untagged variants_ it is possible to mix types together that normally can't be mixed in the ReScript type system, as long as there's a way to discriminate them at runtime. For example, with untagged variants you can represent a heterogenous array:

```rescript
@unboxed type listItemValue = String(string) | Boolean(bool) | Number(float)
Expand All @@ -323,10 +351,38 @@ var myArray = ["hello", true, false, 13.37];

In the above example, reaching back into the values is as simple as pattern matching on them.

### Pattern matching on nullable values
### Advanced: Unboxing rules
#### No overlap in constructors
A variant can be unboxed if no constructors have overlap in their runtime representation.

For example, you can't have `String1(string) | String2(string)` in the same unboxed variant, because there's no way for ReScript to know at runtime which of `String1` or `String2` that `string` belongs to, as it could belong to both.
The same goes for two records - even if they have fully different shapes, they're still JavaScript `object` at runtime.

Don't worry - the compiler will guide you and ensure there's no overlap.

#### What you can unbox
Here's a list of all possible things you can unbox:
- `string`: `String(string)`
- `float`: `Number(float)`. Notice `int` cannot be unboxed, because JavaScript only has `number` (not actually `int` and `float` like in ReScript) so we can't disambiguate between `float` and `int` at runtime.
- `bool`: `Boolean(bool)`
- `array<'value>`: `List(array<string>)`
- `promise<'value>`: `Promise(promise<string>)`
- `Dict.t`: `Object(Dict.t<string>)`
- `Date.t`: `Date(Date.t)`. A JavaScript date.
- `Blob.t`: `Blob(Blob.t)`. A JavaScript blob.
- `File.t`: `File(File.t)`. A JavaScript file.
- `RegExp.t`: `RegExp(RegExp.t)`. A JavaScript regexp instance.

Again notice that the constructor names can be anything, what matters is what's in the payload.

> **Under the hood**: Untagged variants uses a combination of JavaScript `typeof` and `instanceof` checks to discern between unboxed constructors at runtime. This means that we could add more things to the list above detailing what can be unboxed, if there are useful enough use cases.
### Pattern matching on unboxed variants
Pattern matching works the same on unboxed variants as it does on regular variants. In fact, in the perspective of ReScript's type system there's no difference between untagged and tagged variants. You can do virtually the same things with both. That's the beauty of untagged variants - they're just variants to you as a developer.

Here's an example of pattern matching on an unboxed nullable value that illustrates the above:

```rescript
// The type definition below is inlined here to examplify, but this definition will live in [Core](https://github.com/rescript-association/rescript-core) and be easily accessible
module Null = {
@unboxed type t<'a> = Present('a) | @as(null) Null
}
Expand All @@ -345,12 +401,13 @@ let getBestFriendsAge = user =>
| _ => None
}
```
No difference to how you'd do with a regular variant. But, the runtime representation is different to a regular variant.

> Notice how `@as` allows us to say that an untagged variant case should map to a specific underlying _primitive_. `Present` has a type variable, so it can hold any type. And since it's an unboxed type, only the payloads `'a` or `null` will be kept at runtime. That's where the magic comes from.
### Decoding and encoding JSON idiomatically

With untagged variants, we have everything we need to define a JSON type:
With untagged variants, we have everything we need to define a native JSON type:

```rescript
@unboxed
Expand All @@ -370,9 +427,8 @@ Here's an example of how you could write your own JSON decoders easily using the
```rescript
@unboxed
type rec json =
| @as(false) False
| @as(true) True
| @as(null) Null
| Boolean(bool)
| String(string)
| Number(float)
| Object(Js.Dict.t<json>)
Expand Down Expand Up @@ -432,43 +488,151 @@ let usersToJson = users => Array(users->Array.map(userToJson))

This can be extrapolated to many more cases.

// ### Unboxable types
### Advanced: Catch-all Constructors
With untagged variants comes a rather interesting capability - catch-all cases are now possible to encode directly into a variant.

Let's look at how it works. Imagine you're using a third party API that returns a list of available animals. You could of course model it as a regular `string`, but given that variants can be used as "typed strings", using a variant would give you much more benefit:

// TODO #734: Add a list of what can currently be unboxed (and why), and a note that it's possible that more things could be unboxed in the future.
<CodeTab labels={["ReScript", "JS Output"]}>
```rescript
type animal = Dog | Cat | Bird
// ### Catch all
type apiResponse = {
animal: animal
}
// TODO #733: Add a small section on the "catch all" trick, and what kind of things that enable.
let greetAnimal = (animal: animal) =>
switch animal {
| Dog => "Wof"
| Cat => "Meow"
| Bird => "Kashiiin"
}
```
```javascript
```
</CodeTab>

// ## Variant spread

// TODO #732
This is all fine and good as long as the API returns `"Dog"`, `"Cat"` or `"Bird"` for `animal`.
However, what if the API changes before you have a chance to deploy new code, and can now return `"Turtle"` as well? Your code would break down because the variant `animal` doesn't cover `"Turtle"`.

## Coercion
So, we'll need to go back to `string`, loosing all of the goodies of using a variant, and then do manual conversion into the `animal` variant from `string`, right?
Well, this used to be the case before, but not anymore! We can leverage untagged variants to bake in handling of unknown values into the variant itself.

You can convert a variant to a `string` or `int` at no cost:
Let's update our type definition first:
```rescript
@unboxed
type animal = Dog | Cat | Bird | UnknownAnimal(string)
```

Notice we've added `@unboxed` and the constructor `UnknownAnimal(string)`. Remember how untagged variants work? You remove the constructors and just leave the payloads. This means that the variant above at runtime translates to this (made up) JavaScript type:
```
type animal = "Dog" | "Cat" | "Bird" | string
```
So, any string not mapping directly to one of the payloadless constructors will now map to the general `string` case.

As soon as we've added this, the compiler complains that we now need to handle this additional case in our pattern match as well. Let's fix that:

<CodeTab labels={["ReScript", "JS Output"]}>
```rescript
@unboxed
type animal = Dog | Cat | Bird | UnknownAnimal(string)
```res
type company = Apple | Facebook | Other(string)
let theCompany: company = Apple
type apiResponse = {
animal: animal
}
let message = "Hello " ++ (theCompany :> string)
let greetAnimal = (animal: animal) =>
switch animal {
| Dog => "Wof"
| Cat => "Meow"
| Bird => "Kashiiin"
| UnknownAnimal(otherAnimal) =>
`I don't know how to greet animal ${otherAnimal}`
}
```
```javascript
function greetAnimal(animal) {
if (!(animal === "Cat" || animal === "Dog" || animal === "Bird")) {
return "I don't know how to greet animal " + animal;
}
switch (animal) {
case "Dog" :
return "Wof";
case "Cat" :
return "Meow";
case "Bird" :
return "Kashiiin";

}
}
```
</CodeTab>

```js
var theCompany = "Apple";
var message = "Hello " + theCompany;
There! Now the external API can change as much as it wants, we'll be forced to write all code that interfaces with `animal` in a safe way that handles all possible cases. All of this baked into the variant definition itself, so no need for labor intensive manual conversion.

This is useful in any scenario when you use something enum-style that's external and might change. Additionally, it's also useful when something external has a large number of possible values that are known, but where you only care about a subset of them. With a catch-all case you don't need to bind to all of them just because they can happen, you can safely just bind to the ones you care about and let the catch-all case handle the rest.

## Coercion
In certain situations, variants can be coerced to other variants, or to and from primitives. Coercion is always zero cost.

### Coercing Variants to Other Variants
You can coerce a variant to another variant if they're identical in runtime representation, and additionally if the variant you're coercing can be represented as the variant you're coercing to.

Here's an example using [variant type spreads](#variant-type-spreads):
```rescript
type a = One | Two | Three
type b = | ...a | Four | Five
let one: a = One
let four: b = Four
// This works because type `b` can always represent type `a` since all of type `a`'s constructors are spread into type `b`
let oneAsTypeB = (one :> b)
```

</CodeTab>
### Coercing Variants to Primitives
Variants that are guaranteed to always be represented by a single primitive at runtime can be coerced to that primitive.

It works with strings, the default runtime representation of payloadless constructors:
```rescript
// Constructors without payloads are represented as `string` by default
type a = One | Two | Three
let one: a = One
// All constructors are strings at runtime, so you can safely coerce it to a string
let oneAsString = (one :> string)
```

If you were to configure all of your construtors to be represented as `int` or `float`, you could coerce to those too:
```rescript
type asInt = | @as(1) One | @as(2) Two | @as(3) Three
let oneInt: asInt = One
let toInt = (oneInt :> int)
```

### Advanced: Coercing `string` to Variant
In certain situtations it's possible to coerce a `string` to a variant. This is an advanced technique that you're unlikely to need much, but when you do it's really useful.

You can coerce a `string` to a variant when:
- Your variant is `@unboxed`
- Your variant has a "catch-all" `string` case

Let's look at an example:
```rescript
@unboxed
type myEnum = One | Two | Other(string)
// Other("Other thing")
let asMyEnum = ("Other thing" :> myEnum)
// One
let asMyEnum = ("One" :> myEnum)
```

// TODO #731: expand this section with:
//
// Coercing between variants (and the constraints around that)
// Why you can sometimes coerce from variant to string/int/float, and how to think about that (runtime representation must match)
// The last additions of allowing coercing strings to unboxed variants with catch-all string cases
This works because the variant is unboxed **and** has a catch-all case. So, if you throw a string at this variant that's not representable by the payloadless constructors, like `"One"` or `"Two"`, it'll _always_ end up in `Other(string)`, since that case can represent any `string`.

## Tips & Tricks

Expand Down Expand Up @@ -620,12 +784,12 @@ switch data {
```js
console.log("Wof");

var data = /* Dog */0;
var data = "Dog";
```

</CodeTab>

The compiler sees the variant, then

1. conceptually turns them into `type animal = 0 | 1 | 2`
1. conceptually turns them into `type animal = "Dog" | "Cat" | "Bird"`
2. compiles `switch` to a constant-time jump table (`O(1)`).

0 comments on commit 1bf628f

Please sign in to comment.