Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erased type-tagged anonymous union types #538

Open
5 of 6 tasks
alfonsogarciacaro opened this issue Feb 14, 2017 · 89 comments
Open
5 of 6 tasks

Erased type-tagged anonymous union types #538

alfonsogarciacaro opened this issue Feb 14, 2017 · 89 comments

Comments

@alfonsogarciacaro
Copy link

alfonsogarciacaro commented Feb 14, 2017

Modified Proposal (by @dsyme)

This is a suggestion to add adhoc structural type-tagged union types where

  • type syntax (A|B) or Typed(A|B) or Typed<A,B>
  • each type of (A|B|C|...) is distinct and non-overlapping w.r.t. runtime type tests
  • such types are erased to object
  • introducing such a union value would need to be either explicit (e.g. using some new operator like Typed) or type-directed or both

See #538 for original proposal. e.g. this would allow some or all of these:

let generateValue1 () : Typed(int | string)) = if Monday then 2 else "4"

let generateValue2 () = if Monday then Typed 2 else Typed "4"

let eliminateValue (x : Typed(int | string)) = ...
    match x with 
    | :? int as i ->  ...
    | :? string as s -> ...

let eliminateValue2 x = ...
    match x with 
    | Typed(i : int) ->  ...
    | Typed(s; string) -> ...

type Allowed =  Typed(int | string)

There are plenty of questions about such a design (e.g. can you eliminate "some but not all" of the cases in a match? Is column-polymorphism supported?). However putting those aside, such a construct already has utility in the context of Fable, since it corresponds pretty closely to Typescript unions and how JS treats values. It also has lots of applications in F# programming, especially if the use of Typed can be inferred in many cases.

Now, this construct automatically gives a structural union message type , e.g.

type MsgA = MsgA of int * int
let update1 () = Typed (MsgA (3,4))

type MsgB = MsgB of int * int
let update2 () = Typed (MsgB (3,4))

val update1 : unit -> Typed MsgA  // actually : unit -> Typed (MsgA | ..), i.e. column-generic on use
val update2 : unit -> Typed MsgB // actually : unit -> Typed (MsgB | ..), i.e. column-generic on use

and a combination of update1 and update2 would give

let update = combine update1 update2 

val update : unit -> Typed (MsgA | MsgB)

As noted in the comments, some notion of column-generics would likely be needed, at least introduced implicitly at use-sites.

Original Proposal (@alfonsogarciacaro)

I propose we add erased union types as an F# first citizen. The erased union types already exist in Fable to emulate Typescript (non-labeled) union types:

http://fable.io/docs/interacting.html#Erase-attribute

Note that Fable allows you to define your custom erased union types, but this is because it's painful to type a generic one like U2.Case1. If the compiler omits the need to prefix the argument, this wouldn't be necessary and using a generic type can be the easiest solution.

The F# compiler could convert the following code:

// The name ErasedUnion is tentative
// The compiler should check the generic args are different
let foo(arg: ErasedUnion<string, int>) =
    match arg with
    | ErasedUnion.Case1 s -> s.Length
    | ErasedUnion.Case2 i -> i

// No need to instantiate ErasedUnion, but the compiler checks the type
foo "hola"
foo 5
// This doesn't compile
foo 5.

Into something like:

let foo(arg: obj) =
   match arg with
   | :? string as s -> s.Length
   | :? int as i -> i
   | _ -> invalidArg "arg" "Unexpected type"
  • Pros: It will make the Fable bindings generated from Typescript declaration files much more pleasant to work with.

  • Cons: It's a feature that seems to be exclusively dedicated to interact with a dynamic language like JS.

  • Estimated cost (XS, S, M, L, XL, XXL): S

Alternatives

For Fable it's been suggested to generate overloads in the type bindings instead of using erased union types:

interface IFoo {
    foo(arg: string | number): void;
}
type IFoo =
    abstract foo: arg: string -> unit
    abstract foo: arg: number -> unit

However these has some problems:

  • It can quickly explode when you have several erased union arguments
  • Due to type inference the F# compiler many times doesn't know which overload to use
  • Cannot be used in properties
  • Doesn't let you use erased unions yourself.

Affadavit

Please tick this by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I would be willing to help implement and/or test this
  • I or my company would be willing to help crowdfund F# Software Foundation members to work on this
@AviAvni
Copy link

AviAvni commented Feb 14, 2017

This is sound like it need to be implemented with CompilationRepresentationAttribute
Like the way the option.None represents as null so you can defined your erased union

@Horusiath
Copy link

@alfonsogarciacaro erased unions are not only a thing for dynamic lang transpilation. They are also useful in message-based systems i.e. when you want to describe protocols in as a closed set of messages (in case of F# those could be discriminated unions). In that case a behavior that wants to satisfy more than one protocol, must have some way to define union of those, which so far is possible only as a lowest common denominator (usually an obj type).

@dsyme
Copy link
Collaborator

dsyme commented Feb 15, 2017

There are some other interesting reasons for this compiler feature. One is that we frequently hit situations in the F# compiler where a union type incurs an entire extra level in allocations, e.g.

type NameResolutionItem = 
    | Value of ValRef
    | UnionCase of UnionCaseRef
    | Entity of EntityRef
    | ...

The needs for this type are relatively "low perf" (cost of discrimination doesn't really matter - multiple type switches are ok) but the type gets many, many long-lived allocations when the F# compiler is hosted in the IDE. One could make the type a struct wrapping an obj reference manually, but simply adding an annotation to represent this as an erased union type and discriminate by type switching would be a much less intrusive code change. (Note using a struct union would not work well as the struct would still have a dsicrimination tag integer, and would have one field for each union case - struct unions are by no means perfect representations for multi-case types as things stand at the moment)

Estimated cost (XS, S, M, L, XL, XXL): S

:) There's not really any such thing as "S" for language features :) I'd say "M" or "L".

... CompilationRepresentationAttribute ...

yes that would seem natural

@robkuz
Copy link

robkuz commented Feb 15, 2017

Will there be multiple ErasedUnion s under this proposal?
Like ErasedUnion3, ErasedUnion4 etc.?

@AviAvni
Copy link

AviAvni commented Feb 15, 2017

@robkuz if the implementation will be with CompilationRepresentationAttribute then you can create your own erased union

[<CompilationRepresentationAttribute(CompilationRepresentationFlags.ErasedUnion)>]
type DU<'a, 'b> = A of 'a | B of 'b

@alfonsogarciacaro
Copy link
Author

@AviAvni @dsyme Please note that if this just enables a CompilationRepresentationFlags.ErasedUnion on customly defined unions and doesn't allow implicit conversions when passing arguments (in the example above writing foo "hola" instead of foo (ErasedUnion.Case1 "hola")), there won't be much benefit for Fable, as this is basically the same situation as we have now.

@ijsgaus
Copy link

ijsgaus commented Feb 16, 2017

This is almost same as or on type operator.T1 or T2. In perspective can be realized by special attribute on function. Full erased from compiled code. But how to save metadata?

@dsyme
Copy link
Collaborator

dsyme commented Feb 17, 2017

But how to save metadata?

I think the intent is that the types would be erased (like other F# information). The metadata would only available at compile-time through the extra blob of F#-specific metadata that F# uses

@cartermp
Copy link
Member

I think that given the reasoning above (both for FABLE and the use case @Horusiath mentioned), this would be a good addition. 👍

@Richiban
Copy link

Richiban commented Mar 14, 2017

Is it very important that the type is erased?

Perhaps it's a slightly separate proposal, but I would love to have ad-hoc type unions in the form:

let print (item : string | int) = 
    match item with
    |  s : string -> printfn "We have a string: %s" s
    |  i : int ->   printfn "We have an int: %i" i

Which would essentially compile down to the same IL as:

let print (item : Choice<string, int>) = 
    match item with
    | Choice1Of2 s -> printfn "We have a string: %s" s
    | Choice2Of2 i -> printfn "We have an int: %i" i

and, more importantly, at the callsite:

print "Hello world"

instead of:

print (Choice1Of2 "Hello world")

@alfonsogarciacaro
Copy link
Author

alfonsogarciacaro commented Jul 23, 2017

In Fable we've finally managed to remove the erased union case name by using the so-called erased/implicit cast operator !^. Check this and this. So now it's possible to do:

let foo(arg: U2<string, int>) =
    match arg with
    | U2.Case1 s -> s.Length
    | U2.Case2 i -> i

// No need to write foo(U2.Case1 "hola")
foo !^"hola"
foo !^5
// The argument is still type checked. This doesn't compile
foo !^5.

@ovatsus
Copy link
Member

ovatsus commented Dec 3, 2017

TypeScript also supports string literals in these union types, i.e, in addition to type T1 = number | string, it also supports type T1 = number | "string1" | "string2". Would be nice to also support that.

Or alternatively, if string enums were supported like in TypeScript, we could acheive the same effect that way:

    enum Colors { Red = "RED", Green = "GREEN", Blue = "BLUE" }
    type T = number | Colors

@alfonsogarciacaro
Copy link
Author

As a reference, Fable already supports string enums 😄

@cloudRoutine
Copy link

@Richiban is this what you're looking for? - Polymorphic Variants

@dsyme
Copy link
Collaborator

dsyme commented Mar 2, 2018

@Richiban @alfonsogarciacaro I hijacked this suggestion to convert this to a suggestion for erased ad-hoc type unions of the kind suggested by @Richiban

(Note sure what the callsite would be though @Richiban - perhaps what you say)./

@dsyme dsyme changed the title Erased union types (like Typescript union types) Erased union types Mar 3, 2018
@dsyme dsyme changed the title Erased union types Erased type-tagged anonymous union types Mar 3, 2018
@ijsgaus
Copy link

ijsgaus commented Mar 3, 2018

Can we make this types not erased? Why not introduce base implementation on Typed<'t1, 't2, ...> and make this as member of FSharp.Core

@Richiban
Copy link

Richiban commented Mar 3, 2018

@ijsgaus But if it's not erased then it's no difference from Choice<'a,' b>

@wallymathieu
Copy link

This seems like a really sweet suggestion! I imagine it could help the performance of a lot of library code.

@voronoipotato
Copy link

Would this help this problem?

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose | Cardinal | Mallard
let x  = Goose 7

This code fails. Goose in the Bird DU shadows Goose as a type and turns it into an Atom. This shadowing happens silently and at least to me is surprising.

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose of Goose | Cardinal of Cardinal | Mallard of Mallard
let x  = Goose 7

The type shadowing here still means I can't move forward, because there's no way to make a Goose....

type Goose = Goose of int
type Cardinal = Cardinal of int
type Mallard = Mallard of int
type Bird = Goose' of Goose | Cardinal' of Cardinal | Mallard' of Mallard
let x  = Goose' (Goose 7)

This works. This kind of situation happens where someone created a single case DU, and it gets consumed by someone who can't muck with the original DU for fear of breaking existing code.

@dsyme
Copy link
Collaborator

dsyme commented Jan 12, 2021

... would be compiled down to something like this pseudo C#:

We wouldn't generate multiple C# overloads splitting out the choices - it's a technique fraught with problems.

@chkn
Copy link

chkn commented Jan 13, 2021

it's a technique fraught with problems.

Can you elaborate? The only problem I can think of is a proliferation of overloads, but each overload would simply delegate to the actual erased-type overload, so hopefully wouldn't be too bloaty.

there are a lot of APIs where adding this feature reduces N overloads to one overload taking an N-way choice

Exactly! This is a great feature for API modeling, which is why converting all these to obj would be a such a shame for the API exposed to .NET.

I am aware that these kind of APIs also often want to constrain to particular values, e.g. (int32|int64|"*"|"auto")

Note that this is already valid F# (albeit with an incomplete matches warning):

let foo ("*"|"auto" as str) = printfn "Got %s" str

@dsyme
Copy link
Collaborator

dsyme commented Jan 14, 2021

Can you elaborate?

Imagine the method being virtual, for example. Then multiple virtual slots are generated. Or imagine multiple untagged union paramaters generating an exponential number of overloads

@charlesroddie
Copy link

We are going to need guide users clearly about this. Giving clear use cases in online docs, to try to prevent an avalanche of terrible code that would result from users replacing DUs with this because this is shorter. And making sure no one uses this feature in assemblies which could be referenced by .Net languages other than F#.

@JaggerJo
Copy link

JaggerJo commented Jan 19, 2021

We are going to need guide users clearly about this. Giving clear use cases in online docs, to try to prevent an avalanche of terrible code that would result from users replacing DUs with this because this is shorter. And making sure no one uses this feature in assemblies which could be referenced by .Net languages other than F#.

Not a big fan. Really seems like a super specific feature that should not be used 99.9% of the time. Looking at the code samples in the RFC and I strongly prefer the currently available solutions.

It also looks like adding this will make anonymous DU's more unlikely..

Some things I thought while looking at the examples in the RFC

They allow representing subset of protocols as a type without needing to resort to the lowest common denominator like obj.
Types are actually enforced, so mistakes can be caught early.
Because they are enforced, type information is less likely to become outdated or miss edge-cases.

.. all of this can be archived with a DU right ?

Also from looking at the exhaustivity checking it looks like there is still one case that can slip through - the base type not being explicitly handled. This also means that you actually need to know the base type.

let prettyPrint (x: (int8|int16|int64|string)) =
    match x with
    | :? (int8|int16|int64) as y -> prettyPrintNumber y
    | :? string as y -> prettyPrintNumber y
    
// common base type is object, value types get boxed.
let prettyPrint (x: obj) =
    match x with
    | :? int8 | :? int16 | :? int64 as y -> prettyPrintNumber y
    | :? string as y -> prettyPrintNumber y
    
// unhandled cases - might be called from C#/VB
prettyPrint (null)
prettyPrint ([1..3] :> obj)

They serve as an alternative to function overloading.

.. but are only really usable from F#.

They allow representing more than one type

.. as DU's do in a slightly different way.

Still think what I described in #538 (comment) would enable the same use cases without adding the totally new concept of quite limited erased unions.

Really don't want to deal with code like this (taken from the RFC samples) in the future.

type Username = Username of string
type Password = Password of string
type UserOrPass = (Password | UserName) // UserOrPass is a type alias

// `getUserOrPass` is inferred to `unit -> UserOrPass`
let getUserOrPass () = if (true) then name :> UserOrPass else password :> UserOrPass

// `getUserOrPass2` binding is inferred to `unit -> (UserOrPass | Error)`
let getUserOrPass2 () = if isErr() then err :> (UserOrPass | Error) else getUserOrPass() :> _

@cartermp
Copy link
Member

Can discussion about the RFC move to the discussion thread here? fsharp/fslang-design#519

This feature is approved in principle and so it's no longer a concern of "would this ever be included, in some form, in a future F# version?". The question is how and in what shape, hence the RFC discussion.

Note that there are also several open questions posed in the PR here: fsharp/fslang-design#512

@dsyme
Copy link
Collaborator

dsyme commented Jan 20, 2021

RFC at https://github.com/fsharp/fslang-design/blob/master/RFCs/FS-1092-anonymous-type-tagged-unions.md

@chkn
Copy link

chkn commented Jan 30, 2021

Imagine the method being virtual, for example. Then multiple virtual slots are generated.

The intention is not for multiple slots to be generated. There is still only one true implementing method; the strongly-typed overloads are simply an API facade that delegate to that.

Or imagine multiple untagged union paramaters generating an exponential number of overloads

This is definitely a valid concern.

@kspeakman
Copy link

Where did the RFC for erased unions go? Seems like this should be the default for single-case unions. They are commonly used in the community. But they are terrible for performance as union types.

@chkn
Copy link

chkn commented May 1, 2022

Seems like this should be the default for single-case unions.

We already have this for single cases:

type Foo = int

@kspeakman
Copy link

A type alias doesn't provide the same functionality.

open System

type CourseId = Guid
type CouncilId = Guid

let courseOnly (courseId: CourseId) =
    ()

let courseId = Guid.Empty
let councilId = Guid.Empty

// compiles, but (helpfully) would not if CourseId were an SCU
courseOnly councilId

I don't use SCUs, but many do. Understandably so. I made the above mistake. And spent a while scratching my head at unexpected results.

@voronoipotato
Copy link

voronoipotato commented Oct 6, 2022

Where did the RFC for erased unions go? Seems like this should be the default for single-case unions. They are commonly used in the community. But they are terrible for performance as union types.

Yes I had the same question. You could basically represent it as the underlying type for that case say age or name, and then break out the function by case into separate functions that are called .

[<Erased>]
type  Person = Age of int | Named of string

let x = Age 7
let y = Named "steve"

let process a = 
    match a with
    | Age n -> if n > 33 then "older than voronoipotato" else "not older than voronoipotato"
    | Named s -> "hello" + s 
process x
process y

Would get unfolded at compile time into

let x = 7
let y = "steve"
let process_Age n = if n > 33 then "older than voronoipotato" else "not older than voronoipotato"
let process_Named s = "hello" + s 
process_Age x
process_Named y

This is what I had in mind with erased unions. Put a guid on the end if you're worried about name collision. I don't know if this is the correct way to do it... maybe you'd like use a struct and force the fields to overlap or some other witchcraft. The point is it ought to be possible...

@dsyme would this be a whole new suggestion, and has it been suggested?

@kerams kerams mentioned this issue Oct 21, 2022
5 tasks
@Xyncgas
Copy link

Xyncgas commented Nov 10, 2022

to add to the conversation, allowing literal as types is allowing mixing data with types, forming a strong association with logic and domains, while this data can be anything including implementation.

Instead of :

type Operations =
| List
| Add

let Perform (operation:Operations) =
    match operation with
    | List _ -> seq { 0..1}
    | Add _ -> ()//this function can return multiple types

Now :

//whatever the 'Literals as types' syntax are
type Operations =
| List of seq {0..1}
| Add of ()

let Perform (operation:Operations) =
    match operation with
    | List x -> x
    | Add x -> x
//more complicated version would be taking data in this function and doing something with it

people can look the code see there is operation Add & List, and if they want to add another operation called delete it's very straight forward just write it down as another case of Operations. Of course, writing implementation directly in type definition is too verbose sometimes, of course you can do it but of course Add happens to be a little bit more complicated than you think as it's adding something to somewhere on the internet, so you write the implementation somewhere separately maybe in another module

@smoothdeveloper
Copy link
Contributor

C# feature that is similar, being discussed here: dotnet/csharplang#7544

@brianrourkeboll
Copy link

@deyanp
Copy link

deyanp commented Jul 24, 2024

@dsyme any chance we get this in F#?

@brianrourkeboll
Copy link

@deyanp

@/dsyme any chance we get this in F#?

We probably want to wait to see what direction it goes in C# so we can ensure interoperability, etc.

@vzarytovskii
Copy link

@dsyme any chance we get this in F#?

It is approved in principle, so, nothing stops anyone from implementing it. The question is that if we do it now, C# (and possibly runtime) might have different idea about their implementation, which will mean records and tuples all over again - different (potentially incompatible) implementations.

@T-Gro
Copy link

T-Gro commented Jul 26, 2024

@deyanp

@/dsyme any chance we get this in F#?

We probably want to wait to see what direction it goes in C# so we can ensure interoperability, etc.

The C# proposal as of now treats the anonymous unions as object with compiler-only tracking (as opposed to an runtime-optimized struct Choice<..> equivalent)
If this remains, the only interop surface might be in the metadata, if at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests