Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] zero-cost binding to tagged JS objects #5207

Closed
cometkim opened this issue Jun 29, 2021 · 13 comments
Closed

[Feature request] zero-cost binding to tagged JS objects #5207

cometkim opened this issue Jun 29, 2021 · 13 comments

Comments

@cometkim
Copy link
Member

cometkim commented Jun 29, 2021

I explored the various ASTs on astexplorer.net and saw many possibilities. It would be fantastic to use pattern matching when dealing with trees parsed by parser libraries

But in my understanding, that requires converting JS objects to a record and is converted to a ReScript internal representation at runtime.

this feels unnecessary overhead when writing some bindings for a parser library. Because each item in the tree is already a tagged object.

Imagine a function like this:

@deriving(tagged)
type rec node =
  | _Text({
      @tag("#text") nodeName: string,
      value: string,
    })
  | H1({
      @tag("h1") nodeName: string,
      childNodes: array<node>,
    })

@val external nodes: nodes = "nodes"

let rec toText = nodes =>
  nodes->Belt.Array.reduce((text, node) => text ++ " " ++ switch node {
    | _Text({ value }) => value
    | H1({ childNodes }) => toText(childNodes)
  }, "")

nodes->toText // works without additional parsing

And its output, instead of TAG, we can match via a tag we specify.

var match = node;

var tmp;

switch(match.nodeName) {
  case '#text':
    // ...

I think this can be a more ergonomic approach when writing bindings for parsers.

Currently, I have to rely on a 3rd party ppx like decco for this kind of work. Or please let me know if there is a better way I am not aware of

@TheSpyder
Copy link
Contributor

This would be useful in other situations too, I jump through a lot of hoops to bind to Slate's operation set using string matching and %identity externals.

@bobzhang
Copy link
Member

bobzhang commented Jul 1, 2021

@TheSpyder the link is broken

@TheSpyder
Copy link
Contributor

I wasn’t linking to my bindings, those aren’t public (yet). The link is to the source types I’m binding to. The specific line that defines the type is here but it needs context of the rest of the file:
https://github.com/ianstormtaylor/slate/blob/4945a1a27505f59805bbbb630d8e22e47b1f29e5/packages/slate/src/interfaces/operation.ts#L138

The operations are 9 objects that use a shared type field as a tag. Some have overlapping fields but that’s the only unique one across all operations and I use it to direct operations to one of 9 identity functions. This lets me wrap the value in a variant (thus adding a second layer of tagging at runtime).

@zth
Copy link
Collaborator

zth commented Jul 2, 2021

This type of functionality would be incredibly valuable indeed. The type of structure described is used a lot in JS-land, and hassle-free bindings to those types of structures would open up interesting possibilities with ReScript (like dealing with ASTs which ReScript in theory is very very good at, but that's painful/close to impossible to do in a sane way now that every single variant needs to be manually mapped at runtime).

I wonder though if it'd be better modelled as a polyvariant? Since Flow/TS models these types of things structurally, I think it'd be valuable to model at least the tag itself structurally with a polyvariant, rather than with a normal variant.

@leostera
Copy link
Collaborator

It looks like polymorphic variants right now are translated to almost what you'd expect:

let a = #hello({ "world": 1 })
var a = {
  NAME: "hello",
  VAL: { world: 1 }
};

To the point that providing a stable encoding (possibly via an @tagged annotation somewhere) where the contents are inlined in the object, and the discriminating field name is specified upfront, could be enough to start exploring this:

let a = @tagged("type") #hello({ "world": 1 })
var a = {
  type: "hello",
  world: 1
}

Or for the long form:

@tagged("type")
type ast = [
  | #hello({ "hello": string })
]

Of course this means that @tagged polymorphic variants without arguments will still have the shape { type: "name" }, but people are doing that on the other side of the type-system already.

From similar work I did on Caramel, (where #hello(1) becomes {hello, 1} in Erlang/Elixir), using polymorphic variants makes this rather natural. Also not a TypeScript user these days but I could see how writing bindings with this could be easier.

@kiuKisas
Copy link

kiuKisas commented Aug 4, 2021

Seems a must have !
I also though we should cover other JSON shape with variant, in a more general way.
Some rough ideas about it, just writing them for the sake of it:

  • we can also have control on the NAME/VAL keys with an other attribute, e.g:
let a = @keys(["id","content"]) #hello({ "world": 1 })
var a = {
  id: "hello",
  content: { "world": 1}
}

where that @keys attribute have a tuple as argument to define the key. Doesn't seems really helpfull though, your approach seems better.

  • In the other case we have full control other the JSON shape (e.g: request from an API we have), we can also use non-polymorphic parametric variant for tinier payload and I assume more efficient pattern matching, but still define keys for clarity. It will be a bit more complex if we expect to manage different parameters shapes . The current behavior of regular parametric variant is:
type b = | Hello(string) | Foo(int, int) 
let b1 = Hello("world")
let b2 = Foo(1, 42)
var b1 = {
  TAG: /* Hello */0,
  _0: "world"
};

var b2 = {
  TAG: /* Foo */1,
  _0: 1,
  _1: 42
};

We can have the same attribute to change keys, if we have different shapes:

type b = @tag("id") | @keys(["text"]) Hello(string) | @keys(["n1", "n2"]) Foo(int, int) 
let b1 = Hello("world")
let b2 = Foo(1, 42)
var b1 = {
 id: /* Hello */0,
 text: "world"
};

var b2 = {
 id: /* Foo */1,
 n1: 1,
 n2: 42
};

It will have a similar result than the polymorphic variant with tag, but with a slight performance gain (that I still assume) in exchange of a more complex syntax.
We can add some sugar in the case we have the same shape:

type b = @keys(['id', 'content']) | Hello(string) | Foo(int) 
let b1 = Hello("Foo")
var b = {
 id: 0,
 content: "Foo"
}

Again, your idea feel better, even if I would prefer using regular variant.

For some extreme case, we can also imagine that the tagged/keys attribute can be a polymorphic variant for the sake of covering all use case, for example retro-compatibility with a name change on an API.. but things get more complex and probably not that useful, I don't think it worth it.

@cometkim
Copy link
Member Author

We love the syntax proposed by @Ostera, but since the current poly vars type cannot contain inline record definitions, I assume there will be some semantic changes for it.

It feels natural to have tag = "TAG" always present in regular variants and to tell the compiler to use a custom tag.

// Assume here is an implicit directive
// @tag("TAG")
type t =
   | Foo({ foo: string })
   | Bar({ bar: int })

Currently, we are saying users "don't rely on the internal representation".

However, It may be better to make the internal representation of a regular variants more predictable rather than treating this as special case.

Pros:

  • More readable JS output
  • Reduce the learning curve
  • Ergonomic bindings

Rust's serde_enum provides a good summary of the predictable representation.

  • Externally tagged
    // TypeScript
    type t = (
      | { Foo: { foo: string } }
      | { Bar: { bar: number } }
    )
  • Internally tagged: It is closest to the current behavior. And it's what's known as "Brand" in the TypeScript world.
    type t = (
      | { TAG: "Foo", foo: string }
      | { TAG: "Bar", bar: string }
    )
  • Adjacently tagged: it looks the most flexible.
    type t = (
      | { TAG: "Foo", CONTENT: { foo: string } }
      | { TAG: "Bar", CONTENT: { bar: string } }
    )
  • Untagged: N/A

@cannorin
Copy link
Contributor

cannorin commented Mar 18, 2022

I'm creating a tool to generate ReScript bindings from .d.ts files, and this would be nice to have for consuming TS discriminated unions.

@cometkim
Copy link
Member Author

I'm thinking of a PPX syntax that can be introduced without breaking changes.

ex)

@tagged(nodeName)
type rec node =
  | Text({
      nodeName: [#"#text"],
      value: string,
    })
  | H1({
      nodeName: [#"h1"],
      childNodes: array<node>,
    })

@cometkim
Copy link
Member Author

I just confirmed this works in ReScript v11

input code:

@tag("nodeName")
type rec node =
  | @as("#text") Text({value: string})
  | @as("h1") H1({childNodes: array<node>})

@val external nodes: array<node> = "nodes"

let rec toText = nodes =>
  nodes->Belt.Array.reduce("", (text, node) =>
    text ++
    " " ++
    switch node {
    | Text({value}) => value
    | H1({childNodes}) => toText(childNodes)
    }
  )

nodes->toText->Js.log

And the output:

import * as Belt_Array from "rescript/lib/es6/belt_Array.js";

function toText(nodes) {
  return Belt_Array.reduce(nodes, "", (function (text, node) {
                var tmp;
                tmp = node.nodeName === "#text" ? node.value : toText(node.childNodes);
                return text + " " + tmp;
              }));
}

console.log(toText(nodes));

export {
  toText ,
}

The result is exactly what I want! Thanks, @cristianoc

I will write some bindings for popular parser libraries and report back if I run into any problems.

@cometkim
Copy link
Member Author

The only problem I've noticed so far is that when I wanna reuse a tag name in the code, I have to hardcode it.

@zth
Copy link
Collaborator

zth commented Jul 16, 2023

I think this can be considered solved via these recently merged features:

  • Configurable runtime representation of variants
  • Variant coercion
  • Variant type spreads

Please feel free to open new issues for anything missing for this workflow to work well.

@zth zth closed this as completed Jul 16, 2023
@DZakh
Copy link
Contributor

DZakh commented Jul 16, 2023

Also, if you need a parser library, it'll be well supported in rescript-struct@5
https://github.com/DZakh/rescript-struct/blob/f6dfc93/CHANGELOG_NEXT.md#opt-in-ppx-support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants