[FEATURE] Automatic extraction #58

darionco · 2021-03-02T20:11:15Z

Addresses: #48

First pass at auto-generating the type definitions and adding JSDoc style (TypeDoc) comments to the declarations.

I have tested this locally but I haven't tested it with gpuweb/cts so it's not ready to merge but I thought I'd submit this PR to start getting feedback early.

What it does:

Added the gpuweb/gpuweb repo as a submodule
Added a script to generate the types from the bikeshed file in that repo
In the future a hook could be created to automatically generate the types as soon as a new version of the spec is pushed

kainino0x · 2021-03-03T02:32:27Z

Amazing work! I will take a look soon, right now a bit bogged down.
Especially cool that you got documentation importing working.

I'll also be interested to look more closely at how the upstream projects work, and whether I think the style they use will be sufficient. In particular, it looks like it makes similar decisions to https://github.com/microsoft/TypeScript-DOM-lib-generator, which I used to generate the very first version of this file but later diverged from heavily. One reason I did this was to prevent mostly-empty interfaces from being compatible (e.g. prevent a GPUTexture from being implicitly convertible to a GPUSampler), so I wonder if we can find a solution to that (either with private __brands like we have now, or some other solution).

darionco · 2021-03-03T20:47:02Z

Thanks for looking at this PR.

I had looked at TypeScript-DOM-lib-generator but decided against using it because it requires a json object and proposing changes that fix edge cases (like the __brand hack) is out of the question. Instead, I opted for using webidl2 directly and webidl2ts (which is heavily inspired by TypeScript-DOM-lib-generator) for a couple of its parsing functions.

I created bikeshed-to-ts to handle the rest:

it loads a .bs file and extracts the IDL using a technique very similar to the code in extract-idl.py
extracts all definition blocks (<dl dfn-type=... dfn-for=...>) using regexes
parses the result with webidl2
uses webidl2ts to generate TS AST per ADL declaration in the bikeshed file
consolidates the AST and its objects
enriches the type declarations in the AST with documentation comments using the extracted type definitions

I had seen #11 and was wondering about it, hopefully this gets released soon. In the mean time I can implement a hack similar to __brand, do you know of anything that works with interface rather than class? I know that doing something like

interface A {
  readonly __brand: unique symbol;
}

works but it exposes __brand as a public member that would crash if accessed at runtime.

kainino0x · 2021-03-03T22:29:35Z

Thanks for the details!

extracts all definition blocks (<dl dfn-type=... dfn-for=...>) using regexes

Aha, this makes sense. Unfortunately the spec may not ALWAYS use <dl> to define every item in the spec, but this technique should at least work for MOST items.

In the mean time I can implement a hack similar to __brand, do you know of anything that works with interface rather than class? I know that doing something like

Unfortunately I don't. Putting a non-private brand in the interface is probably fine - we can limit this only to our empty interfaces.

There are a number of alternatives out there, which we could consider. enums might work since these aren't numeric types?
https://michalzalecki.com/nominal-typing-in-typescript/
https://betterprogramming.pub/nominal-typescript-eee36e9432d2

I know TypeScript's built-in dom types don't use class, but instead define both an interface (interface HTMLImageElement) and a var (declare var HTMLImageElement). I think that's more accurate as it expresses overrideability of built-in interfaces (e.g. globalThis.HTMLImageElement = ...;).

We definitely want these types to be as high quality as possible, and that probably means at least dropping the class thing if not nominal typing altogether. It's not a huge deal if we lose the nominal typing for now, though it would be great to bring back eventually.

kainino0x · 2021-03-03T22:31:59Z

Aha, this makes sense. Unfortunately the spec may not ALWAYS use <dl> to define every item in the spec, but this technique should at least work for MOST items.

A thought: Since the bikeshed source is markdown+html-ish, regular expressions could potentially break down in certain cases (like nested <dl>s?). Wonder how possible it would be to either pull the definitions out of the generated HTML (which is full of a lot of generated stuff, but might be possible) or use Bikeshed's actual parser?

darionco · 2021-03-04T14:33:18Z

It's not a huge deal if we lose the nominal typing for now, though it would be great to bring back eventually.

I feel like we should try to keep nominal typing, the main reason to use TypeScript is type safety/checks and the definitions should provide just that.

There are a number of alternatives out there, which we could consider. enums might work since these aren't numeric types?

yeah, I'll implement the enum approach and we can revisit once TypeScript releases the unique keyword update.

A thought: Since the bikeshed source is markdown+html-ish, regular expressions could potentially break down in certain cases (like nested <dl>s?). Wonder how possible it would be to either pull the definitions out of the generated HTML (which is full of a lot of generated stuff, but might be possible) or use Bikeshed's actual parser?

right now it handles the file line-by-line, which makes dealing with regexes and XML easier so it should be sufficient for now. I am still learning the bikeshed spec though so it might be incorrect, I'll give switching the custom parsing with the bikeshed parser a try and report back.

darionco · 2021-03-04T20:05:43Z

I decided to go with adding a __brand declaration to the interfaces because it can be written in a way that produces less cryptic errors:

I used bikeshed-js to parse the .bs, since bikeshed is written in python the integration is awkward, I did toy with the idea of writing this in phyton but decided against it.

I can get an HTML file from the bikeshed parser now, but I am having trouble wrapping my head around what to do with it. It is much easier to parse HTML but it seems to me that things are not as neatly organized/printed as in the bs file, feels like some metadata is missing from the HTML (or it's harder to find, rather).

Can we define what needs to be included in the JSDoc comments?
Using that we could come up with a set of heuristics to extract the data from either the bikeshed file or HTML file and decide which one would be easier to use.

kainino0x · 2021-03-08T19:24:15Z

I can get an HTML file from the bikeshed parser now, but I am having trouble wrapping my head around what to do with it. It is much easier to parse HTML but it seems to me that things are not as neatly organized/printed as in the bs file, feels like some metadata is missing from the HTML (or it's harder to find, rather).

I wasn't sure whether this would be the case, but I am not too surprised. Indeed there is a ton of HTML generated for bikeshed code, but I wasn't sure if all the metadata would be included in the end. If not, then I guess it's not a good suggestion, sorry :/

Can we define what needs to be included in the JSDoc comments?

That's a good question. Unfortunately there's no immediately obvious answer - there's no strict delineation between "summary" and "detailed algorithm" for various items in our spec, though it did seem like you had hit a pretty good balance here. For requestAdapter for example, I think we want exactly what you got. More generally:

Include summary of the item (interface, attribute, method/constructor, argument, dictionary, dict-member, typedef/callback, etc.)
- For items defined in <dl>s, I don't know if this should be the full <dd> or just the first paragraph. First paragraph probably works well in general - in fact we could tweak the spec to make sure the first paragraph is always a good summary.
  - (But if we did want to include more, it should include Note:/<div class=note> items, but still exclude algorithm/Issue:/<div class=issue> items.)
- For anything defined at the top level, like interfaces, I don't know how to extract the right info. Unfortunately right now we don't have a consistent style in the spec - sometimes a <dfn> is inside a header (<h1-6>) and other times it's inside a paragraph.

Anyway, we don't have to solve the documentation perfectly right now. Auto-generating even with NO documentation would be a huge help!

In fact, we don't even have to fully auto-generate, and in fact right now we probably won't want to, because at any given time there are usually some deprecated items that we still include in the types because implementations (*cough*chrome) isn't quite up to spec yet. I'm imagining this workflow for a developer updating these types:

Run the autogenerator, it generates into a temporary, local, .gitignored file
Open vimdiff/similar between generated and checked-in file
"Pull" changes from generated to checked-in file manually, one-by-one, to leave intentional diffs in.

Someday we probably won't need this, but right now it would be a very nice way to reduce the maintenance burden but maintain the flexibility.

…raction

darionco · 2021-03-09T17:23:41Z

I'm imagining this workflow for a developer updating these types:

Run the autogenerator, it generates into a temporary, local, .gitignored file

Open vimdiff/similar between generated and checked-in file

"Pull" changes from generated to checked-in file manually, one-by-one, to leave intentional diffs in.

Agreed, that makes sense to me, I have pushed a commit that changes the generation script to write its result to dist/generated.d.ts (which is git ignored). The first version of this is going to be painful as the diff between index.d.ts and generated.d.ts is huge right now.

I don't feel I am equipped to decide which changes should be kept and which shouldn't though :/

kainino0x

No worries - thanks for this awesome contribution!

I think for the first version we can basically replace the checked-in version with the generated version, then look through the old version and port any necessary changes into it. I can take care of that.

I'm happy to start landing this and then we can work through the remainder afterward. Left a few minor comments and then LGTM :)

package.json

dist/index.d.ts

package.json

yarn.lock

kainino0x

thanks!

darionco added 5 commits March 2, 2021 14:03

Merge remote-tracking branch 'gpuweb/main' into main

ff72d70

Added bikeshed-to-ts package

06a3839

Added spec repo as a submodule

4766eee

Added script to generate the types file

fba7889

Updated definition file with autogenerated version + fixes

e650ec0

kainino0x self-requested a review March 3, 2021 02:28

darionco added 4 commits March 4, 2021 14:20

Updated gpuweb submodule

ba6ab9d

Updated bikeshed-to-ts

5807113

Updated generation script

adfe7c9

Updated types definition file

4137e6d

darionco mentioned this pull request Mar 8, 2021

Added HTMLCanvasElement.getContext definition gpuweb/gpuweb#1496

Merged

darionco added 3 commits March 9, 2021 12:14

Merge remote-tracking branch 'gpuweb/main' into feature/automatic-ext…

377156e

…raction

Updated gpuweb submodule

b275414

generate script creates dist/generated.d.ts

d597fb0

kainino0x reviewed Mar 10, 2021

View reviewed changes

package.json Outdated Show resolved Hide resolved

dist/index.d.ts Outdated Show resolved Hide resolved

package.json Outdated Show resolved Hide resolved

yarn.lock Outdated Show resolved Hide resolved

darionco and others added 5 commits March 10, 2021 12:06

Updated bikeshed-to-ts

ccead9b

generate script creates an index.d.ts file in the generated folder

9d686ea

Added /generated/ to gitignore

762d291

Switched to using npm

9278417

revert dist/index.d.ts change

4e78bbb

kainino0x approved these changes Mar 11, 2021

View reviewed changes

kainino0x merged commit a3b8dd8 into gpuweb:main Mar 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Automatic extraction #58

[FEATURE] Automatic extraction #58

darionco commented Mar 2, 2021 •

edited

Loading

kainino0x commented Mar 3, 2021

darionco commented Mar 3, 2021 •

edited

Loading

kainino0x commented Mar 3, 2021 •

edited

Loading

kainino0x commented Mar 3, 2021

darionco commented Mar 4, 2021

darionco commented Mar 4, 2021

kainino0x commented Mar 8, 2021

darionco commented Mar 9, 2021 •

edited

Loading

kainino0x left a comment

kainino0x left a comment

[FEATURE] Automatic extraction #58

[FEATURE] Automatic extraction #58

Conversation

darionco commented Mar 2, 2021 • edited Loading

kainino0x commented Mar 3, 2021

darionco commented Mar 3, 2021 • edited Loading

kainino0x commented Mar 3, 2021 • edited Loading

kainino0x commented Mar 3, 2021

darionco commented Mar 4, 2021

darionco commented Mar 4, 2021

kainino0x commented Mar 8, 2021

darionco commented Mar 9, 2021 • edited Loading

kainino0x left a comment

Choose a reason for hiding this comment

kainino0x left a comment

Choose a reason for hiding this comment

darionco commented Mar 2, 2021 •

edited

Loading

darionco commented Mar 3, 2021 •

edited

Loading

kainino0x commented Mar 3, 2021 •

edited

Loading

darionco commented Mar 9, 2021 •

edited

Loading