Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carve less syntax out, with more generic, token-level comment syntax #80

Open
theScottyJam opened this issue Mar 10, 2022 · 92 comments
Open

Comments

@theScottyJam
Copy link

theScottyJam commented Mar 10, 2022

I know a big push for this proposal is to try and put as much TypeScript syntax into JavaScript as possible. But, it could be worthwhile to explore what this proposal would look like if we didn't focus so heavily on this objective. Considering the fact that most users will have to run a codemod anyways to change their TypeScript code to be valid JavaScript, I don't think it's that bad of an idea to stray a bit further from current TypeScript syntax.

Let me propose a much simpler form of this proposal, that tries to carve out much less syntax, while still being ergonomic to use. All I'm going to do is introduce a simple token-level comment to the language, which works as follows:

  • Simply place a "@" character before a token to cause the language to ignore the following token. i.e. in const x @number = 2, the @ will cause "number" to be ignored.
  • If the ignored token is a "(", "[", "{", or "<", then the language will ignore content until a closing bracket is found. i.e. in const x @(number | string) = 2, everything within the parentheses is ignored. These groupings can be nested.
  • If the ignored token is followed by a colon, then the colon and the token after the colon is ignored as well. i.e. in class MyClass @implements: MyInterface { ... }, the "implements" token will be ignored, and since there's a colon that follows it, the colon, and the word MyInterface will be ignored as well.

We can also add a "@@" syntax, that will cause everything to the end of the line to be ignored, plus, if any opening brackets were found in that line (via "(", "[", "{", or "<"), further content will continue to be ignored until a closing bracket is found. Examples for this will be demonstrated below.

(The "@" character can of course be bikeshedded. I know it's currently being used by the decorator proposal, but we could have decorators use something else, like a two-character token).

Here's what it looks like in practice:

// The "@" and "string" following it are both ignored
let x @string;

function equals(x @number, y @number) @boolean {
    return x === y;
}

// Everything after @@, until the end of the line is ignored.
// Also, since "{" was found, all content within the { ... } grouping is also ignored.
@@interface Person {
    name @string; // "@" doesn't have to be used here since this is all ignored, but for consistency, it is.
    age @number;
}

@@type CoolBool = boolean;

class MyClass {
  name @number;
}

// More complex types can be wrapped in parentheses
function fn(value @(number | string)) { ... }

// optional parameters
// Notice how after the "?" token, there's a ":", causing the colon and the token after it to be ignored as well?
function fn(value @?:number) { ... }

// import/export types
@@export interface Person { ... }

@@import type { Person } from "schema";

import { @type: Person, aValue } from "...";
// After the ignored content is removed, this line will look like this: import { , aValue } from "...";

// type assertions
// Notice, again, how the ":" after "as" causes further content to be ignored.
// Also note that the "@" before "number" isn't strictly necessary as everything within the { ... } is being ignored,
// but it's being used anyways for consistency.
const point = JSON.parse(serializedPoint) @as: { x @number, y @number };

// Non-nullable assertions
document.getElementById("entry")@!.innerText = "...";

// Generics
function foo@<T>(x @T) { ... }

// "this" param
@this: SomeType function sum(x @number, y @number) { ... }

// Ambient Declarations (currently being considered to not be included to keep the proposal smaller, but now it can be trivially added)
@@declare let x @string;

@@declare class Foo {
    bar(x @number) @void;
}

// Likewise, function overloading might not get added to this proposal as it currently stands, but it's trivial to add with this idea
@@function foo(x @number) @number
@@function foo(x @string) @string;
function foo(x @(string | number)): @(string | number) {
    ...
}

// Class and Field Modifiers (These are easy to add as well, if wanted)
class MyClass {
  @protected @readonly x @number = 2;
}

// Allowing someone to "implement" an interface
class MyClass @implements: MyInterface { ... }

So, yes, perhaps some of those examples aren't as nice-looking as the TypeScript variants, but they're not bad either, and most of them seem equivalent verbosity-wise. And, remember, this is just showing what's possible if we choose to only introduce these simple rules, we could still choose to go with a mix of the current proposal and this idea, where we use the "@" and "@@" syntax for most items, but we also add, for example, a no-op "as" operator so people can write x as yz instead of x @as: yz.

There's some other benefits if we go this route:

  • This is probably the biggest pro: The proposal is much more flexible for new innovations. Right now, every syntactic feature that TypeScript wants needs to go through the TC39 proposal process and be approved, and for this to happen it needs to be considered important for all static type checkers. It's generally not easy to convince TC39 to add new syntax to the language, so this will be a giant bottleneck for innovation. We can already see how much of a bottleneck this will be, by seeing all of the features listed in the README that TypeScript currently supports, but they're considering not adding to this proposal in an effort to keep this proposal from growing too big. All of these features are trivially supported with the simple syntactic carve-out I'm proposing here.
  • It's trivial to learn what code has real runtime effects and what code does not. If there's a "@" involved, then it's a comment. Simple.
  • This is a minor point, but it's something that urks me about TypeScript, which this proposed syntax helps solve, and that's the fact that code that looks like types is very similar to code that looks like actual JavaScript stuff. Take this for example:
    function fn({ x, y }: { x: Thing1, y: Thing2 } = { x: new Thing1(), y: new Thing2() })
    The "type" part of that line of code, { x: Thing1, y: Thing2 }, is valid JavaScript syntax by itself. If you're quickly scanning the code, the only way to realize that this is a type, not a runtime value, is to notice it's followed by a tiny ":". Now compare this with what's being proposed here:
    function fn({ x, y } @{ x @Thing1, y @Thing2 } = { x: new Thing1(), y: new Thing2() })
    Ah, much better. The "type" part of that statement now actually looks different from everything else, it's much easier to scan.
  • Continuing with the previous point, TypeScript's choice of the ":" token has caused a lot of issues for them, because it's a token that already has so many meanings. Because this idea is not using the ":" token, we can rewrite the previous example in an even simpler way that TypeScript syntax can't support:
    function fn({ x @Thing1 = new Thing1(), y @Thing2 = new Thing2() } = {})
  • If we ever want to add runtime behaviors to a type feature after-the-fact, we are now able to do so. For example, TypeScript can immediately support "x@!.y" syntax, and at a later point, JavaScript could add a "x!.y" feature that has runtime meaning. We don't have to figure out up-front which type-features should have runtime meaning with this proposal. (In a similar vein, if we ever decide to give JavaScript an actual, official type system, this proposal leaves enough syntax space to do so, the "official" type system can just use different syntax from the "@" syntax being proposed here.)

The downside here is that this looks very different from TypeScript syntax, which I know can be a bit off-putting. Especially the fact that it can really be hard to call this sort of thing "TypeScript" when it looks so different form TypeScript. Though, IMO, the up-sides outweigh this downside, though I would be interested to hear other thoughts on this matter.

Update: It has been mentioned that we can't use the @ token - we won't be allowed to take that out of the hand of the decorator proposal. I've put together another iteration of this idea which uses different tokens, and encorporates other feedback that's floated around. I still feel there's more room for improvement (e.g. I'm not a giant fan of the back-slash character, I'm still mulling over alternative ideas), but I think it takes a couple more steps in the right direction. You can see it presented in this comment.

@wparad
Copy link

wparad commented Mar 10, 2022

I want to 👏 so much, but that isn't a valid reaction. We should start with the optimal solution and then figure out which compromises we can/should make. Trying to TS and other type systems put into JS, begs the question "Is TS and others actually the right solution". Personally, I hate them, don't use them, and it wouldn't be hard to define new ones that I think are better. Now while that isn't a reason to do it. Starting with the ideal and working back to reality is a much better strategy then starting with a flawed premise. TS and others can easily change to support whatever we want, so let's start with the aspirational goal.

@theScottyJam
Copy link
Author

theScottyJam commented Mar 10, 2022

That's certainly an option.

I like "@", because it's a single token that's not too noisy, and it's going to be used very often (far more often than decorators). I believe it's the only ASCII character that's completely unused in the language right now, which makes it the only single-character token that can be used for this purpose. The decorator proposal could switch to using something like "%%" or whatever instead, they don't have to be as terse as this proposal.

Though, either way works.

@ljharb
Copy link
Member

ljharb commented Mar 11, 2022

Decorators can't switch, for the same reason that decorators and private fields didn't swap sigils years ago - there's too many tutorials and blogs and example code out there using @ that it would be too difficult to update if they switched. Decorators, and nothing else, must use @.

@theScottyJam
Copy link
Author

theScottyJam commented Mar 11, 2022

Oh, if anything I would think it's an advantage for decorators to switch, considering how many renditions of the decorator proposal there's been, and how many of those blogs and tutorials would be related to the older versions of the proposal.

Though, if this is really the case, we can find a different syntax to use here.

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

Yeah, other than the @, which we can't use because of decorators, I'd been hoping for some push in this direction. I agree that something that's like a tag or note is ideal vs. a whole smorgasbord of TypeScript syntax and features which a JS parser
/ interpreter is just supposed to treat "as comments". There's a lot of cognitive overhead with that path.

I saw in the docs the % character. Obviously that's used as well, and it's hard to find an unused character (which is why JS proposals keep reaching into the grab bag for @ and #), so any character would have to have clear rules. You could also change placement to keep semantics clear. There's lots of syntax ways to do the same thing, like for example (spitballing on the OP code)

let x::string;

function equals(x::number , y::number)::boolean {
    return x === y;
}

::interface Person {
    name::string;
    age::number;
}

::type CoolBool = boolean;

class MyClass {
  name::number;
}

So, that is to say, despite liking TypeScript, I'm hoping for a refinement of the proposal to a single character sequence to define these "ignore this" pieces, akin to a 3rd type of comment (annotation) with different comment-ending rules. It keeps the whole proposal simpler and can still, I think, address the verbosity of JSDoc-style annotation.

@lmcarreiro
Copy link

What about the backward slash \?

let x \string;

function equals(x \number , y \number) \boolean {
    return x === y;
}

@theScottyJam
Copy link
Author

theScottyJam commented Mar 11, 2022

Oh, yeah, I guess the back-slash isn't a used token yet either, so that's certainly an option. It looks a bit funky, but I think it's fine.

I am starting to like the "::" syntax more as well, after seeing it used in examples.

So, when it comes to syntax-bikeshedding this idea, there's three main things we'll need to be aware of.

  1. A way to ignore a single token (or grouping) (originally I used the @ delimiter for this)
  2. A way to ignore an entire statement/structure like an interface (originally I had @@ for this)
  3. The ability to make type-related operators. Originally I added the colon rule for this purpose. The "as" operator could be written as @as:, and because a colon followed @as, the colong along with the subsequent token after it would be ignored as well. This felt nicer to me than prefixing each piece with an @ character, but the value of having specific support for this third point can certainly be debated.

So, if we decide to use the :: syntax to fulfill the first part, and try to do a straight translation of the other two rules, we'd be left with something like this:

// 1. Normal usage
let x::number

// 2. There's a third colon here, to make it ignore this whole statement
:::interface MyInterface { ... }

// 3. Note the single colon after "implements" to make the ignoring continue on to MyThing
class MyClass ::implements: MyThing { ... }
x ::as: y

The syntax for point 2 and 3 isn't ideal here with this "straight translation" version, but perhaps there's another way to go.

// 1. Normal usage (unchanged)
let x::number

// 2. Something like ":|" can be used to deliminate a statement. Dunno. 
:| interface MyInterface { ... }

// 3. We don't have to do anything special for this. Just use a bit of spacing tricks to squish
// the next "::" into the "implements" pseudo-operator.
class MyClass ::implements:: MyThing { ... }
x ::as:: y

@simonbuchan
Copy link

Seems like there's a lot of "I don't use typescript but I want to make things awful for people wo do" people in here.

Let me argue the counter-position: no. The entire point is to carve out a "nice" space for type syntax. If you're being at all compassionate, this would ideally include the huge quantity of existing "javascript with type syntax" out there - overwhelmingly typescript by most measures; also obviously this should not be narrowly tied to typescript's specific syntax.

Granted, I'm a little dubious about the technical details of how to carve out some cases, e.g. type ... in particular, but they are technical details and deliberately not addressed in detail yet, and the goal of explicitly reserving space for types that wouldn't be used anyway because it would collide with typescript is good. Especially as it lowers the barrier to entry for new type systems! They don't need to start with a transformer and only need to parse their specific type syntax.

@theScottyJam
Copy link
Author

theScottyJam commented Mar 11, 2022

Seems like there's a lot of "I don't use typescript but I want to make things awful for people wo do" people in here.

Not at all. I know at least one other person here expressed a dislike towards TypeScript, but I personally use TypeScript and love it. I also don't feel like the syntax carve-out being proposed in this thread is "not nice", it's just different than the syntax TypeScript is currently using, with a different set of pros and cons. If you don't particularly like the way this proposed syntax looks, that's fine - this is where, perhaps, we can also discuss a middle-ground, where a generic, flexible syntax is provided to handle arbitrary current and future needs, but more specific syntax is also provided to help with some of the uglier parts, whatever we feel those uglier parts may be.

But, overall, I don't feel like we need to hold tight to the syntax that's already being used in TypeScript, if we can find an alternative syntax that's much simpler and more powerful. Sure, it would make the transition to the JavaScript syntax a bit more bumpy, and I am apologetic to this, but I think in the long run it could be a good thing. (Plus, large complicated proposals with lots of syntax changes typically have a really difficult time getting through).

@simonbuchan
Copy link

I would consider more than 1% of typescript (by lines) having to be updated a very sad outcome. Obviously, foo<T>(bar) is a case where it must be adjusted, handling declare is a sandpit that is being punted for now, etc., but so far all the suggestions here are "re-write (automated or not) pretty much literally everything to an alien syntax"... for very dubious benefit?

This proposal is obviously most immediately most useful for typescript users, but even putting on the "I'm inventing a new type system" hat I would far prefer to have more semantic, designed carve-outs than just another syntax for comments. As a note, I expect if that were useful, we would have seen much more uptake on type-systems in comments than pretty much just typescript's JSDoc checking support.

Putting on my "I'm a typed Javascript programmer" hat, I don't want syntax that, bluntly, sucks. Putting @ or :: or whatever in front of everything would suck. It's worth noting here that ActionScript, Typescript and Flow, with similar constraints, came to pretty much identical syntax, despite having quite different actual checking logic and not having all that much deliberate attempt to converge, AFAIK. The existing general syntax seems to be just the natural option for adding types to all the places you want to put types given Javascript's existing syntax.

@theScottyJam
Copy link
Author

theScottyJam commented Mar 11, 2022

for very dubious benefit?

Perhaps, let me try and expound more on the benefits I see here.

  1. Carving out syntax is expensive.

Carving out as much syntax as this proposal is hoping to do isn't an easy task. There's only limited syntax space available to JavaScript, so new syntax has to pass a very high standard before it gets added in. Each proposal that creates syntactic changes tends to undergo a lot of bike-shedding in order to find a new syntax that doesn't conflict with any existing JavaScript (it's not like JavaScript can just reserve new keywords and what-not, so sometimes the new syntax ends up a little ugly or quarky just to preserve backwards compatibility). Each new bit of syntax we reserve for a feature makes that sort of syntax forever our of our reach in future proposals.

You can see this effect happening even in this thread. I tried to use the @ token, because I thought it would work best, but was told the decorator proposal was far too late in the proposal process for me to use it. Since most other ASCII characters were already being used for other purposes, this leaves us having to choose something that's arguably less-than-ideal to make this idea work.

JavaScript has already used up a fair amount of syntax space, which has created a lot of difficulties (and ugliness) when it comes to adding new syntax. Now, what happens when we bring in a proposal this big, with all of the syntax changes it wants to do? That's a lot of syntax space that'll forever be out of our reach for the future. And, what's worse, people will keep requesting new syntax carve-outs for these type comments as type-safe languages continue to comes out with new inovations that need new syntax.

  1. The EcmaScript proposal process is slow

This point is also argued in the README, and it's the reason why they're avoiding spec-ing the specific details of what the type syntax would look like. They want to be able to rapidly innovate and come up with new type-related syntax, without having to go through a proposal process to implement these syntax ideas. What's being proposed here will let TypeScript rapidly innovate on a much wider range of syntax ideas without having to go through TC39.

  1. EcmaScript proposals won't implement all of the syntax that TypeScript wants/needs.

Because of point 1, and the higher learning curve associated with new syntax, EcmaScript sets a really high bar for new syntax. This means there's simply going to be a fair amount of syntax from TypeScript that will never be able to make it into the language. For example, I would be surprised if a proposal to add access-modifier type-comments into the language ever makes it through (considering the language already have "private", and decorators might be able to help implmement some of the other access modifiers). Syntax features that are specific to a single type-safe language would also have a difficult time getting through the proposal process, because this syntax is meant to be usable by any type-interpretor. This means, if the proposal continues in its current direction, people will simply have to choose between using a less-powerful version of TypeScript (TypeScript-in-JavaScript), or using the whole thing and having a compile step. What's being proposed in this thread would instead make the choice be between having a fully-capable TypeScript-in-JavaScript but at the cost of using syntax that's a big uglier.


To address some of your other points:

but so far all the suggestions here are "re-write (automated or not) pretty much literally everything to an alien syntax"

Yes, and you're correct that this is certainly a downside that we need to weigh in, so I hope I don't downplay it too much. But, if in the long-term this means TypeScript is able to innovate faster and provide new syntax whenever they want, I'm ok with having to do a migration. And, it's also good to realize that migration is not forced - some people might just choose to strick with the current build tooling they have, because "why fix what aint broken", especially if TypeScript plans to continue supporting the fully-compiled version of itself. Even still, yes, I do recognize that this is going to be a pain point, probably the biggest one when it comes to this idea.

As a note, I expect if that were useful, we would have seen much more uptake on type-systems in comments than pretty much just typescript's JSDoc checking support.

I'm not sure this is really a fair comparison. I mean, let's compare them.

// js-docs
/**
 * @param {string}  p1
 * @param {string} [p2]
 * @param {string} [p3]
 * @param {string} [p4="test"]
 * @return {string}
 */
function stringsStringStrings(p1, p2, p3, p4 = "test") {
    // TODO
}

// TypeScript
function stringsStringStrings(p1: string, p2?: string, p3?: string, p4 = "test"): string {
    // TODO
}

// This thread's original idea
function stringsStringStrings(p1 @string, p2 @?:string, p3 @?:string, p4 = "test") @string {
    // TODO
}

Sure, the TypeScript syntax looks the cleanest here, but this thread's version isn't that much worse. I'm not a huge fan of how this idea deals with the optional parameters, so perhaps that's one of the middle-ground areas that we add extra, explicit syntax to help out. If we do that, then this thread's syntax would be just as nice as TypeScript's current syntax. Either way, both of these are much, much better than the js-doc version.

It's worth noting here that ActionScript, Typescript and Flow, with similar constraints, came to pretty much identical syntax, despite having quite different actual checking logic and not having all that much deliberate attempt to converge, AFAIK.

I sort of doubt that this wasn't deliberate. Language are often copying-pasting syntax from other languages to give them a more familiar feel. That's why JavaScript's syntax looks so much like Java, despite having some radically different ideas under the hood. From what I understand, it's syntax was originally supposed to be very different, but was changed so developers would feel more at home when using it.

So, likewise, it's certainly a downside that the syntax proposed here will be pretty different from what anyone is used to, which would in turn increase the learning curve for these features. But, it's also just a different (and slightly uglier) skin for the same feature set, the specific details of how the syntax works/looks is often considered to be one of the least important things about a feature, yet, one of the most discussed items (hence, the origin of the term "bike-shedding" - everyone focuses on the bikeshed for a new building). My hope is that the benefits of a flexible syntax will outweight the ugliness of this syntax. Perhaps it doesn't, and you're right, and it would be better to carve out individual features than to provide a general-purpose, flexible syntax. I don't know. What I do know is that the amount of syntax being proposed by this proposal is a bit scary, and I'm not confident that a proposal of this size will ever be able to reach the end of the proposal process unless it "loses some weight" somehow.

@Nixinova
Copy link

Nixinova commented Mar 11, 2022

I agree with using something else instead of :; I think a tilde ~ would be best. Porting TypeScript's syntax verbatim would cause more issues than it solves - most programmers reading something like let foo:number; would assume you would not be able to put a number in it and it seems very easy to forget that it's actually just a suggestion. Something like let foo ~ number; would be more obvious in what's actually happening with tilde implying "this should maybe be a number". Different syntax would also allow actual enforced type checking to be added to JS in the future if that is wanted.

// e.g.
function foo(bar ~ number, baz ~ string?) ~ void {}

@simonbuchan
Copy link

@theScottyJam Ok, I can't respond to all this on a phone at midnight, so please forgive my selective, abbreviated reply!

The keyword situation is not actually that bad: there are quite a few unused reserved words already. Further, you don't actually need to have a unique keyword, just syntax that wasn't valid beforehand. Conveniently, "identifier identifier" is never valid JS, so you have free reign to introduce any keyword always immediately followed by an identifier, like type Foo. There's similar cases for most everything added by typescript, and the other JS extending languages: otherwise they wouldn't be able to add it because it already meant something!

Yes, technically these carve-outs would mean it couldn't be used for something else. But seriously, what is interface Foo {} ever going to mean if it's not a type declaration? It sucks when a bad early decision means you can't add a feature, but that doesn't mean never add anything! Further, ECMA would already very strongly avoid colliding with existing syntax in Typescript at minimum, even if it technically wouldn't break deployed content it would be mean to make all the Typescript users churn for no reason (to a lesser extent, Flow as well)

You have a point with what I'll call the "80% coverage" issue, your example being access modifiers. I actually think this specific case is pretty unimportant: they are actually already reserved, so it would not be a big deal to simply spec them as ignored in some context if needed, and also they are not really that critical given we have real privates now. But the general issue is still relevant, and I think it will mostly be that this proposal will just have to add the difficult to avoid cases somehow: I'm guessing some version of declare myself.

Could you give an example of the proposed syntax with some of typescript's more exciting syntax? Such as type ternary expressions using extends tests? I feel at some point you have to give up and just bracket the whole thing... which might be what this ends up proposing for weird cases like that.

You say you have an issue with the size of this feature: but really it's likely to be tiny in terms of spec impact from what I can tell; add some BNF rules, done. Plenty of tiny features to a user can easily end up with dozens of pages of spec for the abstract machine semantics!

@Nixinova "someone might not understand this" and it's close friend "someone might misuse this" are both incredibly lame objections: they can be leveled at literally everything. You need to show that it is sufficiently likely to cause an actual problem, and that there is a solution that doesn't make things worse.

Someone thinking that type annotations will be checked is a one time, minimal cost problem that can't really be fixed by different syntax.

@ljharb
Copy link
Member

ljharb commented Mar 11, 2022

@simonbuchan interface Foo would mean https://github.com/tc39/proposal-first-class-protocols, if it hasn’t already changed to “protocol” to void conceptually colliding with TS. It’s best not to underestimate the potential future uses of syntax.

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

@theScottyJam So, it occurred to me early this morning that I should probably explain why a simple character sequence like :: would actually work, universally, in these cases, and why you don't need to pair @ with @@ in your example, or :: with :| or :::.

You, in fact, nearly got there with this statement:

We can also add a "@@" syntax, that will cause everything to the end of the line to be ignored, plus, if any opening brackets were found in that line (via "(", "[", "{", or "<"), further content will continue to be ignored until a closing bracket is found.

You don't actually need @@ though to apply special rules.

Say you have this annotation format.

Say an annotation starts with ::. It ends by:

  • encountering a line ending \n
  • encountering a statement ending ;
  • encountering a list separator ,
  • encountering a JavaScript block end ) or }

The last line is very much like CSS custom properties. Those properties can end automatically with a ; or }, like regular CSS properties, but they don't necessarily end just because ; or } was encountered, because they are block-aware.

Annotations, like CSS custom properties, would have the concept of blocks, which carves out an exception to the above rules. Blocks in this case would be < >, (, ), {, } (I'm not sure [ ] should apply). When an annotation would encounter a top-level block start, it would continue until it has closed all matching blocks.

So, take a multi-line interface:

::interface Person {
    name::string;
    age::number;
}

You don't need anything special here. The annotation contains a top level block, starting with { and it cannot close until it encounters }.

Similarly, you can have multi-line type assignments by wrapping them in parentheses. So, if you have this in TypeScript:

type Animal = Cat
                 | Dog

You could easily manage this in an annotation format like:

::type Animal = (Cat
                 | Dog)

You would have to get clever about some cases in TypeScript, such as return types on functions, because this logic would fail here:

function addChild::<T>(a::Node)::T { /* */ }

The reason it fails is because of ::T at the end of the function, since it's followed by {, which would be considered part of the annotation by the above rules. (@theScottyJam Which is maybe why you leaned towards a different sequence? There are trade-offs to each!)

You would instead need something like:

function addChild::<T>(a::Node)::(T) { /* */ }

The point is you can get very clever with parsing rules, as other languages have demonstrated.

Few more points here:

  1. Obviously :: is just an example. You just need some unique sequence that wouldn't be recognized as valid JavaScript. (@simonbuchan I understand identifier identifier is not valid and therefore could also be used, but then when does an IDE flag an error? It's not that developer friendly to start throwing a lot of invalid-to-JS syntax. IMO, the proposal should pick one new construct.)
  2. There are probably some edge cases I haven't thought of. It's before 6 am and this isn't an actual proposal.
  3. One downside to this is that certain parsing strategies simply won't be able to handle this. So you could have a perfectly working JavaScript parser today that could not be (easily?) adopted to be block-aware. For example, in CSS custom properties, you can have open blocks within your top-level block which never get closed, and that's considered okay. I'm not sure you'd want to be that permissive here (and it's a reason why many, many CSS parsers cannot parse all valid custom property values). Another way to say it: you can write a parser that detects JS comments via regex. But you can't regex block-aware annotations.
  4. IMO, even with #3 being true, it's still easier / more straightforward to write clever parsing rules for essentially one piece of syntax, then what the "types as comments" proposal is doing which is adding special parsing rules to many pieces of syntax to flag/parse/interpret them as annotations. To me, that's a non-starter for the JavaScript language. Parsers should not be treating special word cases like type and interface as comments (or any generic identifier identifier), along with JS-like constructs like :string. I feel this proposal is just far, far too broad in its scope, and "special cases" a lot of TypeScript syntax, without any clear rules / internal logic about why other than "because it's in TypeScript". To me, that's not a sign of a solid language proposal. It makes sense to TypeScript, sure, but it doesn't make sense in a JavaScript language spec.

Another point. Someone could look at:

function myFunction::<T>(a::?string)::(T) { /* */ }

and say, "but it's too noisy". I'd point out that It's still way less verbose than JSDoc, but at least it has clear, simple rules, and if you want noise-less TypeScript, you can use TypeScript! 😄

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

@theScottyJam

Just to spitball with your original code, using a single character sequence to see if this logic checks out:

let x::string;

function equals(x::number, y::number)::(boolean) {
    return x === y;
}

::interface Person {
    name::string;
    age::number;
}

::type CoolBool = boolean;

class MyClass {
  name::number;
}

// More complex types DON'T need parens because of annotation closing rules
function fn(value::number | string) { ... }

// optional parameters
function fn(value::?number) { ... }

// import/export types
::export type Person { ... }

::import type { Person } from "schema";

// hmm....
import { ::type: Person, aValue } from "...";

// type assertions
const point = JSON.parse(serializedPoint) ::as { x::number, y::number };

// Non-nullable assertions -- oooo this is a hard one
document.getElementById("entry")::(!).innerText = "...";

// Generics
function foo::<T>(x::T) { ... }

// "this" param -- so.... this is somewhat ambiguous... like the import example, the comma ends up ending the annotation and would then be an invalid parameter block, so that might need some special definition of rules?
function sum(::this: SomeType, x::number, y::number) { ... }

// Ambient Declarations
::declare let x::string;

::declare class Foo {
    bar(x::number)::void;
}

// Function overloading... err, also tricky? Block rules would get messy here, unless we do:
::(function foo(x::number)::number)
::(function foo(x::string)::string)
function foo(x::string | number)::(string | number) {
    // ...
}

// Class and Field Modifiers
// So... a note here, assignment requires you to wrap the type, according to the above block-level rules, so hmm.... again, trade-offs
class MyClass {
  ::(protected readonly) x::(number) = 2;
}

// Allowing someone to "implement" an interface
class MyClass ::(implements MyInterface) { ... }

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

@theScottyJam I guess another way to do this is similar to what I did in another CSS pre-processing language, which is where it looks like you were leaning with single identifiers, but apply block level rules, like the following, using a # (again, not a serious proposal, just an example):

// I don't have to wrap the type this time, because it only does single identifiers or single blocks
let x#string = 'foo';

function equals(x#number, y#number)#boolean {
    return x === y;
}

// Just wrap the thing longer than a single identifier / block, but fugly?
#(interface Person {
    name#string;
    age#number;
})

@simonbuchan
Copy link

@ljharb a good example, but I I think it's illustrative that I sincerely think protocol is a better name and syntax for that than interface, even in the universe where no JS+types language exists: JavaScript already has the convention of referring to "protocols" not "interfaces", it still has the ecosystem conflict with webidl interfaces (which are very close to Typescript interfaces), and the existing uses of "interface" in other languages don't behave at all like protocols (I suspect the equivalent features are called typeclass, trait etc, in part specifically to avoid intuitions about interfaces). That said, I expect you were at the TC39 presentation of that proposal - was the keyword issue raised at all?

I do think JavaScript could add a more "interface-ey" use of interface. But I can't really see that not being basically #45 - which is a completely legitimate issue!

@matthew-dean I think you misunderstood what I meant with "identifier identifier"? Or at least the implication. I was saying it's already disallowed by the grammar, so the language can safely (if not trivially) add any meaning it wants to anything that matches that. In particular: "my_hot_new_not_quite_keyword identifier". Editors would be in the same boat they always are with any syntax extension: they can't parse it until they can.

@theScottyJam
Copy link
Author

@simonbuchan

By the way, thanks for taking time to engage with me on this, and to help discuss the pros and cons. I think we're both starting to understand each other points of views here, and what the pros/cons are, we are just giving different importance the these different pieces, which is bringing us to different conclusions.

There's similar cases for most everything added by typescript, and the other JS extending languages: otherwise they wouldn't be able to add it because it already meant something!

This is a fair point. I see some discussions about how X can't be done, because it would conflict with TypeScript syntax. So I think the core features of TypeScript will always be carved out and unavailable for use. Though, I also see the occasional comment about how "TypeScript shouldn't have used the X keyword reserved by JavaScript, we're going to use that keyword and let TypeScript deal with the consequences".

Conveniently, "identifier identifier" is never valid JS, so you have free reign to introduce any keyword always immediately followed by an identifier, like type Foo.

This is true, but can also make syntax extra annoying. Just like you don't particularly like the @@ or :: in front of the interface keyword, I can also see people not particularly keep on the idea of using two-word keywords, especially with syntax that has a strong need to be concise. But, yes, this syntax space will always be available.

Could you give an example of the proposed syntax with some of typescript's more exciting syntax? Such as type ternary expressions using extends tests? I feel at some point you have to give up and just bracket the whole thing... which might be what this ends up proposing for weird cases like that.

When it comes to the syntax of the types themselves, what's being proposed here doesn't offer any more flexibility than the current proposal. Eventually, yeah, you would have to just bracket the more complicated types (though I think @matthew-dean is onto an idea that can help avoid that). What I'm thinking about more is the non-type related syntax. You already mentioned the "declare" syntax which they're currently not planing on including in this proposal, and it might not ever get included - you may be forced to use js-docs for those.

Let me also mention some possible future expansions that a type-checker could choose to take in at any point.

// A new @frozen "keyword" that indicates this class's instances
// are meant to be frozen.
@frozen class MyClass {
  ...
}

// C++ has a concept of choosing to inherit, but making all inherited
// properties protected or private. I've never seen another language
// take up this idea, but hey, if someone ever wanted to, they can.
class MyClass @privateExtends: BaseClass { .. }

// Maybe one particular type-checker decides to make it possible to
// put a function's type on the line before where the function is
// defined. I've seen some languages do this, and I love it,
// as it really declutters the declaration.
@@function(@string, @{ x @int, y @int }) @void
function myFunction(name, opts) { ... }

These are the sorts of innovations a type-checker could choose to try out and implement at any arbitrary point in time. TC39 might not be so keen on adding syntax support for functionality like this, as it might still be too controversial for TC39 to bring it into the language as a permanent thing, even though it's considered stable enough for whatever type-checker is wanting to bring it in.

Though, I think you already understand this idea well, with how you dubbed this sort of thing "80% coverage".

@theScottyJam
Copy link
Author

theScottyJam commented Mar 11, 2022

@matthew-dean, I think you're onto an interesting idea here. I like how your syntax idea cleanly helps so you don't have to use parentheses in complex types (like x | y), and how it helps with optional parameters.

It does, unfortunately make a handful of syntax choices a bit grosser. Which, perhaps we ought to bring back a second token with different rules to help with those scenarios (though, at the same time, it does look much nicer when it's only a single token).

It does also suffer from these issues:

// These are pretty sweet
function fn(value::number | string) { ... }
function fn(value::?number) { ... }

// But it doesn't work as nicely here:
let x::(number | string) = fn()
let x::(?number) = fn()

// This works ok
const point = JSON.parse(serializedPoint) ::as { x::number, y::number };
// But this isn't quite as nice
const point = JSON.parse(serializedPoint) ::as(number) + getNumb();

// In this example, wouldn't everything after the closing "}" be considered not-a-comment?
::import type { Person } from "schema";
// If not, then I'm not sure I understand how the return-type syntax works:
function fn(x::string)::(string) { ... }

@simonbuchan
Copy link

I can also see people not particularly keep on the idea of using two-word keywords,

Not what I meant, see my ninja reply above (it seems I was not very clear!)

In short, only one keyword, but only if it has to be followed by an identifier.

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

@theScottyJam

In this example, wouldn't everything after the closing "}" be considered not-a-comment?

Yep! That's a mistake. Which is why I gave the caveat that I was writing it very early before being caffeinated lol. You're exactly right that there are trade-offs with trying to do block-level parsing, and even issues with terminating the annotation immediately after a block end character, such as the } in the ::import.

It's kind of a tricky problem, so it does need some thought. What I was hoping to illustrate / say is I generally agree with you that this is the right type of thinking e.g. to create a brief comment/annotation system & syntax rather than a broad set of syntax cues for turning things into comments / annotations.

@simonbuchan

I think you misunderstood what I meant with "identifier identifier"? Or at least the implication. I was saying it's already disallowed by the grammar, so the language can safely (if not trivially) add any meaning it wants to anything that matches that.

I still don't really follow. Yes, the language can safely add meaning but... (and maybe this is the part I'm misunderstanding), I wouldn't want a language that just treats any arbitrary sequence of identifiers as valid? I just feel like that becomes really hard to flag what was intentional and what's an input error. Maybe you could put in more examples to illustrate that it's not arbitrary?

@theScottyJam Maybe there's something to Microsoft combining token-level annotation syntax with meaningful comment syntax, like:

Say we adopted your rules where you can have an identifier or block (and a few other characters like ? and !?) following :: (I'm still assuming @ is a non-starter because decorators.), but it doesn't "continue". So then you get syntax sorta like:

let x::string = 'foo';

function equals(x::number, y::number)::boolean {
    return x === y;
}

// Just adopt a particular prefix in a regular comment? TypeScript could do this today
/*@
  interface Person {
    name::string;
    age::number;
  }
*/

let x::(number | string) = fn()
let x::?number = fn()

const point = JSON.parse(serializedPoint)::(as { x::number, y::number });
// How I would treat this:
const point = JSON.parse(serializedPoint)::(as number) + getNumb();

/*@ import type { Person } from "schema"; */
function fn(x::string)::string { ... }

// Non-nullable assertions (if you disallowed the period at the "root-level", it would work?
document.getElementById("entry")::!.innerText = "...";

// Function overloading -- more straightforward
/*@
type StringOrNumber = string | number

function foo(x::number)::number
function foo(x::string)::string)
*/
function foo(x::StringOrNumber)::StringOrNumber {
    // ...
}

So, this way, you're still adding a micro-annotation syntax, but it's much more conservative. And TypeScript already does to some degree with "special comments" like // @ts-check.

So then you could put your large TypeScript-y "definition blocks" in special comment syntax that only that parser would interpret, which would be valid JS comments today. And then you are simply adding a micro-annotation syntax with some rules for auto-ending the annotation. That should cover most, if not all TS-y things in this proposal?

@simonbuchan
Copy link

@matthew-dean ok, so there's no currently valid JS that contains ::, so you can add a meaning for that. Likewise, there's no valid JS that has type Foo, so you can add a meaning to that. And the same for other sequence that would currently parse as two identifiers. The point is the keyword issue is really not that big a problem.

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

@simonbuchan I get that, but a sequence like :: is very different from a sequence like type Foo, both semantically and pragmatically. You're comparing two specific points in Unicode, representing a generic start of a sequence vs a sequence of tokens of indeterminate length. In parsing terms, if you said, "any two sequential identifiers makes the first character of the identifier the start of a comment", then you don't know what type is until you get to an indeterminate number of white-space characters and then later encounter Foo, at which point type could be assigned / grouped to a different expression, retro-actively.

If, alternatively, you're saying, "no, not any arbitrary identifiers, but specifically 'type'," then you're asking less of the parser, but you're asking much more of the language, to reserve another keyword and to have particular parsing semantics for it. (And this proposal asks for more than just type.) And even if you're okay with the "ask", this proposal claims it's sort of a "universal", when specifically asking for type is very much a TypeScript-specific ask (even though another language might also use type), for another language that is not TypeScript. So, if that was the ask, then as a language designer, I would reject that outright, unless it could be demonstrated that type or interface are words that are worth reserving in JavaScript as special user space keywords.

So either interpretation of type Foo is either asking a lot from parsers, users, and potentially impacting error-checking in IDEs, or impacting the spec in a strange specific way, which is much greater than the impact of ::, even though, as you say, "there's no currently valid JS" for either. Or a simpler way to say it is that type Foo has side effects that :: does not have.

@matthew-dean
Copy link

matthew-dean commented Mar 11, 2022

We should note that the proposal actually says that this direction proposed in this thread may be the way to go:

There might be another direction in which this proposal expands comment syntax just to support modifiers like these, which lead with a specific sigil.

class Point {
  %public %readonly x: number

  @@public @@readonly x: number
}

I'm not particularly a fan of % or @@ for how noisy they are (let x@@number = 0?), but by the same logic, the above could be re-written as:

class Point {
  ::public ::readonly x::number
}

...which actually more accurately reflects a "specific sigil". (A single one vs both @@ and :)

I'll also note (not sure I made it clear before), that my use of :: is not arbitrary, and is also lifted from the proposal:

// Types as Comments - example syntax solution
add::<number>(4, 5)
new Point::<bigint>(4n, 5n)

So, I feel like the proposal authors had the right idea here for annotating types, but probably didn't realize the same argument could be made elsewhere or generalized to a more general and useful annotation syntax (beyond the narrow ::<> use case, which again, is a very, very specific ask, when the ask should be more generalized).

@simonbuchan
Copy link

simonbuchan commented Mar 11, 2022

@matthew-dean this ... just isn't an issue. Due exactly the above sort of example pretty much everyone tokenizes separately to parsing (and the people who don't are using even more expressive systems like PEG), and the parser is perfectly capable of asking if the identifier type is followed by another identifier, and if so treating it as if it were a keyword. It wouldn't even be the first case of identifiers being keywords based on context: yield and await are keywords depending on their containing function declaration, async is a keyword only when followed by function or (, or, along with get and set in class bodies or object literals when followed by identifiers, not to mention every keyword is an identifier in a property context (object literal or property access)

@jethrolarson
Copy link

👏
I really like this proposal! It really separates the details of the type-system which would not be part of JS, and grants a place for an arbitrary type-system to exist on-top of JS, whether that's TS, Flow, or some future thing(like my proposal 😉 #84 ).

In particular using something like @@(whatever type stuff) to intersperse complex types in whatever way the 3rd party wants is cool.

Only thing this is missing is some kind of pragma to say what system is being used. Maybe just some convention like

@@(using typescript)

or similar is enough. Though with bundling and such, where does such designation end?

I think the specifics of what digraph to use (::, @@) may just be a matter of taste and running a poll or something could decide.

@simonbuchan
Copy link

I think the specifics of what digraph to use (::, @@) may just be a matter of taste and running a poll or something could decide.

Here's my real problem with this sort of suggestion: how do you feel about /*@ and */?

@theScottyJam
Copy link
Author

theScottyJam commented Mar 11, 2022

@jethrolarson

Only thing this is missing is some kind of pragma to say what system is being used. Maybe just some convention like

That sort of thing is actually being discussed in issue #36

(like my proposal 😉 #84 ).

I actually recently noticed that idea, and would love it if future type systems were to go in that direction (so, my hope is we can come up with a annotation system that's flexible enough to support that style of typing as well). I hate how cluttered syntax gets when types are intermixed with actual code. We would still need some sort of way to flexibly handle things like "implements", "as", etc though.

I do sometimes prefer putting the type annotation for a single variable or a property on the same line as the property itself, but I don't have a strong preference here.

@simonbuchan

Here's my real problem with this sort of suggestion: how do you feel about /*@ and */?

I'm fine with it. It's a little awkward to use, because:

// Those /* and */ take up a lot of extra verticle space, especially when dealing with smaller interfaces.
/*@
interface MyInterface {
  ...
}
*/

// Theoretically, you could try removing some whitespace like this, but that feels awkward as well.
/*@interface MyInterface {
  ...
}*/

So, the most difficult part of the /*@ ... */ suggestion is the fact that you need the closing */, which makes it difficult to make the syntax more concise, compared to a simple prefix of something like ::. So, my preference would be to have some sort of syntactic prefix for this sort of thing, but it's not a strong concern for me.

I think the most important thing is to find a way to provide type syntax and/or operators inline, in the middle of expressions, so you can use things like as, implements, etc. I'm ok with the syntax being a bit heavier for big chunks of type-only logic, even if it might mean using block comments to handle it, as long as we can find something suitable for the smaller, in-line chunks that isn't overly heavy to use.

@jethrolarson
Copy link

  • where does the :: end? Near as I can tell, it's hard to reserve syntax that's nice to use (so that anyone uses this rather than a transpiler) and possible to parse, which means (again, as far as I can tell) you kind of need to define some set of possible syntax probably as big as the current proposal. At which point, why not use the simpler, unprefixed, in use syntax?

I have a proposal for that here https://github.com/jethrolarson/proposal-types-as-comments/blob/generic_annotations/README.md

@simonbuchan
Copy link

This is ultimately a proposal for JavaScript.

And proximally a proposal for users of types-in-JS. If you're debating the value of a change to JS, don't you think it's appropriate to think of who's going to use it and why? A Typescript user is not the only person who might use this, but they are a primary use case. If you are going to drop them as users then you make this proposal far less valuable.

I wouldn't frame it as: what would make a TypeScript author of .ts want to move to just .js because I don't feel the authors really address that (or even propose it)? (Because, with a fast enough compiler, it's not that complicated / hard to just use .ts and transpile?

As someone who has lost multiple weeks of their life to trying to get various flavors of ts-node configuration, multiple tsconfigs with carefully tuned includes, editor confusion about that, the current completely broken state of typescript node ESM, dodgy source mapping, the voodoo rituals to try to get various flavors of caching working? Having to write a webpack or roll-up config just to strip types, given how good their defaults are now? Uh huh. How nice for you.

This would significantly improve my experience, and I'm in a better position than many to make this less painful than it could be. Even the trivial case of writing a loose script to automate something and not needing to pick between the poisons of the garbage jsdoc syntax, installing packages, or having my editor completely confused about what's going on would on its own be enough to justify this. And I sure don't want to have to use maybe less garbage completely novel syntax just because you have, as best as I can tell, a theoretical issue with the idea of the proposed syntax, not even any concrete specific issues.

I think it's more like, what would make it easier not for the.ts user, but for the JSDoc-style type-checking .js user (or other type systems using comments)? That is, if you want to write JS that's type-checked by TS, but is instantly runnable without transpiling, for whatever reason, could there be a system that's better / more flexible than JSDoc.

You're saying that only people writing .TS files are typescript users, and the people using typescript to check their .JS are not? But even that distinction disappears with the original proposal, because it's only a syntax change between the two. The only real reason to use checkJs in Typescript is because it's that important to not have to introduce a build step. Guess what this proposal addresses?

This proposal IMO shouldn't be about: could we make a system that makes zero reason to have .ts files, because I don't think it would ever get there.

I'm not sure what point this is trying to make? Nobody's suggesting that. If it's even 5% of TS users, that's still a win. If it's checkJs users lives become far nicer, that's a win. If it's everyone who writes a script can avoid all the headaches with ts-node, that's a win. If it's some new dev copy pasting some typescript into a browser console, that's a win. Not every proposal has to solve everyone's problem maximally.

That said, I think you're overestimating the actual missing features from typescript. The big ticket missing items from memory are:

  • declare - you can use .d.ts files
  • enum - basically just a set of constants and a type union.
  • namespace - either you should already be using modules, or this is just an object, just regular JS
  • Access modifiers/abstract - not all that useful if you have native private and modules to hide things.

So from what I can tell, you could absolutely just use this "80%" happily. But even if you can't ... that's can only really justify an argument to add whatever missing critical syntax in some form, not to introduce a completely new syntax to replace everything?

And nothing's stopping those missing pieces getting added, once they demonstrate value.

@ljharb
Copy link
Member

ljharb commented Mar 14, 2022

@simonbuchan i use checkJs with tsc on babelified code - in no way to avoid a build step (which is unavoidable and imo is not a worthy goal to pursue), but because i don't want to author in typescript.

@simonbuchan
Copy link

@ljharb It's not at all unavoidable if you're not shipping to browsers. Or developing.

Also, I can't possibly imagine that preferring jsdoc syntax to Typescript is an at all common preference, it's obviously worse in every conceivable way other than it "being JS"? Is there some reason you have to believe that others would also "not want to author in typescript", assuming that this proposal landed?

I ask because every JavaScript user's complaints about Typescript I've heard boil down to either "I don't want to use types" or "the syntax is too confusing", but you're using types, and with a more confusing syntax.

@ljharb
Copy link
Member

ljharb commented Mar 14, 2022

@simonbuchan not that it affects this proposal either way - but if typescript's JSDoc support was anywhere close to the capabilities of native TS syntax, I would likely use it on all of my open source libraries - that, and the JS semantics TS isn't capable of typing, are the only reasons I don't. I would not, however, switch to using "not JS" on any of those libraries.

If this proposal landed, and a type system existed that actually covered JS semantics, I would certainly prefer using the combo over similar syntax inside of comments.

@kee-oth
Copy link

kee-oth commented Mar 14, 2022

@mathhew-dean
I absolutely agree with you. This proposal should be focused on what's best for JS. Which isn't necessarily what's best for TS, regardless of TS's user base size.

@simonbuchan
I use JSDoc because of TypeScript's shortcomings in the way I program. I've tried TS out plenty of times but I inevitably get blocked. As @ljharb mentions, there are "JS semantics TS isn't capable of typing". Modeling this proposal after what TS needs just doesn't make sense, regardless of user base size. Our starting point is a system that doesn't work for all of JS which is a really shaky foundation. TS should be used as a reference but not as the One True Way as that doesn't currently exist.

(JSDoc is) obviously worse in every conceivable way other than it "being JS"

You've made a sweeping claim that I don't believe is true. I can conceive of an advantage for JSDoc: you can easily add descriptions to parameters and properties. In TS, you'd just have to add JSDoc anyway (or TSDoc) or some other comment system. JSDoc can also be used for at least some of those semantics that TS can't be used for.

@simonbuchan
Copy link

@ljharb I'm a bit confused, does this describe your preferences accurately, best to worst?

  • Native syntax (e.g. non-generic annotations along the lines of typescript/flow syntax, but not necessarily the same), with a different type checker that models JS semantics better for you? (let's call it harbscript)
  • Generic, non-type specific annotation syntax with harbscript semantics
  • JSDoc with harbscript semantics
  • JSDoc with typescript semantics
  • Any form of non-specified syntax

If so, shouldn't the original proposal (as opposed to this issue's proposals in general) be workable for you, assuming your harbscript semantics don't need more annotation space than the current proposal? If it might, where do you feel it is likely to be missing something?

The existing syntax of the proposal is speculative, and up for discussion, from what I can tell. Hell, I would be perfectly fine with dropping parts of it, e.g. interface and !, and requiring e.g. type Foo = interface { } and as NotNull<_> or whatever, if someone had a grounded reason to object. I would also be fine with adding speculative space that seems like it would be generally useful for other types of languages and checkers (e.g. Haskell style leading types), if a convincing case would be made and a workable solution found. That's essentially why the proposal process exists, afterall! (Of course, it's the committee that needs to be convinced not me, I merely expect that they have very broadly similar expectations from their previous decisions)

To be clear, I have absolutely no issue with people not liking typescript semantics. It has plenty of things I hate (overloads are basically broken, lack of consistency about declaration merging, inability to declare intersecting environments (e.g. node and browser), and so on. It is sufficiently performant and existing for my purposes, so I use it, but I would drop it for a better option in a heartbeat. One of my hopes for this proposal is that it encourages new type checkers, afterall! So don't conflate the fact that the syntax starts to fit a reasonable subset of typescript the syntax with the use of typescript as the checker. From the proposal text:

Additionally, type checkers, such as Flow, Closure, and Hegel may wish to use this proposal to enable developers to use their type checkers to check their code. Making this proposal be only about TypeScript can hamper this effort. Amicable competition in this space would be beneficial to JavaScript as it can enable experimentation and new ideas.

@kee-oth In the specific use-case of JSDoc for using Typescript's checkJs, you have exactly the typescript semantics, just with far worse syntax. If you're talking about using clojure or something, sure, but that's not what I was replying to. JSDoc can still be used for, you know, documenting.

@ljharb
Copy link
Member

ljharb commented Mar 14, 2022

@simonbuchan lol at calling "the way JS works" something arbitrary :-p but that sounds right? i'm not decided yet on what's workable for me. but it's not clear that it's possible to build a type checker that works with actual JS semantics, and if that's never going to be possible, then I suspect any syntax in this space is not a good idea whatsoever.

@simonbuchan
Copy link

I didn't mention arbitrary, let alone about the way JS works? Not sure what you're referring to there.

Of course, any static type checker is only going to accept a subset of valid semantics. That's kind of the point! Which it accepts is a matter of preference and programming style. As such, we currently have at least Typescript, Flow and (new to me, mentioned by the spec!) Hegel, all with reasonably different semantics but, notably, extremely similar syntax. Perhaps it's not the ideal syntax for some future super-type system, but it gets to choose between transpiling until it makes it's case for the syntax to be added, no worse than today, or it is one of the many cases of something not being perfect because we can't see the future.

@theScottyJam
Copy link
Author

theScottyJam commented Mar 15, 2022

just because you have, as best as I can tell, a theoretical issue with the idea of the proposed syntax, not even any concrete specific issues.

Fair point. I'll try to make the 80% issue a little more concrete with a bit of research.

Let's imagine this proposal got dropped into JavaScript a few years ago. This means, we would have great support for TypeScript syntax at that time, but poor to no support for future syntax they've added since then. What would we miss out on? I went through the release notes between now (version 4.6) and version 3.6, which was released back in August of 2019. I looked for any new syntax they added, that could potentially cause issues if a proposal like this already landed that brought type syntax to JavaScript.

Some of the features they've added in the past couple of years pertain to type-annotation-related syntax (i.e. the stuff that goes after a colon). I found out from #105 that they do plan on being pretty rigid about the syntax of the types you use outside of brackets/parentheses (which I'll call top-level type syntax, since I want a name for it). The exact details of their tentative plans can be found in this grammar document. While they leave you room to do whatever you want inside of brackets, you're basically required to follow TypeScript syntax outside of brackets. This means, for example, if you want to have a type prefix in top-level type syntax, you have to use one of the explicitly defined prefixes available to you: readonly, keyof, unique, infer, and not. There's also explicit support for TypeScript's union and intersection types (via | and &), literals, array types (via someArray[] syntax), void, conditional types (via the exact syntax of something extends somethingElse ? type1 : type2), type predicates via x is y, function types, constructor types via new (...) => ..., etc, etc.

In general, this means TypeScript syntax is being blessed a lot more than I realized (the README made these type annotations sound much more flexible than what they really are. They're only flexible if you're using brackets). As an aside, this also causes me to retract my "second iteration" of what I had proposed here, since I was relying on the type-annotation syntax they presented in the README without realizing this syntax was so tied down to a bunch of specific TypeScript features.

And, while it's technically possible to add whatever syntactic features you want by requiring you to use parentheses, this technique suffers from a couple of issues:

  1. There wouldn't be any consistency on when parentheses are required and when they're optional. It basically boils down to if the specific feature you want to use was blessed by TC39 and added natively to the language or not.
  2. Using parentheses isn't that much more verbose then using regular comments. i.e. compare x: (readonly values[]) to x /*@readonly values[]*/

Anyways, the point of explaining this, is to make sure we have an understanding of how they plan on implementing type annotations, so we know how this will cause issues when newer type-related features come along.

So, with no further adieu, lets time travel back a couple of years, gets this proposal in using an earlier version of TypeScript, then see what features TypeScript will struggle to bring into JavaScript as time goes on.

  • Template literal types were introduced in version 4.1, and look like this: x: `a${number}b`;. It's possible a past version of this proposal wouldn't would've have carved out syntax to make this possible as a top-level type, but it's also possible this wouldn't have happened, and a new proposal would be required to support this as top-level type syntax.

  • Version 4.2 added the ability to place abstract before a constructor signature, like this: abstract new () => HasArea. I would find it especially confusing if new () => HasArray is allowed as top-level syntax, but abstract new () => HasArea required parentheses. But, they would have to get another proposal through in order to prevent this inconsistency.

  • Version 3.7 added the ability to make asserts condition a return type, like this: function assert(condition: any, msg?: string): asserts condition { ... }

Now for some stuff that's not directly related to type annotations. Here, things get even more interesting.

  • Before version 4.0, the type of a caught exception was always assumed to be any. Since then, they added the ability to allow you to specify it's type as unknown instead. If this proposal had already been implemented, then it's possible TypeScript would have had to go through the proposal process to allow type annotations on the binding in the catch block.

  • In version 4.3, they added an override keyword that you can place on methods of a subclass to indicate that they override methods on a superclass. This wouldn't have been possible without a new EcmaScript proposal.

  • In version 3.8, they added the ability to specify that the value you're importing is a type. Before that, it would try and automatically determine this information, which they later found had a number of issues. This is now one of the core features presented in the README, but it would have had to of been an extra follow-on proposal had the core proposal been added earlier.

All of these features, except support for abstract constructor types, is explicitly present in the current proposal (as outline in their grammar document I shared earlier), and yet didn't even exist in TypeScript 2½ years ago. Think of all of the follow-on proposals TypeScript would need to have done in those 2½ years to bring these all in, and, think of all of the follow-on proposals they would continue to have to do to add more syntax in the coming years. How many of these proposals would get accepted? How many wouldn't? Are these features all general features that are meant for any type-interpreter, or are they just stuff that TypeScript wants? The other option is to just be conformable with basically using an older version of TypeScript's syntax in JavaScript, and be ok with not having the new shiny features, like, being able to specify that you just want to import types from somewhere, or, using the override keyword on class members. But, if we're resorting to this, we'd also need to acknowledge the fact that this proposal is so closely tied to the TypeScript's current syntax that it can't even support future versions of TypeScript, let alone provide good syntax support for other type-parsers like Flow, thus defeating one of it's own goals of being flexible enough to be type-parser agnostic, at least in terms of syntax. (Sure, Flow can have input in the design of this current proposal, but if Flow does things one way and TypeScript does things another, I can't imagine TC39 providing explicit syntax to support both ways, so only one way will win. You can't make both parties happy with syntax that's this rigid).

@simonbuchan
Copy link

@theScottyJam these are excellent points, and I agree completely regarding the current rules (given I've only glanced at them, but haven't taken the time to really look into them yet)

I was expecting and hoping for something a lot closer to Rust's macro rules, which is roughly "arbitrary token sequences up to a set of stop": https://doc.rust-lang.org/reference/macros-by-example.html#follow-set-ambiguity-restrictions

You couldn't do that too naively, after all types should be able to contain {}, but also in function return type position they should stop at the body opening brace. But surely that sort of thing is resolvable.

In any case, that is presumably a deliberately conservative first attempt, and can and will be improved (and specifics should probably be in a different issue)

@simonbuchan
Copy link

Note that #106 is the counter-argument, and seems pretty well researched at least.

@acutmore
Copy link
Collaborator

acutmore commented Mar 15, 2022

  • Template literal types were introduced in version 4.1, and look like this: x: `a${number}b`;. It's possible a past version of this proposal wouldn't have carved out syntax to make this possible as a top-level type, but it's also possible this wouldn't have happened, and a new proposal would be required to support this as top-level type syntax.

Hi @theScottyJam!

I’m fairly confident that string templates wouldn’t have been missed even if this proposal had been accepted before they were in TypeScript as strings clearly* require dedicated handling in the tokeniser as they can contain arbitrary sequences of characters.

You are 100% right that a core challenge for the designers of this proposal, like many TC39 proposals, is to try and find the balance of solving the known problems of today while keeping enough space for the future unknowns.

* clear to people who are familiar with writing compilers that is.

@theScottyJam
Copy link
Author

theScottyJam commented Mar 15, 2022

@simonbuchan

But surely that sort of thing is resolvable.

Unfortunately, I'm not so sure. #106 does seem to be a good attempt at trying to generalize the grammar for type annotations, but unfortunately it has a handful of issues, including the issue with the conflicting { token (is it the start of a function body, or part of the type annotation?), and I don't see a way of resolving these sorts of issues while keeping the grammar generic. I layed out some thoughts over there. We'll have to see how that thread evolves.

In the meantime, I've noticed a couple of posts mentioning that their current plan for flexibility is to just wrap the type annotation in parentheses (as mentioned here), which worries me. I would prefer that either you don't have to use parentheses at all (i.e. they fully spec how type syntax works, even within parentheses - leaving us with no flexibility in syntax which isn't a good option), or they greatly cut back on the number of special-cases they're making to the type-annotation syntax, making it simpler, and thus easier for the end-user to know when parentheses are required (which also makes parentheses required more often). Route 1 will cause them to completely get rid of the idea of this proposal being friendly to different type-parser, which would probably kill the proposal. If they go the second route, then we've basically got the token-level comments proposed from this thread, perhaps with a couple more bells and whistles and some extra restrictions to where that comment is allowed to be placed, but nothing too complicated.

Anyways, all of this is contingent on the fact that I don't think they'll be able to find a more flexible syntax for type annotations. If they do, that'll be great, and I'll watch the #106 thread to see if people are able to come up with something.

@dpchamps
Copy link
Contributor

dpchamps commented Mar 15, 2022

@theScottyJam I'm inclined to agree with you. I'm still mulling over the impact of just wrapping everything in parens. It looks like you've been thinking about it much more thoroughly.

Like you, I would prefer to not have the parenthesis at all. The additional syntax "escape hatch" may be indicative of a design smell -- or perhaps an opportunity to refine and simplify. It's at the minimum mildly inconvenient to have to type these extra tokens in order to to construct any kind of type.

OTOH the type annotation symbol : is so ubiquitous in mainstream languages as well as academia (happy to back up these claims, but I don't believe them to be controversial), that I think departing from it would be much, much worst.

I would very much not like to see JS depart from the status quo here.

If we were to assume that the grammar was something like

TypeDeclaration :
  type BindingIdentifier TypeParametersopt = ( <anything-at-all-production> )

TypeAnnotation :
  : (<anything-at-all-production>)

It does somewhat beg the question why typescript / flow-specific productions would be entertained at all in the Type production. If this is going to exist, why even bother with any of the type-specific grammars at all?

@ahejlsberg
Copy link
Collaborator

ahejlsberg commented Mar 15, 2022

If we were to assume that the grammar was something like

That is pretty much what is being proposed in the grammar. If you look at the PrimaryType production you see that three of the choices are ParenthesizedType, SquareBrackedType, and CurlyBracketedType, the contents of which can be any sequence of tokens as long as all bracketed constructs are balanced (see the BracketedTokens production for how that is accomplished). The key idea with this definition is that we only need to teach ECMAScript parsers about the top level syntax of types, i.e. the type syntax that isn't bracketed, and that any new syntax a type checker adopts can be accessed just by putting it in parentheses.

We could indeed go as far as saying all type annotations must be parenthesized, but I suspect most users would find that rather annoying--just as they find putting all type annotations in comments annoying.

@dpchamps
Copy link
Contributor

dpchamps commented Mar 15, 2022

We could indeed go as far as saying all type annotations must be parenthesized, but I suspect most users would find that rather annoying--just as they find putting all type annotations in comments annoying.

Yes agreed that it would be inconvenient. But I think the differences between type annotations in comments and enclosing type annotations in brackets are separated by a greater degree than just minor annoyance. Do you agree?

Anyways -- just to be clear -- I'm not necessarily advocating for the reduction of the grammar to ParenthesizedType, SquareBrackedType, and CurlyBracketedType. But it does seem strange to me to support what I will call (perhaps a bit flippantly) a special case such as ConditionalType, and stop there.

I probably need to refine the problem statement of #103, but this is essentially what I wanted to discuss there:

it would be nice if we thought about accommodating a wider variety of existing types within the grammar right now. So as to support future implementations without the annoyance of needing to construct bracketed types.

Let's push this notion of "no type system specified." I'm optimistic we can arrive at the best of both worlds.

@ahejlsberg
Copy link
Collaborator

ahejlsberg commented Mar 15, 2022

Do you agree?

I agree it would be less of an annoyance, but an annoyance nonetheless. function foo(s: (string), n: (number)): (boolean[]) just makes you wonder what's up with all the parentheses.

But it does seem strange to me to support what I will call (perhaps a bit flippantly) a special case such as ConditionalType, and stop there

The grammar doesn't stop there. In fact it includes the entire top-level TypeScript type grammar as it currently exists.

@jethrolarson
Copy link

One thing I don't like about
identifier: type is that : already has meaning in js so it's visually confusing when using TS. Particularly in conjunction with destructuring which iirc wasn't in JS when typescript was defining its syntax.

e.g.

const foo ({bar: {baz}}: {bar: {baz: string}}) => ...

@theScottyJam
Copy link
Author

@ahejlsberg - You don't necessarily have to require parentheses around all types to make it more type-engine agnostic. We could, for example, make it so after the : exactly one token will be consumed and ignored. If that token is an opening brace, then it'll continue consuming tokens until the closing brace is found. Thus, your example would look like this:

function foo(s: string, n: number): (boolean[])

Perhaps, this is where it might be good to also mix in a bit of the idea being proposed in #84, which is to allow people to place the type of a declaration before the declaration. i.e. allow users to avoid verbosity by writting their function types like this:

::(s: string, n: number) => boolean[]
function foo(s, n) { ... }

I think I'm going to take another stab at reformulating what I originally proposed in this thread, using these rules. I think, as this discussion has progressed, we're gradually converging on something that's more and more user-friendly while still being agnostic to a specific type-engine implementation.

@jethrolarson - I wholeheartedly agree here. Maybe at some point I might open up a ticket to discuss this point specifically - seeing if people would be willing to swap out the : token for another one. We'll see.

@dpchamps
Copy link
Contributor

dpchamps commented Mar 15, 2022

@ahejlsberg

The grammar doesn't stop there. In fact it includes the entire top-level TypeScript type grammar as it currently exists.

Sorry, wasn't clear. I meant, "it seems odd to stop at the TypeScript type grammar as it currently exists." I was using ConditionalType as an example of this. I understand it goes wider in some areas, for example not stands out to me... though I believe this is already in the works with TS.

What I mean is: throughout the spec and issues, one of the goals is clear: "to be type system agnostic". There are other interesting types that other type systems offer which TypeScript does not currently provide that may require additional syntax. I think it's fine TypeScript doesn't provide them. But I'd like to help push the specification towards a grammar that is permissive of wider set of types conveniently.

As it stands, we can either write TypeScript (or Flow, or maybe Hegel) conveniently, or something else inconveniently -- or as we've both agreed, annoyingly.

Does that make sense?

I would liken this idea to the idea here: #80 (comment), which I'd roughly summarize as "this proposal would have driven template literal types in the absence of their existence within the TypeScript type system."

@jethrolarson
Copy link

@theScottyJam the problem I see with this

::(s: string, n: number) => boolean[]
function foo(s, n) { ... }

Even though I proposed it I think it's forcing the annotation consumers into a certain way of specifying types. Generic rules around start-and-end of the annotations which can appear anywhere in the source just like comments gives the consumers more flexibility which should better stand the test of time.

::then
::(you can do) const ::whatever foo ::you = ::like 1 ::(to do) 

@theScottyJam
Copy link
Author

theScottyJam commented Mar 15, 2022

@jethrolarson - perhaps, let me briefly explain what I'm envisioning.

  1. We support token-level comments (the @ from my original post), via something like the backslash character.
  2. We support line/block-style comments (the @@ from my original post), via something like ::.
  3. We allow users to place a : after any binding. This acts as a token-level comment as well. This isn't really needed since we already having the \ character serving this purpose, but it does make the code look a little nicer, and less like back-slash soup.

This means, if TypeScript chooses to support putting type-declarations on the line before, they can do so like this:

::(s: string, n: number) => boolean[]
function foo(s, n) { ... }

And it might be worthwhile for them to think about adding such a feature because 1. I'm selfish and I like it better 😄️, and 2. it could help with some of the verbosity of sometimes being required to use parentheses after the colon. But, no one is going to require them to do this, and they can get by just fine if they choose to continue relying on inline type-annotations, it'll just sometimes be a bit more verbose to use them.

The other rules I presnted would still enable this:

\then
\(you can do) const \whatever foo \you = \like 1 \(to do) 

And you can also do this:

function fn(x: number, y: (number | string)) { ... }

::interface {
  x: number
}

return x \as\ number

// etc...

I'm just picking the double-colon and backslash to do those jobs, because that's what I had used in my earlier revision, and I think they look nice - relatively speaking. I'm also good with treating the double-colon as the token-level comment character instead, and finding something else to handle line/block-style comments. I know you mentioned earlier that this could already be done via ::(lots and \n lots of \n content) (assuming the :: is the token-level comment), which is certainly an option that I would be ok with. I just like the looks of these blocks of syntax without the extra nesting caused by the parentheses, which is why I'd prefer we additionally had a prefix to indicate this is a larger block and everything in it should all be ignored. But, either way - I think a proposal that only had :: in it as a token-level comment is already extremly powerful and may be able to get us very far.

@jethrolarson
Copy link

jethrolarson commented Oct 11, 2022 via email

@matthew-dean
Copy link

@theScottyJam So, I've been reading and thinking about the Stage 3 decorators proposal, and I came across the "annotation" syntax that TC39 themselves proposed for later exploration. I think with using the behavior and semantics of decorators, there's actually a path forward where annotations could be nearly anything, would not fundamentally change JavaScript syntax, and yet would / could provide any static-analysis type-checking that TypeScript / Flow / others might want.

Take a look! https://github.com/matthew-dean/proposal-annotations

@matthew-dean
Copy link

matthew-dean commented Nov 29, 2022

@theScottyJam Here is your original example, re-written with Stage 1 of the above proposal:

let @'string' x;

@'boolean'
function equals(@'number' x, @'number' y) {
    return x === y;
}

let @{
    name: 'string',
    age: 'number'
} Person 

let @'boolean' CoolBool;

class MyClass {
  @'number' name;
}

function fn(@'number' @'string' value) { ... }

// optional params
function fn(@'?number' value) { ... }

// import/export types
export let @{ ... } Person

impor { Person } from "schema";

import { Person, aValue } from "..";

// type assertions
const point = @{x: 'number', y: 'number' } JSON.parse(serializedPoint)

// Non-nullable assertions
(@'!' document.getElementById("entry")).innerText = "...";

// Generics
@'<T>'
function foo(@'T' x) { ... }

// "this" param
@['this', SomeType]
function sum(@'number' x, @'number' y) { ... }

// Ambient Declarations 
// omitted - up to the underlying type system

// Function overloading
// ommitted but a type system could define it using some structure of annotation

// Class and Field Modifiers (These are easy to add as well, if wanted)
class MyClass {
  @'protected' @'readonly' @'number' x = 2;
}

// Allowing someone to "implement" an interface - just spitballing
@['implements', MyInterface]
class MyClass  { ... }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests