Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Remove unused steps from definitions of Contains #1519

Merged
merged 2 commits into from
May 29, 2020

Conversation

jmdyck
Copy link
Collaborator

@jmdyck jmdyck commented Apr 26, 2019

... specifically, steps of the form:

    If _symbol_ is an |Identifier| and StringValue of _symbol_
    is the same value as the StringValue of |IdentifierName|,
    return *true*.

Resolves #831.

As @allenwb points out, the intended use case for such steps does not occur in the spec.

And as I point out, these steps are semantically problematic.

If the intended use case ever does occur, I suspect we'll find a different way to handle it.

@zenparsing
Copy link
Member

If the intended use case ever does occur, I suspect we'll find a different way to handle it.

Such as introducing a new static semantic rule like "ContainsIdentifier" or somesuch?

@jmdyck
Copy link
Collaborator Author

jmdyck commented May 2, 2019

Such as introducing a new static semantic rule like "ContainsIdentifier" or somesuch?

Yup, e.g. ContainsArguments in the Public and private instance fields proposal.

Or we might try to introduce a form that abstracts over this kind of ContainsFoo operation, so that defining a new one isn't too burdensome.

@waldemarhorwat
Copy link

Is Contains ever used with identifiers?

If not, then it might be better to do the following:

  • Delete these overrides of Contains
  • Add an override of Contains on IdentifierName that always returns false so that IdentifierName doesn't pick up reserved words in any context.
  • Clarify that Contains on reserved words not in IdentifierName does match

It's currently unclear where Contains stops. Does it jump from the syntactic grammar to the lexical grammar? I hope it doesn't, but the spec is ambiguous. While doing the above it would be good to clarify how Contains deals with that boundary (and on which side of that boundary reserved words and IdentifierName lie) in the definition of Contains.

@jmdyck
Copy link
Collaborator Author

jmdyck commented Jul 23, 2019

Is Contains ever used with identifiers?

The right-hand argument of Contains is never |Identifier|, if that's what you mean.

Here are all the right-hand args:

`super`
|AwaitExpression|
|NewTarget|
|NotEscapeSequence|
|ScriptBody|
|SuperCall|
|SuperProperty|
|YieldExpression|

If not, then it might be better to do the following:

  • Delete these overrides of Contains

  • Add an override of Contains on IdentifierName that always returns false so that IdentifierName doesn't pick up reserved words in any context.

Yeah, I think that would work (assuming the syntactic/lexical boundary isn't a problem, see below).

* Clarify that Contains on reserved words not in IdentifierName does match

What do you mean by "reserved words not in IdentifierName"? It seems to me that there's no such thing. (That is, any sequence of characters that matches ReservedWord also matches IdentifierName.)

It's currently unclear where Contains stops. Does it jump from the syntactic grammar to the lexical grammar? I hope it doesn't, but the spec is ambiguous.

I believe it doesn't (currently) matter. That is, for the cases in which it is currently used, Contains will deliver the same result whether you think it jumps or not. (Mind you, I think there's a fair chance that a new use could break that without anyone noticing, so it's kind of fragile.)

While doing the above it would be good to clarify how Contains deals with that boundary (and on which side of that boundary reserved words and IdentifierName lie) in the definition of Contains.

It seems to me that the spec prefers to avoid talking about that boundary. Which might be one reason that these rules stop the recursion one step away from it.

@syg syg self-requested a review October 17, 2019 00:27
@ljharb ljharb requested review from bakkot and removed request for a team and zenparsing November 9, 2019 04:09
@ljharb ljharb self-assigned this Feb 13, 2020
@ljharb ljharb requested review from a team and removed request for ljharb May 29, 2020 06:52
Copy link
Contributor

@syg syg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the unused definitions lgtm. Though @jmdyck, what were your conclusions re: Waldemar's suggestions, that having them in the spec won't really help readability?

@bakkot
Copy link
Contributor

bakkot commented May 29, 2020

Having just read some about this, two relevant details:

The syntactic grammar "has ECMAScript tokens defined by the lexical grammar as its terminal symbols".

Contains is defined to recurse on the nonterminal symbols on the RHS.

In particular, per the first point IdentifierName is a terminal symbol of the syntactic grammar, not a nonterminal. Therefore, per the second point, Contains would not recurse into it.

Strictly speaking this means that we could remove the four productions currently touched by this PR without adding any new ones, since all terminals effectively have a default definition of "return false". However, I think this would be more confusing than the current PR.


In other words, my reading is that the answer to @waldemarhorwat's question

It's currently unclear where Contains stops. Does it jump from the syntactic grammar to the lexical grammar? I hope it doesn't, but the spec is ambiguous.

is that it unambiguously does not jump to the lexical grammar (assuming my reading of the above two things are correct). And I don't really want to mess with that, so my preference is to stick with the current PR.

jmdyck added 2 commits May 29, 2020 11:15
... specifically, steps of the form:
  If _symbol_ is an |Identifier| and StringValue of _symbol_
  is the same value as the StringValue of |IdentifierName|,
  return *true*.

Resolves issue tc39#831.

As @allenwb points out, the intended use case for such steps
does not occur in the spec.

And as I point out, these steps are semantically problematic.

If the intended use case ever *does* occur,
I'm guessing we'll find a different way to handle it.
... after previous commit. Specifically, when code says:
```
1. If <something>, return *false*.
1. Return *false*.
```
we can drop the first step, because the result is `*false*` either way.
@ljharb ljharb merged commit 9e48030 into tc39:master May 29, 2020
@jmdyck
Copy link
Collaborator Author

jmdyck commented May 29, 2020

In particular, per the first point IdentifierName is a terminal symbol of the syntactic grammar, not a nonterminal. Therefore, per the second point, Contains would not recurse into it.

Except that it's definitely a nonterminal of the lexical grammar. So a given IdentifierName would seem to satisfy the condition in step 1.b ("child is an instance of a nonterminal"), suggesting that Contains would indeed recurse on it. However, the recursive invocation would in practice violate the 'assertion' that symbol "is a terminal or nonterminal of the grammar that includes the associated production".

So it's unclear, and I think it would be a spec bug if we ever allowed it to happen (under the current formulation).

(@syg, does that also answer your question?)

Strictly speaking this means that we could remove the four productions currently touched by this PR without adding any new ones, since all terminals effectively have a default definition of "return false".

(Other than when they actually match the symbol being searched for.)

However, I think this would be more confusing than the current PR.

Yeah. (It might be possible to make that approach clearer, but that's not currently on offer!)

@bakkot
Copy link
Contributor

bakkot commented May 29, 2020

Except that it's definitely a nonterminal of the lexical grammar. So a given IdentifierName would seem to satisfy the condition in step 1.b ("child is an instance of a nonterminal")

Ah, I suppose that's fair. I read it to mean "a nonterminal of the grammar of which this production is a part", since one does not generally talk about nonterminals except in relation to a particular grammar.

@jmdyck jmdyck deleted the resolve_831 branch May 29, 2020 21:28
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request May 29, 2020
(These steps come from PR tc39#1646, which went through its whole
life-cycle while tc39#1519 was dormant. Properly, after tc39#1646 was merged,
I should have rebased tc39#1519 and included these changes there.)
@bakkot bakkot mentioned this pull request May 29, 2020
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Jun 4, 2020
(These steps come from PR tc39#1646, which went through its whole
life-cycle while tc39#1519 was dormant. Properly, after tc39#1646 was merged,
I should have rebased tc39#1519 and included these changes there.)
ljharb pushed a commit to jmdyck/ecma262 that referenced this pull request Jun 15, 2020
)

These steps come from PR tc39#1646, which went through its whole life-cycle while tc39#1519 was dormant. Properly, after tc39#1646 was merged, I should have rebased tc39#1519 and included these changes there.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

_symbol_ is an |Identifier| ?
7 participants