Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_symbol_ is an |Identifier| ? #831

Closed
jmdyck opened this issue Mar 1, 2017 · 12 comments · Fixed by #1519
Closed

_symbol_ is an |Identifier| ? #831

jmdyck opened this issue Mar 1, 2017 · 12 comments · Fixed by #1519

Comments

@jmdyck
Copy link
Collaborator

jmdyck commented Mar 1, 2017

In four of the definitions for 'Contains', there's the step:

If _symbol_ is an |Identifier| and StringValue of _symbol_
 is the same value as the StringValue of |IdentifierName|, return *true*.

I'm struggling to understand what this means. To me, it appears to make a category error (twice), treating _symbol_ (which is a grammar symbol) as if it's a Parse Node.

Consider just the phrase _symbol_ is an |Identifier|. Can someone suggest an example in which that phrase is found to be true? I.e., where some actual rule (e.g. an early error rule) causes (eventually) an invocation of one of those four definitions of Contains, and control reaches that step, and _symbol_ is an |Identifier| is true, according to some interpretation?

@ljharb
Copy link
Member

ljharb commented Mar 21, 2018

|MemberExpression| Contains _symbol_ and |CallExpression| Contains _symbol_ have "IdentifierName" - is that a subset of "Identifier" such that this path would be followed?

@jmdyck
Copy link
Collaborator Author

jmdyck commented Mar 21, 2018

Sorry, I don't understand what you're saying. In particular, could you explain what you mean by have "IdentifierName"?

It might help to say that I'm not concerned about the StringValue of |IdentifierName| portion of the quoted step, but rather the stuff involving _symbol_.

@ljharb
Copy link
Member

ljharb commented Mar 21, 2018

I'm looking at the Contains algorithm, and the first and third sections recursively call into the Contains algorithm with a MemberExpression and a CallExpression, respectively.

However, since both sections return early if symbol is a reserved word, and since Identifier is "an IdentifierName that's not a reserved word", after step 2, "symbol" can't possibly be an Identifier but not an IdentifierName - so at the least, "Identifier" should be changed to "IdentifierName", by my reading?

Beyond that, I'm not sure about why step 3 is there. @allenwb @bterlson, any idea?

@allenwb
Copy link
Member

allenwb commented Mar 21, 2018

Beyond that, I'm not sure about why step 3 is there.

if the spec text said:

If someMemberExpression Contains the Identifier foo ...

and someMemberExpression is a parse node for: x.foo

then Contains needs to return true.

But if the call to Contains was

If someMemberExpression Contains super ...

and someMemberExpression is a parse node for: x.super

it needs to return false because in this context super is not being used as a keyword, it is just an ordinary IdentifierName

The first case is not currently used in spec. The second case definitely is. The reason that case is in the spec, even though it is never used, is that all definitions or Contain try to be complete about defining what it means for terminal or non-terminal grammar symbol to be used in a specific kind of parse node.

@ljharb
Copy link
Member

ljharb commented Mar 21, 2018

@allenwb since super is a reserved word, wouldn't step 2 cause it to return false?

Regarding completeness, that makes sense.

@allenwb
Copy link
Member

allenwb commented Mar 21, 2018

Yes, step 3 is for the other case. An the StringValue call is needed to make sure that IdentiferNames contain unicode escapes get normalized before the comparison.

@ljharb
Copy link
Member

ljharb commented Mar 21, 2018

Got it, thanks!

@jmdyck, this seems answered, so I'll close it - happy to reopen if not.

@ljharb ljharb closed this as completed Mar 21, 2018
@jmdyck
Copy link
Collaborator Author

jmdyck commented Mar 21, 2018

If _someMemberExpression_ Contains the Identifier foo ...
... is not currently used in spec.

Ah, okay, that explains a lot.

The problem with that hypothetical usage is, the definition of Contains says that _symbol_ is "a terminal or nonterminal of the grammar", and the phrase the |Identifier| foo isn't. (Certainly, |Identifier| is, but the |Identifier| foo is an instance of |Identifier|.)

The reason that case is in the spec, even though it is never used, is that all definitions of Contain try to be complete about defining what it means for terminal or non-terminal grammar symbol to be used in a specific kind of parse node.

To me, it's like an operation that says parameter _x_ is a String, and then has a step saying "if x is an Object": that's not completeness, that's inconsistency.

One way to resolve the inconsistency would be to delete the occurrences of step 3. Since the spec has none of the invocations of Contains for which that step is intended, there should be no effect.

Another way to resolve the inconsistency would be to re-word the definition(s) of Contains so that
... Contains the Identifier foo
(or something like it) would be a valid invocation. Given the lack of such invocations, this seems pointless, but this PR from @littledan suggests that it might be useful. (And see this comment from @anba about why it's currently problematic.)

@jmdyck jmdyck reopened this Mar 21, 2018
@allenwb
Copy link
Member

allenwb commented Mar 21, 2018

One way to resolve the inconsistency would be to delete the occurrences of step 3. Since the spec has none of the invocations of Contains for which that step is intended, there should be no effect.

I don't think that would be a good idea as Contains is recursive and some point somebody might define a rule that does Contains using some identifier name and there is a good chance they wouldn't know that they need to update some definitions of Contains to make it work.

IdentifierName is a lexical production that is used as a terminal symbol of the syntactic grammar and StringValue is explicitly defined for it. I think the meaning of StringValue of "IdentiferName foo" has a pretty clear meaning. If they said "Identifier foo" then it is pretty clear that they mean an Identifier that contains an IdentifierName whose StringValue is "foo".

You can try to tweak the language, but it remains to be seen in the end it will be clearer to a human reader.

@jmdyck
Copy link
Collaborator Author

jmdyck commented Mar 21, 2018

If they said "Identifier foo" then it is pretty clear that they mean an Identifier that contains an IdentifierName whose StringValue is "foo".

Well, I'm not sure that the application of StringValue goes without saying, but that's beside the point -- even if someone wrote
|X| Contains an |Identifier| that contains an |IdentifierName| whose StringValue is "foo"
the problem would still be there: Contains, as currently defined, doesn't support that usage.

@ljharb
Copy link
Member

ljharb commented Apr 26, 2019

@allenwb @jmdyck is there an actionable change to make here, or should this be closed?

@jmdyck
Copy link
Collaborator Author

jmdyck commented Apr 26, 2019

Yeah, I'll submit a PR.

jmdyck added a commit to jmdyck/ecma262 that referenced this issue Apr 26, 2019
... specifically, steps of the form:
  If _symbol_ is an |Identifier| and StringValue of _symbol_
  is the same value as the StringValue of |IdentifierName|,
  return *true*.

Resolves issue tc39#831.

As @allenwb points out, the intended use case for such steps
does not occur in the spec.

And as I point out, these steps are semantically problematic.

If the intended use case ever *does* occur,
I'm guessing we'll find a different way to handle it.
ljharb pushed a commit to jmdyck/ecma262 that referenced this issue May 29, 2020
... specifically, steps of the form:
  If _symbol_ is an |Identifier| and StringValue of _symbol_
  is the same value as the StringValue of |IdentifierName|,
  return *true*.

Resolves issue tc39#831.

As @allenwb points out, the intended use case for such steps
does not occur in the spec.

And as I point out, these steps are semantically problematic.

If the intended use case ever *does* occur,
I'm guessing we'll find a different way to handle it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants