Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvements to protoparse's handling of message literals #483

Merged
merged 4 commits into from
Feb 3, 2022

Conversation

jhump
Copy link
Owner

@jhump jhump commented Feb 3, 2022

When a custom option's type is a message and the value is defined using a message literal, protoc parses the content of the message literal as the protobuf text format.

The protobuf text format has some quirks that were not previously handled by protoparse:

  1. Unlike other places that refer to symbols, extension names in the text format may not begin with a dot to indicate they are fully qualified.
  2. Boolean values can be specified using "t", "f", "True", or "False", in addition to the normal keywords which are all lower-case ("true" and "false").

Also, protoparse was not previously handling the case where an enum is defined with enum values whose names conflict with certain keywords ("true", "false", "inf", "nan"). The parser would see these identifiers and eagerly convert to bool or float values and then complain about their use with the enum. Instead, parsing should always represent these identifiers using identifier nodes in the AST (not bool or float literal nodes), and they can then be converted to bools/floats as needed. That way, if they need to instead be interpreted as enum value names, they can be.

jhump added 4 commits February 3, 2022 00:03
1. bool option and value is just "t", "f", "True", "False" (supported by protoc, not previously supported by protoparse)
2. enum option and enum value name is "true", "false", "t", "f", "True", "False", "inf", or "nan"
…es not allow fully-qualified names starting with a dot
@jhump
Copy link
Owner Author

jhump commented Feb 3, 2022

I suspect protoparse's resolution of extension names that inside message literals may be more lenient than protoc. Protoparse is likely resolving them the same way it resolves other symbols (which means it uses the scope of where the reference is in the source to support partially qualified or unqualified references). But I think protoc always requires the fully-qualified name unless the unqualified name is unambiguous, in which case it is allowed.

TODO: test this more...

@jhump
Copy link
Owner Author

jhump commented Feb 3, 2022

TODO: test this more...

Indeed, there are some fun discrepancies. I have fixes in another branch: see #484.

@jhump jhump merged commit d4949d2 into master Feb 3, 2022
@jhump jhump deleted the jh/protoparse-text-format branch February 3, 2022 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant