-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Un-inline the enums in the CST #698
Conversation
|
- VariableDeclarationType (Rule): # 157..166 "\t\tuint256" | ||
- TypeName (Rule): # 157..166 "\t\tuint256" | ||
- ElementaryType (Rule): # 157..166 "\t\tuint256" | ||
- UintKeyword (Token): "uint256" # 159..166 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea we discussed before when we talked about this is flattening the tree structure of the tests a little bit to make it more readable/usable. So, for the below example:
- Instead of each node on a separate line, we can group rules that have only one child (that is also a rule) into the same line, as they will have the same exact range and preview comment.
- Removing
(Rule)
and(Token)
suffixes, since we already make the dinstinction through the YAML value/LHS on the same line.
Just putting the idea out here. Definetely not blocking for this PR, as we probably should do it in a subsequent PR to make it easier to review.
- VariableDeclarationType (Rule): # 157..166 "\t\tuint256" | |
- TypeName (Rule): # 157..166 "\t\tuint256" | |
- ElementaryType (Rule): # 157..166 "\t\tuint256" | |
- UintKeyword (Token): "uint256" # 159..166 | |
- VariableDeclarationType > TypeName > ElementaryType: # 157..166 "\t\tuint256" | |
- UintKeyword: "uint256" # 159..166 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a TODO for myself.
@@ -322,8 +322,7 @@ fn resolve_grammar_element(ident: &Identifier, ctx: &mut ResolveCtx) -> GrammarE | |||
let thunk = Rc::new(NamedParserThunk { | |||
name: ident.to_string().leak(), | |||
context: lex_ctx, | |||
// Enums have a single reference per variant, so they should be inlined. | |||
is_inline: matches!(elem.as_ref(), Item::Enum { .. }), | |||
is_inline: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to get @AntonyBlakey's eyes on this one. Some of these nodes look very useful, for example ContractMember
and ConstructorAttribute
, as it makes it easier to find/match on these elements.
However, some of them look extraneous indeed, and are only there for purpose of authoring the grammar/versioning. For example:
-
ElementaryType
andYulLiteral
which will (almost) always have a unique parent that can be matched against. Maybe we can refactor the grammar a bit to make this more accurate? -
TypedTupleMember
andUntypedTupleMember
which only exist to make parsing/backtracking correct, but provide no additional meaning. Not sure how to avoid it.
I’m trying to avoid adding optional inlining to the DSL unless we absolutely need it. As based on all of our new design decisions, and AST structure, it will make it less obvious to go from grammar to CST/AST, and add another layer of complexity that users have to deal with ..
Without inlining, people can easily depend on the fact that any NonTerminal node is convertible to its matching AST type, and vice versa. If we start to have some inlined enums, it won’t be obvious which CST nodes can be converted to AST types directly. And vice versa, it will be confusing if some AST types started returning a root node that have a different NonTerminalKind.
So, if we are happy with these changes, I’m fine with merging the PR as-is for now, and I can manually go over the enums to see if any of them can be better structured. For example, inlining something like ElementaryType
variants into the types it references, since it is almost always used inside another Enum. It will probably be more nuanced than that though.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enums that are artefacts of the grammar machinations are not of interest to the CST or the AST. We should I think find a better way to achieve whatever they accomplish. Although I am concerned about the 'almost' comment, because that seems to imply that the constraint is not a logical necessity, and so it must be surfaced as a parent + child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make the non-interesting enum into something else in the grammar i.e. effectively add a non-outlined characteristic by using a different name for the concept. It sure sounds like it has a different purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f2f: we think the PR is good enough for now, and let's review the added kinds later to see if anything can be pruned/removed.
988c2b1
to
5329d30
Compare
Rebased and force-pushed; this is still just c4c2f4a + re-generated files and adjusted test. |
Context: #698 (comment) - combined parents with a single child on the same line - using the `꞉` unicode character instead of colon `:` to separate node name and kind, in order not to break YAML parsing/formatting. - surround entire nodes with parenthesis instead of just the kind, to make it easier to read. - include whitespace in the snapshots, since they now take less visual space, and it will make it easier to spot changes to trivia during development.
Part of #638
As we discussed, inlining some nodes types causes us to lose information that is otherwise hard/inconvenient to reconstruct and this hopefully increases the usability of CST alone.