-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go Target - Visit implementation - two other minor changes runtimeimport tokenstream #1841
base: master
Are you sure you want to change the base?
Conversation
…rs to use interfaces.
…interfaces instead of ptr to structs.
…ode in base listener and visitors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not used to doing code reviews on GitHub as much; my apologies if I come across as curt in places. I'm more used to code reviews at work, where there's co-worker context. You did some very interesting things, and I'm glad to have the opportunity to look it over.
I feel like this pull request tries to do too many different things, and at the same time doesn't do enough for the code that's trying to implement a Visitor. Losing return values from visit nodes and not being able to implement a subset of the Visitor are pretty serious losses.
I think those are more important than how Go-like the generated code is, but I'd love to see a version that addresses those problems, and preserves the idiomatic Go nature of your code.
runtime/Go/antlr/tree.go
Outdated
@@ -28,7 +28,7 @@ type SyntaxTree interface { | |||
type ParseTree interface { | |||
SyntaxTree | |||
|
|||
Accept(Visitor ParseTreeVisitor) interface{} | |||
Accept(visitor ParseTreeVisitor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Accept needs to be able to return a value, in order to make recursive expression evaluation straightforward. Something akin to:
op := ctx.GetOp().GetText()
left := ctx.GetA().Accept(v)
right := ctx.GetB().Accept(v)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept should only get called by the generate code, not by visitor implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to look into accept a bit more. It gets called to kick off the visitation, but shouldn't get called inside the visitor (afaik). Doing so a receipt for confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My approach came out of creating an expression evaluator which had no mutable state of its own, which allows .Accept(v)
to work fine. If your visitor type has to carry changing state with it, then yes, you can't really call back into it as the state will get confused. But you shouldn't HAVE to do that, and I didn't think your implementation forced that.
I also do this in a standard Java ANTLR-generated vistor, which I know shouldn't mean that's relevant for Go, but I mean it more as a point that it might be expected by ANTLR users.
If you imagine something which implements a simple expression parser:
expr
: '(' expr ')' #ParenExpr
| a=expr op=(ADD|SUB) b=expr #AddSubExpr
| literal #LiteralExpr
;
literal
: lit=INT #IntLiteral
;
SUB : '-' ;
ADD : '+' ;
INT : [0-9]+ ;
An obvious implementation of VisitIntLiteral(...)
is
result, err := strconv.ParseInt(ctx.GetText(), 10, 64)
if err == nil {
return result
}
return nil
That lets your AddSubExpr
visitor do something like what I said at the top, and then (after a type check) add the numbers and return them. Now you have a recursive visitor that evaluates expressions. Essentially an interpreter, using the Visitor pattern to execute it.
(The one I'm working with in my current project is more complicated, but that's all extracted from pieces of it that are emblematic of the whole. The interpreter and surrounding code is written in Java and Go, but both use the same grammar, and the same aspects of ANTLR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.
I know this isn't idiomatic Antlr4 (overuse of idiomatic ;-) ), I still use returns and actions in my expression eval. See Eval using returns below. Doing an expression eval using Listeners can be quite confusing, precisely because it is difficult to maintain state in the call stack.
It would be nice if arguments and returns could be fully supported in the Visitors and potentially also in the Listeners.
I currently have grammars that look like -
start : a[false];
a[bool recursive] returns[Atom result]
: b[recursive] { $result = ... }
| ...
;
b[bool recursive] returns[Atom result]
: .. a[false] ...
| ...
;
It would be interesting to see how this might be achieved in a visitor. It is probably easier to prototype a grammar with actions, args and returns. It would be nice to be able to move to visitors or listeners once a grammars is more mature.
It would be good to have a couple of different implementations of a simple expression eval as examples. Interested in adding to some to a repository? Maybe in https://github.com/wxio/antlr4-go-examples .
Eval using returns
@members {
type Atom interface { ... }
func parseInt(s string) Atom { ... }
func binaryOp(op string, a, b Atom) Atom { ... }
}
expr returns[Atom atom]
: '(' e=expr ')' { $atom = $e.atom }
| a=expr op=(ADD|SUB) b=expr { $atom = binaryOp($op.text, $a.atom, $b.atom) }
| l=literal { $atom = $l.atom }
;
literal returns[Atom atom]
: lit=INT {$atom = parseInt($lit.text) }
;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I've put up https://github.com/cyberfox/expressions for now, and I'd be happy to mix it into a catalog of examples. I did integrate returns and actions for a bit, while using the visitor at the same time. (My QuotedLiteral sliced the quotes off in the grammar, but I realized it was easier and less prone to weird compile errors to just write it as a visitor.)
For what it's worth (as I bet you guessed), I first built a grammar for Java, because that's what I was targeting for work. I started stripping out actions as I improved my Java visitors, and eventually I was left with only a few and a bunch of returns, when...
I realized there was a Go target, and I've had a goal to improve my Go skills, and so I started making it work for both languages. Doing that from one grammar forced me to strip out all the actions and returns. It also forced me to make the Visitor ANTLR code work. :) So 'prototype a grammar with actions, args, and returns' and then moving to visitors afterwards was pretty much exactly the process I used.
<if(header)> | ||
|
||
<header> | ||
<endif> | ||
|
||
// A complete Visitor for a parse tree produced by <file.parserName>. | ||
type <file.grammarName>Visitor interface { | ||
antlr.ParseTreeVisitor | ||
antlr.ParseTreeVisitor | ||
<file.visitorNames:{lname | <lname; format="cap">ContextVisitor}; separator="\n"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this make it so that anything that wants to implement the Visitor interface has to implement ALL the endpoints, even if they're not used?
Avoiding that requirement was a high-priority goal of my change, as it allowed a Visitor to be incrementally implemented, and there will be endpoints that don't get used during processing, and so shouldn't be required in the implementation.
The place where this matters a LOT is when you have multiple visitor implementations, running at different times.
Let's say you have a language with a name and an expression.
You have one visitor implementation that just collects names and ctx values for the expression associated with that name.
At execution time, a method is looked up by name, and the ctx that was stored associated with it is then passed an expression-evaluator visitor, which is an entirely different subtype of the Visitor.
If you have to implement ALL of the visitor methods in EVERY subtype, then each different visitor has a bunch of useless and never-called code, and that unnecessary boilerplate is not confined to the generated code, it has to live in the ANTLR client's code.
Can you find a way to do this without requiring that every visitor implementation implements every visit node target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The complete visitor interface is actually not needed at all. I only left it in as a convenience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's my bad, then. I couldn't see how it was implementing a partial interface. Sorry about that. I'll try implementing a simple labeled expression evaluator in it, to see how it goes.
@@ -746,8 +798,9 @@ InvokeRule(r, argExprsChunks) ::= << | |||
var _x = p.<r.name; format="cap">(<argExprsChunks>) | |||
<endif> | |||
|
|||
<r.labels:{l | <labelRefStuff(l,"_x")> }; separator="\n"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check that you're using tabs in this file; since this gets generated as .go code, I think it matters...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is only styling. go fmt was fix it. it only matter if there are style checkers as part of commit code to a repo and you commit generated code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does go fmt
run on code that's auto-generated by go generate
?
Anyway, right now I'm committing the generated code because I didn't know about go generate
. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but you can always do
//go:generate go fmt
|
||
RuleContextDecl(r) ::= "<r.name> I<r.ctxName> " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much of what's below this seems like formatting changes that are unrelated?
case <parser.grammarName>Visitor: | ||
return t.Visit<struct.derivedFromName; format="cap">(s) | ||
|
||
func (s *<struct.name>) Accept(delegate antlr.ParseTreeVisitor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really feel that returning a value from Accept is important...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide an example of using the returned interface{}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did above; I'll see if I can put together a more complete example to share.
@@ -113,3 +113,178 @@ The output is: | |||
"a":1 | |||
1 | |||
``` | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also change the note up above to say that the visitor example is below.
@@ -131,16 +248,39 @@ package <file.genPackage> // <file.grammarName> | |||
package parser // <file.grammarName> | |||
<endif> | |||
|
|||
import "github.com/antlr/antlr4/runtime/Go/antlr" | |||
// See base listener file for example implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/listener/visitor
But isn't this the base visitor file that you want to point them to?
I feel a little sketchy generating a commented-out file; it shouldn't be necessary, and the worst case is if they go in and uncomment it out, and start writing code in it. The next time code generation gets run, it'll get stomped on, which is unexpected.
Needing this file may also be a side-effect of having to implement the entirety of the interface in order to use it at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not generation a base visitor. The code generated is a template visitor that can be copied uncommented and used as a stating point.
@@ -26,10 +26,18 @@ | |||
public String exportMacro; // from -DexportMacro cmd-line | |||
public String grammarName; | |||
public String parserName; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and below) is unrelated changes, right?
@@ -101,10 +94,10 @@ type BaseParseTreeListener struct{} | |||
|
|||
var _ ParseTreeListener = &BaseParseTreeListener{} | |||
|
|||
func (l *BaseParseTreeListener) VisitTerminal(node TerminalNode) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the impetus to change the listener implementation, but I think that a more focused pull request (just adding support for Visitor) would be better here. Especially if the Listener change is going to be a breaking one.
The Visitor never worked, so implementing it should be straightforward, but breaking the way Listener worked (and the unrelated changes) seem like they'd be better as a separate pull request.
See my response in #1843 (comment) |
…gateResult. This is so that visitor only needs to implement subset.
…ins commented out example code. Make writing listeners and visitors easier as boiler plate code is generated.
…ntimeimport-tokenstream
I was finding implementing small visitors a bit cumbersome as Go doesn't allow nested method definitions, but does allow nested anonymous function definitions . So I added a VisitFunc method that takes a struct of functions. eg
|
I've been using my fork for a while now; I'm happy to use yours if it becomes the standard way, although I haven't tried changing my stuff over wholesale to it to see if it has holes. How's the Visit implementation going? Have you used it in anger to implement an interpreter? Is there even anybody who can reasonably code review it? It seems like a kinda specialized combination of knowledge, unfortunately. |
I'm willing to review (and even implement a parser to do that). Do we have agreement about which implementation we're going to go with? (I'm good with either -- I'm eager to have anything usable in the main branch at this point.) |
Hey @cyberfox and @davesisson, I used the https://github.com/wxio/antlr4/commits/visit-runtimeimport-tokenstream implementation pretty extensively to implement a basic spreadsheet engine. There are still a few usability issues. The biggest of which is that the visitor & listener should generated code should use pointers to structs, as this is what most people expect. Generating code that uses interfaces should be an option. I haven't used https://github.com/cyberfox/antlr4 since April and there are a number of feature is this fork that would stop me switching. If it help evaluate things, I'll push the latest version to I should also post how to go from a full antlr4 repo to a Go specific one. Basically it's just a git prune. Let me know if you would find this useful. Cheers |
I'll try https://github.com/wxio/antlr4-go out then. It's pretty much already in the form I need -- I end up transplanting the runtime into my build system. |
@davesisson please use https://github.com/wxio/antlr4 the pruned repo is out of date. |
Sounds good to me; is https://github.com/millergarym/expressions up-to-date with the current version of the Go visitor changes? I find it a really helpful example to wrap my head around how things are working, with most of the features I use. |
@cyberfox no, but I'll try make some time to bring it up to date. |
@millergarym I've updated cyberfox/expressions to match the new code. I also converted my own project, which was...complicated, but ultimately doable. I don't necessarily like the long method declarations, but I get it. It'd be good to get this merged... |
This change didn't work for me - the generated .go files didn't compile: Grammar attached - as .txt because of github. |
Reviewed 1 of 3 files at r4, 16 of 19 files at r5, 3 of 3 files at r6. Comments from Reviewable |
We tried https://github.com/wxio/antlr4/commits/visit-runtimeimport-tokenstream here as well and it meets our needs. Could we convert that into a pull request (or update this one) and go with that approach? |
@savvopoulos Not sure if you already dealt with this, but the This is how I tried it (presuming
The idea, as I understand it, is that you should check in the generated code, so that not everybody who checks it out needs to rebuild the antlr jar files, but I included the process to do that above, since it's the steps I went through to test it. @davesisson This pull request is the pull request from It'd be really good if we could get this pull request merged; because |
@parrt I think we want to go with this implementation. Who do we need to agree that this is the right way to go? |
@davesisson I think you were the primary Go guy ,right? can't remember. if so, I will merge |
Ah, the main Go guy is @pboyer. I was doing C++ but I've been helping out in Go lately. :) |
@davesisson @parrt Thanks for pinging me. I'm on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@millergarym Thank you so much for this PR and your continued contributions. I've been a bad maintainer as of late, and should've responded to this much sooner.
It's not accurate to say that the Visitor
pattern is not supported in the Go runtime. The issue you're running into is the absence of a good implementation of VisitChildren
in *BaseParseTreeVisitor
. The current implementation provides only a dumb return nil
body. Anyways, in order to use the Visitor
pattern in Go, you really need to handle each part of your parse tree yourself. I do this myself today for non-trivial grammars. I recognize this as a deficiency, but it's not quite amended by this PR.
We could implement VisitChildren
in a way that matches the Java implementation: https://github.com/antlr/antlr4/blob/46b3aa98cc8d8b6908c2cabb64a9587b6b973e6c/runtime/Java/src/org/antlr/v4/runtime/tree/AbstractParseTreeVisitor.java
There is no reason we couldn't follow this same pattern in the Go implementation for alignment (and frankly so we don't need to reinvent it). I don't think we can accept this PR in its current state for this reason, but also because it mixes in a number of other changes that I'm not sure I understand.
If you want to change this PR, I'd recommend breaking it into a smaller PR with VisitChildren
that follows the same pattern as Java's AbstractParseTreeVisitor
. This would be a breaking change for existing code in only a rare case - where an implementor has purposely depended on the empty implementation of VisitChildren
.
Otherwise, I'm getting the signal here that I should both implement it myself and provide better docs. It is certainly possible to use the Visitor
pattern, but not nearly as convenient. CC @cyberfox
@@ -38,6 +39,18 @@ | |||
// built from it. Avoids adding args to StructDecl template | |||
public OrderedHashSet<Decl> tokenDecls = new OrderedHashSet<Decl>(); | |||
public OrderedHashSet<Decl> tokenTypeDecls = new OrderedHashSet<Decl>(); | |||
// public OrderedHashSet<Decl> gettersDecls = new OrderedHashSet<Decl>(); | |||
// public OrderedHashSet<Decl> rulesDecls = new OrderedHashSet<Decl>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete these commented out lines. History is for a revision control system, not the code.
public OrderedHashSet<Decl> tokenGetterDecl = new OrderedHashSet<Decl>(); | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the superfluous white space.
@@ -28,7 +30,7 @@ type SyntaxTree interface { | |||
type ParseTree interface { | |||
SyntaxTree | |||
|
|||
Accept(Visitor ParseTreeVisitor) interface{} | |||
Visit(visitor ParseTreeVisitor, args ...interface{}) interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is canonically called Accept
. I don't understand why you'd want to change it.
Can you explain the reason for introducing additional arguments to this function? In this case, additional state can simply be stored within the Visitor
implementation. This keeps the interface
implementation simple and prevents untyped contracts from appearing in the code.
func (prc *BaseParserRuleContext) Accept(visitor ParseTreeVisitor) interface{} { | ||
return visitor.VisitChildren(prc) | ||
func (prc *BaseParserRuleContext) Visit(delegate ParseTreeVisitor, args ...interface{}) (result interface{}) { | ||
return delegate.VisitChildren(prc, delegate) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm confused here. Why did you rename this to delegate
? Also, it seems odd that we'd pass the delegate
as an argument to a method of itself. This has a bad code smell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So...IIRC, this is the complicated part. Because Go doesn't actually have what you'd typically call 'inheritance', the VisitChildren
method implemented in the ParseTreeVisitor
will actually NOT call the right method if you just use the *BaseParseTreeVisitor
pointer. So IIRC he passes in delegate
so that it can call the right methods on it. I know that was necessary for my initial version, but I did it very differently, and without a lot of the Go experience that you and he bring to the table. Using type assertions and type switch
appears to have gone a long way.
I did a more 1:1 implementation of VisitChildren
, using the Java code as the guideline, but I had to put it into the generated code in order for it to work and did a lot of ugly if
checks. He found a way to put it into the base code, and have it still call the right methods. (Although ignoring default value, and a few other niceties.)
A lot of this is from memory, refreshed by looking at the various pieces; it's been a while, and I admit, I don't really understand the reason for args
, for example. :-/
runtime/Go/antlr/file_stream.go
Outdated
@@ -19,13 +19,19 @@ type FileStream struct { | |||
filename string | |||
} | |||
|
|||
func NewFileStream(fileName string) *FileStream { | |||
func NewFileStream(fileName string) (*FileStream, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great change. Thank you. It should be in a separate PR, though.
doc/go-target.md
Outdated
@@ -88,7 +88,7 @@ func (this *TreeShapeListener) EnterEveryRule(ctx antlr.ParserRuleContext) { | |||
} | |||
|
|||
func main() { | |||
input := antlr.NewFileStream(os.Args[1]) | |||
input, _ := antlr.NewFileStream(os.Args[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be best to demonstrate this by doing an if err != nil { fmt.Println(err); os.Exit(1); }
here.
return v.VisitErrorNode(e) | ||
func (e *ErrorNodeImpl) Visit(v ParseTreeVisitor, args ...interface{}) interface{} { | ||
v.VisitErrorNode(e) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we want to swallow the output here?
@pboyer Implementing VisitChildren in a way suitable to be used generically was extremely hard for me, at least, mainly because of Go's lack of inheritance. There needed to be a way to pass in the thing you wanted the child calls ( There's a repo, https://github.com/cyberfox/expressions which shows how (the current version, and earlier my own version) it's all used in a slightly less trivial example, if that'd help. My initial VisitChildren stab at implementation was in the generated code, not the base code, and was very small (#1807; ignore the |
Sorry that I haven't been more responsive, this needs attention and I don't have the time at the moment. If anyone has time it would be appreciated. The following is what I would like to do when I can make some time. As noted this there are multiple things going on is this PR, some should be removed. The biggest one is the generation and used of interface instead of pointers to structures. I still think this might be a good idea for some circumstances, but imo it does belong in this PR. @pboyer The functionality of the Java visitor was not rich enough of what I was trying to do, hence the deviation. It appears that the visitor hasn't had very attention payed to it. To back up this comment I would like to provide a couple of examples. I find it very useful to pass and maintain state in the call stack. Yes, all the state can be maintained in the visitor implementation, but this get very cumbersome. Yes, Accept is the more common name the Go code should use it. I changed it while trying to understand the code, it took me a while to get my head around things and I never changed it back. I second @cyberfox comment about Go and inheritance. There is not default receiver, to achieve implement inheritance (Java, C++, C#, Smalltalk, Javascript etc.) an extra argument needs to be passed. @savvopoulos and @cyberfox, I've merged the I would also like to bring
After this one or two intermediate workspaces and repos are needs to merge the changes into the antlr4-go repo. Any git masters who could help automate this? @pboyer I haven't addressed all the issues you have raised, but I hope this helps. @cyberfox thanks for your input. Where do you think we should do from here? |
Any updates on this PR and the Go implementation of the Visitor pattern? |
Any updates on this? |
Any updates on this? I solved the problem by implementing func (v *MyGrammarVisitor) Visit(tree antlr.ParseTree) interface{} {
return tree.Accept(v)
}
func (v *MyGrammarVisitor) VisitChildren(node antlr.RuleNode) interface{} {
for _, child := range node.GetChildren() {
child.(antlr.ParseTree).Accept(v)
}
return nil
} |
@cyberfox see example on the doc/go-target.md
Some differences to note.
This is a breaking change to the generated Listeners interface.
I'll added an option to generated this version with the default being current *struct signatures.
This is an alternative implementation to
#1807