-
-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gherkin: Delegate to Gherkin-Go prebuilt binaries #425
Comments
FWIW there are only 348 SLOC left in gherkin-java after I deleted 10K SLOC. What @cucumber/committers want to help out with the remaining implementations? Nothing is more satisfying than deleting code :-) Also note that gherkin now supports a new |
the bullet points are editable. Is this by design? what about platforms where there are no gherkin-go libraries? E.g. as there's no viable Go compiler, is it out of scope? |
Go's toolchain can cross-compile binaries for most platforms, from any OS that has a Go compiler. We're currently doing that as part of the CI process. This is what we're currently building:
The CI process also uploads them to S3, where they are publicly accessible. Try this: curl https://s3.eu-west-2.amazonaws.com/io.cucumber/gherkin-go/master/gherkin-darwin-amd64 -o gherkin-darwin-amd64
chmod +x gherkin-darwin-amd64
./gherkin-darwin-amd64 --json features/*.feature If there are additional platforms we need to support I'm sure we can make it work for most of them. |
Yes, but only for committers. See https://blog.github.com/2013-01-09-task-lists-in-gfm-issues-pulls-comments/ |
This is not relevant for me at work yet, but we're targeting aarch64 (aka arm64) and I'm expecting to see more of that in the future, not less. So adding that as a target CPU seems desirable. |
@muggenhor PR for that would be much appreciated |
As I have mentioned earlier, having a process-execution-based parser is great for maintainability, but there are other concerns (performance, security, working offline) that make still room for native parsers. As far as I understand now the decision has been made that this repo will contain the go wrappers. Is there any advice on how/where the native parsers should be maintained (if the community wishes to do that of course)? Shall we host them under the "cucumber" organization (e.g. |
Regarding performance - I haven't done any throrough benchmarking, but here is a small one:
For Java there is no noticeable degradation in performance. The go binary is very fast:
Regarding working offline: The jar currently bundles binaries for windows/linux/osx + amd64/x86_32. It only tries to download binaries from S3 for other platforms. We should probably bundle all the binaries so we never have to go online. It would increase the size of the jar a little, but disk is cheap. @gasparnagy I hope this adresses your concerns about performance and the ability to work offline. Let's focus on security issues next and see where that brings us. What securtity concerns do you have? |
have you considered using upx for go binary compression? UPX may decrease the executable startup performance by a tiny fraction, but may reduce the binary size by more than a half, I was successfully using it for more than a year, there was one moment when it was producing a failing binary, but that was an issue on go end. |
Thanks for the upx pointer |
in a day or two I will be able to, if it can wait. I have a baby now, who changed my free time dramatically :) that is why I'm not that frequent contributor. but whenever I can make a small thing, I will be happy to. |
Congratulations on the baby @l3pp4rd!!!!! I don't think |
Are we including the Windows tests for Ruby, too ;) |
I’m referring to the shared approval tests for gherkin - comparing output with files in ‘testdata’. |
Ah, yeah. Makes sense! |
−8,020 in my PR for the Ruby stuff so far. :) |
@SabotageAndi the security issues are going to be a concern. See this thread: https://groups.google.com/d/msg/cukes/ZYDs13qoVTY/OtvCvl_fBQAJ I think we're at an impasse here.
I'd like to explore another avenue. Installing the executable separately. This is how WebDriver works. The various native language implementations are just thin shims that talk to a daemon running locally (chromedriver, geckodriver etc). This has to be installed separately. Another option is to write the shared code in a language that can be compiled to many targets, such as Haxe. I don't think it's a viable option though. Few people know Haxe, so maintenance would be hard. What do you think @charlierudolph @brasmusson @SabotageAndi @gasparnagy @mattwynne? |
This makes it hard to keep projects selfcontained. It'd require some maven/gradle plugin to fetch the executable as part of the build. E.g: https://github.com/Ardesco/selenium-standalone-server-plugin This solution would be less then perfect though. Most build systems in the java ecosystem can only download artifacts through a proxy/cache like Nexus. |
Do you have any suggestions @mpkorstanje? |
I think there are exactly four options: transpile, interpret, generate and delegate. And we're evaluating all these options against the same set of constraints: should be usable and should minimize effort. So far we've found generation and delegation to be lacking. Only interpretation and transpilation have not yet been tested. You've added an ease of maintenance requirement. But I don't Haxe is going to any harder then it already is. We're already dealing with a parser generators for a relatively nice language. If there were more people then we wouldn't be trying to reduce the duplication of effort in the first place. So depending on how well it transpiles Haxe would be an option. As for interpretation: I couldn't find a JavaScript interpreter for most languages. |
By delegation I assume you mean delegating to a binary executable, communicating over streams (what's in gherkin 6). There are some problems with our current implementation:
If we can address these problems, I don't think delegation is going to be a significant problem. I propose we stop bundling the executables inside the platform-specific packages. We'll refactor the native code so it can use either of the following strategies for talking to the executable:
Users would have to download the binary executable in addition to installing the native package. We would need to build version negotiation into the protocol - the executable and the native package may drift out of sync. This is the model WebDriver uses. It's a bit clunky, but it works. I can't think of a better strategy right now. |
Just adding for the record that transpilation is not an option. I don’t want to build this new infrastructure and exclude all the platforms that are unsupported by the transpiler. |
Me neither. But I don't think it's a winning one. |
Would it be possible to use the shared code as libraries rather than executables? Could the Go code be compiled into something that each platform could use as a library instead? |
It’s possible to compile Go code to shared libraries. We’d still need to bundle them, so bundle sizes would be similar. Not sure if it would shut up malware detectors. It might be worth a try! |
Should be easy to test this if we can find a jar with some native code in it. And we do! Perhaps people can test this from behind a firewall? https://search.maven.org/remotecontent?filepath=org/jogamp/jogl/jogl-all/2.3.2/jogl-all-2.3.2-natives-linux-amd64.jar |
I have been very busy lately but I have picked up on this as I have seen that languages that have not moved to the new structure will be removed in January. Honestly, I have little time right now to get Cucumber.ml into this structure. Mostly, and I hate to admit this, but I do not understand how to get the protobuf messages information from the monorepo to my repo. I have tried twice now and I am, at this point, just avoiding this because I do not have the time to figure it out on my own. I will need someone who has time to hand-hold me through the process because it seems out of my ability. If someone has the time and patience to help me, I would appreciate it but right now I do not have the time as I have another project that has my time right now. I am hoping to get it out of the way but I will need a long paring session to understand what exactly I need to do to get information that my project needs from the monorepo into mine to avoid deletion. |
I think we should allow for a combination of approaches. The standardized test suite (feature files and corresponding golden json files) has been very successful, enabling us to growing 8 different parser implementations fulfilling the same gherkin language specification. Whatever we do we must not loose this ability. Side note: I think json golden files are better for qualifying a parser implementation than protocol buffer golden files, since it requires less of the parser implementation to show compliance to the gherkin language specification. For the sake of argument, lets assume that we come up with an approach that works for Java, Javascript and Ruby (that is gherkin parsers to be used in Cucumber-JVM, Cucumber-JS and Cucumber-Ruby), but that approach is not suitable for the .NET environment. Then the gherkin parsers for Java, Javascript and Ruby can be developed and released together (using the mono-repo), and the gherkin for .NET can be developed and released separately still qualified for compliance to the gherkin language specification, and still hosted in the cucumber github project. The hosting in the cucucmber github project should probably require the used of the berp parser generated so that the is some consistency of implementation between the gherkin parsers hoisted here, even though not all are developed and released together. This is similar to that not all Cucumber-JVM modules are developed and released together anymore. The non-Java language modules are released separately. Similar to a issue discussed here, this change was done to reduce the maintenance burden for the Cucumber-JVM core maintainers. |
On .NET, Selenium and all of the webdrivers are distributed by packages. See: https://www.nuget.org/packages?q=selenium Additionally, you can then only have one version installed on your system. There will be breaking changes in the IPC. So having a separate binary to install is also not an option in .NET. As @mpkorstanje wrote in #425 (comment) I think it is essential, that projects are self-contained. It is really annoying for me, when I have to install a long list of additional tools/libraries to get a project to compile. @brasmusson Don't we have this golden files here: https://github.com/cucumber/cucumber/tree/master/gherkin/testdata About shared libraries: AFAIK they should work on .NET Standard. I have to do some experiments. But native stuff in a .NET program makes always problems. And the size increase of the package is still there. I am still not sold, that this would be good for .NET. |
@SabotageAndi Yes, we have the golden files, my point is just that we shall make sure not to loose them, even if we move away form that all gherkin parsers are stored in the |
But in contrast, the jar blows up from 342 kb to 45 MB(!), I understand that it is hard to support many parsers, but is it really the way to go to package such large binaries instead? Won'T it be better to write the parser in some kind of Parser-Language and the generate code for each language from that? |
Unrelated but I don't think it's worth switching to Gherkin 6 just yet for the plugin. Cucumber JVM also can't use it. |
I'm looking to implement Gherkin compatibility in a new test tool and need to be able to turn feature specs to pickles in a browser. Delegating to an external Go process clearly won't work and I don't want the overhead of having to call a back end API for the functionality.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. |
Removed stale and added Slow Burner to keep things open for now. Consider this particular issue on hold at least until Cukenfest where there'll hopefully be some more discussions about how to address things. Sorry folks, but thank you for the patience! |
McAfee reported an “Artemis!ACC6568E95DA” regarding artifact .m2\repository\io\cucumber\gherkin\6.0.14\gherkin-6.0.14.jar\gherkin-go-windows-386.exe. Is there anything wrong with the file regarding viruses or is it a false positive? |
Informally we have decided against continuing this approach and the latest versions of Gherkin no longer include the go executables. |
The Gherkin parser is currently implemented in too many languages. It's too much to maintain. Each language implementation should spawn a child process (
gherkin-go
), read itsSTDOUT
and deserialise the output to language-native objects using Protocol Buffers.If you want to have a go at implementing this for one of the languages below (or a language that isn't listed here), look at #424, which implements this for Java.
gherkin-go
Makefile
,.rsync
)make
)The text was updated successfully, but these errors were encountered: