Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental Builds #130

Closed
jimafisk opened this issue Jan 28, 2021 · 46 comments
Closed

Incremental Builds #130

jimafisk opened this issue Jan 28, 2021 · 46 comments
Labels
enhancement New feature or request

Comments

@jimafisk
Copy link
Member

From @padraicbc

A lot of files are getting rebuilt each time outside of the one actually saved/changed so it may be worth looking at incremental changes based on actual files with new content. That will reduce build times in production especially with large sites.

@jimafisk jimafisk added the enhancement New feature or request label Jan 28, 2021
@jimafisk jimafisk mentioned this issue Jan 28, 2021
@padraicbc
Copy link
Contributor

For live editing using the server fsnotify tells you which files have changed so you could filter down to just the file itself and avoid removing/recreating files without changes. That can break where one file's changes influences another.
Ideally you would store a hash to see what has changed but that also has a cost so I suppose we will just have to benchmark a few approaches and see what works best. Adding concurrency to the build steps will greatly speed up the process anyway so maybe start there?

@padraicbc
Copy link
Contributor

This branch is a first attempt at doing the local dev all in memory using a map as the filesystem and avoiding rebuilds for files that haven't changed.
The ssr compile step per component still has to be completed but it stores the previous layout's ssr state thus avoids needing to fully rebuild individual components that haven't changed. It also stores the previous component css so bundle.css also keeps updated.

Saving is not bad vs full initial build.

Building... Total build took 4.250995191s

Serving site from your "public" directory.
Visit your site at http://localhost:3000/
Total build took 1.194265819s
Total build took 1.075866166s

No doubt a multitude of bugs and there is excess code but a starting point at least.
To Opt in to live reload and in memory just run ./plen serve -L -M

@jimafisk
Copy link
Member Author

@padraicbc I've been playing around a bit with this today and it's insane! Especially when you run it with plenti serve -ML, the local workflow is so quick. On my computer the default starter is rebuilding in ~ 100ms.

The regular plenti server does throw some errors on your mem branch:

errs.go:50: Could not get node module: Could not open source .mjs public/spa/web_modules/navaid/dist/navaid.js file for copying: open public/spa/web_modules/navaid/dist/navaid.js: no such file or directory /home/jimafisk/Downloads/plenti/cmd/build/gopack.go on line 79
 /home/jimafisk/Downloads/plenti/cmd/build/gopack.go on line 103
Could not open source .mjs public/spa/web_modules/navaid/dist/navaid.js file for copying: open public/spa/web_modules/navaid/dist/navaid.js: no such file or directory /home/jimafisk/Downloads/plenti/cmd/build/gopack.go on line 79
open public/spa/web_modules/navaid/dist/navaid.js: no such file or directoryno such file or directory

I know you were showing me some gopack optimizations, maybe it's related? I've also updated gopack recently to make it work with Svelte v3.32.3 (see details here: #138). The updates I made actually may fix this on merge since they were similar to what I was seeing, I'm just confused why they are appearing even though your branch seems to still be using the older version of Svelte.

I tested your mem branch on a bunch of different repos with the -M flag and it ran flawlessly. I did hit a couple of snags, outlined below though.

When running on https://github.com/jimafisk/plenti_multi-pager_example I got:

 errs.go:50: Could not create props: ReferenceError: layout_content_cats_svelte is not defined /home/jimafisk/Downloads/plenti/cmd/build/data_source.go on line 309
javascript stack trace: ReferenceError: layout_content_cats_svelte is not defined
    at create_ssr:1:21

And on https://github.com/jimafisk/plenti_theme_test_bigspring it threw:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xaa105a]

goroutine 1 [running]:
plenti/cmd/build.Gopack.func1(0xc000880bd0, 0x25, 0x1f42fe8, 0xc000e7ab60, 0x0, 0x0, 0x0, 0x0)
	/home/jimafisk/Downloads/plenti/cmd/build/gopack.go:72 +0x3da
path/filepath.walk(0xc000880bd0, 0x25, 0x1f42fe8, 0xc000e7ab60, 0xc0005a5708, 0x0, 0x0)
	/usr/local/go/src/path/filepath/path.go:414 +0x457
path/filepath.walk(0xc000590600, 0x1b, 0x1f42fe8, 0xc000e7a8f0, 0xc0005a5708, 0x0, 0x0)
	/usr/local/go/src/path/filepath/path.go:438 +0x31b
path/filepath.walk(0xc000d14078, 0x13, 0x1f42fe8, 0xc000e7a4e0, 0xc0005a5708, 0x0, 0x2)
	/usr/local/go/src/path/filepath/path.go:438 +0x31b
path/filepath.Walk(0xc000d14078, 0x13, 0xc0005a5708, 0xc0001ae9b8, 0x6)
	/usr/local/go/src/path/filepath/path.go:501 +0x113
plenti/cmd/build.Gopack(0xc0001aea10, 0x6, 0x0, 0x0)
	/home/jimafisk/Downloads/plenti/cmd/build/gopack.go:49 +0x31a
plenti/cmd.Build(0x0, 0x0)
	/home/jimafisk/Downloads/plenti/cmd/build.go:169 +0x685
plenti/cmd.glob..func1(0x24bab40, 0xc000801530, 0x0, 0x1)
	/home/jimafisk/Downloads/plenti/cmd/build.go:46 +0x25
plenti/cmd.glob..func4(0x24bab40, 0xc000801530, 0x0, 0x1)
	/home/jimafisk/Downloads/plenti/cmd/serve.go:73 +0x83e
github.com/spf13/cobra.(*Command).execute(0x24bab40, 0xc000801520, 0x1, 0x1, 0x24bab40, 0xc000801520)
	/home/jimafisk/go/pkg/mod/github.com/spf13/[email protected]/command.go:846 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0x24ba8a0, 0xc000000180, 0x200000003, 0xc000000180)
	/home/jimafisk/go/pkg/mod/github.com/spf13/[email protected]/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	/home/jimafisk/go/pkg/mod/github.com/spf13/[email protected]/command.go:887
plenti/cmd.Execute()
	/home/jimafisk/Downloads/plenti/cmd/root.go:46 +0x31
main.main()
	/home/jimafisk/Downloads/plenti/main.go:8 +0x25

There are a few other things I was planning on working on this weekend, but I don't want master to diverge too much from all the work you're doing here and make merging a pain. Do you think we should pull your updates in and work together to squash these bugs?

@padraicbc
Copy link
Contributor

padraicbc commented Feb 28, 2021

The regular plenti server does throw some errors on your mem branch:

That was me stupidly doing too much storing without checks. There is still some logic to be ironed out in relation to which node module files shpuld be stored in mem(if any) to avoid repeated work in gopack but it should all be good now.

When running on https://github.com/jimafisk/plenti_multi-pager_example I got

I need to add a check to verify that we had hashed/seen the layout before. checking hash was > 0 fixes it

And on https://github.com/jimafisk/plenti_theme_test_bigspring it threw

I still have to add the logic for using theme folders. Having the example to use will make it trivial to add.

All bar themes should be good now. They worked locally for me anyway....

I will add the theme logic and tidy up a few pieces but happy enough to pull in whenever you want.

The more I thought about it, the more it seemed that all we care about is storing and serving bytes, we don't care about directories or stating anything. Nothing bar maybe walk would be useful in terms of a filesystem but that can be replicated more efficiently so a map and a type that implements ServeHTTP that serves our specific needs is all we actually need.

@padraicbc
Copy link
Contributor

For the themes I am getting lstat node_modules/navaid: no such file or directory. That comes from gopack where we walk over the node_modules. If destPath := tempBuildDir + "node_modules" is used inside NpmDefaults where tempBuildDir is equal to temp_build. Should we be simply copying content/layouts from theme into the root dir and also copying over the default node_modules there too or how does temp_build fit into it all?

@jimafisk
Copy link
Member Author

So temp_build is a folder that we create only when using themes (a better name for this would have been theme_build). It's created temporarily during the build process, we first copy over all the theme files and then copy all the project files to the same folder so they replace any files that match - then run the build on the combined files from the theme and the base site. This allows template inheritance to work so you can continue to pull upstream changes from a theme while overwriting things locally if you'd like. It also works recursively, so if you have a theme that inherits from another theme, the copying works so base theme elements get overwritten at each level. You can theoretically nest themes infinitely, although build performance would be impacted. This whole process should honestly happen in memory, there's no reason to write this stuff to the filesystem like we're doing now, so I created this issue: #81. Hopefully that makes sense, just let me know if I can clarify something!

@padraicbc
Copy link
Contributor

So temp_build is a folder that we create only when using themes (a better name for this would have been theme_build). It's created temporarily during the build process, we first copy over all the theme files and then copy all the project files to the same folder so they replace any files that match - then run the build on the combined files from the theme and the base site. This allows template inheritance to work so you can continue to pull upstream changes from a theme while overwriting things locally if you'd like. It also works recursively, so if you have a theme that inherits from another theme, the copying works so base theme elements get overwritten at each level. You can theoretically nest themes infinitely, although build performance would be impacted. This whole process should honestly happen in memory, there's no reason to write this stuff to the filesystem like we're doing now, so I created this issue: #81. Hopefully that makes sense, just let me know if I can clarify something!

Where should the node_modules live?

@jimafisk
Copy link
Member Author

jimafisk commented Feb 28, 2021

It could get messy how this is currently set up, but potentially node_modules could live at each level (in the theme and in the base site). If they live in the theme and in the base site, the node_modules from the base site will overwrite the ones in the theme. If they were omitted at either layer, the remaining one would be used. If they were omitted at both levels, the node_modules packaged in the binary of the current version of Plenti that you're using would be used.

Is that making sense? Basically instead of building from the project root like we normally do, we combine all the files (node_modules included) in the temp_build folder and build from there in order to get theme files included in the build, while keeping the theme and the project files separate for editing purposes.

Edit: It's possible that I've messed something up somewhere along the line, but that's ^ how it works in theory.

@padraicbc
Copy link
Contributor

padraicbc commented Feb 28, 2021

I think the issue is filepath.Walk("node_modules/"+module in gopack is only looking at the root directory. Then there is nothing to look at so fmt.Errorf("can't stat %s: %w", modulePath, err) comes into play

@jimafisk
Copy link
Member Author

That definitely makes sense, it's possible that was never updated to be theme aware by passing the tempBuildDir string like we're doing elsewhere.

@padraicbc
Copy link
Contributor

padraicbc commented Feb 28, 2021

In themes_merge.go what is the purpose of starting at the project root?

Scratch that, I get it now.

@jimafisk
Copy link
Member Author

That should pull in all the project files to overwrite the theme files that have already been copied over. We're purposely excluding the theme itself and other files that aren't needed from the build. This video at around the 37 minute mark helps explain that inheritance process: https://youtu.be/dT69Ph2XkjQ?t=2223.

@padraicbc
Copy link
Contributor

padraicbc commented Feb 28, 2021

If we are using the theme logic do we still need to copy assets over as we walk over the assets in merge?

@jimafisk
Copy link
Member Author

jimafisk commented Feb 28, 2021

The way I envisioned this is you can inherit full sites (layout, content, assets), but you can also excluded any of those aspects as demonstrated here: https://youtu.be/dT69Ph2XkjQ?t=2161. I imagine lots of folks will just want to inherit svelte templates and leave behind all the json files and any assets like .png, .jpg, etc. You can do that by adding an exclude key to theme_config in your plenti.json file. However, it's possible some folks might want to completely copy a site from somewhere else, and they may even want to use the images included with the theme but swap out just a few with their own images. Walking over the assets should allow them to apply the same inheritance principals in that case. Does that answer your question?

@padraicbc
Copy link
Contributor

I get the general logic. I think the temp_build is f**king with my brain tonight! Try out the latest commit to see how it works when you get a chance. I will watch the vid during the week and use the theme to get a better understanding of how it all fits together. Once the rest is working well it should be easy enough to add the theme logic. Especially when i understand it!

@jimafisk
Copy link
Member Author

Awesome I'll check it out! The temp_build is a bad naming convention first of all, and it should be virtual so it's kind of weird in general. The relevant theme stuff in that video starts around the 31:42 minute mark and goes to ~ 43 minute mark, everything else is general Plenti stuff.

@padraicbc
Copy link
Contributor

@jimafisk if we are using a theme, can we just iterate over the theme dir and add any assets/layout/content
to the base directory? If we want exisiting files to take precedence then we just ignore that theme file if one already exists. You can set it up so this happens only once in local dev on initial build. If a user pulls in new changes then the next restart will catch the new files. It would be trivial to keep a record of the related files in a json/config file if you needed to track which were from the theme.
I think having numerous node_modules could get messy very fast where one takes precedence over the other and has breaking changes that screw things up. Also multiple nested themes further complicates the issue.

@padraicbc
Copy link
Contributor

padraicbc commented Mar 2, 2021

Also should we not be specifically be copying over just the content, layout, ejected, node_modules and assets folders from any theme?

@jimafisk
Copy link
Member Author

jimafisk commented Mar 2, 2021

Hey @padraicbc, you're right that node_modules could complicate things and the build would be simpler if we just copied everything over to the project root and ran it the standard way. I do think keeping the theme files separate from the working project accomplish a few important things though:

  1. It makes it really clear to the user which files they have customized because they live in their project root.
  2. It makes pulling upstream changes easier. Like you said we could keep a record of theme files somewhere in config, but often the workflow would be that a user copies a theme file and modifies it. In that case we'd not only have to keep track of original theme files, but also which ones have been modified
  3. It makes switching themes easier. It's possible that a user would write a bunch of json content that they want to put into different layouts. Cleaning up the project root from the previous theme could get complicated.

@padraicbc
Copy link
Contributor

I get the advantages of keeping it separate but it does make building a lot more complex, especially doing it in memory. The more themes you add the worse it gets.
It is also treated very differently if we do keep themes in memory vs disk. We currently read off disk for the layout/content at the root level which is probably the right approach really as you could have a whole lot of json stored. Doing that means you need very different approaches to build based on theme vs root.

Also If you have conflicting filenames like the video example then it would probably be better to stop the build and ask a user to confirm or you end up with potentially very strange bugs.
For dependencies, are we going to going to join nested package.json into one file at the root. If so and there are different versions of the same package in each, what is the correct one to keep?

@padraicbc
Copy link
Contributor

padraicbc commented Mar 2, 2021

Another option would be to put them under layout/theme/somename/.. content/theme/ssomename/... which still keeps them separate but is the whole point of copying over from themes into a build folder just to allow overwriting or why not just compile from where it is?

@jimafisk
Copy link
Member Author

jimafisk commented Mar 2, 2021

conflicting filenames like the video example

In this case the preference would go to the project root. It would be the user's responsibility to name things accordingly if they do not want to override theme assets.

If so and there are different versions of the same package in each, what is the correct one to keep

This could be a harder challenge. I wish it were as simple as preference goes to the project root, but of course different versions of node_modules don't always have the same files that line up one-to-one. So you may end up with a temp_build folder that has a library with some files from an older version and some files from a newer version. I guess as long whatever version is specified in the project root has all the needed files, it should run even if there are lingering files from another version. There's also the challenge that a theme may depend on a certain package with a specific version - it would have to be the user's responsibility to ensure compatibility. There are a few ways to do that: 1) delete any node_modules in the project root and just use whatever the theme provides 2) Make sure you reference the right version in your project root 3) If the theme used a version of a module that you want at some point, pull in the theme at that specific commit.

it does make building a lot more complex, especially doing it in memory

If it's not tenable, for now do you think it would make sense to just running theme builds on the filesystem? Basically if you're working with themes, don't pass the -M flag for now until we can work out a more permanent solution?

@padraicbc
Copy link
Contributor

padraicbc commented Mar 2, 2021

I had it working 95% in memory but it was really ugly. I think tidying up the logic as is and creating the "Builder" type we chatted about the other day to allow easier access to config plus removing node would make implementing a lot easier, so maybe wait till then?

@jimafisk
Copy link
Member Author

jimafisk commented Mar 2, 2021

Yeah that makes a lot of sense. We're carrying around a lot of cruft with the node builds and some other old stuff that's just lingering. It will be nice to clean house a bit. When you're ready let's get this in and we can work off that for the next release. Thanks for plowing ahead with this!

@padraicbc
Copy link
Contributor

No worries. Probably better get feedback with the default setup and squash any obvious bugs before adding it to themes anyway. I will add a PR later in the week. The last commit fixed the other issues?

@jimafisk
Copy link
Member Author

jimafisk commented Mar 2, 2021

I just ran through all the testing sites again with your latest commits and everything seems to be working now, even the themes example - is that expected?

@padraicbc
Copy link
Contributor

The themes don't actually work properly though no?

@jimafisk
Copy link
Member Author

jimafisk commented Mar 2, 2021

It seems to work ok using https://github.com/jimafisk/plenti_theme_test_bigspring.

Screenshot

theme

I didn't dig too deep but the build isn't throwing errors and the site navigation is functional.

@padraicbc
Copy link
Contributor

padraicbc commented Mar 2, 2021

It wasn't working for me. I thought I had to still manipulate the path prefix for some parts but maybe it is the version of svelte I am using as I installed with npm after cloning. It can/will work fine once we keep the temp_build folder. What I was working on was trying to also keep that in memory which wasn't as straightforward.

Are all the routes working for you including when visitied directly?

@jimafisk
Copy link
Member Author

jimafisk commented Mar 2, 2021

The routes that exist, yes. That might be part of the problem, that theme isn't complete so the pricing, contact, and faq pages won't work because they don't exist.

@padraicbc
Copy link
Contributor

Client side routing is broken for me but it is definitley a svelte issue .

There a couple more issues and things I am not sure on.

  1. destPath := tempBuildDir + "node_modules" in NpmDefaults so if you happen to not have a node_modules folder at the root level then GoPack will break here

  2. If you have 1-n themes and each has it's own dependencies, should a user first go to each level and npm install or should all deps be concatenated and moved to the root package.json then run the install programmatically or how do you see this working?

  3. I thought initially that files did get copied over as I remembered seeing something relating to "too many open files" when a theme got added so I presumed we were watching those files too. I think that is where my confusion came from in relation to the temp folder. If a user adds a theme and wants to overwrite a layout/content, they copy/paste the files into the root level or just add files with the same names and never edit the theme files directly?

The fact we don't watch the themes folder, if a user wants to edit a lot of the files it's best just to clone the theme and use that as the root so rebuilding etc.. works on file changes?

@jimafisk
Copy link
Member Author

destPath := tempBuildDir + "node_modules" in NpmDefaults so if you happen to not have a node_modules folder at the root level then GoPack will break here

Yeah good call. I just changed some of this code to remove go:generate to use embed instead (#137), but Gopack would still need to be updated if we wanted to look at theme's node_modules like you said.

If you have 1-n themes and each has it's own dependencies, should a user first go to each level and npm install or should all deps be concatenated and moved to the root package.json then run the install programmatically or how do you see this working

This is a great point. I'm thinking for deps we should probably just manage this at the root level like you've been saying. If you're using a theme that relies on custom npm configuration, it should be your responsibility to copy that config to your project root and do any necessary installing there. Would you agree?

If a user adds a theme and wants to overwrite a layout/content, they copy/paste the files into the root level or just add files with the same names and never edit the theme files directly?

Yup that's exactly right. Theme files should not be edited directly so you can continue to pull upstream changes without overriding your work.

The fact we don't watch the themes folder, if a user wants to edit a lot of the files it's best just to clone the theme and use that as the root so rebuilding etc.. works on file changes?

Yeah that's definitely an option. If a user plans to override every file in a theme, it might make sense to not use it as a "theme" at all - just clone it and work off it like a normal project.

@padraicbc
Copy link
Contributor

embed does simplify things nicely!

This is a great point. I'm thinking for deps we should probably just manage this at the root level like you've been saying. If you're using a theme that relies on custom npm configuration, it should be your responsibility to copy that config to your project root and do any necessary installing there. Would you agree?

That works fine for me anyway. You could automate it somewhat but conflicting packages adds to the complexity so I think it is fair to ask a user to update the deps themselves. So essentially one package.json and one node_modules folder at root?. If that is the case then gopack need never look anywhere bar root for node_modules and we could check for the existence at the very start and bail if it doesn't exist? I got caught testing the in memory implementation on the theme example as there was no node_modules at root. My mistake but something easy enough to do I think so probably best to scream that fact should it arise.

All good on the rest.

@jimafisk
Copy link
Member Author

Yeah that all sounds good! If the node_modules folder is missing at the project root, the build should try to pull it from the embedded defaults:

if _, err := os.Stat(destPath); os.IsNotExist(err) {

Of course if you're using a theme that requires something custom, it's on you to make sure you have it in your node_modules.

@padraicbc
Copy link
Contributor

Yeah that all sounds good! If the node_modules folder is missing at the project root, the build should try to pull it from the embedded defaults:

if _, err := os.Stat(destPath); os.IsNotExist(err) {

Of course if you're using a theme that requires something custom, it's on you to make sure you have it in your node_modules.

Sorry, yes it was the modified build path for the theme that caused the error not at root.

@padraicbc
Copy link
Contributor

padraicbc commented Mar 22, 2021

@jimafisk A few more hours will have this ready but I will wait until v0.4 is out to comlete it. Should have first go at incremental disk builds added also by then.

@jimafisk
Copy link
Member Author

That's awesome, it's going to make a huge difference when we pull that in. I should have the v0.4 stuff done tonight.

@jimafisk
Copy link
Member Author

jimafisk commented Apr 7, 2021

@padraicbc amazing work on this! Your PR is in v0.4.13 and the in-memory stuff is crazy fast on my computer. Are you using this issue for incremental disk build stuff you're working on, or should I close this out?

@padraicbc
Copy link
Contributor

No prob, you can close it out. The disk inc should be a whole a lot simpler as it's just paths/hashes stored with a little more info in a JSON object. I should have that added in next few days.
On another note, ALTLR is definitely useful for the import replacement logic as is and could easily be used to create a svelte compiler. I can replace the regex import with it once I know the x,y, z of the logic behind replacements and see how that goes.

@jimafisk
Copy link
Member Author

jimafisk commented Apr 7, 2021

That sounds amazing, I'm very interested in the ANTLR stuff you're working on. I'm still trying to wrap my head around how the golang target for ANTLR even works. My understanding is antlr4 is written in java, does this come with some of the same challenges as cgo or is it fundamentally different? Just let me know how I can support your efforts!

@padraicbc
Copy link
Contributor

The ANTLR Java binary takes in the lexer/grammar files and spits out java visitors/listeners depedning on what you pass. It also supports a few other languages. To get visitors for go you woukd antlr4 -Dlanguage=Go -visitor -no-listener *.g4 where *.g4 are your gramar/lexer files. There is a Go runtime which is maybe not the greatest example of go code but I have had no issues running the examples here. I use github.com/padraicbc/antlr4) but it is identical. I will more than likely be changing the runtime so I just cloned it locally and changed the paths.

Java is only used to parse the grammar/lexer files and output the Go files. After that it is all Go so no calling into Java. It would completely remove the dependency on CGo/v8 so while there is room for improvement with the ANTLR Go runtime, it would still be light years faster, and greatly simplify the release builds.

For the import parsing.I am not 100% on the logic from here down. The ANTLR parser will pull any imports/exports which can easily be replaced/manipulated/removed as you can get access to the start/end offsets for each statement. The Gopack import search could also be done in a similar fashion which would be more robust than the regex.
This parser using the standard lib html parser can extract any of the root script/style/html sections which can be passed to ANTLR and any other processing that is integrated like css prefixing etc..

@jimafisk
Copy link
Member Author

jimafisk commented Apr 7, 2021

Don't let me distract you too much, I'm mainly just thinking out loud. I'll spend a little time learning the basics of ANTLR and try to get up to speed on the principals of compiler design.

To get visitors for go you would antlr4 -Dlanguage=Go -visitor -no-listener *.g4 where *.g4 are your gramar/lexer files.

I assumed we'd use an existing .g4 grammar file like https://github.com/antlr/grammars-v4/blob/master/javascript/javascript/JavaScriptParser.g4 to produce this. Looks like you might have something similar here: https://github.com/padraicbc/gojsp/blob/master/JavaScriptParser.g4

Java is only used to parse the grammar/lexer files and output the Go files. After that it is all Go so no calling into Java.

So it sounds like we would run this on the current ecmascript spec, and potentially have to run in it the future if the spec changed, but this would not be run often (i.e. every time we build plenti).

I will more than likely be changing the runtime so I just cloned it locally and changed the paths.

Do you plan on contributing back to https://github.com/antlr/antlr4/tree/master/runtime/Go/antlr or keeping https://github.com/padraicbc/antlr4 as a more permanent fork? If it's helpful for me look at anything in particular, or even try to find people who are better versed in this than myself to contribute just let me know. I know that sometimes too many cooks can slow things down though.

it would still be light years faster, and greatly simplify the release builds.

🤤 You have my buy-in. Cgo is proving challenging esp as new architectures like arm64 for m1 are added, windows support would be nice too.

I am not 100% on the logic from here down.

The "removing" of static import/export statements is really just wrapping these lines in comments because v8 would blow up when it tried to read them. So instead of reading across files by following their imports, we dump all components into the same ctx so they can reference each other. To avoid naming conflicts, we just make sure that each has a unique component signature (e.g. layouts/components/grid.svelte becomes layouts_components_grid_svelte) - all that means is we rename the variable name from "Component" to whatever the unique component signature is. Not only are we changing where the component is defined, we also needed to find all the places those components are used and put the component signature in there as well. Also had to handle edge cases like multiple named imports coming from the same file by appending the import name to the end of the component signature. It's a little wild, but I didn't have a better way to do it at the time.

The Gopack import search could also be done in a similar fashion which would be more robust than the regex.

I think a lot of this nonsense could go away if we start parsing components properly vs massaging them to work in v8. If we compiled the components with Go directly, I'm not sure if we'd need Gopack in the same capacity we have it currently. Navaid doesn't change all that often so we could potentially build that into our ejectable core and if folks need to include node_modules for something specific to their project, we could leverage the esbuild api for a more robust experience.

the standard lib html parser can extract any of the root script/style/html sections

We'd need a way to comprehend htmlx control structures, or we could even reinvent these a bit. Also scoping css to the markup provided in the component + shaking out unused styles from the bundle. I'm honestly not sure I get the full picture of how things like svelte's reactive declarations would work, but maybe I'm getting ahead of ourselves a bit.

@padraicbc
Copy link
Contributor

I assumed we'd use an existing .g4 grammar file like https://github.com/antlr/grammars-v4/blob/master/javascript/javascript/JavaScriptParser.g4 to produce this. Looks like you might have something similar here: https://github.com/padraicbc/gojsp/blob/master/JavaScriptParser.g4

Yes, and for now both .g4 files are identical to the ANTLR repo but there are some errors that need fixing so they will be different at some stage!

So it sounds like we would run this on the current ecmascript spec, and potentially have to run in it the future if the spec changed, but this would not be run often (i.e. every time we build plenti).

Yep, a few fixes to be added like those above and some manual work of actually using the new Visitor/Listener .

A 10000ft view of the genral flow when using a visitor:

A few types of interest:

  • antlr.ParsetreeVisitor

  • BaseParseTreeVisitor which implements antlr.ParsetreeVisitor

  • BaseJavascriptVisitor which embeds BaseParseTreeVisitor above

  • JavaScriptParser

  • Embedding BaseJavaScriptParserVisitor like I have here means you have also satisfied then you have implemented all methods to visit the tree i.e ParseTreeVisitor.

  • As you "visit" each node you can call v.VisitChildren like implemented there (the implementation varies based on your needs) and you can decide what to do next. Filter out children, pull data, continue down the tree etc.. Or you can just parse the children in the visit like the 3 VisitModuleItems variations here. There is no one right way, all depends on your goals.

    • In VisitChildren to continue down the part of the tree, you call child.(antlrParseTree).Accept(v).

      • If the particular VisitX method is implemented on your own visitor type then v.visitX is called like in that example .

      • If not implmented the embedded BaseJavaScriptParserVisitor.VisitChildren is called which falls back to it's embeded BaseTreeVisitor.VisitChildren. That just returns nil so a no-op essentially.

TLDR; Just embed BaseJavaScriptParserVisitor in your own visitor and overwrite any methods of interest. Or just copy/paste every method and change (v *BaseJavaScriptParserVisitor) to (v *YourVisitor) to have all on your visitor and edit as needed.
Hopefully that all made sense.

It is a slightly complicated world of satisfying interfaces that can likely be improved upon but once setup then re-running to generate for any new ecma changes is just a matter of implementing the new methods for the new types/changes. All the rest should generally remain unchanged bar maybe a new select case or two .

We'd need a way to comprehend htmlx control structures, or we could even reinvent these a bit. Also scoping css to the markup provided in the component + shaking out unused styles from the bundle. I'm honestly not sure I get the full picture of how things like svelte's reactive declarations would work, but maybe I'm getting ahead of ourselves a bit.

Famous last words ... but I think the htmlx will be the easiest part It has a very llimited grammar outside of standard html. We could likely adapt the builtin html parser or just use ANTLR again. One of the svelte devs has a short video that gives an overview of how things work.
I don't think it is overly complex.

This is also a really nice lecture series on ANTL4. The implementaion varies vs how it works in Go but it gives you a really solid undestanding of how to work with all parts of ANTLR.

I am going to walk through the svelte compiler code and implement the ts types in Go. I will add some pseudocode to how the types get used and try to buld up the flow of how it all works. Then it is a matter of moulding the parser/compiler to output what is needed. I am happy for anyone to help...

@jimafisk
Copy link
Member Author

jimafisk commented Apr 8, 2021

@padraicbc you're next level 😳💨

I am happy for anyone to help...

I'm digging into the materials you sent so I can get myself up to speed at a basic level. I wonder if my time would be best spent trying to find someone with the skill set to hit the ground running with you. Let me know if it's helpful to connect about specific tasks you need done or strategy for getting eyes on the project.

@padraicbc
Copy link
Contributor

I started porting this over to Go to try out integrating the Parser. You can see an import ImportDeclaration tyope is used, the definition is here.
That basically maps to ImportFrom in the ANTLR parser.
To access the data needed we could use VisitImportStatement or VisitImportBlock.

Taking VisitImportFromBlock() as an example you can see the direct relationship between the grammar and all possible fields on the context object :


// grammar snippet
// importFromBlock
//     : importDefault? (importNamespace | moduleItems) importFrom eos
//     | StringLiteral eos
//     ;
func (v *visitor) VisitImportFromBlock(ctx *gojsp.ImportFromBlockContext) interface{} {
      //  you can check of which these != nil and pull out what you need accordingly
       ctx.ModuleItems() // look at moduleItems in grammar to see what this has...
        ctx.ImportDefault()
        ctx.ImportFrom() 
	ctx.ImportNamespace()
	ctx.StringLiteral()
	
      // optionally visit children or do all the work here using above....
	return v.VisitChildren(ctx)
}

It is actually a lot easier than it seems once you have the maze of interface satisfied.The grammar shows you what is available on each node/context. importDefault? means may be there, (importNamespace | moduleItems) means one or the other but not both.

A large part of what is needed is creating equivalent objects in Go that you see in the typescript code like ImportDeclaration . Then it is just a matter of implementing the code to transform which is pretty straightforward. I am happy to hop on a call and go through the ANTLR logic a bit more and discuss a plan.

@jimafisk
Copy link
Member Author

jimafisk commented Apr 9, 2021

Ok nice, that makes sense to me at a high level. Let's hop on a quick call next week, I'll send you a calendar invite. Just let me know if another day/time works better and we can coordinate in email.

I'll close this issue for now, but feel free to brainstorm or share ideas in this queue!

@jimafisk jimafisk closed this as completed Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants