Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: revamping code generation #1024

Closed
6 of 7 tasks
evancz opened this issue Aug 24, 2015 · 16 comments
Closed
6 of 7 tasks

WIP: revamping code generation #1024

evancz opened this issue Aug 24, 2015 · 16 comments

Comments

@evancz
Copy link
Member

evancz commented Aug 24, 2015

There are a number of issues that it makes sense to tackle all at once:

  1. The potential for name collision as described in Module name collision in generated code #826
  2. Decent amount of code spent in the intro and outro of modules.
  3. We add all modules to Elm such that they are public to the world.
  4. Make it work with Closure Compiler better
  5. We want to have nice support for node
  6. The bad error messages when folks say something like Elm.fullscreen(Erm.Main) in JS
  7. We rebuild a decent amount of stuff for each "instance" of Elm as described in Memoize module make functions in the embedded scenario #888

With @JoeyEremondi working on dead code elimination, it seems like a reasonable time to start addressing these things in one pass. The point of this issue is to describe the plan for this in a coherent way.

Format for Generated Elm Code

When elm-make spits out a hunk of JS, the general format I have in mind is like this:

var Elm = function() {
    var elm-lang$core$Basics$fst = ...
    var elm-lang$core$Basics$snd = ...
    var elm-lang$core$Basics$degrees = ...
    var evancz$elm-html$Html$div = ...
    var evancz$elm-html$Html$text = ...
    ...

    return {
        fullscreen: ...
        embed: ...
        ...
    };
}();

All top level declarations in the whole project are turned into fully qualified names. There are no more module boundaries in generated code. So when I want to refer to Html.div I refer to evancz$elm-html$Html$div in the generated JS. And when we do dead code elimination, it will just mean leaving out some subset of these definitions.

This will address points 1, 2, and 3 entirely (name collision, intro/outro, exposing too much). I suspect it will also go a very long way with point 4 about Closure Compiler. One of the big problems we had there was that if you expose an object and use one field from it, you gotta keep all the fields. This is no longer an issue with module exports, so I am hoping things will work a lot nicer. We could at least pass things through there to do variable renaming and get nicer names.

Initializing an Elm instance

Right now initializing Elm code has certain issues (points 5 and 6) that it'd be nice to improve. We can conditionally generate the headers needed to get things working with node, that's not too crazy. The big change here would be in how we initialize Elm widgets.

// generated JS from the Elm code

var app = Elm.fullscreen('Main');

Notice that we give a string version of the module name we want. This means if there is some misspelling, we can give an error like "Cannot find module Main, did you mean one of these? ..." and then point to some other resources on what might be going wrong.

We also only have a very small API coming off of the Elm object now.

Reducing overhead of many "instances"

In the vast majority of cases, it is safe to share values across every instance that is out there. This is true of all non-signal code. When it comes to Mouse.position we need to be more careful because it matters where you are embedded.

Right now I am purposely leaving the "native format" unspecified, but it would make sense to me to have certain kinds of native code. The categories might just need to be pure/impure and we need to duplicate the impure parts for each instance but nothing else. If things depend on impure things, they also need to get duplicated. It should not be too hard to have an impure thing infect its dependencies.

Plan

To get this chunk of stuff done, it makes sense to break it into chunks in this order:

  • Add package information to the AST.Module.Interface record. We need to know the user and package name for every module to make canonicalization truly canonical and solve Module name collision in generated code #826. This should be a PR of its own.
  • Make some tweaks to the canonicalization code to make uses of top-level values point to the fully qualified version, so Canonical Local "x" becomes Canonical (Module "Here") "x" and we generate the correct code.
  • Modify elm-compiler and elm-make to spit out code in the new format described above, with no module boundaries. The existing core and native stuff will not work with this. That is fine, we will rebuild them.
  • See how this interacts with Closure Compiler. Are there any bad problems?
  • Plan out how native code needs to fit into this.
  • Get core using the new format
  • Start doing dead code elimination on the whole thing
@mgold
Copy link
Contributor

mgold commented Aug 24, 2015

Nitpick: You'll need to turn dashes into underscores, e.g. elm-lang into elm_lang.

Question: How does this handle unexported values in modules? Are they hidden somehow? Or is the hiding at name-resolution-time?

@vilterp
Copy link

vilterp commented Aug 25, 2015

What happens if there are multiple modules running in a page and they are using different versions of some library? Not sure how this interacts with the discussion of multiple instances, or if version numbers need to be added to the fully-qualified identifiers of values.

@JoeyEremondi
Copy link
Contributor

I think that would be an error right now, since elm-make needs a consistent
set of dependencies? But it's definitely good to keep in mind.

On Tue, Aug 25, 2015 at 8:40 AM, Pete Vilter [email protected]
wrote:

What happens if there are multiple modules running in a page and they are
using different versions of some library? Not sure how this interacts with
the discussion of multiple instances, or if version numbers need to be
added to the fully-qualified identifiers of values.


Reply to this email directly or view it on GitHub
#1024 (comment)
.

@laszlopandy
Copy link
Contributor

@vilterp That is not a problem because there is a closure around the whole app. This only removes the closure between modules, not between Elm and the outside world.

@evancz
Copy link
Member Author

evancz commented Aug 25, 2015

@vilterp, @laszlopandy is correct that it's all in a closure, so they won't interfere with anything else in the page. @JoeyEremondi is also correct that it'd be impossible to build a single closure that had multiple versions of the same thing.

@mgold, since everything will have a canonical name, it actually does not matter if "unexported values" are available. We know from the checks in the compiler that the module boundaries are respected. Those boundaries do not need to be replicated in the generated code because no one can write a new function that refers to a "hidden" value.

@evancz
Copy link
Member Author

evancz commented Aug 25, 2015

I opened #1025 about what to do with hyphens. I am not sure how to do that yet.

@kmarekspartz
Copy link

It looks like this does nothing about #873 (and may make it worse!). If codegen is changing this much, it may be possible to fix that bug at the same time.

@rtfeldman
Copy link
Member

I opened #1029 about JS module compatibility.

@JoeyEremondi
Copy link
Contributor

It doesn't solve #873, but I don't think it will make it worse. I've been
working on reference analysis for DeadCodeElimination, so extending that to
check for self-reference not across lambdas should be not to hard.

On Wed, Aug 26, 2015 at 9:33 AM, Richard Feldman [email protected]
wrote:

I opened #1029 #1029
about JS module compatibility.


Reply to this email directly or view it on GitHub
#1024 (comment)
.

@rtfeldman
Copy link
Member

I believe @laszlopandy's suggestion of "specify main in elm-package.json" would be a better fix for "The bad error messages when folks say something like Elm.fullscreen(Erm.Main) in JS" because it eliminates that class of errors.

In that world you'd just call Elm.fullscreen() and it would work, because the notion of what main to run would have been already incorporated into the compiled output, and verified at compile time.

@laszlopandy
Copy link
Contributor

Richard, not if you have multiple mains. Which apparently circuit hub needs.
That's why I promised elm-package.json has a list of main modules.

On Wednesday, August 26, 2015, Richard Feldman [email protected]
wrote:

I believe @laszlopandy https://github.com/laszlopandy's suggestion of
"specify main in elm-package.json" would be a better fix for "The bad
error messages when folks say something like Elm.fullscreen(Erm.Main) in
JS" because it eliminates that class of errors.

In that world you'd just call Elm.fullscreen() and it would work, because
the notion of what main to run would have been already incorporated into
the compiled output, and verified at compile time.


Reply to this email directly or view it on GitHub
#1024 (comment)
.

@rtfeldman
Copy link
Member

I'm curious if multiple mains are a strict "must have" there, or just the easiest way to implement given current tools?

@evancz
Copy link
Member Author

evancz commented Aug 26, 2015

@rehno-lindeque, I know you were building multiple mains in one call to elm-make. Folks are discussing some changes to elm-package.json that might make this easier to specify. Everyone, please continue this discussion on elm-lang/elm-make#49.

@rtfeldman and others, please try to self police the curation stuff described here when possible.

@rtfeldman
Copy link
Member

Argh, sorry about that. Should have made a separate issue.

@evancz
Copy link
Member Author

evancz commented Aug 26, 2015

No worries, it's not an immediately intuitive way to do things (took me 3 years to think of it) so I expect I'll be adding reminders like that for a while ;)

@evancz
Copy link
Member Author

evancz commented May 12, 2016

Done enough with 0.17. Makes sense to manage DCE as a separate thing as it is more complicated than "just doing it" as discussed with @justinmanley yesterday!

@evancz evancz closed this as completed May 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants