Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write initial spec, and add some examples #21

Merged
merged 22 commits into from
Nov 16, 2022

Conversation

nicolo-ribaudo
Copy link
Member

@nicolo-ribaudo nicolo-ribaudo commented Oct 28, 2022

This PR writes the initial spec text for module declarations. There are some TODOs left for some edge cases, but everything else is here. It's a diff on top of tc39/ecma262#2905.

I also added an examples folder with some examples: I used them to help myself specifying the correct semantics, and throwing them away would have felt wasteful :P

PREVIEW

This commits introduces the definition of a ModuleDeclaration:
    `module` [no LineTerminator here] Identifier `{` ModuleBody? `}`
and it extends the ModuleSpecifier production to allow importing
module declarations by id.

ModuleDeclarations are currently only allowed at the top-evel of scripts
and modules (including both top-level modules, and nested ones). They
cannot be imported or exported yet.

Importing an undeclared module fragment, or a module fragment declared
in a nested module scope, is an early error:
    import foo; // ok
    import bar; // error, it's only visible inside `foo`
    import baz; // error, not declared
    module foo {
      import foo; // ok
      import bar; // ok
      module bar {};
    }

This commit does not define runtime semantics yet.
This commit adds early error to prevent using
    import foo;
where foo has not been declared as a module fragment.

Module fragments must be declared either in the current module, or in
an outer module:

    module foo {
        module bar {}
        import bar; // ok
        import foo; // ok
    }
    module baz {
        import bar; // error
    }
    const block = module {};
    import block; // error

Module declarations are hoisted, so this is valid:
    import foo;
    module foo {}
This commit makes this code work:

    module foo {};
    typeof foo; // object
    foo instanceof ModuleBlock; // true
    await import(foo);
This code should now work:

    module foo {
      export { x } from "./external";
      export const y = 2;
    }
    module bar {
      export { x, y } from foo;
      export const z = 3;
    }
    import { x, y, z } from bar;
This commit allows the following code to work:

    // a.js
    import { mod } from "./b.js";
    import { value } from mod;

    // b.js
    export module mod { const value = 1; }

It also supports re-exporting module declarations, except for
declarations re-exported using `export * from ...`.
Additionally, it may not work correctly with import cycles.
}
```

> **TODO**: Should the _binding_ introduced by the module declaration be initially in TDZ, as it is for class declarations?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It does not like class which has a static init block contains user code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO removed - eb53890 (#21)

Classes initially didn't have static blocks either, but they still had exports ... that contained runtime code!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, because they can extends some runtime expression

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, hoisting seems a lot more coherent than const-style bindings, given that these are read "early".

spec.emu Outdated
<emu-alg>
1. Let _name_ be the StringValue of |Identifier|.
1. Let _currentScriptOrModule_ be GetActiveScriptOrModule().
1. Assert: _currentScriptOrModule_ is not *null*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

button.setAttribute("onclick", `
  Promise.resolve('module block {}; import(block).catch(e => print("caught", e))')
       .then(eval);
`);

see whatwg/html#3295

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's annoying, thank you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about how to fix this, I'll just add a TODO for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved by d21d66d (#21).

spec.emu Outdated
</emu-alg>

<emu-grammar>
ScriptBody : ScriptItemList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about this ScriptBody.

<script>
module A {}
</script>
<script>
import(A) // does it become a global reference?
</script>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, A becomes a global reference because that code behaves like this:

<script>
const A = module {}
</script>
<script>
import(A);
</script>

@@ -0,0 +1,47 @@
# Hoisting

Module declarations are hoisted, and can be imported before their declaration:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow them to be hoisted, then we will need to special handle export default module Identifier {} and export default module {} (just like what we did for functions).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 240191d (#21)

spec.emu Outdated

ModuleItemList :
ModuleItem
ModuleItemList ModuleItem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it can only appear at the top of a Script or a Module?

{
    module x {}; // syntax error?
    import(x)
}

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If there is a good reason to allow them in nested blocks then we can do it (however, it considerably complicates the static semantics), but since they can always be manually moved to the top-level scope it doesn't seem like a bad restriction. There are already other constructs that are limited to the top-level (export and import), + you can always use a module expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I definitely imagined that they'd be permitted nested anywhere; it's just that they'd only have the ability to be statically imported if they were at the top level. Nesting might be useful because all sorts of code transformations end up making certain top-level code nested; we should insert restrictions only where needed semantically IMO.

Nested module declarations might be less useful than top-level ones, but a natural place where you might expect them to "just work" (even though it's not really a significant use of the feature) is to declare a module and then use it as a variable (as a module block), in some nested code. For this case, one could even imagine a bundler placing a number of module fragments in the same inner nested scope, so that the module block passed to the worker closes over them.


$$ {\text{module declarations} \over \text{module expressions}} = {\text{function declarations} \over \text{function expressions}} $$

Except for the "they are static declaration parts", module declarations behave exactly like module expressions assigned to a `const` variable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, like const/let - the declaration hoists to the top of the block, and the initialization does not hoist, creating a TDZ between the start of the block and the module declaration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed eb53890 (#21) to clarify it: yes you can use them before their definition, but it's more like function than like var (because their value is already present).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it has function scope or block scope?

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Block scope, but since they can only appear at the top level of scripts/modules the only noticeable difference is that <script>module a {}</script> doesn't create a property on window. However, I consider this detail a minor semantic difference that we can iterate on if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would module declarations - or expressions - only be able to appear at the top level? They can be used in expression position with dynamic import.

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module expressions can appear anywhere. Module declarations can only appear at the top-level because the only benefit that module declarations give compared to module expressions is that you can statically import them, and static imports can only appear at the top level. Also, since module declarations don't capture variables you can easily manually move them to the top level.

I don't have a good reason to disallow them in any position other than "it's unnecessary, and it makes the static analysis of visible module declarations harder": I put the simpler version in this first version of the spec text to be able to focus on the major semantics, but please consider this as still up for debate. If there is an use case for module declarations in arbitrary position then I'm happy to support them.

(ref #21 (comment))

Comment on lines +23 to +24
import { x } from modX;
import { y } from modY;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is interesting. you can import from a first-class reified value?

what happens if i have:

const { modX } = await import('./file-a.js');
import { x } from modX;

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a link error. import { x } from modX run in the link stage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but that seems very confusing - how do users know when a first-class value can be statically imported from or not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they must be statically declared as a ModuleDeclaration. maybe the type checker will help

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no type checker - not everyone uses TS and proposals can't assume there will be a linter, let alone a type checker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is like how you know the imported path is valid, or if the dependency graph contains a Top-Level-Await.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially around modules, there are already different things that users now know where linters and type checkers help, and the language assumes that they do "the right thing" even if they don't use a linter (otherwise, it throws an error):

  • As @Jack-Works mentioned, they have to write a correct import path otherwise the module will throw when loaded
  • Users have to know the name of exports of a module (you cannot import { x } from a module that only exports y)
  • Users have to know the type of exports of a module: in const { f } = await import("mod"); f() they need to know that f is a function, even if they don't use a type checker

how do users know when a first-class value can be statically imported from or not?

In general you cannot "import from values": linking happens before execution, so before that those values even exist. Users who use import declarations will have to replace their "I can import from files" knowledge with "I can import from files or module declarations".

Additionally, it's already known that imports are hoisted to the top of the file1, or at least that imports are "executed" before any code in the file. If you think about JS modules as two sections:

// all the imports get moved to this section
import { x } from modX;

// all the runtime code gets moved to this section
const { modX } = await import('./file-a.js');

then it becomes more evident that you cannot import from runtime values, because they are not initialized yet when the imports run.

Footnotes

  1. The reality is not that they are hoisted, but that they run in a completely different phase (linking vs evaluation). However, I have always only seen is taught as "they are hoisted" - and it's a good enough approximation of what happens.

@@ -0,0 +1,47 @@
# Hoisting

Module declarations are hoisted, and can be imported before their declaration:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may answer/contradict my earlier question/comment - if they're hoisted to the top of the block, can you reference them in an expression before they're defined also, like var?

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you can, clarified in eb53890 (#21)!

console.log(x); // 1
```

Module declarations can also be shadowed by other variables: in that case they cannot be imported anymore.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how, if you can't reassign module declarations, and module declarations are hoisted?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shadowed like this:

module x {
    let x = 1
    import x // error! shadowed by local declaration x
}

{
    let x;
    import(x) // error! undefined is not a valid specifier
}
import x // ok

Copy link
Member

@ljharb ljharb Oct 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but I’m not sure if this example matches that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example below is identical to this one, except that:

  • I used different variable/module names
  • I used two modules (foo and nested) instead of one (x), to avoid re-using the same one for two purposes (declaring the module, and trying to statically import a shadowed module).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh maybe the confusion comes from different meanings of hoisting.

Not hoisted as in "available in an outer scope":

{
  var x = 1;
}
x === 1;

but as in "available before their definition":

f();

function f() {}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we want to add another construct to the language that’s available before it’s definition? That’s why ES6 has the TDZ for let/const imo - to avoid the mistake of var and function declarations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • var hoisting is problematic because the declaration is hoisted, but its value is not and thus you can read the variable but it's not really usable.
  • function hoisting is not considered problematic in general: some coding styles prefer to keep functions at the bottom, in a "the more you scroll the more internal details you see" pattern. It causes problems when the function captures a variable from the outer scope, and the function is called before that that variable is initialized. Module declarations don't have this problem, because they don't capture variables.
  • class didn't follow the function pattern because, as noted by @Jack-Works in Write initial spec, and add some examples #21 (comment), class declarations execute some code when evaluated and thus they cannot be used before evaluating the declaration.

Both TDZ and not TDZ would be fine for me: I'll open an issue to discuss about it as soon as this is merged, since with just two reviews I got two different opinions 😛

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree that relying on function hoisting isn't considered problematic in general - there's coding styles for everything, but that doesn't mean everything's fine.

@Jack-Works
Copy link
Member

Some new thoughts after seeing @ljharb's review.

  1. It should be like a function declaration. Hoisted, usable before definition, and no TDZ.
  2. It should not limit to the top level, this limitation does not make sense and break the symmetry between declaration and expression. IMO a Module Declaration should be hoistable block scoped.
{
    import(a) // ok
    module a {}

    // means:
    // const a = module {} // hoisted
    // import(a)
}
a // reference error

But this requires more spec work and also creates a separate link-stage variable namespace.

module a {}
{
    module a { "hit this one!" }
    import(module {
        import a
    })
}

@mhofman
Copy link
Member

mhofman commented Oct 31, 2022

If a module declaration is not at the top level, would you expect it to be statically importable? I suppose not since the binding would not exist in the top level scope. In that case, would they be strictly equivalent to the corresponding module expression assigned to a variable, or would there be any semantic differences?

Edit: I see that this PR clarifies that module fragments would not have a public identifier like suggested in #15 and so even top level module declarations are already strictly equivalent to a module expression.

@@ -2,17 +2,17 @@

$$ {\text{module declarations} \over \text{module expressions}} = {\text{function declarations} \over \text{function expressions}} $$

Except for the "they are static declaration parts", module declarations behave exactly like module expressions assigned to a `const` variable.
Except for the "they are static declaration parts", module declarations behave exactly like module expressions assigned to a `const` variable, except that they are hoisted similarly to function declarations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably say, "strict mode function declarations", so that this isn't taken to include Annex B 3.3 :)

@nicolo-ribaudo
Copy link
Member Author

nicolo-ribaudo commented Oct 31, 2022

@mhofman The difference is that module declarations are statically analyzable, so you can statically import them:

module mod { export const x = 2 }
import { x } from mod; // works
const mod = module { export const x = 2 }
import { x } from mod; // error! At linking time (before evaluation), there is no way to know that `mod` is a module

@Jack-Works I'm starting to write spec text for module declarations in blocks/functions :)

@nicolo-ribaudo
Copy link
Member Author

nicolo-ribaudo commented Nov 1, 2022

Ok I pushed a commit to support module declarations nested in blocks&functions.

The main complexity comes from dividing the logic to validate/instantite module declarations in two places:

  • for function f() { module mod {} } it must necessarily be tied to the current lexical environment record, because we don't want all module declarations to be generically attached to the top-level of the module;
  • for export module mod {} it must be tied to the module itself, because it needs to be available during module loading which happens before creating execution contexts and environments records.

There are a bunch of TODOs left in the spec text, but they are all self-contained.

@nicolo-ribaudo
Copy link
Member Author

Re-reading the discussions, the only unresolved one is about the strict-function-like hoisting behavior, and there are point in favor of both the alternatives. I'll keep it as it is for now (so, without TDZ), but I'll make sure to mention it in plenary during the presentation.

There are some TODOs left in the spec text, but it should be enough for the "Initial spec text" criterion :)

@nicolo-ribaudo nicolo-ribaudo merged commit 1c10ccb into tc39:main Nov 16, 2022
@nicolo-ribaudo nicolo-ribaudo deleted the initial-spec branch November 16, 2022 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants