Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow ESM-CJS-ESM-ESM cycles when using require(esm) #52145

Closed
nicolo-ribaudo opened this issue Mar 18, 2024 · 14 comments · Fixed by #52264
Closed

Disallow ESM-CJS-ESM-ESM cycles when using require(esm) #52145

nicolo-ribaudo opened this issue Mar 18, 2024 · 14 comments · Fixed by #52264
Labels
esm Issues and PRs related to the ECMAScript Modules implementation.

Comments

@nicolo-ribaudo
Copy link
Contributor

Version

Unreleased (5f7fad2)

Platform

n/a

Subsystem

esm

What steps will reproduce the bug?

// c.mjs
import { createRequire } from "module";
console.log("Start c");
createRequire(import.meta.url)("./d.mjs");
throw new Error("Error from c");
// d.mjs
import "./c.mjs";
console.log("Execute d");
// b.mjs
import "./c.mjs";
console.log("Execute b");
// a.mjs
try {
  await import("./b.mjs");
  console.log("Dynamic 1 didn't fail")
} catch (err) {
  console.log(err);
}

try {
  await import("./d.mjs");
  console.log("Dynamic 2 didn't fail")
} catch (err) {
  console.log(err);
}

How often does it reproduce? Is there a required condition?

Whenever there is an ES modules that triggers a require() of a separate ES module that imports the original ES module.

What is the expected behavior? Why is that the expected behavior?

  • d.mjs contains some code and then a throw statement at the top level, so import("./d.mjs") must always reject (either because of the "some code", of because of the throw)
  • c.mjs imports d.mjs, so import("./c.mjs") must also always throw
  • b.mjs imports c.mjs, so import("./b.mjs") must also always reject

Indeed, out/Release/node --experimental-require-module ./d.mjs throws

Start c
node:internal/modules/esm/loader:274
      return job.module.getNamespaceSync();
                        ^

Error: require() cannot be used on an ESM graph with top-level await. Use import() instead. To see where the top-level await comes from, use --experimental-print-required-tla.

(which is a wrong error, but at least it is an error 😛) and out/Release/node --experimental-require-module ./c.mjs and out/Release/node --experimental-require-module ./c.mjs throw

Start c
Execute d
file:///.../c.mjs:5
throw new Error("Error from c");
      ^

Error: Error from c

What do you see instead?

However, if you import() first b.mjs and then d.mjs, the second import() does not reject! out/Release/node --experimental-require-module ./d.mjs outputs

(node:928766) ExperimentalWarning: Support for loading ES Module in require() is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
Start c
Execute d
Error: Error from c
    at file://.../c.mjs:5:7
    at ModuleJob.run (node:internal/modules/esm/module_job:235:25)
    at async ModuleLoader.import (node:internal/modules/esm/loader:436:24)
    at async file://.../a.mjs:3:3
Dynamic 2 didn't fail

Additional information

I think the reason the second import() does not properly reject is because running two module evaluation algorithms at the same time messes up with the way V8 tracks cycles. Specifically, it violates the assumption made at step 11.c.ii of https://tc39.es/ecma262/#sec-innermoduleevaluation, causing them problems at step 16.

A solution to this would be, when doing require(esm), to check if the module or any of its transitive dependencies are currently being evaluated. Right now Node.js just checks if the module itself is being evaluated (and throws that top-level await error).

cc @joyeecheung (who is already aware of this, I'm just opening the issue so that it doesn't get forgotten)

@nicolo-ribaudo
Copy link
Contributor Author

nicolo-ribaudo commented Mar 18, 2024

Oh, I just noticed that, when building Node.js in debug mode, V8 crashes when running c.mjs:

out/Debug/node --experimental-require-module ./c.mjs
(node:930777) ExperimentalWarning: Support for loading ES Module in require() is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
Start c


#
# Fatal error in ../../deps/v8/src/objects/module.cc, line 205
# Debug check failed: module->status() != kEvaluating (4 vs. 4).
#
#
#
#FailureMessage Object: 0x7ffef98cbe30
----- Native stack trace -----

 1: 0x55cfdded5ad8 node::DumpNativeBacktrace(_IO_FILE*) [out/Debug/node]
 2: 0x55cfde1252f9  [out/Debug/node]
 3: 0x55cfde12531d  [out/Debug/node]
 4: 0x55cfe0660f4b V8_Fatal(char const*, int, char const*, ...) [out/Debug/node]
 5: 0x55cfe0660f7b  [out/Debug/node]
 6: 0x55cfded97342 v8::internal::Module::PrepareInstantiate(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Module>, v8::Local<v8::Context>, v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::FixedArray>, v8::Local<v8::Module>), v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::Module>)) [out/Debug/node]
 7: 0x55cfdee31a78 v8::internal::SourceTextModule::PrepareInstantiate(v8::internal::Isolate*, v8::internal::Handle<v8::internal::SourceTextModule>, v8::Local<v8::Context>, v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::FixedArray>, v8::Local<v8::Module>), v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::Module>)) [out/Debug/node]
 8: 0x55cfded9824b v8::internal::Module::Instantiate(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Module>, v8::Local<v8::Context>, v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::FixedArray>, v8::Local<v8::Module>), v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::Module>)) [out/Debug/node]
 9: 0x55cfde446021 v8::Module::InstantiateModule(v8::Local<v8::Context>, v8::MaybeLocal<v8::Module> (*)(v8::Local<v8::Context>, v8::Local<v8::String>, v8::Local<v8::FixedArray>, v8::Local<v8::Module>)) [out/Debug/node]
10: 0x55cfddf77132 node::loader::ModuleWrap::InstantiateSync(v8::FunctionCallbackInfo<v8::Value> const&) [out/Debug/node]
11: 0x55cf7f8d199d 
[1]    930777 trace trap (core dumped)  out/Debug/node --experimental-require-module

@nicolo-ribaudo nicolo-ribaudo changed the title Disallow ESM-CJS-ESM-ESM cycles Disallow ESM-CJS-ESM-ESM cycles when using require(esm) Mar 18, 2024
@joyeecheung
Copy link
Member

I think we have two options:

  1. Only return the module if it's already evaluated. But ESM-CJS-ESM-ESM can still be allowed if, by the time CJS -> ESM happens, the cycle is already fully evaluated (unless that graph involves TLA)
  2. Distinguish between "modules (directly or indirectly) required() by CJS" and "modules (directly or indirectly) imported by ESM". And disallow cycles when the last ESM-ESM edge is the latter.

I think the first is probably easier to implement - just throw when the status is not evaluated. In practice, that's probably suits the "best effort" approach of CJS/ESM interop better - we give users whatever we can when it's safe, otherwise we throw. The latter would be a bit trickier to implement - we need to keep track of where the modules come from and search for cycles. Not impossible but I wonder if that's really worth the effort. We could just implement 1 first, and, if it turns out to be problematic, we do 2.

@nicolo-ribaudo
Copy link
Contributor Author

nicolo-ribaudo commented Mar 19, 2024

Only return the module if it's already evaluated. But ESM-CJS-ESM-ESM can still be allowed if, by the time CJS -> ESM happens, the cycle is already fully evaluated (unless that graph involves TLA)

I'm not sure I understand what you are proposing here. The cycle never fully evaluated when require() returns, because the module that contains require() is part of the cycle and thus it will not be finished evaluating yet.

@legendecas legendecas added the esm Issues and PRs related to the ECMAScript Modules implementation. label Mar 19, 2024
@joyeecheung
Copy link
Member

joyeecheung commented Mar 19, 2024

Hmm, I guess I was too jet-legged when I posted that, and was thinking about the other cycle we talked about (import cjs -> require(esm) cycle).

re. a.mjs, it seems in this case, d.mjs gets cached in the first case, so you wouldn't see an error later. I am not sure what really should be the expected behavior. Funny enough if you switch createRequire(import.meta.url)("./d.mjs"); to await import("./d.mjs") in c.mjs it gets even more bizzare (only logs Start c. On main with the recent changes I added in #51999 it will show you that await import("./b.mjs") in a.mjs was never resolved). I suspect we need to fix the caching strategy for this, not just for require(esm).

re. the DCHECK, we should add a status check before instantiation. I'll dig into it.

@joyeecheung
Copy link
Member

joyeecheung commented Mar 22, 2024

Working on a branch to just ban ESM -> CJS and CJS -> ESM edges when they participate in a cycle. For the example in the OP, b-d.mjs fails, the two cases in a.mjs fail individually but not together, still need to properly fix up the cache when the evaluation fails.

@joyeecheung
Copy link
Member

Opened #52264

@joyeecheung
Copy link
Member

joyeecheung commented Mar 31, 2024

On a side note, #52145 (comment) is still there (i.e. if this is await import(), the code below TLA never runs), though I am not sure what the spec says about that. Looks like a footgun to have silently never-executed code below await import() that trigger a cycle...

@Jamesernator
Copy link

Jamesernator commented Apr 4, 2024

Wouldn't it be better here to just check if the module is evaluating and simply return the namespace if it's already evaluating? This would more easily allow people to migrate file-at-a-time between CJS and ESM (and it mirrors how CJS already works so wouldn't be a big divergence).

i.e. The problem here basically reduces down to this:

globalThis.mod = new vm.SourceTextModule(`
    function requireImpl(specifier) {
        // specifier gets resolved to the currently to the currently evaluating module
        const mod = globalThis.mod;
        mod.evaluate();
        return mod.namespace;
    }
    
    console.log("Before require");
    const getMod = requireImpl("specifier-that-results-in-evaluate-of-the-current-module");
    console.log("After require");
`);

my suggested fix is just to check if evaluation is already happening:

globalThis.mod = new vm.SourceTextModule(`
    function requireImpl(specifier) {
        // specifier gets resolved to the currently to the currently evaluating module
        const mod = globalThis.mod;

        // In a cycle so just return the namespace
        if (mod.status === "evaluating") {
            return mod.namespace;
        }
        mod.evaluate();
        return mod.namespace;
    }
    
    console.log("Before require");
    const getMod = requireImpl("specifier-that-results-in-evaluate-of-the-current-module");
    console.log("After require");
`);

@nicolo-ribaudo
Copy link
Contributor Author

What if the require()d module itself is not evaluating, but one of its dependencies is?

@Jamesernator
Copy link

Jamesernator commented Apr 4, 2024

What if the require()d module itself is not evaluating, but one of its dependencies is?

True, it might be worth seeing if we can get a spec change here as this behaviour is very similar to the change proposed for deferred module evaluation.

In fact I wonder if, with that proposal and the fix, implementing require("esm") like this would just work:

const mod = new SourceTextModule(...);
mod.linkSync(); // Whatever the internal method here is for linking sync
// Accessing any member of the namespace triggers EvaluateSync
mod.namespace.ANY_PROP_TO_TRIGGER_EVAL;

@joyeecheung
Copy link
Member

joyeecheung commented Apr 5, 2024

This would more easily allow people to migrate file-at-a-time between CJS and ESM (and it mirrors how CJS already works so wouldn't be a big divergence).

Technically with CJS, it's not simply returning circular exports that are being loaded, but a proxy (pretty slow) that emits a warning if you are accessing a property before it's added to the exports object (which is still more of a guess - one can still create a race condition by modifying the exports later, so in the cycle then you can't be certain what version of the export you are getting from a consumer point of view).

nodejs-github-bot pushed a commit that referenced this issue Apr 8, 2024
This patch disallows CJS <-> ESM edges when they come from
require(esm) requested in ESM evalaution.

Drive-by: don't reuse the cache for imported CJS modules to stash
source code of required ESM because the former is also used for
cycle detection.

PR-URL: #52264
Fixes: #52145
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
Reviewed-By: Antoine du Hamel <[email protected]>
@joyeecheung
Copy link
Member

joyeecheung commented Apr 8, 2024

Should we keep it open in case we want to discuss further about the behavior? I guess for now at least we can just throw, as this is probably more of an edge case that most wouldn't care about. If somehow that proves to be inconvenient enough we can revisit to return the not-yet-populated namespaces for the cycle (probably requires a V8 & sepc change to get rid of the assertions in subgraphs though).

@joyeecheung
Copy link
Member

It might also be nice to print a require + import stack when the cycle happens. Though I guess that kind of traces have been long missing on the ESM front (for require() there is a stack that we maintain to enhance the errors, for ESM I guess so far one has to increase --stack-trace-limit and fish out relevant lines from the error stack trace when the ESM loader throws...).

@joyeecheung
Copy link
Member

joyeecheung commented Apr 10, 2024

It seems tc39/proposal-defer-import-eval#39 is what we'll need to not throw for cyclic edges. So until that ends up in the ECMA262 proper, we can just throw for now.

joyeecheung added a commit to joyeecheung/node that referenced this issue Jun 17, 2024
This patch disallows CJS <-> ESM edges when they come from
require(esm) requested in ESM evalaution.

Drive-by: don't reuse the cache for imported CJS modules to stash
source code of required ESM because the former is also used for
cycle detection.

PR-URL: nodejs#52264
Fixes: nodejs#52145
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
Reviewed-By: Antoine du Hamel <[email protected]>
joyeecheung added a commit to joyeecheung/node that referenced this issue Jul 30, 2024
This patch disallows CJS <-> ESM edges when they come from
require(esm) requested in ESM evalaution.

Drive-by: don't reuse the cache for imported CJS modules to stash
source code of required ESM because the former is also used for
cycle detection.

PR-URL: nodejs#52264
Fixes: nodejs#52145
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
Reviewed-By: Antoine du Hamel <[email protected]>
marco-ippolito pushed a commit that referenced this issue Aug 8, 2024
This patch disallows CJS <-> ESM edges when they come from
require(esm) requested in ESM evalaution.

Drive-by: don't reuse the cache for imported CJS modules to stash
source code of required ESM because the former is also used for
cycle detection.

PR-URL: #52264
Backport-PR-URL: #53500
Fixes: #52145
Reviewed-By: Geoffrey Booth <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
Reviewed-By: Antoine du Hamel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants