Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TC39] Potentially blocking "Module Attributes" proposal #91

Closed
devsnek opened this issue May 26, 2020 · 40 comments
Closed

[TC39] Potentially blocking "Module Attributes" proposal #91

devsnek opened this issue May 26, 2020 · 40 comments
Labels
standards-agenda To be discussed in bi-weekly meeting

Comments

@devsnek
Copy link
Contributor

devsnek commented May 26, 2020

"Module Attributes" (https://github.com/tc39/proposal-module-attributes) is requesting advancement to stage 2. Currently my concerns for the proposal are as follows:

Cache semantics

  • Per-module cache break from (resolver, specifier) pairs
    • Possible fix: enforce that all metadata matches within module
    • Possible fix: something like for import 'x' use type: 'json' at top of file
  • Global cache break in real world implementations
    • Possible fix: validate that all attributes applied to live module record are exactly the same, early error otherwise
  • Modules cannot portably access metadata
    • If above issues are fixed, modules can be given access to metadata

Standardized handling of parameters

  • Host-specific metadata is not portable
  • Security invariants may be broken if a host ignores an attribute or interprets it incorrectly
    • I don't have a possible fix for this

Stage 2 requests that all major semantics are covered, and these seem fairly major to me, so I'm not sure this proposal should continue to stage 2 quite yet.

Delegate expectations are still in the works (#84) so I'm not really sure what the process here should be.

@MylesBorins
Copy link
Contributor

I am one of the champions of this proposal, so I obviously have a bias here. Would it make sense for me to recuse myself from this conversation or should I actively participate?

@devsnek
Copy link
Contributor Author

devsnek commented May 26, 2020

@MylesBorins i think discussion is fine (not allowing discussion would probably be bad for the goal of allowing forward progress), though if we eventually landed on some sort of voting procedure it might make sense to recuse yourself from that.

@MylesBorins MylesBorins added the standards-agenda To be discussed in bi-weekly meeting label May 26, 2020
@MylesBorins
Copy link
Contributor

It might be necessary for us to have an out of band meeting to discuss process issues here, or potentially work it out in this thread. We have not yet reached any sort of consensus on what is needed for an OpenJS delegate to be able to block at a committee, or if they should even be able to.

We also have to consider the fact that multiple members of this working group are champions of the proposal that is potentially being blocked.

@devsnek we have talked about requiring a degree of consensus to bring forward a block. Have you done any work building consensus across OpenJS members about this?

@ljharb
Copy link
Member

ljharb commented May 26, 2020

I have similar concerns - in addition, it will create a refactoring hazard where i won't always be able to change my module freely between types without it being a breaking change; I also continue to believe that the only place the type should be defined is the export site, not the import site.

I'm in support of OpenJS blocking on stage 2 at this time.

@devsnek
Copy link
Contributor Author

devsnek commented May 26, 2020

@MylesBorins I agree this is definitely not the best situation. I opened this issue to discuss it and come to some conclusion, possibly (hopefully) involving consensus.

@MylesBorins
Copy link
Contributor

So, and this is where my input might need to be recused... I disagree with the block.

@ljharb the hazard argument does not make sense in a world where folks are including the file extension. The security concerns that were raised by the W3C folks were specifically about a privilege escalation attack if a json module were to be changed to be a JS module. The proposal does not preclude other means of identifying module type, such as out of band metadata... so if you wanted to write code in an environment that supported file extension resolution + out of band meta data it would work.

Based on your belief that the only place the type should be defined is the export site, not the import site. there is no path forward to getting officially supported JSON module in the JS platform that would be supported by both browsers and severside runtimes.

I think we might want to be careful about this issue turning into a pile on of feedback back and forth... although perhaps if there isn't clear consensus in the issue, that should make it clear that we should not be blocking as a foundation. Alternatively if there is strong enough voice + evidence that a block should happen perhaps that would be enough. I'll bring meta conversation to #84

I'll speak to Gus's more specific blocks with more research if we decide this is the right venue for the discussion

@ljharb
Copy link
Member

ljharb commented May 26, 2020

Specifically, I think that if there's not consensus here, we should not be joining consensus to advance the proposal. Despite how a proposal champion may choose to word the question, consensus at TC39 is for advancement, not blocking.

@devsnek
Copy link
Contributor Author

devsnek commented May 26, 2020

I'm hoping we can just constructively find consensus, in either direction.

@dtex
Copy link
Contributor

dtex commented May 26, 2020

Which Foundation projects have an explicit interest in this proposal, and how would this block/advancement affect them?

@MylesBorins
Copy link
Contributor

@dtex the module attributes proposal is the current best possible path to natively supporting json modules in JavaScript (allowing JSON to be imported).

One of the biggest groups this affects is the node.js modules team... which does not have consensus on the proposal (Myself, Jordan, and Gus are all members)

@ljharb
Copy link
Member

ljharb commented May 26, 2020

It will also affect eslint and mocha and electron, all of which may want to import json config files; probably also amp.

@dtex
Copy link
Contributor

dtex commented May 26, 2020

What about those other projects, have they weighed in on the proposal?

@ljharb
Copy link
Member

ljharb commented May 26, 2020

cc @boneskull @nzakas @codebytere @tobie for thoughts? (please pass the buck if there's someone else on your projects who'd rather discuss it)

@littledan
Copy link

In addition to hearing from other projects in the OpenJSF foundation, I wonder if we could hear more from Node.js collaborators. If this proposal is bad for Node.js, I think that'd be a serious enough issue, even if other parts of the OpenJSF don't have particular concerns. However, it's championed by a Node.js collaborator (@MylesBorins), while another Node.js collaborator (@devsnek) expressed serious concerns, so it's unclear how to read that. Can we reach out for broader feedback among Node.js collaborators to learn more?

@devsnek
Copy link
Contributor Author

devsnek commented May 26, 2020

fwiw my concerns aren't really specific to node.

@GeoffreyBooth
Copy link

GeoffreyBooth commented May 26, 2020

Can we reach out for broader feedback among Node.js collaborators to learn more?

Hi, I'm a Node collaborator and on the modules team with @MylesBorins and @devsnek and @ljharb.

I agree that @devsnek’s concerns over cache invalidation are valid and should be addressed, but I don’t see any philosophical debate there. Those strike me as details that need to be worked out, but there’s nothing stopping them from being addressed to everyone’s satisfaction.

The real potential blocker is @ljharb’s:

it will create a refactoring hazard where i won't always be able to change my module freely between types without it being a breaking change; I also continue to believe that the only place the type should be defined is the export site, not the import site.

So when adding support for ESM in Node, we were faced with a design decision related to how to interoperate ESM with Node's original module system, CommonJS. In very broad strokes, the options were along these lines:

  • No interoperability: CommonJS could require other CommonJS, ESM could import other ESM, but never the twain shall meet. (This was never really considered.)

  • “Consumer-driven” interoperability: If require was used, it would load the target as CommonJS; import would load ESM. The require function could be imported into ESM, and import() could be used in CommonJS. The NPM team created a detailed proposal and implementation around this design.

  • “Author-driven” interoperability: In ESM, import would load either CommonJS or ESM based on an algorithm inspecting the target to be imported. The algorithm initially was more or less just looking at the filename for an .mjs extension; currently it does that plus look for package.json "type" field, and possibly an "exports" field that defines different files to be loaded for CommonJS versus ESM. This is what ended up shipping.

The reason the group went with the last option is because of what @ljharb alludes to, in that we wanted to create a seamless upgrade path for package authors to migrate their packages to ESM. So if lodash wants to start shipping an ESM version, it can do so without all of the folks who use lodash needing to update their references to it from require to import (the consumer-driven approach). This is more convenient for end users, at the expense of some slight package.json complexity on the part of package authors.

The Module Attributes proposal brings us back closer to the “consumer-driven” design, but instead of require versus import it would’ve been syntax like import _ from 'lodash' with type: "commonjs" or whatever. And so in this scenario, if lodash wants to start shipping an ESM version, it not only needs to do so but it also needs to get users to change that with type: "commonjs" to with type: "module".

I can see why that is undesirable for package authors, but only if the module attribute was mandatory. Presumably it has to be optional, in order for this not to be a breaking change, and so it doesn't strike me as a concern. If a package ships multiple versions of itself (CommonJS, ESM, WASM, whatever) and the user wants to explicitly define which one they’re importing, let them; it’s no different than if they used a full path like import _ from './node_modules/lodash/dist-commonjs/index.js', which they can do today.

Node will retain the author-driven approach it has now, in that the exported types will be defined by package.json files. The Module Attributes proposal provides a way for users to opt into both author- and consumer-driven interop, which is like a compatibility check where the import would fail if the desired type is unavailable from the package. If a user wants to opt into such a check, similar to adding a shasum, that seems like an enhancement to me.

P.S. I do agree with the motivation of this specific proposal, in that for security reasons in browsers we need both an author- and consumer-defined type, at least for JSON. If it were always author-defined, a JSON module could switch to JavaScript and suddenly be injecting code into an app that was expecting data. So there are good reasons to not want an always author-only design.

@littledan
Copy link

To clarify about mandatory vs optional:

  • For existing JS modules, types definitely aren't mandatory; that would break everything
  • For JSON modules, it's mandatory that environments support with type: "json"; environments can then decide whether or not they want to support implicit JSON modules (the Web won't, but Node.js could)
  • If with type: were used for other things, like ESM/CJS? I don't understand those issues well enough to understand the implications. Certainly I don't see how the module attributes proposal could/would make the user-side type declaration mandatory, if the Node Module team already has a solution worked out without this proposal.

@littledan
Copy link

To respond to the concerns (if this goes on for much longer we should probably move to issues in the module attributes repo):

  • Per-module cache break from (resolver, specifier) pairs
    • Possible fix: enforce that all metadata matches within module
    • Possible fix: something like for import 'x' use type: 'json' at top of file

In general, there may be two types of module attributes: those which just "check" what the module is (and type is required to be that), and those who change the interpretation of the module (several were suggested in the repo. For a simple example: imagine if you had one that freezes the JSON module. Or: give certain capabilities to a WASI module). I think the latter type should be part of the cache key, and it should be fine to load the same (specifier, referrer) pair multiple times with different values here. If you reuse the same "change" settings multiple times, you'd get the same module.

  • Global cache break in real world implementations
    • Possible fix: validate that all attributes applied to live module record are exactly the same, early error otherwise

If you use "check" attributes wrong, then you'll get an error at module load time. Isn't this early enough? Which other kinds of mismatch/"cache break" cases are you worried about?

  • Modules cannot portably access metadata

Giving metadata to modules would only make sense if those pieces of metadata were part of the cache key. Otherwise, there would be a race to see who loads it first, and that would determine the metadata. An environment could do this with all unknown attributes, maybe. The only reasonable place to expose it is on import.meta, which is currently entirely environment-defined. I'm still not sure what the use case for accessing this metadata is, though; I think module attributes, just like the specifier, are for asking for things from the host.

If we had out-of-band module metadata, we'd face analogous issues: if each package had a file declaring what the metadata for its imports is, then we'd have to somehow merge these files all into one some time during startup/build. Then, we'd have to figure out how to deal with conflicts, just like if the unknown module attributes weren't part of the cache key.

Overall, I think it'd be a bit weird to non-compositionally throw due to mismatches from different import statements in the module graph, whether we use in-band or out-of-band metadata. Instead, each import does its own individual checking, and would individually fail if it doesn't pass the type check. This way, combining packages doesn't blow up randomly, just from differing metadata--you could run the tests in the packages individually and get the error isolated.

  • Standardized handling of parameters
    • Host-specific metadata is not portable

The idea of the proposal is to define as much host-independent stuff as we can (e.g., type: "json"), but also allow hosts to do more (e.g., on the Web, maybe HTML and CSS modules; maybe CJS/ESM assertions as described in #91 (comment)). I hope that, when there's environment-independent things in the future, we can define them in TC39. Isn't this all par for the course for JS, where some things are portable and others aren't? I think, with out-of-band metadata, we'd also want some non-portable, host-specific metadata.

  • Security invariants may be broken if a host ignores an attribute or interprets it incorrectly

Yeah, hosts have the ability to subvert security in all sorts of ways.... I don't really have a fix for that (documentation for host authors??), or understand how module attributes face this in a different way from anything else in JS.

@jkrems
Copy link

jkrems commented May 26, 2020

I think part of the confusion is that the proposal currently tries very hard to stay impartial when it comes to its impact on module resolution and module identity. My impression is that it may be too impartial. It seems like for type: "json" there's now a strong "no impact on identity and resolution" stance but that still leaves a big question mark about forward- and cross-runtime compatibility: While a runtime without support for type: "json" specifically can safely ignore that attribute, any other attribute cannot be ignored because it may or may not lead to completely different semantics.

If I'm understanding @devsnek correctly, I share his concerns that having one clear semantic for all attributes would make it easier to evaluate the proposal. If it's not possible to safely/gracefully ignore future attributes (or attributes that only matter for other kinds of JS runtimes), that introduces a big divergence risk between the web and node. I would argue that it also introduces a future risk for attributes in browsers.

@littledan
Copy link

littledan commented May 26, 2020

In the module attributes proposal, all JS engines are required to support type: "json", but this is still a potential risk for other attributes. Our past discussion: tc39/proposal-import-attributes#21 : I've argued, after being convinced by @jkrems for the reasons he states above, that we should throw on unknown attributes. Fellow OpenJSF TC39 delegate @gibson042 argued, we should ignore them, and that it's safe to do so. The current draft leaves this decision up to hosts. I'd be open to tightening the requirements one way or another, or making an informal recommendation to hosts. Ultimately, I think the same compatibility/evolution issues would apply to an out-of-band format as well, about how we treat unrecognized metadata (in terms of ignore vs cause an error).

@jkrems
Copy link

jkrems commented May 27, 2020

Fellow OpenJSF TC39 delegate @gibson042 argued, we should ignore them, and that it's safe to do so.

I think I agree with @gibson042 that we should ignore them. Maybe there was a misunderstanding on this end in past discussions (or I changed my mind? It's been a while.). :) As long as module attributes may not change identity or resolution, ignoring should be safe. And I believe that would be the most future-safe arrangement. Specifying that all attributes are always cache- and resolution-neutral would completely remove my concerns.

@littledan
Copy link

Sorry, right, I believe I was convinced by you in a more oblique way. If we specified that none of the module attributes are part of the cache key, it'd rule out a very large fraction of the module attributes use cases that have been posited (e.g., capabilities for Wasm modules, reinterpreting things with different options, etc). That's why I felt like, the unknown attributes should be an error--to enable those in the future.

@littledan
Copy link

Note, we couldn't simultaneously (1) Omit module attributes from the cache key (2) Expose them (e.g., as import.meta.attributes) (3) ignore the unknown attributes, because it would lead to the "race" that whoever imports the module first sets the attributes. Or, if you did an early check that it was always loaded with the same attributes across the whole module graph, then it could blow up if you combined two packages which work independently. (Such a blow-up/race choice would also be faced by tooling which combines out-of-band attributes for two packages which are used together.)

@jkrems
Copy link

jkrems commented May 27, 2020

capabilities for Wasm modules

We're definitely entering wild speculation territory here but wouldn't that be possible with cache-neutral attributes as well? E.g. if a module is loaded with certain capabilities first and then requested with incompatible capabilities, it could fail. Assuming that the assertion is defined at the import point (which it would have to be if it's not part of the cache key).

import './x.wasm' with accessTo: ['files'];
import './x.wasm' with accessTo: ['net']; // import assertion error: capabilities mismatch

It does make the behavior difference aspect more fuzzy but I think it's still safely cache-neutral (though it introduces a race condition where the 2nd checked import in the graph fails). I think bloating the module graph with duplicates of modules that are differently interpreted doesn't sound like something that should be easy. Explicit duplication is still possible via query args etc. to break the cache.

@nzakas
Copy link

nzakas commented May 27, 2020

I don’t believe ESLint has any project-specific concerns related to this proposal. I’ve reached out to our larger team asking for feedback just in case.

@boneskull
Copy link

Currently Mocha will load config files in various formats, but not in a browser. If that is ever implemented on Mocha’s side, I’d expect to be able to “guess” at the format of whatever config file it loads, regardless of the config file’s extension. For example, in Node.js, we do something like this:

  1. Mocha is given a file, .mocharc
  2. It attempts to require() this file
  3. If that fails, try to await import() it
  4. If that fails, fs.readFile() and parse as YAML

In a browser, it should work as close to this as possible (haven’t looked in to it yet). If this proposal means something like the above could not work, then that would negatively impact its consumers.

@eemeli
Copy link
Member

eemeli commented May 27, 2020

For messageformat this is interesting for its possible interactions with custom loaders. In particular, how should an import with type: "messageformat" work? At the moment, we're providing a couple of Webpack loaders (1) (2) and soon a Rollup plugin that allow for resource files to be compiled into message functions on import.

Furthermore, how should this work with resources that are stored using one format (e.g. JSON/YAML/.properties), but expect to be further transformed during the load? If these are expected to be piled on as additional attributes for the import statement, it starts to seem like a return to the way Webpack used to only support query parameters for loaders and loader options, just with different syntax.

This is what the Webpack docs say on this now, based on a few years of experience:

Use module.rules whenever possible, as this will reduce boilerplate in your source code and allow you to debug or locate a loader faster if something goes south.

I fear that the key:value syntax being currently proposed is going to lead to a worse developer experience down the road whereas a single string key (e.g. as "messageformat") would more clearly communicate to the reader that something special is happening during the load, while moving the implementation details of that load to some central location.

@littledan
Copy link

@jkrems There are basically three options that the proposal lets host choose between: 1) Race and use the first value 2) Error if a second value occurs 3) Make two copies of the module when values disagree. All are permitted by the spec. I personally see 1 and 2 as bad for composition, and would aim to avoid them if I were designing a host.

@boneskull This proposal doesn't change how to import normal JS modules, so you are safe.

@eemeli I agree that we should be conservative in what module attributes and type values we add. However, we got strong feedback from various potential users (including @bmeck) that we should not make this proposal specific to the type, and permit other attributes on an equal syntactic footing. For now, IIUC the Babel parser only permits type to discourage fragmentation, but of course that can be revisited.

@eemeli
Copy link
Member

eemeli commented May 27, 2020

Just as an observation, it's pretty clear that a single-keyword type has a different level of utility compared to other attributes. I mean, it is the only attribute actually being currently considered; all the other ones appear to still be theoretical.

Furthermore, using as for it doesn't prevent other attributes from using with, without any significant cost in terms of expressive complexity:

import foo from './foo.json' as 'json' with foo: 'bar'
import foo from './foo.json' with type: 'json', foo: 'bar'

import('./foo.json', { type: 'json', with: { foo: 'bar' } })
import('./foo.json', { with: { type: 'json', foo: 'bar' } })

@littledan
Copy link

We previously proposed starting with as in TC39 and got very strong negative feedback, including from @bmeck.

@jkrems
Copy link

jkrems commented May 27, 2020

There are basically three options that the proposal lets host choose between

I think allowing all of them is the issue for me. Because one host can't have perfect knowledge of attributes other hosts may allow (including future versions of the same host). Which means that using an attribute immediately makes the code super coupled to the exact set of hosts at specific versions. It's not a choice in practice for one host to just ignore attributes (or pick how to interpret them) if there's any expectation that JS code is portable between runtimes.

Making copies of modules isn't generally safe, so I don't think (3) is a good outcome for hosts. Modules are stateful and effectively nominal types exist in JS. It may break applications in very subtle ways when there's suddenly two variants of the same class/Symbol or two copies of some mutable module-scope value.

@devsnek
Copy link
Contributor Author

devsnek commented May 27, 2020

My concerns in the context of stage 2 advancement have been alleviated, but it seems like people might still have some feedback to give here?

@littledan
Copy link

Yeah, there are definitely some tricky tradeoffs here. This thread has helped me understand the space better. For module attributes to move forward, we need to think more about this catching issue. I have proposed a sort of starting point for Stage 2 (cop out and leave it up to the host, permitting cloning), but we're nowhere near ready to call this the final answer.

I want to figure more out about this catching question, bringing more concrete details about hosts and use cases, before Stage 3. My hesitation remains around not understanding how non-type attributes and composition of packages fit into this; these topics are not my area of expertise and will require more research. Based on this, we can develop more specific requirements or recommendations for hosts as a necessary part of what goes to Stage 3, possibly including separating module attributes from cache keys.

As part of the Stage 2 presentation, we'll make it clear that the plan is to continue this discussion on caching, not conclude in a closed way right now.

@ljharb
Copy link
Member

ljharb commented May 27, 2020

How is caching not a major semantic?

@littledan
Copy link

We've spec'd out full proposed semantics, and are open to changes during Stage 2. We'd need to have all this locked down by Stage 3.

@jkrems
Copy link

jkrems commented May 27, 2020

To clarify: I don't think this has to block advancement to stage 2 from my side.

This proposal doesn't change how to import normal JS modules, so you are safe.

I do believe that the mocha case is a new item worth tracking, independent of this thread. What @boneskull was pointing out was the use case of loading a config of unknown format, ES module or data. I do think the proposal does affect that capability. E.g. import('/mocha-config') won't work if it's supposed to support either a JSON config or an ES module exporting config. It would require a pre-processing step to convert any JSON config into an ES module or an ugly try/catch waterfall with different attributes.

@boneskull Can you open an issue on the proposal repo with that concern / use case if there isn't one already?

@MylesBorins
Copy link
Contributor

MylesBorins commented May 28, 2020 via email

@jkrems
Copy link

jkrems commented May 28, 2020

As dynamic import in node requires the full file path [...]

I think the concern was about support for loading configuration in the browser from a well-known path. So I don't think the node implementation fundamentally changes the story there. But yes, nested try/catch with all possible attributes is a workaround/solution that could be used. Still worth tracking as a trade-off imo.

@devsnek
Copy link
Contributor Author

devsnek commented Jun 4, 2020

There are issues for the relevant topics in the proposal repo so I'm going to close this out.

@devsnek devsnek closed this as completed Jun 4, 2020
@littledan littledan reopened this Jun 5, 2020
@littledan
Copy link

Oops, reopened by mistake. I just wanted to thank everyone here for talking over the issues and ultimately moving to the proposal repo. I hope the changes made (especially tc39/proposal-import-attributes#66) address the concerns raised here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standards-agenda To be discussed in bi-weekly meeting
Projects
None yet
Development

No branches or pull requests

10 participants