Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behaviour of existing "#specifier"s and "./relative/path.js#specifier"? Use a scheme instead? #6

Closed
Jamesernator opened this issue Mar 7, 2021 · 9 comments · Fixed by #21

Comments

@Jamesernator
Copy link

Jamesernator commented Mar 7, 2021

Currently #fragment is already used in Node for it's package imports feature, on the web with import maps now enabled in Chrome one can map #specifier arbitrarily to URLs as well.

Relative specifiers aren't safe either as ./relative/path.js#fragment can already be used for things such as cache busting (e.g. ./path.js#adshd32ds), or simply for forcing multiple copies of a module such as in a SharedWorker pool (e.g. new SharedWorker('./worker.js#1'), etc for each copy).

In my opinion both of these are a awkward, and potentially confusing to deal with for both authors and tools. For example some questions I would ask:

  • As an author...
    • Are ./path/to/mod.js#fragment urls still supported at all?
    • What happens if I add a fragment module to the current module, will it break existing imports?
      • Do only fragments of the same name cause it to break? Or will any fragment mean all #specifier become fragment lookups?
    • If I add module "#specifier" { ... } in what environments will this change, how will it change?
  • As a bundler how should I remap #fragment and ./path/to/file.js#fragment?
    • Can I simply generated an import map?
    • What about in other environments? Can I reliably have fallbacks?
    • Can url segments be transformed at all? (e.g. const worker1 = new SharedWorker("./worker.js#1");)
      • Can I preserve both the hash segment and refer to a module fragment? How?
  • As a content blocker ...
    • How can I disambiguate "module fragments" from fragments simply being used as part of the url?
    • Can I easily disambiguate module fragment imports from those that go to the import map?

These are just some of the questions I can imagine being asked, as such as I think it would be simpler to instead have module fragments use a new scheme instead, this would make behaviour fairly obvious in all contexts.

For example one could imagine trivial inlining of modules into a bundle for example:

// mod.js
module "./node_modules/lodash/map" {
  // ...
}

module "./node_modules/lodash/filter" {
  // ...
}

module "./main.js" {
  import filter from "lodash/filter";
  import map from "lodash/map";
}

This could have an associated import map:

// import-map.json
{
  "scope": {
    "./mod.js": {
      // Needs bikeshed, separating character @@ should not be valid within URL
      "lodash/map": "js-fragment:./node_modules/lodash/map@@./mod.js",
      "lodash/filter": "js-fragment:./node_modules/lodash/map@@./mod.js",
      // ...
      
      // With sugary defaults for import maps https://github.com/WICG/import-maps/issues/7
      // this would just become (assuming all imported modules are inlined)
      "*": "js-fragment:./node_modules/*@@mod.js", // bare specifiers
      "./*": "js-fragment:./*@@mod.js" // relative urls
    },
    "js-fragment:lodash/map@@./mod.js": {
      // Assuming sugary defaults
      "*": "js-fragment:./node_modules/lodash/node_modules/*@@./mod.js",
      "./*": "js-fragment:./node_modules/lodash/*@@./mod.js"
    }
  }
}

For use cases like the inline tests use case we could use js fragment with separation to easily create these:

const testURL = `js-fragment:test@@${ file }`;

const module = await import(testURL);

runTests(module);

For cases where specifier renaming is possible, one could have a short form too js-fragment:name, which could elide the need for import maps in certain cases.

@littledan
Copy link
Member

I was picturing that, if the module fragment was not declared in source, then it would fall back to the surrounding/existing semantics. So it would be like a layer on top of the existing semantics that you describe.

Creating new schemes seems like it has really high cost. I am not sure what the right solution is here, but I am not especially attached to using URL fragments here.

@ExE-Boss
Copy link

ExE-Boss commented Mar 7, 2021

Relative specifiers aren't safe either as ./relative/path.js#fragment can already be used for things such as cache busting (e.g. ./path.js#adshd32ds)

Actually, URL #fragments cannot be used for cache busting, as the #fragment part of the URL doesn’t get sent over HTTP, as it’s entirely client‑side [RFC3986].

@nayeemrmn
Copy link

Actually, URL #fragments cannot be used for cache busting, as the #fragment part of the URL doesn’t get sent over HTTP, as it’s entirely client‑side [RFC3986].

It can be used to load another instance of the module, without reloading the resource. More like module-map busting.

@Jamesernator
Copy link
Author

Jamesernator commented Mar 8, 2021

It can be used to load another instance of the module, without reloading the resource. More like module-map busting.

Actually, URL #fragments cannot be used for cache busting, as the #fragment part of the URL doesn’t get sent over HTTP, as it’s entirely client‑side [RFC3986].

Ah right, I tested it with shared worker and was assuming that because it spawned two workers it must've done two requests, but no, as you say it shares the cache, but loads two copies.

I was picturing that, if the module fragment was not declared in source, then it would fall back to the surrounding/existing semantics. So it would be like a layer on top of the existing semantics that you describe.

Creating new schemes seems like it has really high cost. I am not sure what the right solution is here, but I am not especially attached to using URL fragments here.

I understand why schemes might have high cost, something I thought about after writing my example is why do we need to restrict module fragments to #fragment at all? Couldn't we just allow any specifier and suggest that implementations load it from the current module if a module fragment with that name is found.

e.g. For example:

module "dependency" {
  export default function foo() {
  
  }
}

module "./main.js" {
  // Host tries checking this module for the specifier "dependency" first
  // then it tries succesive modules
  import foo from "dependency";
}

With this we could even imagine more powerful logic, for example we could inline a directory and paths could be resolved correctly, e.g.:

// If this file is hosted at https://domain.tld/path/to/main.js then
// host can canonicalize this url to https://domain.tld/path/to/lib/assert.js
// so that is main.js imports that path, it can be remapped to 
// https://domain.tld/path/to/main.js#./lib/assert.js as the automatic behaviour if there's
// not a corresponding "scopes" entry for it.
module "./lib/assert.js" {
  // Similarly because this import occurs within a module fragment with a canonicalizable url
  // we can remap this to "./lib/helper.js"
  import "./helper.js";
  
  export default function assert() {
    
  }
}

// Again this is canonicalized
module "./lib/helper.js" {
  // The canonicalized URL https://domain.tld/path/to/lib/helper.js
  console.log(import.meta.url); 

  export default function helper() {
  
  }
}

// Not canonicalized directly, unless we have an import map entry
module "depdendency" {
  export default function dep() {
  
  }
}

// main.js
import dependency from "dependency";
import assert from "./lib/assert.js";

The resolve algorithm could look something like this (in Hosts that use URLs):

async function resolveUrlModule(url: URL | string): Promise<Module> {
  url = new URL(url);
  // Consult import map for overrides here for remappings
  url = resolveInImportMapSomehow(url);

  const module = await fetchParseAndCreateModule(url);
  if (url.hash) {
    const fragmentIdentifierString = url.hash.slice(1); // remove # sign
    // If there is such a fragment use it,
    if (module.fragmentModules.has(fragmentIdentifierString)) {
      return moduleFragmentModule.get(fragmentIdentifierString);
    }
  }
  // If no module fragment, just return the module itself
  return module;
}

async function resolveModule(module: Module, specifier: string): Promise<Module> {
  if (!isRelativeURL(specifier)) {
    // Bare specifiers might be remapped, so check immediately, this
    // would behave the same as done today in hosts with bare specifier
    // mapping such as Node or browsers
    if (importMap.hasEntry(module, specifier)) {
      // If the import map has an entry then use that instead
      return await resolveImportMapModule(module, specifier);
    // If bare import is a fragment module then we take it from
    // the file
    } else if (module.fragmentModules.has(specifier)) {
      return module.fragmentModules.get(specifier);
    // Throw an error, no resolution, this is what is done today in Node/browsers
    } else {
      throw new Error("Couldn't resolve module");
    }
  }

  const moduleURL = new URL(module.url);
  // Depending on whether it comes from a fragment or not can determine whether
  // or not we should canonicalize it twice or not (once with fragment, once with specifier)
  const canonicalizedURL = moduleURL.hash === ''
    // If the originator of this request is the module itself, then we simply canonicalize
    // the url directly to it's response URL
    ? new URL(specifier, moduleURL)
    // Otherwise if the hash is a canonicalized section, e.g. #./lib/assert.js
    // then we need to resolve relative to that, so resolve both
    : isRelativeURL(moduleURL.hash.slice(1))
      ? new URL(specifier, new URL(moduleURL.hash.slice(1), moduleURL))
      // If it's a bare specifier, then we need to rely on import map machinery so
      // we don't do anything here, we might need to but I'm not certain
      : await resolveImportMapModule(module, specifier);
  for (const [fragmentIdentifierString, fragmentModule] of module.moduleFragments) {
    if (!isRelativeURL(fragmentIdentifierString)) {
      // This was handled above
      continue;
    }
    const fragmentCanonicalizedURL = new URL(fragmentIdentifierString, module.url);
    // If there is a fragment for this path then we can simply use it, although
    // we will still consult import map for any overrides
    if (fragmentCanonicalizedURL.href === canonicalizedURL) {
      // We simply take the fragment 
      return await resolveUrlModule(new URL(`#${ fragmentidentifierString }`, moduleURL));
    }
  }
  return await resolveUrlModule(canonicalizedURL);
}

@littledan
Copy link
Member

I'm a bit concerned about allowing any specifier because it makes the resolution algorithm somehow "stateful": If those specifiers are available from other JavaScript files, and not just locally within a file, then the order that things are imported in starts to matter. What happens if two different files declare the same specifier? Or, what happens if a later dynamic import declares a module which had already been imported directly from the backing URL? With a fragment, you always know where to fetch a module from, even if it's in a different file and you haven't heard of it yet.

@Jamesernator
Copy link
Author

Jamesernator commented Mar 8, 2021

I'm a bit concerned about allowing any specifier because it makes the resolution algorithm somehow "stateful": If those specifiers are available from other JavaScript files, and not just locally within a file, then the order that things are imported in starts to matter. What happens if two different files declare the same specifier? Or, what happens if a later dynamic import declares a module which had already been imported directly from the backing URL? With a fragment, you always know where to fetch a module from, even if it's in a different file and you haven't heard of it yet.

Just to clarify with both this, and the same proposal in the other issue the URL is canonical only within the context of the same file. e.g. this:

module "./lib/assert.js" {

}

module "lodash" {

}

import assert from "./lib/assert.js";
import lodash from "lodash";

Would essentially be equivalent to having an import map like:

{
  "scopes": {
    // Because it's scoped to the file, other files know nothing about fragments
    // within ./path/to/main.js (unless they were added to the import map explictly)
    "./path/to/main.js": {
      "lodash": "./path/to/main.js#lodash",
      // Note this gets resolved to a full URL so any equivalent forms
      // would also import the same module, not just literal './lib/assert.js', for example
      // within a fragment named ./lib/foo.js 
      // importing ./assert.js would resolve this correctly
      "./lib/assert.js": "./path/to/main.js#./lib/assert.js"
    } 
  }
}

If another module imported lodash it would know nothing about ./path/to/main.js#lodash and would instead fall back to the main import map as usual (if desired one could make the import map point to ./path/to/main.js#lodash, but this would not be the default behaviour).

@Jamesernator
Copy link
Author

Jamesernator commented Mar 8, 2021

Or to put it more simply, the module fragments would act exactly as an inline import map, scoped only to that file. External import maps would still take precedence however.

@getify
Copy link

getify commented Mar 11, 2021

the #fragment part of the URL doesn’t get sent over HTTP, as it’s entirely client‑side

I see this as a downside because I can imagine wanting to put logic into a service worker that would intercept calls for a module and its fragment, and potentially pull it already from a cache, or even dynamically generate a response... but IIUC the SW won't receive the # part from the module-specifier request, so it won't know what fragment was requested.

@littledan
Copy link
Member

I'd like to go with #5 to address this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants