Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include imported scripts to byte-check #1023

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 36 additions & 5 deletions docs/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe

A <a>script resource</a> has an associated <dfn export for="script resource" id="dfn-referrer-policy">referrer policy</dfn> (a [=/referrer policy=]). It is initially the empty string.

A [=/service worker=] has an associated <dfn export id="dfn-script-resource-map">script resource map</dfn> which is an <a>ordered map</a> where the keys are [=/URLs=] and the values are [=/responses=].
A [=script resource=] has an associated <dfn export for="script resource" id="dfn-script-resource-map">imported scripts map</dfn> which is an <a>ordered map</a> where the keys are [=/URLs=] and the values are [=/responses=].

A [=/service worker=] has an associated <dfn export id="dfn-skip-waiting-flag">skip waiting flag</dfn>. Unless stated otherwise it is unset.

Expand Down Expand Up @@ -2159,10 +2159,10 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
1. Let |response| be the result of <a lt="fetch">fetching</a> |request|.
1. If |response|’s <a for="response" href="https://github.com/whatwg/fetch/issues/376">cache state</a> is not "<code>local</code>", set |registration|’s [=service worker registration/last update check time=] to the current time.
1. If |response|'s <a>unsafe response</a>'s [=response/type=] is not "<code>error</code>", and |response|'s [=response/status=] is an <a>ok status</a>, then:
1. [=map/Set=] <a>script resource map</a>[|request|'s [=request/url=]] to |response|
1. [=map/Set=] |serviceWorker|'s [=script resource=]'s [=script resource/imported scripts map=][|request|'s [=request/url=]] to |response|.
1. Return |response|.
1. Else:
1. If <a>script resource map</a>[|url|] [=map/exists=], return <a>script resource map</a>[|url|].
1. If |serviceWorker|'s [=script resource=]'s [=script resource/imported scripts map=][|url|] [=map/exists=], return |serviceWorker|'s [=script resource=]'s [=script resource/imported scripts map=][|url|].
1. Else, return a <a>network error</a>.
</section>
</section>
Expand Down Expand Up @@ -2190,7 +2190,7 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
<section>
<h3 id="privacy">Privacy</h3>

[=/Service workers=] introduce new persistent storage features including <a>scope to registration map</a> (for [=/service worker registrations=] and their [=/service workers=]), <a>request to response map</a> and <a>name to cache map</a> (for caches), and <a>script resource map</a> (for script resources). In order to protect users from any potential <a biblio data-biblio-type="informative" lt="unsanctioned-tracking">unsanctioned tracking</a> threat, these persistent storages *should* be cleared when users intend to clear them and *should* maintain and interoperate with existing user controls e.g. purging all existing persistent storages.
[=/Service workers=] introduce new persistent storage features including <a>scope to registration map</a> (for [=/service worker registrations=] and their [=/service workers=]), <a>request to response map</a> and <a>name to cache map</a> (for caches), and [=script resource/imported scripts map=] (for script resources). In order to protect users from any potential <a biblio data-biblio-type="informative" lt="unsanctioned-tracking">unsanctioned tracking</a> threat, these persistent storages *should* be cleared when users intend to clear them and *should* maintain and interoperate with existing user controls e.g. purging all existing persistent storages.
</section>
</section>

Expand Down Expand Up @@ -2529,7 +2529,7 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe

Else, continue the rest of these steps after the algorithm's asynchronous completion, with |script| being the asynchronous completion value.

1. If |newestWorker| is not null, |newestWorker|'s [=service worker/script url=] [=url/equals=] |job|'s [=job/script url=] with the *exclude fragments flag* set, and |script|'s [=source text=] is a byte-for-byte match with |newestWorker|'s [=script resource=]'s [=source text=], if |script| is a [=classic script=], and |script|'s [=module script/module record=]'s \[[ECMAScriptCode]] is a byte-for-byte match with |newestWorker|'s [=script resource=]'s [=module script/module record=]'s \[[ECMAScriptCode]] otherwise, then:
1. If |newestWorker| is not null, |newestWorker|'s [=service worker/script url=] [=url/equals=] |job|'s [=job/script url=] with the *exclude fragments flag* set, and the result of invoking [=Check If Service Worker Resources Are Identical=] with |newestWorker|'s [=script resource=], |script|, |registration|, and |job| is true, then:
1. Invoke <a>Resolve Job Promise</a> with |job| and the {{ServiceWorkerRegistration}} object which represents |registration|.
1. Invoke <a>Finish Job</a> with |job| and abort these steps.
1. Else:
Expand Down Expand Up @@ -3224,6 +3224,37 @@ spec: webappsec-referrer-policy; urlPrefix: https://w3c.github.io/webappsec-refe
1. Return |newestWorker|.
</section>

<section algorithm>
<h3 id="check-if-service-worker-resources-are-identical-algorithm"><dfn>Check If Service Worker Resources Are Identical</dfn></h3>

: Input
:: |sourceScript|, a [=script=]
:: |targetScript|, a [=script=]
:: |registration|, a [=/service worker registration=]
:: |job|, a [=job=]
: Output
:: True or false, a boolean

1. If |sourceScript| and |targetScript| are not both [=classic scripts=] or not both [=module scripts=], return false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow good catch

1. If |sourceScript| is a [=classic script=], then:
1. If |sourceScript|'s [=source text=] is not a byte-for-byte match with |targetScript|'s [=source text=], return false.
1. Let |sourceMap| be |sourceScript|'s [=script resource/imported scripts map=].
1. [=map/For each=] |url| → |response| of |sourceMap|:
1. Let |request| be a new [=/request=] whose [=request/url=] is |url|, [=request/client=] is |job|'s [=job/client=], [=request/type=] is "<code>script</code>", [=request/destination=] is "<code>script</code>", [=request/parser metadata=] is "<code>not parser-inserted</code>", [=request/synchronous flag=] is set, and whose [=request/use-URL-credentials flag=] is set.
1. Set |request|'s [=request/cache mode=] to "<code>no-cache</code>" if any of the following are true:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to cause a double-download. I'll explain this in another comment…

* |registration|'s [=service worker registration/use cache=] is false.
* |job|'s [=force bypass cache flag=] is set.
* |registration|'s [=last update check time=] is not null and the time difference in seconds calculated by the current time minus |registration|’s [=last update check time=] is greater than 86400.
1. Let |targetResponse| be the result of [=fetch|fetching=] |request|.
1. If |targetResponse|'s <a for="response" href="https://github.com/whatwg/fetch/issues/376">cache state</a> is not "<code>local</code>", set |registration|’s [=last update check time=] to the current time.
1. Let |targetResponse| be |targetResponse|'s [=unsafe response=].
1. If |targetResponse|'s [=response/type=] is "<code>error</code>", or |targetResponse|'s [=response/status=] is not an [=ok status=], return false.
1. If the result of [=UTF-8 decoding=] |response|'s [=response/body=] is not a byte-for-byte match with the result of [=UTF-8 decoding=] |targetResponse|'s [=response/body=], return false.
1. If |sourceScript| is a [=module script=], then:
1. If |sourceScript|’s [=module script/module record=]'s \[[ECMAScriptCode]] is not a byte-for-byte match with |targetScript|’s [=module script/module record=]'s \[[ECMAScriptCode]], return false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check imported scripts too?

Copy link
Collaborator Author

@jungkees jungkees Dec 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importScripts() is not supported in module scripts: https://html.spec.whatwg.org/#import-scripts-into-worker-global-scope step 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I meant imported as in import whatever from "./foo.js"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought those urls are fetched, parsed and stored as part of the module record, but maybe I'm wrong. I noticed an environment settings object has a module map which is referenced throughout traversing the module script graph. I'll check the algorithms again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh does it flatten it somehow? I couldn't figure it out 😞

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakearchibald, I think you're right. I can't find any step that flattens the entire imported assets. It seems each module script has module record that has the parsed source text and info to the required modules (record's [[RequestedModules]] in ECMAScript language.) Getting to ModuleEvaluation() when running a module script, it evaluates all the related module scripts recursively finding the dependencies.

And I think we should compare all the module scripts stored in an environment settings object's module map. I'd want to confirm this with @domenic before updating this part.

@domenic, SW intends to compare all the script assets including the main script and the imported scripts for both classic and module scripts. For classic scripts, SW has it's own imported scripts map that's used for this byte-by-byte check. But for module scripts, I think I misunderstood that the main script's module record has a flattened asset that include all the imported scripts in the graph. I'd like to check with you whether this is wrong, and if so, iterating an environment setting's object's module map and compare the source texts of all the module scripts in the map would be a right way for our purpose?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, [[ECMAScriptCode]] appears to be a "parse result" according to the spec, but the spec doesn't define what that is 😢 . But since it's the result of parsing it doesn't feel appropriate for a byte-for-byte check.

The way ModuleDeclarationInstantiation performs "For each ImportEntry" makes me think that there isn't any flattening going on. Maybe HTML's module map is an easier way in, since it's at least a flattened view of the imported scripts - although this may contain dynamically imported items that we're not interested in.

If I'm right that [[ECMAScriptCode]] isn't the source text, and source text is only available on classic scripts, we need to store this ourselves.

"Perform the fetch" is called for the top level script, but also each module fetch, so we could use this hook to store our own collection of source texts for byte-by-byte comparison.

@domenic does the above sound right at all? Our intent is to refetch the top-level script and all its imports, and see if any are byte-different to the original.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jungkees whoa, I replied before seeing your message. Weird how similar they are!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, there is no flattened representation of the graph. I think checking everything in the module map is more correct.

I guess [[ECMAScriptCode]] is indeed not correct :-/. However, what we can do is update HTML to move "source text" from classic scripts only to the base script type. We'd probably want to add a note saying something like "source text is only used for classic scripts in this specification, but the Service Workers specification needs to store it for reasons". I don't think you all should create a separate parallel registry :).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. Moving the source text to the base script type will help us use the source text of scripts for both classic and module script cases. I'll spec it to checking everything in the module map with that change.

Another thing that needs a solution is the lifetime of the scripts and the map for SW. With other web workers, the lifetime of the scripts is tied to the WorkerGlobalScope object, so it doesn't store the main script and only stores the module scripts in the module map on the global. But with SW, the lifetime of the scripts outlives those globals, so it stores the main script and the imported scripts on a service worker concept rather than on the SW global. Because of this difference, I think we need to discuss around how to make the fetch a classic worker script and the fetch a module worker script graph work more correctly with SW. There's an opened issue for this: #1013.

1. Return true.
</section>

<section algorithm>
<h3 id="create-client-algorithm"><dfn>Create Client</dfn></h3>

Expand Down
Loading