-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Host each crate on its own subdomain and allow user JS #1853
Comments
These two at least are solveable. Instead of using a nonce we would use precalculated hashes from the essential files; and instead of having JS files we can have rustdoc emit JSON files for things like |
Oh yeah, this is technically possible currently, e.g. https://docs.rs/tide/latest/tide/ (though, blocked by uBlock for me because third-party favicons are a privacy issue). I would like if we can block having third-party urls for these (and the sidebar logo) and instead have rustdoc able to embed a provided file into the output bundle that is served from docs.rs. |
This is true, although it requires a bunch of little things:
Additionally on the docs.rs side it would require keeping track of the mapping from each release to its list of essential files. And it would break the search and sidebar functionality for all historic docs. None of it is insurmountable, but it winds up being a bunch of work chasing fixes for problems introduced by fixes for problems, etc. :-) And meanwhile there's a nice elegant solution that is already proven out by other hosting sites and aligns with the web security model.
Oh right, I forgot about this! Tide is using:
Presumably we could modify rustdoc so it would fetch these URLs and generate HTML that points to a local image file. We could also try |
I don't think rustdoc fetching URLs is a good model, since it means you need to give unbounded network access to the crate build - which brings a lot of pain, e.g., you need to be much more careful about things like the ec2 metadata service / instance roles, and in general the network environment of the builder. |
Good point. I agree. Let's file a separate issue on rustdoc for how to
tweak the favicon and logo attributes to encourage/make it easier to use
fully-local icons.
|
By the way, the potential for crate A installing ServiceWorkers to mess with crate B's docs is, I think, not present. According to https://w3c.github.io/ServiceWorker/#service-worker-script-response, a script at |
In #167 there is some discussion of what to do about crates providing their own JS. There are two risks suggested: cryptocurrency mining via JS and installing a ServiceWorker that could serve incorrect documentation for other crates. As far as I know the current plan of record is to prevent this using the Content-Security-Policy header to allowlist certain JS, and initial steps were taken in #1333, which implements CSP for crate pages only, with rustdoc pages left as a future exercise.
I propose that we abandon as impractical the plan to implement CSP for rustdoc pages. Instead, we should explicitly allow bring-your-own-JS, and we should plan on a separate subdomain per crate. This aligns docs.rs security boundaries (crates' documentation should not be able to affect each other) with the web's natural security boundaries. Specifically, the Same-origin Policy is the foundation of web security and states that scripts on
foo.example.com
cannot affectbar.example.com
(without specific opt-in frombar.example.com
, other caveats apply, etc etc).Allowing KaTeX and other useful libraries
With each crate on its own subdomain, we can unreservedly allow crates to include whatever JS they want. This resolves a long-standing uncertainty about what is/will be allowed on docs.rs, particularly as regards the popular KaTeX library used to render LaTeX inline on web pages.
Aligning docs.rs with rustdoc
Allowing crates to bring their own JS (and styles, and even fonts) aligns docs.rs with rustdoc's philosophy: rustdoc has a variety of flags that allow adding arbitrary HTML (including script tags). Also, rustdoc implements Markdown, which is defined to allow arbitrary HTML. Since docs.rs relies so heavily on rustdoc, it would be challenging to enforce a security boundary that rustdoc does not participate in enforcing. To further underscore that: rustdoc has no systematic XSS defense in its HTML generation.
Also, since docs.rs hosts all historic versions of a crate as they were documented at the time, docs.rs needs to deal with output from many historical versions of rustdoc. So even to the extent rustdoc is updated to participate in enforcing this security boundary, we would face the problem of what to do with old versions, and what to do with modern versions that were emitted by a buggy version of rustdoc.
Crates control their own execution environment
With
build.rs
, crates can do a lot to modify their environment at build time. For instance the xss-probe build.rs takes the simple expedient of writing a .html and a .js file into the docs/ directory before rustdoc runs. Even if we blocked that behavior (for instance, by clearing the doc directory at some strategic moment), there are potential tricks: overwriting the rustdoc binary, setting PATH or LD_LIBRARY_PATH, or other unknown shenanigans. To make a defensible security boundary of "thou shalt not write unauthorized files during doc builds," we would have to invent and enforce a lot of other security boundaries that are not even currently considered boundaries in the Rust ecosystem.Script-nonce won't work for rustdoc output
For templated output from the docs.rs web server, we can use script-nonce, and inject the nonce at the known places where we are generating an inline script or a
<script src=...>
tag. But we can't inject nonces into rustdoc HTML because we don't know the known-good places. We could parse the HTML and inject the nonce on all script tags, but of course that would defeat the purpose since we would also inject the nonce on malicious script tags.Allowlisting scripts also won't work for rustdoc output
We could allowlist the shared files (mainXXX.js, storageXXX.js), but crate-specific JS is a problem. As one example, each crate has a source-filesXXX.js that lists all the files for the source view sidebar (e.g. https://docs.rs/ureq/latest/source-files-20220709-1.64.0-nightly-6dba4ed21.js). That file is under control of the crate author (see "Crates control their own execution environment" above). So allowlisting it would pierce the security boundary we are trying to defend.
DNS and TLS wildcards
Having a separate subdomain for each crate does not require that we configure separate DNS and certificates for 75k+ crates. Instead, we should set up a wildcard DNS entry (
*.docs.rs
) that points all subdomains to the same set of IP addresses. And we can get a wildcard certificate to match. Then routing requests in docs.rs would just require looking at the hostname as well as the path.We could also continue doing nothing for a while
In general it's always a good idea to compartmentalize different users' content from each other. For instance, GitHub Pages uses
*.github.io
, readthedocs uses*.readthedocs.io
. However, since there is no authentication on docs.rs and no cookies, the issues we're facing are not particularly serious and we can continue to postpone a systematic fix.We can disable ServiceWorkers via the CSP
worker-src
directive, without blocking scripts in general.Cryptocurrency mining via JS is annoying, but has such tiny yields you need a massive amount of visitor traffic to be worthwhile. I don't know what the current state of the problem is, but I suspect you would need to either distribute your JS via an ad network or via a large number of compromised websites to make it worthwhile. And it's pretty noisy. If someone starts using the documentation of a popular crate to mine cryptocurrency, it would be spotted quickly and the docs.rs team could take it down and take any necessary followup actions. This seems like a purely hypothetical problem at this point.
Even if we don't decide to move forward with per-crate subdomains, I think it's very worthwhile to make the decision now that crates are allowed to embed JavaScript, and they will continue to be allowed to do so. The status quo creates unnecessary uncertainty for crate authors, and stumbling blocks for docs.rs developers.
Why the existing approach causes problems
Some issue threads where CSP came up as causing trouble (presumably the combination of
default-src 'none';
and `script-src 'nonce-XYZabc123'):#1387
#302
#1552
#1255
#568
One last cute thing
If each crate has its own subdomain, each crate can have its own favicon logo, so you can better identify different crates' docs in your tabs! ❤️
The text was updated successfully, but these errors were encountered: