-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept-CH-Lifetime privacy concerns #372
Comments
IIUC, this is not correct. If the origin includes
The Chromium implementation (currently underway) processes
cc'ing @igrigorik to comment on if any spec changes are needed here to make this more explicit. |
Hmm, I'm getting a different impression from the text in the spec:
If hints are supposed to be sent only on same-origin subresource requests it would address my main concern. In general, it would be great if the current non-lifetime Accept-CH behavior was also limited to same-origin subresources -- I don't think requiring developers who want to use client hints to also leak this information to owners of third-party subresources loaded on their pages is particularly great.
What about HTTPS pages which have HTTP subresources? The behavior I currently see for Accept-CH is that the HTTP subresource requests have client hints (but no Referer); I believe you're saying this behavior would change with Accept-CH-Lifetime? If so, this sounds good, though see above re: the broader concern about non-same-origin subresources. |
@arturjanc thanks for the great feedback. Let me try to unpack... Prior to introduction of Now that we have ACL, it may be possible to revisit this setup. In particular, if we scope ACL to per-origin, then both the page origin and resource origin can advertise support separately and these preferences will be persisted across requests. In fact, I suspect that this is what will happen anyway, because for cases like image serving CDNs will advertise support for hints on their own origins, such that any site relying on their service can benefit by default. @yoavweiss wdyt?
I think this would naturally fall out of the origin-scoped model? As in, if opt-in is scoped to HTTPS then we wouldn't send hints to HTTP, unless the HTTP origin also explicitly opted-in. Re, Referer: "I have third party resources on my site, which may have hints enabled, and I want to strip those alongside referrers", is that the use case? If so, that seems legitimate, but using Referer does feel a little action-at-a-distance to me. As in, I wouldn't expect Referrer to control hints? That said, I don't have a better suggestion either. @mikewest wdyt?
Good points, we should flag this in the privacy & security section. |
This seems like a better model than the status quo because it limits the sending of hints to origins which opt into receiving them. However, it still has the issue of revealing new information to providers of third-party subresources which they can't obtain now. This seems fine to do with opt-in from the first party (which can already get the Viewport-Width, etc via client-side scripts and then provide it to the CDN in the URL), but seems less great if the CDN can decide to always get hints in all requests, because of the passive fingerprinting potential. What do you think of double-keying the hints on both the first- and third-party? That is, if you set
I realize this reduces the "on by default" benefit you mentioned, but it seems like a relatively simple opt-in for the first party (which is already a necessary condition in |
Would double-keying have material effect on this though? Yes, implementing it this way reduces passive fingerprinting, but only for the very first visit. Nothing prevents every 3P origin from advertising a blanket opt-in policy (in fact, I expect that's exactly how image CDN's will implement it), which will result in same exposure for any repeat visit. Also, as a side effect, double-keying would expose first vs repeat visit bit? I'm not ruling it out, but it's not clear to me that it would be a big win in this context? |
@arturjanc started a branch @ #373 - ptal. |
Re: double-keying, the point is that if you're visiting a first party without Accept-CH then the browser wouldn't send client hints to a CDN even if it sets ACL. Essentially, if you have a first party which wants to use client hints they can opt into them and then their non-same-origin subresources can gets hints if their providers also set ACL (which is fine because the first party could manually pass the hints in the URLs anyway). This prevents CDNs and other origins from which subresources are loaded from getting hints during the user's visit to non-cooperating sites, but should still offer the benefits you're looking for in cases where both parties opt in. I commented on #373 -- it looks good to me in general, modulo the broader double-keying issue (which I believe is important). |
Artur: makes sense, thanks. @yoavweiss @tarunban curious to hear your thoughts on double-keying. Any implementation gotchas here that we should think through? |
No implementation gotchas wrt double-keying. |
This introduces two new requirements - Accept-CH and Accept-CH-Lifetime should be processed for responses originating from potentially trustworthy origins (i.e. HTTPS-only) - Accept-CH-Lifetime preference should be double-keyed, per discussion in #372.
Apologies for the late reply (was exploring the Canadian wilderness...) I don't see a particular problem with double keying but I also don't to see the privacy advantages of it. As @igrigorik said earlier, all third parties will add an automatic ACHL and will get the hints as soon as the main document opts-in, meaning the situation will not be different from today. I think the concerns that @arturjanc raises are real and ACHL will expose new information regarding origins that are not capable of running scripts in the context of the page, new info regarding viewport, DPR and network conditions that they don't currently have. Maybe we need |
not chair hat Could we please stop calling this use case "CDN"? Most CDNs take all traffic for an origin, not just images, etc. (that's very 1999). It's true that some sites direct images.* to a CDN (for example), but that's not great practice. We should really be talking about this as 3rd party content -- e.g., widgets and ads. Given the history there, it's entirely reasonable to be concerned about increasing fingerprinting exposure. Requiring the origin to opt-in to any 3rd party CHs is an improvement, but not a huge one; it'll just give an incentive to 3rd parties to instruct the origin to set whatever policy we require. I think that at a minimum, we should warn (Security Considerations) specifically about the new vector for 3rd party content fingerprinting here, and allow implementations to decide whether they send it to 3rd parties (at all, or when they're in private browsing mode, or...). I'd like to also see at least consideration of making this 1st party only. Given that the use case for CH is mostly to allow intermediaries to do content optimisation -- and remembering that origins already have other techniques available to them -- I'm wondering if allowing third party origins to do intermediary-imposed optimisations is worth the potential privacy tradeoff here. After all, widgets and ads are usually served by |
While we can argue that in an H2 world this is not the best practice, having a separate domain for images is not uncommon. e.g. looking at https://www.nytimes.com/ its images are served from https://static01.nyt.com. All served from the same CDN (and covered by a single cert AFAICT), but still considered a third party from a SOP perspective. It's also not uncommon to have separate certs to such "static domains" as an infosec requirement. So I don't think we can consider sending CH to third parties as something that will only benefit 3rd party content.
They can similarly add requirements for the origin to include their JS that will beacon up that data to them. There's very little we can do about that other than trust the first party's judgement (and make them aware of what they're enabling).
That would exclude many legitimate use cases, as discussed above.
If they run a script in the context of the main page, they can easily exfiltrate that data, which is available through JS APIs. |
Hey Yoav. Absolutely, but the Web security model doesn't have any concept of CDN; images.* is a third-party host as far as same-origin is concerned. |
Indeed. All I'm saying is that exposing CH to such "first party owned third parties" is a legitimate and very common use-case. I'd prefer we find ways to address that use-case (with appropriate means to maintain user-privacy), rather than block it. |
I agree. If subresources can use CH then it is bound to be misused. The trend is to further restrict access to third-party cookies e.g. Safari’s ITB, so there will be more incentive to find other ways to track. CH headers allow that to be done much more efficiently as there is no need for an extra roundtrip.
|
Thanks everyone for the feedback. Nomenclature discussions on "CDN" vs. "widget" vs "3P" aside, I agree with Yoav that subresource optimization is a critical use case for CH and not something we can omit. It may be the case that some particular hints make sense to be scoped to first-party only, but that's a discussion we should have in the context of individual hints; as a general mechanism, CH must support delivery of hints to all origins, possibly with knobs for 1P origin to control which 3P origins are allowed to receive hints. Current proposal for
FWIW, I think (1) is the right behavior, from a user perspective. However, what it doesn't address is the 1P -> 3P delegation case where 1P origin may want to control if (and which) 3P is allowed to request CH. @arturjanc's proposal (b) is an attempt at this, but a fairly blunt one: 3P is allowed to receive hints if 1P opt's in, but this is a blanket policy for all 3P origins and it similarly doesn't provide any fine grained control for which hints are allowed to be requested. If we all agree that the "1P -> 3P hint delegation" use case is an important one, I think we need to explore mechanisms beyond (b). To that end, squinting at this space, I see close parallels to Feature Policy: we have features (hints) that we want to selectively enable/disable, and we may want to scope them to a list of origins — e.g. enable them for "self" / 1P only, selectively enable them for some set of origins, or disable them outright. And so, here's my crazy proposal of the day:
The other benefit here is that this opens a well-defined way to think about a "default FP policy" for each hint. For example, some hints may be restricted by the UA to be 1P only, other may be on-by-default for everyone (e.g. save-data), and others may be off by default. WDYT? Crazy talk? :) |
It's not the craziest thing I've read today :) It's a bit mind-stretching to think of these as features, but it definitely is a kind of power or trust that is being delegated to other origins by the embedding page, so it's not too far from the goals of feature policy. One question I would bring up would be regarding the inheritance of ACL in deeply-nested frame trees -- Feature Policy's model is that once disabled in an frame, a feature can never be reenabled by any subcontent; I wouldn't want to break that invariant without a real compelling case. With the FP model, if |
I think FP's current inheritance behavior makes perfect sense in this context as well. If I disable use of a particular hint on my site, I expect this policy to propagate to all nested frames. |
Naive question: IIUC, any information that can be obtained by an origin via client hints can also be obtained by that origin using a combination of Javascript and cookies. Is it currently possible for 1P origins to control Javascript/cookies behavior of 3P origins? |
At a high level, I am not sure why the client hints permission for foo.com is not the same as (permission for foo.com to run Javascript AND permission for foo.com to store cookies)? |
@tarunban the key difference here is passive content (e.g. images, audio, video, etc): today such resources/origins cannot obtain data that we expose with CH, unless there is active content executing in top level page and scheduling their fetches. I think it's reasonable for origins to have control over whether such passive content providers should be able to request data exposed by CH. |
Really interesting, Ilya. It is a big change :) On the face of it, it seems workable. Feature Policy doesn't have a concept similar to I'm not crazy about the verbosity of Still digesting... |
Interesting indeed!
|
Not essential to the proposal's semantics, but seems like FP is moving away from JSON towards a CSP-like structure. |
LGTM2! :) |
Per discussion in #372, add (optional) suggestion to enforce explicit delegation for 3P origins. For example, a UA may use Feature Policy to implement this — see w3c/webappsec-permissions-policy#129.
Excellent, looks like we've landed in a good place! I've updated the CH spec and resolving this issue. Let's continue the FP discussion in w3c/webappsec-permissions-policy#129. Thanks all for your patience and help here! |
I'm confused by this text, given my understanding of the resolution of this issue:
Why SHOULD and not MUST? Why the qualification of "some or all"? If explicit delegation of permission is required (through feature policy or some other means), shouldn't we just say so? Also, would it help to cite Feature Policy explicitly? |
I don't think we can or should assume that the only "client" for CH is a browser. I may be building a (native) app that's not subject to this and hence SHOULD. Does that sound reasonable? |
I'm not sure I understand. In your non-browser app example, will the opt-in origin not delegate permission to another origin but the app will choose to send Client Hints to that other origin anyway? Is there a reason that behavior needs to be considered conformant? Or just that delegated permission will be communicated in a different way than Feature Policy for non-browser clients? Re-reading this thread, it seems like maybe "some or all" is here because |
The latter. Let's say I'm building a native game app, with assets living on one or multiple CDNs which I know and trust upfront. I may encode my "delegation decision" directly into my code (by appending appropriate CH hints on outbound requests) that's responsible for initiating the fetches; all of this work lives outside of a webview/browser. Which is to say, CH is not browser specific and we should allow for non-FP delegation mechanisms. |
Gotcha. So it sounds like there is agreement that non-browser clients should restrict delivery unless there is explicitly delegated permission, and just that in the non-Web case that permission might be delegated through means other than Feature Policy, e.g. internal policy implemented through server/client code. Can we remove "some or all" or explicitly note the hints that aren't restricted to particular origins? Do we want this to be SHOULD even if it applies to all kinds of servers/clients? |
As outlined in w3c/webappsec-permissions-policy#129, in some cases the default "delegation policy" can be "*" — i.e. deliver to all. Concrete example today is Save-Data, where Chrome decided to deliver this hint on all requests when Data Saver feature is enabled. As such, I think we need to keep the "some or all".
"MUST ... some or all" doesn't seem right? In fact, I'm wondering if I ought to convert all the capital terms in that sections to non-normative. |
Generally, 2119 requirements in Security Considerations are a bit weird / frowned upon. Requirements are to improve interop; general advice to specifications and implementations aren't what they're for. Also, the W3C TAG has expressed some sadness about these requirements. So personally I think that turning this text into prose is appropriate, FWIW. |
Can the TAG have written comments somewhere so we can understand that better? I agree that 2119 language is intended for interop. Sometimes knowing which things require permission and which don't is both a security consideration and important for interoperability of implementations. |
TAG discussion is here: w3ctag/design-reviews#206 |
Now that Save-Data has been dropped from the spec, would it make sense to revisit the "some or all" language? |
The way I read the spec, secure transport is only required for CH opt-in, but CH could be delegated to non-secure origins via Feature Policy. Do you agree with that reading of the spec? Is it intentional to allow CH for (some) non-secure connections? -- That would provide passive network attackers with additional information to track users which I don't think is good. |
@thiemonagel Save-Data is alive and well, we just moved its definition to NetInfo API. Re, opt-in: you cannot opt-in an non-secure origin. The opt-in only extends to the origin for which it was advertised, and CH is now scoped to only support opt-in on secure origins. |
Thanks! Where can the complete list of client hints be found? Would it make sense to insert a reference?
Thanks! I missed that for 3rd parties both delegation and opt-in are required. (Would it make sense to spell out explicitly that secure transport is a prerequisite, e.g. in 2.1. or in 5.?) |
@dret assuming upstream specs provide the IANA registrations with a CH note, is there a page where we could see all the CH hints in one place? @thiemonagel in the meantime, at least for Chrome: https://client-hints-demo.appspot.com/
We already do in: |
I think we should make that explicit in whatwg/fetch#773, and update the processing so that hints are not added on non-secure connections, regardless of the opt-in mechanism. |
@dret <https://github.com/dret> assuming upstream specs provide the IANA
registrations with a CH note, is there a page where we could see all the
CH hints in one place?
yes, but only if the spec creates a registry of the hints and
requires/recommends hints to be registered.
i may understand the spec wrong, but currently there is no clear
definition of which header fields can be used in Accept-CH, right? if
that's true, then almost be definition *all* registered header fields
can be CH hints, right?
for this to be different, a registry might be the way to go.
|
I don't understand the value a registry would provide here. The primary reason for registries -- avoiding collisions -- is already served by the HTTP header registry. Is there any technical use case for being able to recognise that a particular header is a client hint without actually implementing it? |
On Oct 1, 2018, at 08:46, Mark Nottingham ***@***.***> wrote:
I don't understand the value a registry would provide here.
the question was how to have a place where to look up possible client hints. a registry would be one way of achieving that, right?
Is there any technical use case for being able to recognise that a particular header is a client hint without actually implementing it?
that i don't know. i just wanted to point out one possibility how to provide a list of possible client hints.
|
Say I'm a web developer and I'd like to know which values I can set for Accept-CH. Shouldn't the spec answer that question, if only by referring to some other doc? |
Client Hints is a framework describing a pattern of use for other headers. It's not developer documentation, as such; I think we'd expect that need to be filled elsewhere (e.g., books, blog entries, etc.). |
Based on https://groups.google.com/a/chromium.org/forum/#!msg/blink-dev/QHI3sio6--Q/v_zWX1O6AAAJ I have a couple of questions about the impact of
Accept-CH-Lifetime
as currently defined in https://tools.ietf.org/html/draft-ietf-httpbis-client-hints-04#section-2.2.2The main risk, touched upon in the Security Considerations section, is the fact that providers of cross-origin subresources (e.g. images) loaded from pages with
Accept-CH
headers will start receiving fingerprinting-prone information about the configuration of a connecting client. Note that it will expose more information than what was available to these parties in the past because subresources such as images cannot use client-side logic to access information sent in Client Hints. A related issue is that based on the sets of headers sent in such subresource requests, the subresource owner might be able to identify the referring site even if it sets a Referrer Policy to prevent disclosing its URL or origin. For example, requests from a large site which setsReferrer-Policy: no-referrer; Accept-CH: DPR
will be distinguishable from requests from sites withAccept-CH: DPR, Viewport-Width
and from those without client hints. Depending on the chosen set of hints this can in practice uniquely identify the origin visited by the user.The introduction of
Accept-CH-Lifetime
will extend this problem to all resources on a given origin -- if one page sets the header, then subresource requests from all pages in that origin will start carrying hint information. This can be undesirable for the user because it can disclose information about the visited origin and broadcasts fingerprintable information to all parties from which the given origin loads subresources.It seems like this should be mitigated, possibly by one of the following:
Referer
header. Similarly, hints should probably also be stripped on HTTPS -> HTTP transitions.The text was updated successfully, but these errors were encountered: