-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed Work Item: First-Party Sets #17
Comments
Apple supports adopting this proposal as a Privacy CG Work Item. We have proposed similar mechanisms in the past and continue to be interested in this area. In honesty, we would probably not implement the spec as-is because it leaves too many of the hard problems with such a mechanism unsolved or up to each individual browser, but we believe they are eminently solvable, and Privacy CG would be a great place to work through them. |
Echoing Microsoft Edge sentiment from the WICG Discourse thread: we believe that First-Party Sets could be useful in helping unblock valid intra-organizational use cases while maintaining the right privacy promises. We’re supportive of exploring this idea further. Agreed that as a community we’ll need to continue workshopping mitigations against abuse while striking the right balance between organizational cohesion vs. sets that can be reasoned about by most users. We’re hopeful that we can collectively come up with solutions to these considerations, and are interested in continued discussion on First-Party Sets. Privacy CG would be a great home for this. |
Echoing what I wrote on the Discourse thread, I think this proposal is better discussed in WICG. Privacy is a major consideration here, but it is not the overriding or exclusive consideration. The Privacy Group would seem to relegate all other considerations to second-class, which is not appropriate for a standard that has so many implications that go beyond privacy. |
First Party Sets aim to relax privacy (and potentially security) protections on the web. Such protections are an overriding concern but not an exclusive concern. If we don't figure out how to uphold existing protections, browser vendors who prioritize user privacy are unlikely to implement First Party Sets and the end result would be a bifurcated web in terms of how domain names are handled. That's why I think First Party Sets should be discussed in the Privacy CG. This is a place where we have a reasonable chance of figuring out a version of this proposal that's acceptable by most browser vendors. |
If adopted by browsers other than Chrome (like Safari/Webkit) then, yes FPS does have the side effect (not aim) of reducing privacy, and perhaps security, protections. However, within Chrome, it is part of a set of proposals that aim to increase privacy and security, while limiting economic damage to publishers. It is possible that a more desirable outcome for "the web" is:
It seems less likely that an honest conversation across all stakeholders can be had if privacy is the overriding concern. |
I don't understand what "within Chrome" means. Do you mean this is a one-browser feature? If the aim is not to get browser interoperability, I don't see why it should be discussed anywhere within W3C. This is a place were we work together to enhance and develop a web platform that works regardless of which (modern) browser is being used. Given that the goal is interoperability, I think the Privacy CG is the right place to work on First Party Sets. I'll let Google and @krgovind speak to whether they share your views since they are the ones proposing First Party Sets. |
I mean that if Webkit/Safari and other browsers could consider other perspectives (around economic benefits, increased competition, support for diverse voices, etc.) beyond privacy, perhaps an interoperable standard could be created that does decrease privacy in return for other end-user benefits. Or, the choice could be made that an interoperable standard is not possible. However, without an honest conversation around all considerations, it seems that the only possible outcomes of an interoperable FPS standard are:
I'm also very interested in the Chrome team's point of view. |
The Privacy CG has very explicit goals around multi-implementer support and evaluating web compatibility impacts, so a characterization that it doesn't take a holistic view is, in my opinion, unfair. Current Work Items, including Storage Access API and Private Click Measurement, are designed to provide capabilities to help address some of the concerns outlined. Privacy considerations will be an important part of the conversation no matter where this is incubated. The Privacy CG has a more regular cadence for discussion than WICG (which is designed to be lightweight), including twice-a-month teleconferences, breakout sessions, and face-to-faces. It's likely to get more focused time and attention from a diverse set of interests, including both the ads industry and browser developers. As a result, I believe it's likely to move forward more quickly in the Privacy CG. |
I'll disagree with this framing. First Party Sets are aiming to establish a well defined notion of first parties that can safely maintain existing capabilities granted to third parties in order to enable browsers to put greater restrictions on true third parties. It's also important to note that both Firefox and Edge have seen the need to use entities.json for a similar purpose. So this is hopefully standardizing that existing behavior. |
Sorry, I should have been more precise. Today, a third-party means differing registrable domain from the top frame. With FPS, the intention is to, for at least some engine decisions, treat some such differing registrable domains as first party. That to me is a relaxation. But all of this should be discussed in issues, not the proposal. 🙂 I‘ve been wanting to solve this for years, as shown by my two pitches of the idea to WebAppSec in 2017, and I really hope we can get to a definition that holds over time as new business decisions are made based on the existence of FPS and that meets user expectations. I even have some ideas for how to resolve some things. I’ll share once we have a repo. |
How would that work without it being centrally managed? |
@annevk I think what you're advocating for is a centralized/unified UA policy as defined in the current proposal, in order to enable standardization? Please feel free to open an issue on the repo with that suggestion. :) |
I'd like to make a quick argument for the proposal #11, if I may :) Instead of defining a relationship between domains, I believe a better solution is to define the relationship between a domain and the business that owns it. A business may own multiple domains, and therefore relationships between domains can be inferred, potentially serving the same goals as first party sets. In just this regard I believe it has the following advantages:
|
I just want to be really clear about this point, while FPS establishes a set of domains that are owned/controlled/run by the same party, it is not suggesting to treat them as first party to each other such that they would be equivalent to subdomains of the same registrable domain. Perhaps this was a mistake in naming (perhaps "Entity Sets" would be better to put it in the context of entities.json and happy to revisit that choice).
And that is why we are all looking to reduce the capabilities of third parties, which this helps to enable. Or are you suggesting that those capabilities are too powerful to allow for a set of domains that are owned/controlled/run by the same party?
As @krgovind pointed out, central management is certainly a possibility, but defining what that means is important. I am very much of the opinion that the current central management isn't working very well for a number of reasons (no clear policy, sets that are clearly wrong, lack of awareness or opt-in from affected entities, etc.). |
@krgovind, would you like to talk about this during this week's telcon? If so, please add the 'agenda+' label to this issue. Thanks! |
This is very similar to the discussion back in spring of 2017 when I called this proposal Same-Origin Policy v2. People thought I proposed relaxing parts of the existing same-origin policy, similar to what you describe above with subdomains. That was never the case and it is not the case here where I say relaxation. There are many more "engine decisions" made on first versus third party than same-origin policy ones. I went through some of them back in 2017 and would like to explore them anew as part of this work item. Some examples:
|
It would be helpful to understand precisely the problems we’d like to solve with First Party Sets, and why those problems can’t be solved through other web platform features or proposals (e.g., the Storage Access API). The definition of “first party” should be clear and understandable to users, web developers, and publishers. The simplest, most natural approach is to enforce a strict one-to-one mapping between first party and registrable domain (i.e., eTLD+1) or a narrower selector (e.g., origin). Using information from the top-level URL is the ideal way to indicate first party because this is already familiar to most users, it is based on a unique identifier for the website owner, it is consistent across web browsers, it is visible in the address bar, and is even visible in a URL to a page that has not yet been visited. Unfortunately, a definition of first party based on top-level URL isn’t compatible with all sites on the web today. Some cross-site applications expect unrestricted access to third-party cookies. For this reason, Mozilla has deployed Disconnect’s entity list. This is a web compatibility intervention that we hope to deprecate as fewer browsers support third party cookies and fewer sites rely on them. Standardizing such an intervention through First Party Sets solidifies new means of cross-site communication that are unintuitive, and that reduce the accountability a site has to a user. This is opposite of the direction we'd like to move the web. Shared membership in a First Party Set is not easily discoverable. Why should a user expect that a visit to siteA-flowers.example would automatically be correlated to their siteB-roses.example account? We should not have to rely on their shared ownership being implicit knowledge. We don’t see an additional “UI treatment” that will fix the unwanted surprise. Requiring the user agent to enforce a policy puts too much onus on the user agent in constructing a policy and rules for determining which First Party Sets are permitted. Inconsistent application of those rules, especially between different browsers, creates considerable uncertainty for sites. This creates compatibility problems for all browsers that are most felt by smaller actors, and may force browsers to adopt the most permissive of the policies (as pointed out by Maciej). This might be alleviated by agreeing to a common set of rules, but we don’t expect to reach agreement on those rules, leaving uncertainty where there is no agreement. These issues seem fundamental to the design of the proposal, and hence Mozilla is not supportive of First Party Sets. |
To respond to this specifically: we found |
While use-cases of larger actors are clearer and these actors have the resources to be more vocal and represented, we should be cautious about prioritizing the larger actors use-cases above those of smaller actors, particularly if we aim to promote a dynamic and open web. |
I think the definition of a FPS should expand to include domains acting in a cooperative fashion, otherwise FPS heavily favors big companies, Google.com & youtube.com for example. |
I agree with Mozilla's concerns about this proposal. However, I think it's at least possible, if uncertain, that the user-understandability, bad-faith, and interop problems can be solved, and I think it's worth a try. |
A method of addressing the competing concerns the proposal highlights is needed. Two options available are:
Overall the proposal is based on a number of assumptions which do not sit comfortably with both TAG and W3C positions.
|
@krgovind thank you for addressing our support for expanding FPS. wrt FLoC, our concern is that the only entity with the ability to create FLoCs or cohorts is the browser, we feel that is anti-competitive. What if the FLoCs generated don't preform any better than contextual? What if the FLoC's are unpredictable. It will be very challenging to "preserve a vibrant and competitive open web" if we are made to design bidding strategies against a FLoC created by what is essentially a black box to us. Our desire for expanding FPS tracks directly to our desire to have another trusted entity that can create FLoCs or cohorts. That trusted entity will need cross domain identity signals to build viable cohorts, similar to what the browser will use, except different in one important way. Whereas the browser could have access to all browsing habits, we are only asking for browsing habits within the FPS. |
Hello @jdwieland8282, I do believe that some of the ideas on how to build TURTLEDOVE-style interest groups should support your desire here: a bunch of sites that band together and jointly create ad targeting audiences based on activity on any of those sites. For prior discussion of more powerful ways to build audiences, check out TURTLEDOVE issue #26, Criteo's SPARROW version, and Facebook's approach. But there's room for a lot of flexibility here. It sounds like you also want to limit these audiences so that they can only be targeted while someone is visiting that same collection of sites? That hasn't come up before, but it would be an easy feature to add. Anyway, if your goal is building cohorts to target ads at, please work with us in making the TURTLEDOVE/SPARROW idea space support your needs. |
Hi @michaelkleber,
Not entirely, TD interest groups do an ok job at retargeting, but there is no mechanism for finding the "next 1000" customers interested in my product or service. Modeling, the idea that given a seed, one can predict what other users will be interested is essential to Ad Tech and is more or less what (based on my understanding) FLoC does. Criteo's Sparrow version is promising, but FB's proposal won't work for publishers with limited 1st party data, FB is unique in that they have many many users who generate lots of 1st party data which can be used for a seed and modeled audiences.
This is not a conclusion I would draw based on my previous comments. I think we can set it aside for now. The core point I'm making is that we need cohort creation to be possible by more than just the browsers, and the only way for small publishers to generate enough data/signal for this cohort creation is for them to be able to share data horizontally among themselves (not necessarily w/ advertisers). ex. a FPS Thanks for your comments, I plan to attend the Sparrow Tech workshop next week. |
We definitely do want interest groups to support the "next 1000 customers" use case. The SPARROW Lookalike Targeting section is explicitly about this, and I'm happy to work on how something like the FB proposal can be made available to someone who is a third party on many consenting sites, rather than one large first party. But we (Chrome) are not interested in an approach that involves joining up individual users' browsing histories across many different sites. Our focus is on ways to build audiences that don't require giving out browsing history. First Party Sets is the wrong tool for this problem. |
Given the comments on separate proposals above, I think it would be useful to have a separate discussion on them to see if there is any multi vendor interest. Chrome folks, do you intend to set something like that up for e.g. Turtledove? |
Yes! TURTLEDOVE & SPARROW have just moved into WICG (discourse thread), very much because we want to have multi-vendor conversations about it. |
@krgovind thanks a lot for the reply. You're right, an authority would have to follow a policy in order to sign off on the information given. In my proposal, this would mean verifying that the correct business is being registered for the domain. The correct business should be the one that is named on the published privacy policy on the site. A published privacy policy is already commonplace, and is required by law in certain jurisdictions (e.g. https://gdpr.eu/privacy-notice/). The proposal's main aim is to have some of this information readable by the user agent programmatically, in an effort to reduce the over-prevalence of consent overlays, and to foster better transparency / control of the user's data. My proposal does not go as far as defining UA behaviour/policy, how it should treat two domains owned by a common business, or requirements for their privacy policies to be the same. In that respect, the goals for these two proposals are quite different. However, my argument is that the publication of the domain-to-business relationship suits the goals for this proposal nicely, and may be more useful than the publication of a domains-to-domains relationship - especially if the policy for this proposal ends up being that the domains must have matching business ownership according to their privacy policies. |
We have consensus among the @privacycg/chairs and @krgovind (as required by our charter) to adopt this as a Work Item, with @krgovind and @davidben as Editors. I'll work with the @WICG/chairs to transfer the repository over soon. |
@annevk I tried to explain this, but perhaps didn't do a good job. :) Essentially, forcing all sites to move to subdomains of their parent/owner domains would have, in my example, manifested as I will also mention a couple of other use-cases that we learned about:
@pbannist - The example that you mentioned, Geico and Dairy Queen, would actually not be a valid set given our current thinking around the FPS policy. Berkshire Hathaway is a holding company, with Geico and DQ being subsidiaries. Regardless, you do bring up challenges around defining the policy in a way that stays true to first principles, but I'm confident that we can work together towards that goal. Regarding the question of whether ownership/organization is the right principle to design FPS around, there is user research around users' expectations/comfort with being tracked within a first-party. For example, see this paper. Of particular interest are Section 4.2.3, and "Trust" under Section 4.3.2
I agree that it's important to surface FPS affiliation information to users, and we are proposing that it be surfaced in the browser UX. Are you suggesting that this is not sufficient?
Got it. Would this be similar to the P3P project? If so, it may be instructive to study the criticisms, and address how we can overcome those issues with your proposal.
Our specification of FPS as a domain-to-domains relationship is mostly an artifact of needing to find a domain to host the central/unified manifest file on. :) As I mentioned in my previous response, having a single source of truth makes verification and deployment easier. Do you envision a way that we can maintain a central manifest file using a domain-to-business relationship? |
@krgovind that very much depends on the browser UI, no? If sites all moved in that direction, browsers could respond by highlighting the registrable domain even more prominently (or only showing that). |
@annevk : I'm not seeing how browsers highlighting the registrable domain would help this situation, because in the Flickr case, the URL bar would have changed from |
@krgovind The deprecation of 3rd party cookies would be the forcing function which would push sites to consolidate domains to retain some functional benefits they see in a 3rd party cookie world and set up the situation you are solving for above. This drive to consolidate to as few eTDL+1s for functional benefit would not be limited to only 1st parties that are owned by the same organization. You could imagine sites forming a co-op or joining together in a publisher network where you might see two not-co-owned sites Is this something you have considered? |
@krgovind I think that's a good illustration as to why they might not want to do that (those domain transitions would also not be cheap I suspect). |
@brodrigu - I think you are arguing for a solution for publisher consortiums/networks. As discussed earlier on this thread, we think that those should be better served by other APIs such as TURTLEDOVE, and are ideally not compelled to join under a single domain. Note that moving registrable domains like you described also has the cost of losing access to your previous state/cookies, so that would need to be weighed against other incentives. @annevk It sounds like you're taking the position that if a multi-domain site wanted to share data across its domains, the only way it should be allowed to do that is by taking the significant step of consolidating on a single domain? Would that recommendation stand for ccTLD domain variants, as well as for content separated for security reasons (e.g. |
@krgovind It's important to note that the publisher consortium use case is more incentivized than co-owned domains to migrate to a shared eTLD+1 and that if the problem first party sets is trying to avoid is user domain apathy, FPS will likely not be successful if the use case isn't addressed.
certainly there are tradeoffs, but the upside for sharing an eTLD+1 amongst a trusted consortium is higher than currently available alternatives. First Party Sets is a great proposal, but the rigidity of co-ownership as a requirement for set membership hinders its potential to meet a developing security concern. Update: moved to issue: WICG/first-party-sets#17 |
@krgovind on a set of domains that have a common registrable domain as defined by the URL Standard, yes. It's hard enough to get users to grasp that, conveying through UI that two unrelated domains are in a set would go far beyond that and frankly does not really seem feasible. |
Since FPS is now a work item, can we continue the conversation in separate issues? Maybe the editors can find the cycles to migrate the subdomains vs registrable domain set discussion into an issue. 🙏🏼 |
Thanks for the advice, John. I've created WICG/first-party-sets/issues/19 to capture this discussion. |
Closing, as this is now a Work Item. |
First-Party Sets is a web platform mechanism that allows a set of registrable domains (or origins) to be defined as "first-party" to each other. Our primary motivation for this proposal is to define a privacy boundary that allows browsers to eliminate cross-site tracking that currently relies on mechanisms such as third-party cookies and fingerprinting. Tracking policies and privacy models from various browser vendors - Chromium, Edge, Mozilla, WebKit - scope access to user identity to some notion of first-party , which we refer to as a privacy boundary.
Although the top-level document’s registrable domain can act as a natural privacy boundary; it is clear that multi-domain sites are a reality, which compels us to define a better alternative. For example, Firefox ships an entity list to group together domains belonging to the same organization.
Organizations generally prefer maintaining distinct domain names to manage branding, or to allow for future business sales/acquisitions. Additionally, choosing the registrable domain as the privacy boundary may compel organizations to move all their web properties to a single parent domain. The parent domain that a property is hosted on may change with business ownership, and train users to make security decisions based on the subdomain component of URLs. This could make them more susceptible to phishing attacks.
First-Party Sets allows site operators to assert a list of domains as being associated with the same entity. This then allows us to define a top-level document’s First-Party Set as the privacy boundary. Browsers may choose to not impose cross-domain communication restrictions across members of a given First-Party Set (such as is done in practice with disconnect.me’s extension, Firefox ETP’s use of the entity list, and Edge Tracking Protection’s similar exception for same-party domains). However, it is important to apply a set of countervailing pressures:
First-Party Sets has recently been the subject of discussion on various forums; including at PrivacyCG F2F, and WebAdvBG.
We have been working to incubate First-Party Sets in WICG, and it was recently transferred there: https://github.com/WICG/first-party-sets
We'd like to propose that the Privacy CG discuss it and see if the group would like to take it on as a Work Item.
The text was updated successfully, but these errors were encountered: