-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Privacy Sandbox APIs #129
Adding Privacy Sandbox APIs #129
Conversation
@Yash-Vekaria @max-ostapenko what's the latest on this PR as need to merge it today if we want it included in the Web Almanac 2024. |
@tunetheweb Topics API is ready to merge. Protected Audience API and Attribution APIs have been correctly implemented but I don't have a good example that simulates all the cases, so partial response was recorded on the current test website. I'll let @yohhaan comment if he plans to incorporate any other APIs. @max-ostapenko informed me that he shall be taking a look at the code later today to finalize these APIs and perhaps remove stale code to make the PR ready to merge. |
@tunetheweb let's merge it. |
Hi, |
- previous implementation of Topics Detection was returning skipObservation = true if nothing detected and so unwanted urls were added to array - refactored Topics detection: - result = array with object {origin, skipObservation - escaping special char { and } in regex - adding directly to array if anything detected
Can we add the following test sites to this PR for unit tests (I am still checking if header detection is working as expected)?
Also, I am not sure the async check for the attestation is working/returning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of confusion of origins and domains in this. Suggested some clean ups.
Co-authored-by: Barry Pollard <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@max-ostapenko any further comments or you happy to merge?
This sort of thing is very verbose and needlessly increases the size of custom metrics for all 15 million sites (most of which likely don't use shared storage):
Can we exclude empty values in all the data in this custom metric? I made a similar comment on #125 (review) because if EVERY custom metric adds this sort of thing then the custom metric (already 8.32TB or $50 for a single query) will become too large for anyone to query it without taking out a mortgage. |
Sounds good, I am on it
|
Per @tunetheweb 's comment: I changed the format of the result object to make it smaller if APIs are not called. @Yash-Vekaria: can you see if you can port the ProtectedAudience and Attribution Reporting APIs to a similar format? |
@yohhaan I couldn't find any way to verify cross-origin files existence. Otherwise I would say we remove the attestation checks from custom metrics completely. |
Same comment, do we need these all to be set to empty values or can we skip them when nothing of interest in them to save storage space/query costs?
|
I am done with the edits I wanted to do on this custom metric, the tests have started to return results as expected since I fixed the double escape issue for the regex. If people have more cycles to commit, they could try to find websites to test on for each API and see that the calls are actually correctly detected. I have done a few tests + the ones from the automated GitHub actions and things look good to me for now. Thanks to @Yash-Vekaria for all the help along the implementation of this custom metric and to @max-ostapenko as well! |
Thanks @yohhaan and @max-ostapenko for bringing this to completion. @tunetheweb All the changes are locked from my side as well. We are ready to merge this PR. @max-ostapenko Extraction of attestation has been removed as it was working in WPT without Privacy Sandbox API capturing in WPT but not with it on live checks (some async/promise issue thats hard to debug due to different behavior). We have decided to analyse it in analysis phase and remove it in custom metrics altogether. |
I still think that is a lot of extra data to store per URL when in a lot of cases there won’t be any different. Is there no way to only report if the defaults are different? I tried adding https://www.example.com to see what it would produce and reran the test but it didn’t pick them up. But I’m presuming it would produce the same as the blank ones? |
@tunetheweb One way to optimize further is that we remove all
Next, for other APIs, we can have a domain-level mapping where key is domain and the value is a list of APIs used by that domain on the current website, rather than having separate keys for each API and same domain repeating in each of them. So, reduced output will look like the following for operafootball.com:
|
That looks a lot better if you can do that. And happy to include permission policy data if it;'s different from the default (i.e. explicitly blocked) but seems silly to log that same default of all true millions of times. |
Custom metrics for https://almanac.httparchive.org/en/2022/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_9X_2N Custom metrics for https://www.operafootball.com/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_2W_2P {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"securepubads.g.doubleclick.net": [
"runAdAuction",
"navigator.userAgentData.getHighEntropyValues",
"fencedFrameJs",
"sec-browsing-topics|false",
"attribution-reporting-eligible"
],
"pagead2.googlesyndication.com": [
"runAdAuction",
"navigator.userAgentData.getHighEntropyValues",
"joinAdInterestGroup"
],
"googleads.g.doubleclick.net": [
"sec-browsing-topics|true",
"sec-browsing-topics|false",
"navigator.userAgentData.getHighEntropyValues"
],
"www.googletagmanager.com": [
"joinAdInterestGroup",
"navigator.userAgentData.getHighEntropyValues"
],
"fundingchoicesmessages.google.com": [
"accept-ch|Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-WoW64, Sec-CH-UA-Form-Factor, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version"
],
"cdn.ampproject.org": [
"reportWin"
],
"tpc.googlesyndication.com": [
"navigator.userAgentData.getHighEntropyValues"
],
"www.gstatic.com": [
"navigator.userAgentData.getHighEntropyValues"
],
"www.googleadservices.com": [
"attribution-reporting-eligible",
"attribution-reporting-register-source|destination=https://homeserve.com|epsilon=undefined",
"attribution-reporting-register-source|destination=https://fubo.tv|epsilon=undefined",
"privateAggregation.enableDebugMode",
"attribution-reporting-register-source|destination=https://t-mobile.com|epsilon=undefined"
],
"pixel.tapad.com": [
"accept-ch|Sec-CH-UA\nSec-CH-UA-Arch\nSec-CH-UA-Bitness\nSec-CH-UA-Full-Version-List\nSec-CH-UA-Mobile\nSec-CH-UA-Model\nSec-CH-UA-Platform\nSec-CH-UA-Platform-Version\nSec-CH-UA-WoW64",
"accept-ch|Sec-CH-UA\u0000Sec-CH-UA-Arch\u0000Sec-CH-UA-Bitness\u0000Sec-CH-UA-Full-Version-List\u0000Sec-CH-UA-Mobile\u0000Sec-CH-UA-Model\u0000Sec-CH-UA-Platform\u0000Sec-CH-UA-Platform-Version\u0000Sec-CH-UA-WoW64"
]
}
}
} Custom metrics for https://floc.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_KW_2Q {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"floc.glitch.me": [
"document.interestCohort"
]
}
}
} Custom metrics for https://pets-animals-pets-cats.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_G3_2R {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"pets-animals-pets-cats.glitch.me": [
"document.browsingTopics|false"
],
"topics-demo.glitch.me": [
"document.browsingTopics|false"
]
}
}
} Custom metrics for https://tennis-tennis.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_V5_2S {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"tennis-tennis.glitch.me": [
"document.browsingTopics|true"
],
"topics-server.glitch.me": [
"sec-browsing-topics|false"
]
}
}
} Custom metrics for https://protected-audience-demo.web.app/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_60_2T {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": []
}
} Custom metrics for https://protected-audience-demo-advertiser.web.app/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_DD_2V {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"protected-audience-demo-dsp.web.app": [
"joinAdInterestGroup"
]
}
}
} Custom metrics for https://protected-audience-demo-publisher.web.app/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_3E_2W {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"protected-audience-demo-ssp.web.app": [
"runAdAuction"
]
}
}
} Custom metrics for https://fedcm-rp-demo.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_AV_2X {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"fedcm-idp-demo.glitch.me": [
"navigator.credentials.get"
]
}
}
} Custom metrics for https://fedcm-idp-demo.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_8T_2Y {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": {
"fedcm-idp-demo.glitch.me": [
"navigator.credentials.get",
"IdentityProvider.getUserInfo"
]
}
}
} Custom metrics for https://private-state-token-redeemer.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_6V_2Z {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": []
}
} Custom metrics for https://private-state-token-issuer.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_PT_30 {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": []
}
} Custom metrics for https://user-agent-client-hints.glitch.me/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_X6_31 {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": []
}
} Custom metrics for https://shared-storage-demo.web.app/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_2G_32 {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": []
}
} Custom metrics for https://www.example.com/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240611_Q8_33 {
"_privacy-sandbox": {
"top-level-storage-access": false,
"privacySandBoxAPIUsage": []
}
} |
@tunetheweb I have refactored the code as shared in the previous comment. If it looks good, we can merge. |
Thanks for that. Merging! |
Implemented capturing the following Privacy Sandbox APIs:
Test websites: