Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fingerprinting via machine-specific artifacts #85

Closed
kpu opened this issue Aug 27, 2020 · 5 comments · Fixed by #271
Closed

Fingerprinting via machine-specific artifacts #85

kpu opened this issue Aug 27, 2020 · 5 comments · Fixed by #271
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response.

Comments

@kpu
Copy link

kpu commented Aug 27, 2020

Apropos of #3 and webmachinelearning/webmachinelearning-ethics#22, an efficient matmul implementation can be fingerprinted to determine hardware capabilities.

On pre-VNNI Intel, the only efficient way to implement 8-bit multiplication is via pmaddubsw that produces a 16-bit result summed horizontally with saturation. I can construct matrices that test for this saturation, which indicates a pre-VNNI Intel CPU. Whereas ARM and NVidia implement signed * signed to 32-bit.

Saturating addition, which should be used for accuracy lest you generate large sign errors, can be used to infer the order of operations. So vpdpbusds saturation tells me what order the matmul ran in.

The slowdown from using AVX512 instructions is likely detectable with timing.

In floats one can also infer order of operations from rounding. This would reveal the SIMD length and possibly variations in the compiler used to build the user agent. A cache-efficient matmul implementation reveals cache sizes via floating point order of operations.

@dontcallmedom dontcallmedom added the privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. label Feb 12, 2021
@anssiko
Copy link
Member

anssiko commented May 13, 2021

Per https://www.w3.org/2021/05/13-webmachinelearning-minutes.html#t08 @huningxin will solicit input from Wasm people and will report back.

@huningxin
Copy link
Contributor

There are some inputs from @jonathanding @jing-bao, thanks much!

  1. Saturation and rounding (round-to-nearest ties-to-eve) are standardized in Wasm SIMD. So JS developers should see same saturation behavior on different architectures.
  2. There is an early proposal in Wasm SIMD called Relaxed SIMD, which wants to relax some strict determinism requirements of instructions to unlock near native performance on different platforms. Fingerprinting would also be considered there.

@anssiko
Copy link
Member

anssiko commented Jun 3, 2021

In PR #170 we incorporated the following statement to inform implementers about this possible fingerprinting vector as well as added a pointer to this issue:

[...] An execution time analysis may reveal indirectly the performance of the underlying platform's neural network hardware acceleration capabilities relative to another underlying platform.

Note: The group is soliciting further input on the proposed execution time analysis fingerprinting vector and will augment this section with more information and mitigations to inform the implementers of this API.

See https://webmachinelearning.github.io/webnn/#privacy

This issue was discussed on our 2021-05-27 call and we decided to keep this issue open to solicit further feedback.

@krgovind
Copy link

krgovind commented Feb 22, 2022

[Leaving feedback on behalf of Chrome privacy reviewers, since we also would like to understand the fingerprinting abilities of this API.]

See https://webmachinelearning.github.io/webnn/#privacy

Thank you for capturing the fingerprinting considerations in this section. I have a couple of additional questions:

  • The section does a great job covering machine-specific performance considerations. Would it be possible to address the issue of "machine-specific artifacts" as addressed in the WebGPU document.
  • I see that the section points to the WebGPU Privacy Considerations section. Are there any additional considerations or orthogonal fingerprinting vectors for WebNN, such as when dedicated ML hardware accelerators are in use?

It would also be great if you could surface recommended mitigations for implementers to minimize the fingerprinting risk in this section.

anssiko added a commit that referenced this issue Jun 7, 2022
Add the following to Privacy Considerations:

- Machine-specific artifacts
- Device selection
- Future device types

Fix #85
Fix #175
Related #169
@anssiko anssiko changed the title Fingerprinting via matmul Fingerprinting via machine-specific artifacts Jun 9, 2022
@anssiko
Copy link
Member

anssiko commented Jun 9, 2022

(Renamed the issue to better reflect the broader scope of this consideration.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants