Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add differential privacy as a threat mitigation #91

Merged
merged 5 commits into from
Aug 2, 2021

Conversation

csharrison
Copy link
Contributor

@csharrison csharrison commented Jul 28, 2021

Differential privacy can protect against attacks where a portion of the input data is known a priori, or when the size of a batch is small. This PR adds DP as an optional mitigation for PDA deployments and lists where it helps mitigate various threats.

Additionally:

  • Removes a false statement that revealing user input does not compromise other users' privacy. This is not necessarily the case (e.g. imagine if the batch size is 2 or if many users reveal their input).
  • Removes a statement that clients revealing their inputs is outside the threat model (is this still relevant?)

@cjpatton cjpatton requested review from cjpatton and tgeoghegan July 29, 2021 17:20
Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent! Requested changes are editorial.

draft-pda-protocol.md Show resolved Hide resolved
draft-pda-protocol.md Show resolved Hide resolved
## Differential privacy {#dp}

Optionally, PDA deployments can choose to ensure their output F achieves
[differential privacy](https://en.wikipedia.org/wiki/Differential_privacy).
Copy link
Collaborator

@cjpatton cjpatton Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDITED: Given that the likely trajectory of this document is an IETF WG draft, it would be better to replace this markdown with a reference to a paper about DP. The most useful reference would be something that describes the distinction between client-side and server-side noise addition.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edited this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I added a general DP reference in place of the wikipedia link. It discusses central, local, and multi-party DP.

draft-pda-protocol.md Outdated Show resolved Hide resolved
draft-pda-protocol.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@tgeoghegan tgeoghegan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question about whether we need more explicit protocol support for DP but if so, that can land in a subsequent PR.

@@ -1253,6 +1257,18 @@ but server implementations may also opt out of participating in a PDA task if
the minimum batch size is too small. This document does not specify how to
choose minimum batch sizes.

## Differential privacy {#dp}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides this informal recommendation, do we need explicit protocol support for differential privacy so that collectors can de-noise outputs? We can leave it up to aggregators to decide how they're going to implement DP but I wonder if PDAOutputShare should have a field for the epsilon value that was used by the aggregator. Forgive me if I'm talking nonsense about DP, I am speaking in the terms that we used in Prio v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I am nervous about being to prescriptive here. In the simplest protocol design nothing is needed since the epsilon is hardcoded into the specific protocol instantiation and won't change.

In practice, some specific instantiations may want to reveal even more information about how noise was applied, e.g.

  • The distribution noise is sampled from (Laplace, Gaussian, etc)
  • Parameters of the noise distribution
  • Any kind of threshold used (for example, if you are using approximate DP)

I think we should make this as opaque to the protocol as possible vs. prescribing some single "epsilon" field which might be too constraining. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I still think this current PR is land-able given that a basic instantiation can hardcode everything without requiring any communication.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to go here, I think, is to document the open question by adding an [OPEN ISSUE: blah blah blah].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put a note about this in #19, which I think is an appropriate issue to track this discussion.

csharrison and others added 2 commits July 29, 2021 15:46
Support more than one helper

Co-authored-by: Christopher Patton <[email protected]>
@@ -1253,6 +1257,18 @@ but server implementations may also opt out of participating in a PDA task if
the minimum batch size is too small. This document does not specify how to
choose minimum batch sizes.

## Differential privacy {#dp}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way to go here, I think, is to document the open question by adding an [OPEN ISSUE: blah blah blah].

@cjpatton cjpatton merged commit 2590598 into ietf-wg-ppm:main Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants