-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functionality to compute membership inference risk for each individual point #146
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. I will probably need to make some formatting changes for consistency (like the printing format). Hopefully this will be merged by the end of the week.
tensorflow_privacy/privacy/membership_inference_attack/data_structures.py
Show resolved
Hide resolved
tensorflow_privacy/privacy/membership_inference_attack/data_structures.py
Outdated
Show resolved
Hide resolved
tensorflow_privacy/privacy/membership_inference_attack/data_structures.py
Outdated
Show resolved
Hide resolved
tensorflow_privacy/privacy/membership_inference_attack/data_structures.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lwsong ,I took another look and left some minor comments. Also, can you add the required types in function signatures (and change regular lists to np.ndarrays)?
...orflow_privacy/privacy/membership_inference_attack/codelabs/codelab_privacy_risk_score.ipynb
Outdated
Show resolved
Hide resolved
...orflow_privacy/privacy/membership_inference_attack/codelabs/codelab_privacy_risk_score.ipynb
Outdated
Show resolved
Hide resolved
Hi @CdavM , I just fixed all these issues you mentioned. Please take a look, thanks! |
Hi @lwsong ! Thanks for making these changes. This looks great! I can't merge the PR, the button is grayed out and there's a warning saying "This branch has conflicts that must be resolved". Can you resolve the conflicts? Also, a meeting would be great. I worry that finding a good time with the holiday season will be hard, but I'll set something up in early January. The meeting isn't blocking this PR anyway. |
Hi @CdavM , just resolved the conflicts. Let me know if there is an issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing, it looks great!
I left comments. Sorry maybe there are more of them, it's very common in software development to have more comment on code review. Of course I might have missed something in comments or wrote something incorrect, that please respond (just for the context, usually comments on code review are kind of the discussions).
tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack.py
Outdated
Show resolved
Hide resolved
tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack.py
Outdated
Show resolved
Hide resolved
tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack.py
Outdated
Show resolved
Hide resolved
@@ -172,6 +174,85 @@ def run_attacks(attack_input: AttackInputData, | |||
privacy_report_metadata=privacy_report_metadata) | |||
|
|||
|
|||
def _compute_privacy_risk_score(attack_input: AttackInputData, | |||
num_bins: int = 15) -> SingleRiskScoreResult: | |||
"""compute each individual point's likelihood of being a member (https://arxiv.org/abs/2003.10595) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate in comments more what this score means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more explanation in the comments
tensorflow_privacy/privacy/membership_inference_attack/membership_inference_attack.py
Outdated
Show resolved
Hide resolved
summary = [] | ||
for single_result in self.risk_score_results: | ||
single_summary = single_result.collect_results() | ||
for line in single_summary: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: summary.extend(single_summary) and no need to use for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already updated the code!
|
||
min_log_value = np.amin(np.concatenate((train_log_values, test_log_values))) | ||
max_log_value = np.amax(np.concatenate((train_log_values, test_log_values))) | ||
bins_hist = np.linspace(min_log_value, max_log_value, num_bins+1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks replacement
np.linspace -> np.logspace, would make log histograms on train_values/test_values (i.e. w/o logs).
Is it correct replacement?
If yes, could you please update the code? (it's always better to have simpler code for understanding and maintenance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the code to np.logspace!
|
||
train_hist, _ = np.histogram(train_log_values, bins=bins_hist) | ||
train_hist = train_hist/(len(train_log_values)+0.0) | ||
train_hist_indices = np.fmin(np.digitize(train_log_values, bins=bins_hist),num_bins)-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why -1? doesn't np.digitize return 0-based indices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bins_hist has num_bins+1 elements, and np.digitize returns values from 1 to num_bins+1 (https://stackoverflow.com/questions/40880624/binning-in-numpy)
tensorflow_privacy/privacy/membership_inference_attack/data_structures.py
Show resolved
Hide resolved
recall_list.append(true_positive_normalized) | ||
return np.array(meaningful_threshold_list), np.array(precision_list), np.array(recall_list) | ||
|
||
def collect_results(self, threshold_list=np.array([1,0.9,0.8,0.7,0.6,0.5])): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using default arguments with mutable types is not allowed by Google Python Styleguide
https://google.github.io/styleguide/pyguide.html#212-default-argument-values
Please remove using default arguments here.
The reason is that it's error-prone (more details in style guide)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the default arguments
"source": [ | ||
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n", | ||
" <td>\n", | ||
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/privacy/blob/master/tensorflow_privacy/privacy/membership_inference_attack/codelabs/codelab.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: This link and the link below seem to point to an incorrect colab.
"\n", | ||
"This part shows how to use the privacy risk score.\n", | ||
"\n", | ||
"For each data slice, we compute privacy risk scores for both training and test data. We then set a threshold on risk scores (an input is inferred as a member if and only if its risk score is higher than the threshold) and compute the attack precision and recall values" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that be possible to add some interpretation of the obtained results?
Some questions that we could highlight:
- How does the precision / recall figures compare to the membership inference attack? As you have the results for MIA right above, it might be helpful to compare the two methods.
- Are there any samples that have high risk scores? Is there anything special about those samples?
- What is the distribution of privacy risk scores? We could probably plot a simple histogram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the precision-recall figures, it will be pretty similar to the threshold attacks based on prediction loss or entropy since the privacy risk score is computed based on the distributions of prediction loss or entropy over training and test data.
The importance of privacy risk score analysis is that we actually compute a risk value for each sample such that we can know which samples have high risks. The precision-recall metric is just one way to present the results.
Indeed, some samples have high risk scores. In the codelab running example, class 3 and class 4 have certain training samples with the risk score of 1. It is a very interesting future direction to explore why those samples have high risks.
I agree that we can plot the histogram of the privacy risk scores.
I will suggest a meeting after the holiday season to thoroughly discuss what is the best way to present the privacy risk score results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting up a meeting sounds good, thank you. Some comments (not blocking for this PR, we can address those after the meeting):
it will be pretty similar to the threshold attacks based on prediction loss or entropy since the privacy risk score is computed based on the distributions of prediction loss or entropy over training and test data.
This sounds reasonable. Can we compare these numbers in the codelab, as a sanity check?
It is a very interesting future direction to explore why those samples have high risks.
As a low hanging fruit, could we plot a few pictures with low and high risk? That might already be very insightful.
We're looking for some way to show how this metrics work in practice.
importance of privacy risk score analysis is that we actually compute a risk value for each sample such that we can know which samples have high risks
This makes me realize that some of the trained attacks, and the threshold attack also give some kind of score, which we threshold to classify into training / non-training examples. For the threshold attack, this score is simply the loss. For the neural network classifiers, this is the output of the softmax layer.
Conceptually your score a very similar, except that you provide a different empirical measure.
We discussed this internally, and we think it might make sense to consolidate this with the other membership attacks instead of having this as a separate codepath. This might make the implementation cleaner and simpler.
We're happy to do this refactoring ourselves, just wanted to let you know about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the AUC and advantage values into the new codelab file. As expected, the results are similar to threshold attacks.
I also plotted several figures with high and low risks for each class label in the codelab file, please take a look.
I agree that neural network attack classifiers or threshold attacks can also give some kind of scores.
The advantage of our proposed privacy risk score metric is that it closely captures the real likelihood of being in the training set. For example, if we count all training and test samples with the privacy risk score around (let's say) 0.8, then among those samples, it is indeed around 80% of them are from the training set. While neural network classifier output usually does not well capture the real likelihood of being a member. You can check more in our USENIX paper (Figure 3 and Figure 11 in https://arxiv.org/pdf/2003.10595.pdf)
Sure, I can see the benefits of consolidating the privacy risk score part with other attacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! Added a small question on the colab. It would be helpful to add interpretation/discussion of results.
Hi @dvadym @sushkoy , I updated the code following your comments. For the codelab file, I feel it is better to have a meeting to thoroughly discuss what is the best way to present the privacy risk score results before I further change the code. So I just leave the file as it is. Let me know if you have more comments or questions, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing comments!
We've discussed internally about the metric name. Risk is very generic term and for library users (which might be unfamiliar with details) it might create a wrong impression, eg. when the training data are public there is no risk in exposure any information about the train data. It's better to represent in the name more precisely what it means.
Would it be possible to have some name that represents it? Eg. train_probability (just for example). WDYT?
It's fine to have in comments references to privacy score, since it's how it's named in the paper.
tensorflow_privacy/privacy/membership_inference_attack/data_structures.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Following our recent paper https://arxiv.org/abs/2003.10595, I implement the code to compute the privacy risk score for each individual sample, which represents its likelihood of being a member. The main function is defined as "_compute_privacy_risk_score" in "membership_inference_attack.py". The function will compute risk scores for all training and test points, which are passed to the "SingleRiskScoreResult" class in "data_structures.py". I also add "codelab_privacy_risk_score.ipynb" to demonstrate how to run the code. Test cases for privacy risk scores are also added.