Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Herd-privacy" canonicalization #8

Closed
yamdan opened this issue Oct 12, 2022 · 4 comments
Closed

"Herd-privacy" canonicalization #8

yamdan opened this issue Oct 12, 2022 · 4 comments

Comments

@yamdan
Copy link
Contributor

yamdan commented Oct 12, 2022

I completely agree with the importance of the "herd-privacy" canonicalization proposed in #4 (comment) by @dlongley when we use c14n with selective disclosure. However, if I understand it correctly, we would still have to improve the above algorithm; it seems to me that the following normalized datasets CX1 and CX2 are not modified via the above transformation, i.e., CX1==CY1 and CX2==CY2.

CX1 (obtained from JSON-LD Playground) (==CY1)

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .
_:c14n1 <http://schema.org/name> "Bob" .

CX2 (obtained from JSON-LD Playground) (==CY2)

_:c14n0 <http://schema.org/name> "Carl" .
_:c14n1 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/spouse> _:c14n0 .

Therefore, even if Alice selectively hides the statement about her spouse, anyone can easily guess whether Bob or Carl is Alice's spouse based on the canonicalized identifiers or the order of unrevealed statement:

CY1 with selective disclosure

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .
####### 3rd statement is unrevealed ########

CY2 with selective disclosure

####### 1st statement is unrevealed ########
_:c14n1 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/spouse> _:c14n0 .

What we actually wanted seemed like the following result:

CY1'

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/name> "Bob" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .

CY2'

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/name> "Carl" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .

I am trying to figure out a solution, but haven't found one yet so just submitting this issue at the moment...

@dlongley
Copy link
Contributor

The way I'd characterize our work here (in this WG) is that it's important that we enable the output from the canonize algorithm to be easily used by selective disclosure software to accomplish its goals (as opposed to us needing to necessarily fully solve selective disclosure problems ourselves). Of course, if we have a simple and clean example "herd-privacy" post-processing algorithm demonstrating what can be done, it's all the better.

@iherman
Copy link
Member

iherman commented Oct 13, 2022

The way I'd characterize our work here (in this WG) is that it's important that we enable the output from the canonize algorithm to be easily used by selective disclosure software to accomplish its goals (as opposed to us needing to necessarily fully solve selective disclosure problems ourselves). Of course, if we have a simple and clean example "herd-privacy" post-processing algorithm demonstrating what can be done, it's all the better.

Hear, hear. Much as this is an exciting subject and technical challenge, this is not part of this WG charter...

@yamdan
Copy link
Contributor Author

yamdan commented May 10, 2023

It seems to me that we can close this issue since the topic can be discussed as a part of privacy considerations (#84 and related PR).

@yamdan
Copy link
Contributor Author

yamdan commented Jun 7, 2023

Discussed on 2023-06-07; decided to close because this topic has been already covered in #84 and related PR.

@yamdan yamdan closed this as completed Jun 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants