"Herd-privacy" canonicalization #8

yamdan · 2022-10-12T09:13:40Z

I completely agree with the importance of the "herd-privacy" canonicalization proposed in #4 (comment) by @dlongley when we use c14n with selective disclosure. However, if I understand it correctly, we would still have to improve the above algorithm; it seems to me that the following normalized datasets CX1 and CX2 are not modified via the above transformation, i.e., CX1==CY1 and CX2==CY2.

CX1 (obtained from JSON-LD Playground) (==CY1)

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .
_:c14n1 <http://schema.org/name> "Bob" .

CX2 (obtained from JSON-LD Playground) (==CY2)

_:c14n0 <http://schema.org/name> "Carl" .
_:c14n1 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/spouse> _:c14n0 .

Therefore, even if Alice selectively hides the statement about her spouse, anyone can easily guess whether Bob or Carl is Alice's spouse based on the canonicalized identifiers or the order of unrevealed statement:

CY1 with selective disclosure

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .
####### 3rd statement is unrevealed ########

CY2 with selective disclosure

####### 1st statement is unrevealed ########
_:c14n1 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/spouse> _:c14n0 .

What we actually wanted seemed like the following result:

CY1'

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/name> "Bob" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .

CY2'

_:c14n0 <http://schema.org/name> "Alice" .
_:c14n1 <http://schema.org/name> "Carl" .
_:c14n0 <http://schema.org/spouse> _:c14n1 .

I am trying to figure out a solution, but haven't found one yet so just submitting this issue at the moment...

dlongley · 2022-10-12T14:58:03Z

The way I'd characterize our work here (in this WG) is that it's important that we enable the output from the canonize algorithm to be easily used by selective disclosure software to accomplish its goals (as opposed to us needing to necessarily fully solve selective disclosure problems ourselves). Of course, if we have a simple and clean example "herd-privacy" post-processing algorithm demonstrating what can be done, it's all the better.

iherman · 2022-10-13T04:13:35Z

The way I'd characterize our work here (in this WG) is that it's important that we enable the output from the canonize algorithm to be easily used by selective disclosure software to accomplish its goals (as opposed to us needing to necessarily fully solve selective disclosure problems ourselves). Of course, if we have a simple and clean example "herd-privacy" post-processing algorithm demonstrating what can be done, it's all the better.

Hear, hear. Much as this is an exciting subject and technical challenge, this is not part of this WG charter...

yamdan · 2023-05-10T15:06:39Z

It seems to me that we can close this issue since the topic can be discussed as a part of privacy considerations (#84 and related PR).

yamdan · 2023-06-07T14:43:43Z

Discussed on 2023-06-07; decided to close because this topic has been already covered in #84 and related PR.

yamdan mentioned this issue Mar 13, 2023

Some privacy considerations #84

Closed

yamdan added the propose closing label May 10, 2023

yamdan closed this as completed Jun 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Herd-privacy" canonicalization #8

"Herd-privacy" canonicalization #8

yamdan commented Oct 12, 2022

dlongley commented Oct 12, 2022

iherman commented Oct 13, 2022

yamdan commented May 10, 2023

yamdan commented Jun 7, 2023

"Herd-privacy" canonicalization #8

"Herd-privacy" canonicalization #8

Comments

yamdan commented Oct 12, 2022

dlongley commented Oct 12, 2022

iherman commented Oct 13, 2022

yamdan commented May 10, 2023

yamdan commented Jun 7, 2023