From f0d19739491031db75da61ddbefa92a7856f488a Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 06:51:45 +0000
Subject: [PATCH 01/13] Add privacy considerations

---
 spec/index.html | 249 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 247 insertions(+), 2 deletions(-)
diff --git a/spec/index.html b/spec/index.html
index 7e100ae..a7ae50d 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2674,8 +2674,12 @@ <h2>Serialization</h2>
 <section id="privacy-considerations" class="informative">
   <h2>Privacy Considerations</h2>
 
-  <section>
-    <h3>Selective Disclosure Schemes</h3>
+  <p>Privacy considerations here are primarily worth discussing when the canonicalization scheme is used for
+    privacy-respecting signed RDF dataset and are likely acceptable for other use cases. One of the former examples is a
+    verifiable credential with selective disclosure.</p>
+
+  <section id="privacy-considerations-leakage">
+    <h3>Data Leakage in Selective Disclosure Schemes</h3>
 
     <p class="issue" data-number="70" title="Dataset structure might reveal information">
 Add text that warns implementers using this specification in selective
@@ -2686,8 +2690,249 @@ <h3>Selective Disclosure Schemes</h3>
 which might be enough to disclose information beyond what the discloser intended to
 disclose.
     </p>
+
+    <p>Selective disclosure is the ability for someone to share only some of the statements from a signed dataset, without
+      harming the ability of the recipient to verify the authenticity of those selected statements.</p>
+    
+    <p>The output of the canonicalization algorithm described in this specification, may leak partial
+      information about undisclosed statements and help the adversary correlate the original and disclosed datasets.</p>
+    
+    <section id="privacy-considerations-leakage-labeling">
+      <h4>Possible Leakage via Canonical Labeling</h4>
+    
+      <p>If a dataset contains at least two blank nodes, the canonical labeling can be exploited to guess the undisclosed
+        quad in the dataset.</p>
+      
+      <p>For example, let us assume we have the following dataset to be signed. (Note: this person is fictitious, prepared
+        only to make this example work.)</p>
+
+      <pre id="ex-pc-leakage-labeling-original-dataset" data-transform="updateExample">
+        <!--
+          # original dataset
+          _:b0 <http://schema.org/address> _:b1 .
+          _:b0 <http://schema.org/familyName> "Jarrett" .
+          _:b0 <http://schema.org/gender> "Female" .  # gender === Female
+          _:b0 <http://schema.org/givenName> "Ali" .
+          _:b0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+          _:b1 <http://schema.org/addressCountry> "United States" .
+          _:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
+        -->
+      </pre>
+
+      <p>Using <a href="#canon-algorithm" class="sectionRef">, we can obtain the <a>serialized canonical form</a> of the
+          <a>normalized dataset</a>, where all the blank nodes are serialized using the canonical labels.</p>
+
+      <pre id="ex-pc-leakage-labeling-normalized-dataset" data-transform="updateExample">
+        <!--
+          # normalized dataset
+          _:c14n0 <http://schema.org/addressCountry> "United States" .
+          _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
+          _:c14n1 <http://schema.org/address> _:c14n0 .
+          _:c14n1 <http://schema.org/familyName> "Jarrett" .
+          _:c14n1 <http://schema.org/gender> "Female" .  # gender === Female
+          _:c14n1 <http://schema.org/givenName> "Ali" .
+          _:c14n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+        -->
+      </pre>
+
+      <p>The signer can generate a signature for the dataset by first hashing each statement and then signing them
+        using a multi-message digital signature scheme like BBS+. The resulting dataset with signature is passed to the
+        holder, who can control whether or not to disclose each statement while maintaining their verifiability.</p>
+      
+      <p>Let us say that the holder wants to show her attributes except for `gender` to a verifier. Then the holder should
+        disclose the following partial dataset. (Note: proofs omitted here for brevity)</p>
+
+      <pre id="ex-pc-leakage-labeling-disclosed-dataset" data-transform="updateExample">
+        <!--
+          # disclosed dataset
+          _:c14n0 <http://schema.org/addressCountry> "United States" .
+          _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
+          _:c14n1 <http://schema.org/address> _:c14n0 .
+          _:c14n1 <http://schema.org/familyName> "Jarrett" .
+          ########### 5th statement is unrevealed ##########
+          _:c14n1 <http://schema.org/givenName> "Ali" .
+          _:c14n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+        -->
+      </pre>
+
+      <p>However, in this example, anyone can guess the unrevealed statement by exploiting the canonical labels and order.</p>
+      
+      <p>Since the dataset was sorted in the canonical order, we can get to know that the hidden statement must start with
+        `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess that the hidden predicate is
+        `&lt;http://schema.org/gender&gt;` with high probability. Alternatively, we can assume that the guesser already has
+        such knowledge via the public credential schema.</p>
+      
+      <p>Then, if the canonical labeling produces different results depending on the gender value, we can use it to deduce the
+        gender value. In fact, this example produces different results depending on whether the gender is `Female` or `Male`.
+        (Note: ignored the other types of gender just for brevity)</p>
+      
+      <p>The following example shows that `gender` = `Male` yields different canonical labeling.</p>
+
+      <pre id="ex-pc-leakage-labeling-hypothetical-normalized-dataset" data-transform="updateExample">
+        <!--
+          # hypothetical normalized dataset
+          _:c14n0 <http://schema.org/address> _:c14n1 .
+          _:c14n0 <http://schema.org/familyName> "Jarrett" .
+          _:c14n0 <http://schema.org/gender> "Male" .  # gender === Male
+          _:c14n0 <http://schema.org/givenName> "Ali" .
+          _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+          _:c14n1 <http://schema.org/addressCountry> "United States" .
+          _:c14n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
+        -->
+      </pre>
+
+      <p>So the verifier should have obtained the following dataset if `gender` had the value `Male`, which differs from the
+        revealed dataset. Therefore, the verifier can conclude that the `gender` is `Female`.</p>
+
+      <pre id="ex-pc-leakage-labeling-hypothetical-disclosed-dataset" data-transform="updateExample">
+        <!--
+          # hypothetical disclosed dataset
+          _:c14n0 <http://schema.org/address> _:c14n1 .
+          _:c14n0 <http://schema.org/familyName> "Jarrett" .
+          ########### 3rd statement is unrevealed ##########
+          _:c14n0 <http://schema.org/givenName> "Ali" .
+          _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+          _:c14n1 <http://schema.org/addressCountry> "United States" .
+          _:c14n1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
+        -->
+      </pre>
+
+      <p>Note that we can use the same approach to guess non-boolean values if the range of possible values is still a
+        reasonable (small) size for a guesser to try all the possibilities.</p>
+
+      <p>By making the canonicalization process private, we can prevent a brute-forcing attacker from trying to see the
+        labeling change by trying multiple possible attribute values.
+        For example, we can use a HMAC instead of a hash function in the canonicalization algorithm. Alternatively, we can add
+        a secret random nonce (always undisclosed) into the dataset.
+        Note that these workarounds force dataset issuers and holders to manage shared secrets securely.
+        We also note that these workarounds adversely affect the unlinkability described below because canonical labeling now
+        varies depending on the secret shared by the issuer and the holder, which will help correlate them.</p>
+    </section>
+    
+    <section id="privacy-considerations-leakage-sorting">
+      <h4>Possible Leakage via Canonical Sorting</h4>
+
+      <p>The canonical order can leak unrevealed information even without canonical labelings.</p>
+      
+      <p>Let us assume that the holder has the following signed dataset, sorted in the canonical (code-point) order.</p>
+
+      <pre id="ex-pc-leakage-sorting-signed-dataset" data-transform="updateExample">
+        <!--
+          :a <http://schema.org/children> "Albert" .
+          :a <http://schema.org/children> "Alice" .
+          :a <http://schema.org/children> "Allie" .
+          :a <http://schema.org/name> "John Smith" .
+        -->
+      </pre>
+
+      <p>If the holder wants to hide the statement for their second child for any reason, the disclosed dataset now looks like
+        this:</p>
+
+      <pre id="ex-pc-leakage-sorting-disclosed-dataset" data-transform="updateExample">
+        <!--
+          :a <http://schema.org/children> "Albert" .
+          ########### 2nd statement is unrevealed ##########
+          :a <http://schema.org/children> "Allie" .
+          :a <http://schema.org/name> "John Smith" .
+        -->
+      </pre>
+
+      <p>Knowing that these statements are sorted in the canonical order, we can guess that the hidden statement must start
+        with `:a &lt;http://schema.org/children&gt; "Al`, which leaks the subject (`:a`), predicate
+        (`&lt;http://schema.org/children&gt;`) and the first two letters of the object (`"Al"`) in the hidden statement.</p>
+      
+      <p>To avoid this leakage, the dataset issuer can randomly shuffle the normalized statements before signing and issuing
+        them to the holder, preventing others from guessing undisclosed information from the canonical order.
+        However, similar to the workarounds mentioned above, this workaround also adversely affects unlinkability. This is
+        because there are $n!$ different permutations for shuffling $n$ statements, and whichever one is used will help
+        correlate the dataset.</p>
+    </section>
   </section>
 
+  <section id="privacy-considerations-leakage">
+    <h3>Unlinkability</h3>
+
+    <p>Unlinkability ensures that no correlatable data are used in a signed dataset while still providing some level of
+      trust, the sufficiency of which must be determined by each verifier. </p>
+
+    <p>While canonical sorting works better for unlinkability, canonical labeling can be exploited to break it.
+      The total number of canonical labelings for a dataset with $n$ blank nodes is $n!$, which is not controllable by the
+      issuer.
+      It means that the herd constructed as a result of selective disclosure will be split into $n!$ pieces due to the
+      canonical labeling, which reduces unlinkability.</p>
+
+    <p>For example, let us assume that an employee of the small company "example.com" shows its employee ID dataset without
+      their name like this:</p>
+
+    <pre id="ex-pc-unlinkability-disclosed-dataset" data-transform="updateExample">
+      <!--
+        # disclosed dataset
+        ########### 1st statement is unrevealed ##########
+        _:c14n0 <http://schema.org/worksFor> _:c14n1 .
+        _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+        _:c14n1 <http://schema.org/address> _:c14n2 .
+        _:c14n1 <http://schema.org/geo> _:c14n3 .
+        _:c14n1 <http://schema.org/name> "example.com" .
+        _:c14n2 <http://schema.org/addressCountry> "United States" .
+        _:c14n3 <http://schema.org/latitude> "0.0" .
+        _:c14n3 <http://schema.org/longitude> "0.0" .
+      -->
+    </pre>
+    
+    <p>The verifier can always trace this person without knowing their name if this company has only three employees with
+      the following employee ID datasets.</p>
+
+    <pre id="ex-pc-unlinkability-normalized-dataset-1" data-transform="updateExample">
+      <!--
+        # normalized dataset 1
+        _:c14n0 <http://schema.org/address> _:c14n1 .
+        _:c14n0 <http://schema.org/geo> _:c14n3 .
+        _:c14n0 <http://schema.org/name> "example.com" .
+        _:c14n1 <http://schema.org/addressCountry> "United States" .
+        _:c14n2 <http://schema.org/name> "Jayden Doe" .
+        _:c14n2 <http://schema.org/worksFor> _:c14n0 .
+        _:c14n2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+        _:c14n3 <http://schema.org/latitude> "0.0" .
+        _:c14n3 <http://schema.org/longitude> "0.0" .
+      -->
+    </pre>
+
+    <pre id="ex-pc-unlinkability-normalized-dataset-2" data-transform="updateExample">
+      <!--
+        # normalized dataset 2
+        _:c14n0 <http://schema.org/address> _:c14n1 .
+        _:c14n0 <http://schema.org/geo> _:c14n2 .
+        _:c14n0 <http://schema.org/name> "example.com" .
+        _:c14n1 <http://schema.org/addressCountry> "United States" .
+        _:c14n2 <http://schema.org/latitude> "0.0" .
+        _:c14n2 <http://schema.org/longitude> "0.0" .
+        _:c14n3 <http://schema.org/name> "Morgan Doe" .
+        _:c14n3 <http://schema.org/worksFor> _:c14n0 .
+        _:c14n3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+      -->
+    </pre>
+
+    <pre id="ex-pc-unlinkability-normalized-dataset-3" data-transform="updateExample">
+      <!--
+        # normalized dataset 3
+        _:c14n0 <http://schema.org/name> "Johnny Smith" .
+        _:c14n0 <http://schema.org/worksFor> _:c14n1 .
+        _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+        _:c14n1 <http://schema.org/address> _:c14n2 .
+        _:c14n1 <http://schema.org/geo> _:c14n3 .
+        _:c14n1 <http://schema.org/name> "example.com" .
+        _:c14n2 <http://schema.org/addressCountry> "United States" .
+        _:c14n3 <http://schema.org/latitude> "0.0" .
+        _:c14n3 <http://schema.org/longitude> "0.0" .
+      -->
+    </pre>
+
+    <p>The canonicalization in this example produces different labelings for these three employees, which helps anyone to
+      correlate their activities even if they do not reveal their names in the dataset.</p>
+
+    <p>By determining some "template" for each anonymous set (or herd) and fixing the canonical labeling and canonical order
+      used in the anonymous set, we can achieve a certain unlinkability.</p>
+  </section>
 </section>
 
 <section id="security-considerations" class="informative">

From 73fe66c09a2295739ae7276c05aaf2c9f9b62d1e Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 08:46:51 +0000
Subject: [PATCH 02/13] Fix section id

---
 spec/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spec/index.html b/spec/index.html
index a7ae50d..98ff05f 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2849,7 +2849,7 @@ <h4>Possible Leakage via Canonical Sorting</h4>
     </section>
   </section>
 
-  <section id="privacy-considerations-leakage">
+  <section id="privacy-considerations-unlinkability">
     <h3>Unlinkability</h3>
 
     <p>Unlinkability ensures that no correlatable data are used in a signed dataset while still providing some level of

From 5fcf4c2444eb4d5aa53805251771e95b972f23d0 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 08:48:55 +0000
Subject: [PATCH 03/13] Disable code highlights for N-Quads examples

---
 spec/index.html | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index 98ff05f..338178c 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2706,7 +2706,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>For example, let us assume we have the following dataset to be signed. (Note: this person is fictitious, prepared
         only to make this example work.)</p>
 
-      <pre id="ex-pc-leakage-labeling-original-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-original-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           # original dataset
           _:b0 <http://schema.org/address> _:b1 .
@@ -2722,7 +2722,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>Using <a href="#canon-algorithm" class="sectionRef">, we can obtain the <a>serialized canonical form</a> of the
           <a>normalized dataset</a>, where all the blank nodes are serialized using the canonical labels.</p>
 
-      <pre id="ex-pc-leakage-labeling-normalized-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-normalized-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           # normalized dataset
           _:c14n0 <http://schema.org/addressCountry> "United States" .
@@ -2742,7 +2742,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>Let us say that the holder wants to show her attributes except for `gender` to a verifier. Then the holder should
         disclose the following partial dataset. (Note: proofs omitted here for brevity)</p>
 
-      <pre id="ex-pc-leakage-labeling-disclosed-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-disclosed-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           # disclosed dataset
           _:c14n0 <http://schema.org/addressCountry> "United States" .
@@ -2768,7 +2768,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       
       <p>The following example shows that `gender` = `Male` yields different canonical labeling.</p>
 
-      <pre id="ex-pc-leakage-labeling-hypothetical-normalized-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-hypothetical-normalized-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           # hypothetical normalized dataset
           _:c14n0 <http://schema.org/address> _:c14n1 .
@@ -2784,7 +2784,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>So the verifier should have obtained the following dataset if `gender` had the value `Male`, which differs from the
         revealed dataset. Therefore, the verifier can conclude that the `gender` is `Female`.</p>
 
-      <pre id="ex-pc-leakage-labeling-hypothetical-disclosed-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-hypothetical-disclosed-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           # hypothetical disclosed dataset
           _:c14n0 <http://schema.org/address> _:c14n1 .
@@ -2816,7 +2816,7 @@ <h4>Possible Leakage via Canonical Sorting</h4>
       
       <p>Let us assume that the holder has the following signed dataset, sorted in the canonical (code-point) order.</p>
 
-      <pre id="ex-pc-leakage-sorting-signed-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-sorting-signed-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           :a <http://schema.org/children> "Albert" .
           :a <http://schema.org/children> "Alice" .
@@ -2828,7 +2828,7 @@ <h4>Possible Leakage via Canonical Sorting</h4>
       <p>If the holder wants to hide the statement for their second child for any reason, the disclosed dataset now looks like
         this:</p>
 
-      <pre id="ex-pc-leakage-sorting-disclosed-dataset" data-transform="updateExample">
+      <pre id="ex-pc-leakage-sorting-disclosed-dataset" class="nohighlight" data-transform="updateExample">
         <!--
           :a <http://schema.org/children> "Albert" .
           ########### 2nd statement is unrevealed ##########
@@ -2864,7 +2864,7 @@ <h3>Unlinkability</h3>
     <p>For example, let us assume that an employee of the small company "example.com" shows its employee ID dataset without
       their name like this:</p>
 
-    <pre id="ex-pc-unlinkability-disclosed-dataset" data-transform="updateExample">
+    <pre id="ex-pc-unlinkability-disclosed-dataset" class="nohighlight" data-transform="updateExample">
       <!--
         # disclosed dataset
         ########### 1st statement is unrevealed ##########
@@ -2882,7 +2882,7 @@ <h3>Unlinkability</h3>
     <p>The verifier can always trace this person without knowing their name if this company has only three employees with
       the following employee ID datasets.</p>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset-1" data-transform="updateExample">
+    <pre id="ex-pc-unlinkability-normalized-dataset-1" class="nohighlight" data-transform="updateExample">
       <!--
         # normalized dataset 1
         _:c14n0 <http://schema.org/address> _:c14n1 .
@@ -2897,7 +2897,7 @@ <h3>Unlinkability</h3>
       -->
     </pre>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset-2" data-transform="updateExample">
+    <pre id="ex-pc-unlinkability-normalized-dataset-2" class="nohighlight" data-transform="updateExample">
       <!--
         # normalized dataset 2
         _:c14n0 <http://schema.org/address> _:c14n1 .
@@ -2912,7 +2912,7 @@ <h3>Unlinkability</h3>
       -->
     </pre>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset-3" data-transform="updateExample">
+    <pre id="ex-pc-unlinkability-normalized-dataset-3" class="nohighlight" data-transform="updateExample">
       <!--
         # normalized dataset 3
         _:c14n0 <http://schema.org/name> "Johnny Smith" .

From 39cdb6bd058286df1b22a5136fd9f22db658ca7d Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 08:57:36 +0000
Subject: [PATCH 04/13] Fix unclosed <a>

---
 spec/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spec/index.html b/spec/index.html
index 338178c..7be2a28 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2719,7 +2719,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         -->
       </pre>
 
-      <p>Using <a href="#canon-algorithm" class="sectionRef">, we can obtain the <a>serialized canonical form</a> of the
+      <p>Using <a href="#canon-algorithm" class="sectionRef"></a>, we can obtain the <a>serialized canonical form</a> of the
           <a>normalized dataset</a>, where all the blank nodes are serialized using the canonical labels.</p>
 
       <pre id="ex-pc-leakage-labeling-normalized-dataset" class="nohighlight" data-transform="updateExample">

From 28b95e4a35d7f797dfe93ad6fa9e18260bb99cd2 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 09:11:53 +0000
Subject: [PATCH 05/13] Apply `example` class to the examples

---
 spec/index.html | 108 +++++++++++++++++++++++-------------------------
 1 file changed, 52 insertions(+), 56 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index 7be2a28..02025dc 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2706,9 +2706,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>For example, let us assume we have the following dataset to be signed. (Note: this person is fictitious, prepared
         only to make this example work.)</p>
 
-      <pre id="ex-pc-leakage-labeling-original-dataset" class="nohighlight" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-original-dataset" class="example" data-transform="updateExample" title="Original Dataset">
         <!--
-          # original dataset
           _:b0 <http://schema.org/address> _:b1 .
           _:b0 <http://schema.org/familyName> "Jarrett" .
           _:b0 <http://schema.org/gender> "Female" .  # gender === Female
@@ -2722,9 +2721,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>Using <a href="#canon-algorithm" class="sectionRef"></a>, we can obtain the <a>serialized canonical form</a> of the
           <a>normalized dataset</a>, where all the blank nodes are serialized using the canonical labels.</p>
 
-      <pre id="ex-pc-leakage-labeling-normalized-dataset" class="nohighlight" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-normalized-dataset" class="example" data-transform="updateExample" title="Normalized Dataset">
         <!--
-          # normalized dataset
           _:c14n0 <http://schema.org/addressCountry> "United States" .
           _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
           _:c14n1 <http://schema.org/address> _:c14n0 .
@@ -2742,9 +2740,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>Let us say that the holder wants to show her attributes except for `gender` to a verifier. Then the holder should
         disclose the following partial dataset. (Note: proofs omitted here for brevity)</p>
 
-      <pre id="ex-pc-leakage-labeling-disclosed-dataset" class="nohighlight" data-transform="updateExample">
-        <!--
-          # disclosed dataset
+      <pre id="ex-pc-leakage-labeling-disclosed-dataset" class="example" data-transform="updateExample" title="Disclosed Dataset">
+        <!--          
           _:c14n0 <http://schema.org/addressCountry> "United States" .
           _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/PostalAddress> .
           _:c14n1 <http://schema.org/address> _:c14n0 .
@@ -2768,9 +2765,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       
       <p>The following example shows that `gender` = `Male` yields different canonical labeling.</p>
 
-      <pre id="ex-pc-leakage-labeling-hypothetical-normalized-dataset" class="nohighlight" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-hypothetical-normalized-dataset" class="example" data-transform="updateExample" title="Hypothetical Normalized Dataset">
         <!--
-          # hypothetical normalized dataset
           _:c14n0 <http://schema.org/address> _:c14n1 .
           _:c14n0 <http://schema.org/familyName> "Jarrett" .
           _:c14n0 <http://schema.org/gender> "Male" .  # gender === Male
@@ -2784,9 +2780,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>So the verifier should have obtained the following dataset if `gender` had the value `Male`, which differs from the
         revealed dataset. Therefore, the verifier can conclude that the `gender` is `Female`.</p>
 
-      <pre id="ex-pc-leakage-labeling-hypothetical-disclosed-dataset" class="nohighlight" data-transform="updateExample">
+      <pre id="ex-pc-leakage-labeling-hypothetical-disclosed-dataset" class="example" data-transform="updateExample" title="Hypothetical Disclosed Dataset">
         <!--
-          # hypothetical disclosed dataset
           _:c14n0 <http://schema.org/address> _:c14n1 .
           _:c14n0 <http://schema.org/familyName> "Jarrett" .
           ########### 3rd statement is unrevealed ##########
@@ -2816,7 +2811,7 @@ <h4>Possible Leakage via Canonical Sorting</h4>
       
       <p>Let us assume that the holder has the following signed dataset, sorted in the canonical (code-point) order.</p>
 
-      <pre id="ex-pc-leakage-sorting-signed-dataset" class="nohighlight" data-transform="updateExample">
+      <pre id="ex-pc-leakage-sorting-signed-dataset" class="example" data-transform="updateExample" title="Signed Dataset">
         <!--
           :a <http://schema.org/children> "Albert" .
           :a <http://schema.org/children> "Alice" .
@@ -2828,7 +2823,7 @@ <h4>Possible Leakage via Canonical Sorting</h4>
       <p>If the holder wants to hide the statement for their second child for any reason, the disclosed dataset now looks like
         this:</p>
 
-      <pre id="ex-pc-leakage-sorting-disclosed-dataset" class="nohighlight" data-transform="updateExample">
+      <pre id="ex-pc-leakage-sorting-disclosed-dataset" class="example" data-transform="updateExample" title="Disclosed Dataset">
         <!--
           :a <http://schema.org/children> "Albert" .
           ########### 2nd statement is unrevealed ##########
@@ -2864,9 +2859,8 @@ <h3>Unlinkability</h3>
     <p>For example, let us assume that an employee of the small company "example.com" shows its employee ID dataset without
       their name like this:</p>
 
-    <pre id="ex-pc-unlinkability-disclosed-dataset" class="nohighlight" data-transform="updateExample">
+    <pre id="ex-pc-unlinkability-disclosed-dataset" class="example" data-transform="updateExample" title="Disclosed Dataset">
       <!--
-        # disclosed dataset
         ########### 1st statement is unrevealed ##########
         _:c14n0 <http://schema.org/worksFor> _:c14n1 .
         _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
@@ -2882,49 +2876,51 @@ <h3>Unlinkability</h3>
     <p>The verifier can always trace this person without knowing their name if this company has only three employees with
       the following employee ID datasets.</p>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset-1" class="nohighlight" data-transform="updateExample">
-      <!--
-        # normalized dataset 1
-        _:c14n0 <http://schema.org/address> _:c14n1 .
-        _:c14n0 <http://schema.org/geo> _:c14n3 .
-        _:c14n0 <http://schema.org/name> "example.com" .
-        _:c14n1 <http://schema.org/addressCountry> "United States" .
-        _:c14n2 <http://schema.org/name> "Jayden Doe" .
-        _:c14n2 <http://schema.org/worksFor> _:c14n0 .
-        _:c14n2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
-        _:c14n3 <http://schema.org/latitude> "0.0" .
-        _:c14n3 <http://schema.org/longitude> "0.0" .
-      -->
-    </pre>
+    <pre id="ex-pc-unlinkability-normalized-dataset" class="example" data-transform="updateExample" title="Normalized Datasets">      
+      <pre id="ex-pc-unlinkability-normalized-dataset-1" class="nohighlight" data-transform="updateExample">
+        <!--
+          # normalized dataset 1
+          _:c14n0 <http://schema.org/address> _:c14n1 .
+          _:c14n0 <http://schema.org/geo> _:c14n3 .
+          _:c14n0 <http://schema.org/name> "example.com" .
+          _:c14n1 <http://schema.org/addressCountry> "United States" .
+          _:c14n2 <http://schema.org/name> "Jayden Doe" .
+          _:c14n2 <http://schema.org/worksFor> _:c14n0 .
+          _:c14n2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+          _:c14n3 <http://schema.org/latitude> "0.0" .
+          _:c14n3 <http://schema.org/longitude> "0.0" .
+        -->
+      </pre>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset-2" class="nohighlight" data-transform="updateExample">
-      <!--
-        # normalized dataset 2
-        _:c14n0 <http://schema.org/address> _:c14n1 .
-        _:c14n0 <http://schema.org/geo> _:c14n2 .
-        _:c14n0 <http://schema.org/name> "example.com" .
-        _:c14n1 <http://schema.org/addressCountry> "United States" .
-        _:c14n2 <http://schema.org/latitude> "0.0" .
-        _:c14n2 <http://schema.org/longitude> "0.0" .
-        _:c14n3 <http://schema.org/name> "Morgan Doe" .
-        _:c14n3 <http://schema.org/worksFor> _:c14n0 .
-        _:c14n3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
-      -->
-    </pre>
+      <pre id="ex-pc-unlinkability-normalized-dataset-2" class="nohighlight" data-transform="updateExample">
+        <!--
+          # normalized dataset 2
+          _:c14n0 <http://schema.org/address> _:c14n1 .
+          _:c14n0 <http://schema.org/geo> _:c14n2 .
+          _:c14n0 <http://schema.org/name> "example.com" .
+          _:c14n1 <http://schema.org/addressCountry> "United States" .
+          _:c14n2 <http://schema.org/latitude> "0.0" .
+          _:c14n2 <http://schema.org/longitude> "0.0" .
+          _:c14n3 <http://schema.org/name> "Morgan Doe" .
+          _:c14n3 <http://schema.org/worksFor> _:c14n0 .
+          _:c14n3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+        -->
+      </pre>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset-3" class="nohighlight" data-transform="updateExample">
-      <!--
-        # normalized dataset 3
-        _:c14n0 <http://schema.org/name> "Johnny Smith" .
-        _:c14n0 <http://schema.org/worksFor> _:c14n1 .
-        _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
-        _:c14n1 <http://schema.org/address> _:c14n2 .
-        _:c14n1 <http://schema.org/geo> _:c14n3 .
-        _:c14n1 <http://schema.org/name> "example.com" .
-        _:c14n2 <http://schema.org/addressCountry> "United States" .
-        _:c14n3 <http://schema.org/latitude> "0.0" .
-        _:c14n3 <http://schema.org/longitude> "0.0" .
-      -->
+      <pre id="ex-pc-unlinkability-normalized-dataset-3" class="nohighlight" data-transform="updateExample">
+        <!--
+          # normalized dataset 3
+          _:c14n0 <http://schema.org/name> "Johnny Smith" .
+          _:c14n0 <http://schema.org/worksFor> _:c14n1 .
+          _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
+          _:c14n1 <http://schema.org/address> _:c14n2 .
+          _:c14n1 <http://schema.org/geo> _:c14n3 .
+          _:c14n1 <http://schema.org/name> "example.com" .
+          _:c14n2 <http://schema.org/addressCountry> "United States" .
+          _:c14n3 <http://schema.org/latitude> "0.0" .
+          _:c14n3 <http://schema.org/longitude> "0.0" .
+        -->
+      </pre>
     </pre>
 
     <p>The canonicalization in this example produces different labelings for these three employees, which helps anyone to

From d68877abfc5e927a61ec1c6e95fa068014f1896e Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 09:16:40 +0000
Subject: [PATCH 06/13] Fix nested `<pre>`

---
 spec/index.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index 02025dc..e538ceb 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2876,7 +2876,7 @@ <h3>Unlinkability</h3>
     <p>The verifier can always trace this person without knowing their name if this company has only three employees with
       the following employee ID datasets.</p>
 
-    <pre id="ex-pc-unlinkability-normalized-dataset" class="example" data-transform="updateExample" title="Normalized Datasets">      
+    <aside id="ex-pc-unlinkability-normalized-dataset" class="example" title="Normalized Datasets">      
       <pre id="ex-pc-unlinkability-normalized-dataset-1" class="nohighlight" data-transform="updateExample">
         <!--
           # normalized dataset 1
@@ -2921,7 +2921,7 @@ <h3>Unlinkability</h3>
           _:c14n3 <http://schema.org/longitude> "0.0" .
         -->
       </pre>
-    </pre>
+    </aside>
 
     <p>The canonicalization in this example produces different labelings for these three employees, which helps anyone to
       correlate their activities even if they do not reveal their names in the dataset.</p>

From 1688a91e9187c877f251b8bf09f266a681578f27 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 09:22:35 +0000
Subject: [PATCH 07/13] Fix unlinkability examples

---
 spec/index.html | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index e538ceb..b23d0d8 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2876,10 +2876,11 @@ <h3>Unlinkability</h3>
     <p>The verifier can always trace this person without knowing their name if this company has only three employees with
       the following employee ID datasets.</p>
 
-    <aside id="ex-pc-unlinkability-normalized-dataset" class="example" title="Normalized Datasets">      
-      <pre id="ex-pc-unlinkability-normalized-dataset-1" class="nohighlight" data-transform="updateExample">
+    <aside id="ex-pc-unlinkability-normalized-dataset" class="example" title="Normalized Datasets">
+
+      Normalized dataset about the first employee:
+      <pre id="ex-pc-unlinkability-normalized-dataset-1" data-transform="updateExample">
         <!--
-          # normalized dataset 1
           _:c14n0 <http://schema.org/address> _:c14n1 .
           _:c14n0 <http://schema.org/geo> _:c14n3 .
           _:c14n0 <http://schema.org/name> "example.com" .
@@ -2892,7 +2893,8 @@ <h3>Unlinkability</h3>
         -->
       </pre>
 
-      <pre id="ex-pc-unlinkability-normalized-dataset-2" class="nohighlight" data-transform="updateExample">
+      Normalized dataset about the second employee:
+      <pre id="ex-pc-unlinkability-normalized-dataset-2" data-transform="updateExample">
         <!--
           # normalized dataset 2
           _:c14n0 <http://schema.org/address> _:c14n1 .
@@ -2907,7 +2909,8 @@ <h3>Unlinkability</h3>
         -->
       </pre>
 
-      <pre id="ex-pc-unlinkability-normalized-dataset-3" class="nohighlight" data-transform="updateExample">
+      Normalized dataset about the third employee:
+      <pre id="ex-pc-unlinkability-normalized-dataset-3" data-transform="updateExample">
         <!--
           # normalized dataset 3
           _:c14n0 <http://schema.org/name> "Johnny Smith" .

From 8ac634df51dbfa91bf01ad000b7899f221681a40 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 12 May 2023 09:24:34 +0000
Subject: [PATCH 08/13] Fix unlinkability examples

---
 spec/index.html | 2 --
 1 file changed, 2 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index b23d0d8..6c10c1d 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2896,7 +2896,6 @@ <h3>Unlinkability</h3>
       Normalized dataset about the second employee:
       <pre id="ex-pc-unlinkability-normalized-dataset-2" data-transform="updateExample">
         <!--
-          # normalized dataset 2
           _:c14n0 <http://schema.org/address> _:c14n1 .
           _:c14n0 <http://schema.org/geo> _:c14n2 .
           _:c14n0 <http://schema.org/name> "example.com" .
@@ -2912,7 +2911,6 @@ <h3>Unlinkability</h3>
       Normalized dataset about the third employee:
       <pre id="ex-pc-unlinkability-normalized-dataset-3" data-transform="updateExample">
         <!--
-          # normalized dataset 3
           _:c14n0 <http://schema.org/name> "Johnny Smith" .
           _:c14n0 <http://schema.org/worksFor> _:c14n1 .
           _:c14n0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .

From e1a515ceb7eebd3e773a1aa0b74e2ed84d8033f2 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Mon, 15 May 2023 14:46:39 +0900
Subject: [PATCH 09/13] Apply suggestions from code review

Co-authored-by: Gregg Kellogg <gregg@greggkellogg.net>
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
---
 spec/index.html | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index 6c10c1d..2d79585 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2674,9 +2674,12 @@ <h2>Serialization</h2>
 <section id="privacy-considerations" class="informative">
   <h2>Privacy Considerations</h2>
 
-  <p>Privacy considerations here are primarily worth discussing when the canonicalization scheme is used for
-    privacy-respecting signed RDF dataset and are likely acceptable for other use cases. One of the former examples is a
-    verifiable credential with selective disclosure.</p>
+  <p>In general, RDF datasets are used for representing and conveying arbitrary information,
+    which may include personally identifiable information which may be used for correlating
+    other with data and disclose arbitrary information.
+    After canonicalizing a dataset, even if some information is removed,
+    it may be possible for a third party to infer omitted data.
+    Applications should consider the potential for exposing personally identifiable information when constructing and conveying datasets.</p>
 
   <section id="privacy-considerations-leakage">
     <h3>Data Leakage in Selective Disclosure Schemes</h3>
@@ -2703,8 +2706,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>If a dataset contains at least two blank nodes, the canonical labeling can be exploited to guess the undisclosed
         quad in the dataset.</p>
       
-      <p>For example, let us assume we have the following dataset to be signed. (Note: this person is fictitious, prepared
-        only to make this example work.)</p>
+      <p>For example, let us assume we have the following dataset to be signed,
+        describing the fictitious person, "Ali Jarret".</p>
 
       <pre id="ex-pc-leakage-labeling-original-dataset" class="example" data-transform="updateExample" title="Original Dataset">
         <!--
@@ -2719,7 +2722,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       </pre>
 
       <p>Using <a href="#canon-algorithm" class="sectionRef"></a>, we can obtain the <a>serialized canonical form</a> of the
-          <a>normalized dataset</a>, where all the blank nodes are serialized using the canonical labels.</p>
+          <a>normalized dataset</a>, where all the blank nodes are serialized using the canonical identifiers.</p>
 
       <pre id="ex-pc-leakage-labeling-normalized-dataset" class="example" data-transform="updateExample" title="Normalized Dataset">
         <!--
@@ -2737,7 +2740,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         using a multi-message digital signature scheme like BBS+. The resulting dataset with signature is passed to the
         holder, who can control whether or not to disclose each statement while maintaining their verifiability.</p>
       
-      <p>Let us say that the holder wants to show her attributes except for `gender` to a verifier. Then the holder should
+      <p>Let us say that the holder wants to show their attributes except for `gender` to a verifier. Then the holder should
         disclose the following partial dataset. (Note: proofs omitted here for brevity)</p>
 
       <pre id="ex-pc-leakage-labeling-disclosed-dataset" class="example" data-transform="updateExample" title="Disclosed Dataset">
@@ -2752,7 +2755,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         -->
       </pre>
 
-      <p>However, in this example, anyone can guess the unrevealed statement by exploiting the canonical labels and order.</p>
+      <p>However, in this example, anyone can guess the unrevealed statement by exploiting the canonical identifiers and order.</p>
       
       <p>Since the dataset was sorted in the canonical order, we can get to know that the hidden statement must start with
         `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess that the hidden predicate is

From 8c1547d6e0fe7674fde56b7c5e1bcd50f3129a56 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Mon, 15 May 2023 05:50:51 +0000
Subject: [PATCH 10/13] Fix typo

---
 spec/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spec/index.html b/spec/index.html
index 2d79585..ced6bdc 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2707,7 +2707,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         quad in the dataset.</p>
       
       <p>For example, let us assume we have the following dataset to be signed,
-        describing the fictitious person, "Ali Jarret".</p>
+        describing the fictitious person, "Ali Jarrett".</p>
 
       <pre id="ex-pc-leakage-labeling-original-dataset" class="example" data-transform="updateExample" title="Original Dataset">
         <!--

From cd7627aac8a6182fc46096131b37227b3814a825 Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Tue, 16 May 2023 22:34:50 +0900
Subject: [PATCH 11/13] Apply suggestions from code review

Co-authored-by: Ivan Herman <ivan@ivan-herman.net>
---
 spec/index.html | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index ced6bdc..b5124dd 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2675,7 +2675,7 @@ <h2>Serialization</h2>
   <h2>Privacy Considerations</h2>
 
   <p>In general, RDF datasets are used for representing and conveying arbitrary information,
-    which may include personally identifiable information which may be used for correlating
+    which may include personally identifiable information that may be used for correlating
     other with data and disclose arbitrary information.
     After canonicalizing a dataset, even if some information is removed,
     it may be possible for a third party to infer omitted data.
@@ -2697,7 +2697,7 @@ <h3>Data Leakage in Selective Disclosure Schemes</h3>
     <p>Selective disclosure is the ability for someone to share only some of the statements from a signed dataset, without
       harming the ability of the recipient to verify the authenticity of those selected statements.</p>
     
-    <p>The output of the canonicalization algorithm described in this specification, may leak partial
+    <p>The output of the canonicalization algorithm, described in this specification, may leak partial
       information about undisclosed statements and help the adversary correlate the original and disclosed datasets.</p>
     
     <section id="privacy-considerations-leakage-labeling">
@@ -2741,7 +2741,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         holder, who can control whether or not to disclose each statement while maintaining their verifiability.</p>
       
       <p>Let us say that the holder wants to show their attributes except for `gender` to a verifier. Then the holder should
-        disclose the following partial dataset. (Note: proofs omitted here for brevity)</p>
+        disclose the following partial dataset. </p>
 
       <pre id="ex-pc-leakage-labeling-disclosed-dataset" class="example" data-transform="updateExample" title="Disclosed Dataset">
         <!--          
@@ -2758,13 +2758,13 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>However, in this example, anyone can guess the unrevealed statement by exploiting the canonical identifiers and order.</p>
       
       <p>Since the dataset was sorted in the canonical order, we can get to know that the hidden statement must start with
-        `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess that the hidden predicate is
-        `&lt;http://schema.org/gender&gt;` with high probability. Alternatively, we can assume that the guesser already has
+        `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess that the hidden predicate is, with high probability,
+        `&lt;http://schema.org/gender&gt;` . Alternatively, we can assume that the guesser already has
         such knowledge via the public credential schema.</p>
       
       <p>Then, if the canonical labeling produces different results depending on the gender value, we can use it to deduce the
-        gender value. In fact, this example produces different results depending on whether the gender is `Female` or `Male`.
-        (Note: ignored the other types of gender just for brevity)</p>
+        gender value. Indeed, this example produces different results depending on whether the gender is `Female` or `Male`.
+        (Note: ignored the other types of gender for the sake of the example.)</p>
       
       <p>The following example shows that `gender` = `Male` yields different canonical labeling.</p>
 
@@ -2795,8 +2795,8 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         -->
       </pre>
 
-      <p>Note that we can use the same approach to guess non-boolean values if the range of possible values is still a
-        reasonable (small) size for a guesser to try all the possibilities.</p>
+      <p>Note that we can use the same approach to guess non-boolean values if the range of possible values is still of a
+        reasonably small size for to try all the possibilities.</p>
 
       <p>By making the canonicalization process private, we can prevent a brute-forcing attacker from trying to see the
         labeling change by trying multiple possible attribute values.
@@ -2804,7 +2804,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
         a secret random nonce (always undisclosed) into the dataset.
         Note that these workarounds force dataset issuers and holders to manage shared secrets securely.
         We also note that these workarounds adversely affect the unlinkability described below because canonical labeling now
-        varies depending on the secret shared by the issuer and the holder, which will help correlate them.</p>
+        varies depending on the secret shared by the issuer and the holder, which may help to correlate them.</p>
     </section>
     
     <section id="privacy-considerations-leakage-sorting">
@@ -2842,7 +2842,7 @@ <h4>Possible Leakage via Canonical Sorting</h4>
       <p>To avoid this leakage, the dataset issuer can randomly shuffle the normalized statements before signing and issuing
         them to the holder, preventing others from guessing undisclosed information from the canonical order.
         However, similar to the workarounds mentioned above, this workaround also adversely affects unlinkability. This is
-        because there are $n!$ different permutations for shuffling $n$ statements, and whichever one is used will help
+        because there are `n!` different permutations for shuffling `n` statements, and whichever one is used will help
         correlate the dataset.</p>
     </section>
   </section>
@@ -2854,9 +2854,9 @@ <h3>Unlinkability</h3>
       trust, the sufficiency of which must be determined by each verifier. </p>
 
     <p>While canonical sorting works better for unlinkability, canonical labeling can be exploited to break it.
-      The total number of canonical labelings for a dataset with $n$ blank nodes is $n!$, which is not controllable by the
+      The total number of canonical labelings for a dataset with `n` blank nodes is `n!`, which is not controllable by the
       issuer.
-      It means that the herd constructed as a result of selective disclosure will be split into $n!$ pieces due to the
+      It means that the herd constructed as a result of selective disclosure will be split into `n!` pieces due to the
       canonical labeling, which reduces unlinkability.</p>
 
     <p>For example, let us assume that an employee of the small company "example.com" shows its employee ID dataset without
@@ -2931,7 +2931,7 @@ <h3>Unlinkability</h3>
       correlate their activities even if they do not reveal their names in the dataset.</p>
 
     <p>By determining some "template" for each anonymous set (or herd) and fixing the canonical labeling and canonical order
-      used in the anonymous set, we can achieve a certain unlinkability.</p>
+      used in the anonymous set, we can achieve a certain level of unlinkability.</p>
   </section>
 </section>
 

From 6f8cf64a416ab2c928f86e5d2f13f40ef24ce44e Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Wed, 17 May 2023 16:35:51 +0900
Subject: [PATCH 12/13] Apply suggestions from code review

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
---
 spec/index.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/spec/index.html b/spec/index.html
index b5124dd..cea38e5 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2758,7 +2758,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       <p>However, in this example, anyone can guess the unrevealed statement by exploiting the canonical identifiers and order.</p>
       
       <p>Since the dataset was sorted in the canonical order, we can get to know that the hidden statement must start with
-        `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess that the hidden predicate is, with high probability,
+        `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess with high probability that the hidden predicate is
         `&lt;http://schema.org/gender&gt;` . Alternatively, we can assume that the guesser already has
         such knowledge via the public credential schema.</p>
       
@@ -2796,7 +2796,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       </pre>
 
       <p>Note that we can use the same approach to guess non-boolean values if the range of possible values is still of a
-        reasonably small size for to try all the possibilities.</p>
+        reasonably small size, allowing us to try all possibilities.</p>
 
       <p>By making the canonicalization process private, we can prevent a brute-forcing attacker from trying to see the
         labeling change by trying multiple possible attribute values.

From ae75903cccb3b7b0b1609dc39c9647eeafd319ee Mon Sep 17 00:00:00 2001
From: Dan Yamamoto <dan@iij.ad.jp>
Date: Fri, 19 May 2023 17:11:57 +0900
Subject: [PATCH 13/13] Apply suggestions from code review

Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
---
 spec/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/spec/index.html b/spec/index.html
index cea38e5..3ba3ca0 100644
--- a/spec/index.html
+++ b/spec/index.html
@@ -2759,7 +2759,7 @@ <h4>Possible Leakage via Canonical Labeling</h4>
       
       <p>Since the dataset was sorted in the canonical order, we can get to know that the hidden statement must start with
         `_:c14n1 &lt;http://schema.org/[f-g]`, which helps us guess with high probability that the hidden predicate is
-        `&lt;http://schema.org/gender&gt;` . Alternatively, we can assume that the guesser already has
+        `&lt;http://schema.org/gender&gt;`. Alternatively, we can assume that the guesser already has
         such knowledge via the public credential schema.</p>
       
       <p>Then, if the canonical labeling produces different results depending on the gender value, we can use it to deduce the