-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
credentialSchema and Selective Disclosure #890
Comments
Note: ISO 18013-5 the meaning of "mandatory" in the data structure is that
the mDL app must have that data element in its storage area (i.e. issuers
are required to include that data element when issuing an mDL). It does not
apply to data element request/response with a verifier.
Your question is probably sound, but the example is not.
…On Tue, Jul 5, 2022 at 8:05 AM David Chadwick ***@***.***> wrote:
There are many different ways of implementing selective disclosure. Some
send the whole credential with blinded property names and values, others
send atomic credentials, others send assertions and proofs that the
assertions are correct etc.
If the verifier receives a selectively disclosed credential which has a
credentialSchema property in it, in which some properties are said to be
mandatory and some are optional (e.g. the ISO mDL specifies 11 mandatory
attributes and 22 optional ones) but the verifier only requests a subset of
these properties, and not all the mandatory ones (e.g. asking for date of
birth from a driving license), then how should the credentialSchema
property be utilised by the verifier, given that the received credential
clearly does not match the credentialSchema as it is missing some mandatory
attributes?
I think we need to add some clarifying text to the data model to address
this issue, because currently the DM states "data schemas that provide
verifiers <https://www.w3.org/TR/vc-data-model/#dfn-verifier> with enough
information to determine if the provided data conforms to the provided
schema."
—
Reply to this email directly, view it on GitHub
<#890>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AETAZ7C2IKYEGASZYJO6HJ3VSRFMLANCNFSM52WRMZYA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
As I understand mDL it comprises a set of random numbers for both property types and values (from the perspective of the verifier) therefore it is not possible to determine what it contains. So the question still remains, if the verifier only has a subset of properties revealed to it, what are the rules (and description that we should insert into the DM) to describe how the verifier determines if the presented VC matches the credentialSchema? |
At the meeting on 31 Aug the discussion indicated that another property "presentationSchema" would be more valuable to tell the verifier what properties must and may be in the selectively disclosed credential. Thus the verification process would tell the verifier whether the presented credential conformed to the presentationSchema or not, and then the verifier could determine whether this was valid or not for their business use case. e.g. a verifier can decide to accept a presented credential that failed the verification, or can decide to reject a credential that has passed the verification |
The issue was discussed in a meeting on 2022-08-31
View the transcript4.4. credentialSchema and Selective Disclosure (issue vc-data-model#890)See github issue vc-data-model#890. Brent Zundel: david can you walk us through this?. David Chadwick: thanks for reminding me... there are various ways in which selective disclosure can be done....
Logan Porter: you want to validate as much as you can... I don't think there should be any requirement to check the schema.. David Chadwick: example, I want to see bank account details from a university degree credential... assuming I got an answer... what should the RP do?. Logan Porter: seems the credential would be contradicting the schema. Manu Sporny: several layers... first one is when the issuer issues the credential, are they stating mandatory and optional.... Ted Thibodeau Jr.: I think that those mandatory fields are mandatory for issuance, not presentations..
Samuel Smith: we solve this in ACDC with composable schema... the issuer can create a schema in such a way that the holder can present valid combinations... using anyOf and oneOf. David Chadwick: what I understand is that maybe we need a separate field for presentationSchema... instead of credentialSchema..
Manu Sporny: I am concerned that we know of a few selective disclosure schemes, that define that schema at the cryptographic layer... and thats were it belongs, because you want to enforce.
Manu Sporny: it looks like folks are putting these in the crypto layer, and that we don't need presentationSchema then.. David Chadwick: if we have advanced crypto, maybe that works, but it we are using vanilla crypto, it might be more valuable.
Brent Zundel: this seems like the verifier presentation definition... which is the schema the verifier is requring.. David Chadwick: this would be set by the issuer, not the verifier.
Logan Porter: I think there is a danger of having the issuer control the presentation.
Logan Porter: I think its dangerous to have the issuer mandate presented fields..
See github issue vc-data-model#839. Oliver Terbu: question, is this issue related?. David Chadwick: I don't think they are exactly the same... The verifier is in control of verification... and then applying business rules.
David Chadwick: the verifier can ignore the verification if it wants to.. Ted Thibodeau Jr.: I don't understand.
|
Note that presentationSchema for VPs is different to presentationSchema for VCs. The former (issue 839) indicates what the VP does contain, whereas the latter (this issue) says what the selectively disclosed credential must or may contain. |
Here is a suggestion as a resolution of this issue. Note, I am only addressing credential verification and not credential validation, as the verifier can determine its own rules as to whether a (un)verified credential is valid or not.
However I am not convinced that a discloseSchema is required providing that presented selectively disclosed credentials always must contain metadata properties such as ToU, Evidence, ExpirationDate etc. because when requesting someone's age from a driving license then the number of points is irrelevant. |
First thought, I would suggest Then, it seems that Next, I don't think I understand what you mean by "a list of credential properties that must always be present in presented credentials (whether selectively disclosed or not)". Does this mean that those credential properties will be included/revealed/disclosed whether or not the Holder selects them for disclosure? That would seem to fly in the face of selective disclosure, unless the Holder is at least alerted to the fact before they disclose things they've not selected. Further, you say "the discloseSchema is used to ensure that all the properties that the issuer says must be presented have been presented (e.g. points on a driving license, or TermsOfUse)" and I disagree strongly with the idea that when I (selectively) present my driving license as proof of age that I must also present the violation points I've been assigned thereon, as those points are entirely irrelevant to this presentation of the license — as you yourself accede in your final paragraph. Bottom line, I think this suggestion/idea needs significant refinement before it can be considered viable. |
After more thought I do not think a discloseSchema is needed. Rather I think the following clarification of credentialSchema is needed for verifiers. Whilst the credentialSchema property may be used to ensure that an issued credential is well formed, a verifier may only use it to determine that all the presented subject properties in a selectively disclosed credential are allowed to be there (e.g. a university degree credential does not contain bank account details). Any MUST be present schema directives are irrelevant to a selectively disclosed verifiable credential and MUST be ignored by the verifier. I think this should be added as a second NOTE under the current one in clause 5.4. |
+1. different use-cases require different set of claims, which is why selective disclosure is important in the Issuer-Holder-Verifier model. The Issuer cannot predict all those use-cases and I do not understand why the Issuer would instruct the Holder to always release certain claims.
Verifier uses a schema that includes the subset of the claims that is can (legally, trust framework wise, etc.) request and receive from the Holder. I do not see the need for a separate schema nor a new property also..
yes, claim values are hashed, but the "mandatory claims" will 100% will be included in those hashes, just list Andrew said. 18013-5's mandatory does not mean that those claims always have to be presented to the Verifier, that would be pretty privacy invading..
+1 (if we ever add this property) |
You want verifiers to test whether they're allowed to receive the information that has been presented to them, and then ignore the stuff that they're not allowed to know? This is a broken process. The cat is already out of the bag. If anything, the presenter must be prevented from including bank details in an academic credential, but I'm not sure even this is viably or generally implementable. |
@TallTed What is the purpose of the credentialSchema property? |
A fine question. I didn't introduce the If its purpose is not clear in current documents, then some research would seem to be in order, to see what purpose it was intended to serve. You've suggested that restricted information — e.g., banking information — from a VC may be included in a selective disclosure VP, and that verifiers should check to make sure they have not received any such restricted info. THIS IS NOT VIABLE. I think it's really no different from presenting a non selective disclosure VP and telling verifiers they must discard some fields from it, which I hope you'll agree is equally non-sensical. |
In my opinion credentialSchema is there to check that the credential is well formed. JSON schemas say which properties must or may be present in the credential and what their syntaxes are. So a parser can differentiate between integers, string, URLs, images etc. and know that a credential is wrong if a mandatory property is missing (such as the type). |
I've been thinking about #895 and about what specifically would prevent ACDC from complying with VCDM. As @SmithSamuelM mentions in the meeting summarised here, ACDC uses composable JSON schema, so implementing ACDC to conform with the VCDM would use I also thought |
Its difficult to decide where to start to voice the security concerns of a json-ld document with an
This is a non-normative mechanism for protecting
Composition and Local (non-network location) schema identifiers are two vital (to ACDC) normative properties of JSON Schema that schema.org does not share. Local immutable JSON-Schema are essential to schema integrity. There is no way to lock down @context in a normative way. In contrast local JSON-Schema is a normative use case. Its widely used that way and the JSON Schema spec is very clear that a JSON Schema identifier is nominally not a network location even when expressed as a URL. A schema packager may optionally use network locations but that is up to the packager. Local JSON Schema can be locked down easily by including a hash in the local schema identifier. This is how ACDC uses JSON Schema. But then there is no reason to ever use @context in a normative way. To elaborate, JSON-LD does not normatively recognize any other schema besides schema.org. Of course we can make an exception and make it normative for W3VC but we are doing pretty invasive surgery on json-ld when we do that.
It is entirely nonsensical to talk about authenticity in the context of data transmitted over the internet in any other terms than cryptographically verifiable attribution to some digital identifier. To my knowledge the only practical cryptographic mechanisms for securely attributing data to a digital identifier require serialization of that data to which a verifiable cryptographic commitment is made. And any tampering of that data will break the verifiability of that commitment. So authenticity assumes integrity as a hard constraint. This means that any in-place dynamism which is indistinguishable from tampering can not allowed within the scope of a verifiable commitment. Extensibility can be had but only by chaining, or appending or otherwise adding on to previous commitments not by in-place extension. Now it gets more complicated. The commitments to VCs must allow for multiple sources with different loci of control and different cryptographic artifacts of verifiability. So even though one can expand multiple VCs expressed as JSON-LD into a single RDF graph and create an "integrity proof" on all or part of that expansion, recreating that proof assumes one source, the entity that made the expansion, unless you segment your RDF graph into a set of graphs, one for each source because the artifacts of the original source commitments must be kept around in order to verify authenticity not merely integrity which defeats the gains of having the complex RDF integrity proofs in the first place. There is a subtle slight-of-hand involved here. Verifiable authenticity of data in motion, is not the same as verifiable authenticity of data at rest. One can have an authentic communications channel where the data in motion has been verified as authentic prior to storing in a local database. But the holder of that database, cannot prove to a downstream user that the data in the database is authentic to the source, unless the authenticity mechanism applies to the data at rest. This means that merely proving data integrity of the data at rest is not tantamount to proving authenticity to the original source of the data now at rest. What that all means is that we should start with immutable data objects including immutable schema to which we can attach proof of authenticity at rest and build from there. The easiest interoperability path I see. Is to use ACDCs as an authenticity layer that conveys an opaque payload (opaque to the authenticity layer). That payload may very well be JSON-LD but only an immutable expression of a JSON-LD document. Any dynamic in-place expansion breaks strict authenticity-at-rest. A common approach to protocol layering is to add an authorization sublayer to the authentication layer. This authorizaton sublayer would satisfy the majority of VC use cases where the VC is truly a "credential" i.e. evidence of an entitlement. Authorization is nonsensical without authentication, hence why its a sublayer. In the authorization case, the ACDC must expose the type of authorization. Forensic (enforcement) information could be opaque to the verifiability of the type of authorization and could therefore be relegated to the payload. The authentication layer, and authorization sublayers do not benefit from an open world model or do not benefit enough to justify the complexity of an open world model. And the artifacts of the auth layer can be kept around by the application layer which can add them to an open world model. But the open world is necessarily opaque to the auth layer. The dynamic open world data model should not be pushed down the stack because it then makes security very very difficult. And now we have come full circle. TLDR
|
I want to add one other comment which I think is relevant.
Compared to number 1, tooling that supports Number 2 is relatively experimental, not well proven and much more difficult, complicated, harder to adopt, and risky. Technologically we are infants when it comes to verifiable algorithms for data transformations. We have had 30 years to figure out how to make brute force breaking of ECC digital signatures and Hashes computationally infeasible. These two are all the crypto we need for number 1. Its easy to hand wave the number 2 merely because it sounds cool, but its not cool if its risky and hard to adopt. RDF integrity proofs are more like number 2 than number 1. They are relatively new and therefore risky on a cryptographic time scale. And as I explained above they don't buy us much because its authenticity at rest we care about not merely integrity at rest. And most of the VC use cases are more compatible with number 1, that of a authorization sublayer to an authentication layer that merely depends on digital signatures for verifiable authenticity. And instead of verifiable algorithms to provenance transformations we just build a verifiable data structure made up of the results of the transformations appended in a chain or tree. This works today, no muss, no fuss and no fancy mechanisms. This seems like the practical path forward or at the very least the only reasonable starting point. We have been calling this approach of using append to extend verifiable data structures, the Authentic Web. Because, in our opinion, the primary reason the internet is broken is not because we can't interoperate around the semantics of the data, but because we can't trust the provenance of that data in the first place. So lets decide on an authenticity mechanism, i.e. a trust spanning layer for the internet by which we can establish the authenticity of data. Given that authenticity layer as conveyor, we can convey whatever other facts we want to that are opaque to that layer. This makes the authenticity layer relatively simple. And we solve the provenance problem without complicating it with all the other things one wants to do with the conveyed data once its authenticity has been established. |
Does this mean one cannot simply "convert"(and by convert I mean map to a different data model) a "Verifiable Credential" to an "AnonCred" and vice versa but must perform some one way function to do so? |
@SmithSamuelM Please edit your #890 (comment) and put code fences (single backticks, i.e., |
@David-Chadwick to review PR on issue #934 in the light of this requirement and propose a concrete text in schemas advanced concepts section if any. once those are done, will mark pending-close |
@SmithSamuelM -- Please return once more to #890 (comment) and edit it to wrap each and every instance of |
The issue was discussed in a meeting on 2022-11-09
View the transcript2.4. credentialSchema and Selective Disclosure (issue vc-data-model#890)See github issue vc-data-model#890. Kristina Yasuda: #890 credential Schema selective disclosure. We discussed if this can be incorporated in the VP. Is that a fair statement?. David Chadwick: my thoughts were if the credential schema property are in the credential that is in the VC then it will say what the mandatory/optional properties are. Depending on the disclosure model, for example in a merging multiple atomic credential or like JWTSD or the credential only contains properties verifier wants, my feeling is if the schema is copied into each of these, the verifier would not be able to match the schema, as the schema say[CUT]. Kristina Yasuda: we should tackle selective disclosure first, my suggestion is adding sentence to spec saying "depending on the selective disclosure mechanism used, a schema may or may not be valid". Gabe Cohen: there is also utility in knowing what a set of credentials may or may not contain, including authorship information, like state or DMV etc. will make sure that there is language in schema to avoid issues you described.. Logan Porter: I think putting it in the wrong layer to talk about selective disclosure here, in the case of selective disclosure, what isn't issued shouldn't matter.. Manu Sporny: +1 i think this is at the wrong layer, we have multiple different schemes that are being disclosed, typically they have "must" "can" disclose terms, which is usually at the cryptographic level.. David: The Credential Schema gives the full set of all the properties that should be present, so all the verifier can do is, regardless of the disclosure mechanism. tell if there are additional properties that are not in the schema.. Kristina Yasuda: we have Gabe defining schema into Data Model, and you are doing it for the presentation, what actionable item are you requesting?. David: Between us we have enough to move forward.. Logan Porter: Disagree with complexity credential schema brings, this feels like a business logic decision, if you think it is well formed is a function of trusting the issuer, having a second schema talking about how it is disclosed/used is extra complexity..
Ted Thibodeau Jr.: I'm finding it difficult to frame an argument against it because it makes no sense to me. Suggestion that I'll take a VC issued to me, and selectively disclose attributes, and schema attributes to a verifier, if I don't want to I don't disclose that field. Mahmoud Alkhraishi: What David might be saying - you can't add a field when you're doing a selective disclosure -- original credential had AB and C, if verifier receives A B and D - that was not in original schema..
Kristina Yasuda: anyone opposed to david giving concrete proposals while noting there may be confusion and if concerns are not incorporated PR will not be merged?. Manu Sporny: I'm with ted, the logic doesn't line up, if you have a credential that has AB and C and you can selectively disclose AB and C combination how can you disclose D?. Mahmoud Alkhraishi: +1 I have similar issue. David: Lets say we have a schema that has A B and C, and issuer issues A B C and D, then if verified the signature will pass, but the schema itself will fail the verification logic:. Logan Porter: i think this shouldn't be on the schema level, I find it a strange place to draw a line..
David: this is only to do with VC credential schema, we have to address what is the point of the property. is it to check if it is well-formed or not, it should be a conformance property not an advisory property.. Oliver Terbu: IF issuer issues ABCD then schema can say SD has to have ABC or ABD to be valid credential.. Shawn Butterfield: trying to think through this from real world point of view, what you talked about is about composition constraints, if you just have a schema with no required properties, once processed with a json schema processor it will ignore the things that are there that are not in your schema. If we have regex patterns or composition operators, then json schema validator will spit useful info about properties that shouldn't be there, there is mo[CUT] making it valuable.. Kristina Yasuda: nothing in the data model mandates usage of the schema. When you suggest schema make sure your feedback is incorporated, please make concrete suggestion on advanced schema part.. |
Seems blocked by the working group having not accepted a solution that supports selective disclosure. I suggest closing until such an item exits, or leaving open until this item can be addressed. Its like "potentially compatible with https://github.com/w3c-ccg/ldp-bbs2020/"... but you'll never know until we have a formal work item. I could write text that would answer this regarding BBS LDPs... but its not clear where we would put that text. |
Related to #999 and other issues |
along with @OR13 i don't see an answer that can be real here until we have a formal work item to consider |
One way to use AnonCreds with ACDCs append to extend is to treat an AnonCred as a ZKP corroboration to the claims in an ACDC created at presentation time by the presenter. The presenter composes an ACDC with the claims the presenter wants to disclose to the verifier. These are issued by the presenter. (In ACDC every diclosure is via ACDC not some custom presentation exchange data format). The presenter can then attach an AnonCred disclosure as corroboration that some issuer also made the same claims to some link secret under the control of the presenter. In ACDC proofs (signatures) are attached. This makes it easy to do multi-sig proofs, endorsements, and ZKP corroborations without transforming or converting. An attached corroboration is the correct way to use AnonCreds IMHO. Not as a VC itself but as a ZKP in support of the claims in the VC. I think this is truer to the spirit of how @dhh1128 Daniel Hardman describes how AnonCreds ZKPs are meant to be used. |
Agreed at F2F that PR will create a second property, the verifier's schema, which tells the verifier what the schema of the disclosed VC will be |
The issue was discussed in a meeting on 2022-09-15
View the transcript5.9. credentialSchema and Selective Disclosure (issue vc-data-model#890)See github issue vc-data-model#890. Orie Steele: A problem occurs if, as an issuer, I commit to a credential schema then present to you a selectively disclosed credential that doesn't match the schema that I committed to. David Chadwick: This can be solved by making every property optional because then everything conforms to the schema. Brent Zundel: This sounds like potentially a better fit for the Implementer's Guide. Oliver Terbu: There's another possible solution using JSON schemas. David Waite: There may be different requirements for the full-blown credential and the credential you present to people. Joe Andrieu: What you just said hurt my brain in a good way. Orie Steele: The issuer can express its intention for mandatory-to-disclose fields and protect them. Samuel Smith: In ACDC we use the combination operations of JSON schema, including nesting. David Chadwick: in the evidence work we've done, the evidence contains the name & address of the subject.
Brent Zundel: This is happening with a different mental model of what the issuer needs to do.
Samuel Smith: You can do both together. That's how we do it.. Brent Zundel: What's actually happening is the intersection.
Kristina Yasuda: I think both are needed.
Kristina Yasuda: What is in scope for the data model document is the schema for the issuer.
Kristina Yasuda: We should try to describe the aspects that people need to consider when using selective disclosure. Steve McCown: They may need to see that I live in the state. They don't need to know my organ donor status.. Brent Zundel: Sometimes the verifier won't use the schema because it doesn't care. Joe Andrieu: I think we're talking about four different schemas at least. Manu Sporny: I'm concerned that we're waving our hands around this.
Manu Sporny: Just like we added things to VC 1.0 that we're now taking out because we haven't figured out how to use them.
Brent Zundel: I disagree that this isn't in protocols and implementations.
Orie Steele: I disagree that we shouldn't provide guidance.
Orie Steele: Even if the guidance is just a heads-up that these things are happening in the wild, it's useful to say that to people. Michael Jones: brent, you talked about people using presentation exchange in protocols, that's certainly true. I will put on my charter hat and said we intentionally put specification of protocols out of scope. I'm fine saying "this is how things are done in some protocols" without picking winners.. Brent Zundel: Yes, I was speaking to more of the how in the data model not in protocol perspective.. Phillip Long: Are we saying that the holder's response to the verifier can choose to add or omit claims?. Orie Steele: yes.
Manu Sporny: I wanted to underscore the points Mike Jones made.
Manu Sporny: It's fine to talk about data elements used in protocols. David Chadwick: The current credential schema addresses Joe's #1.
Manu Sporny: Is the "disclosable" property being used today?. David Chadwick: No - it doesn't exist today. Orie Steele: This would make it unambiguous. Joe Andrieu: If we don't have implementations, we'll have to cut the work anyway. David Chadwick: We have to specify things for people to implement them.
Brent Zundel: Assigned to Orie. |
I have been looking at the editorial changes that are needed in order to enact the decision made at the F2F meeting, and it is hard. This is because the current credentialSchema is used for two different purposes:
|
I don't like any of those options : ( It feels like this should be put on hold until we make progress on crypto suites. I suspect a concrete example with BBS or SD-JWT will make this obvious in retrospec |
Just wanted to say that I read Orie's post and had an unexpected smile. I am the most prolific source of typos that I know, but mine are never as clever and fun as this. I am intrigued by the idea of "retrospec." Since hindsight is 20/20, let's write one of those, perhaps harking back to the '80s or even the '70s. No flowers and bell bottoms, though. ;-) |
I take it for granite that RDF has a retro vibe. I can do this all day. |
There are many different ways of implementing selective disclosure. Some send the whole credential with blinded property names and values, others send atomic credentials, others send assertions and proofs that the assertions are correct etc.
If the verifier receives a selectively disclosed credential which has a credentialSchema property in it, in which some properties are said to be mandatory and some are optional (e.g. the ISO mDL specifies 11 mandatory attributes and 22 optional ones) but the verifier only requests a subset of these properties, and not all the mandatory ones (e.g. asking for date of birth from a driving license), then how should the credentialSchema property be utilised by the verifier, given that the received credential clearly does not match the credentialSchema as it is missing some mandatory attributes?
I think we need to add some clarifying text to the data model to address this issue, because currently the DM states "data schemas that provide verifiers with enough information to determine if the provided data conforms to the provided schema."
The text was updated successfully, but these errors were encountered: