-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add normative requirements regarding media type and proof #1014
Conversation
@msporny there is some odd lint fencing around these references... For example:
... not sure how best to leverage respec, but I figured it would be good to start simple, and try to rely on the existing definitions. |
Yes, we can remove the lint fencing now... I think I had to do it when we were doing FPWD and moving VC-JWT out of the spec (we ended up losing all references to the "external proof" definition). I expected that would be temporary, so left the dfn in, but quieted the respec errors. Now that you're referencing
+1 to that approach. |
The issue was discussed in a meeting on 2023-01-25
View the transcript4.5. Add normative requirements regarding media type and proof (pr vc-data-model#1014)See github pull request vc-data-model#1014. Manu Sporny: normative requirements for media type and proof, follow on to PR from last week. Orie Steele: believe that intention has been made clear in comments, good exchange on intention of PR. Manu Sporny: moving on to data integrity PRs. |
I'm trying to figure out if we agree on this PR or not. There seems to be some kind of inversion of "what is a subset of what?" going on here... and I can see both mental models. We'd have to align on those mental models to make sure we're not talking past each other in order to merge this PR. At present, I think what I'm reading in the commentary is:
|
Consider that some folks will protect the media type by itself... And ignore any potential embedded proof when signed by JOSE or COSE.... I don't think we can assume that folks will care about data integrity proofs when using the media type Perhaps folks are required to care about data integrity proofs for The goal is to communicate requirements clearly and unambiguously. If the proof is present, ignoring it is either safe, or unsafe, for each media type. If proof can be present in |
I think that's fine but the "it must be safe to ignore" part should be defined in specific applications or in appropriate specs that use the media type, not in the media type itself. |
I think the only way you can say something is "safe" regarding media types is if you trust the party that is sending you the payload that is tagged with that media type. There's no amount of "you can just trust this because someone tagged it with |
The issue was discussed in a meeting on 2023-02-01
View the transcript3.3. Add normative requirements regarding media type and proof (pr vc-data-model#1014)See github pull request vc-data-model#1014. Manu Sporny: PR 1014 about media types. We're trying to figure out details regarding the media types, how many we're going to have, etc. This PR is about a very specific media type, and about whether you can include a proof in it, if it can be ignored, etc. Please jump in if you have opinions on media types.. |
This seems like the right approach to me, or something like We want to prevent confusion, not introduce more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved, with a slight preference to adjust language to the following:
This media type MUST NOT be used to describe a verifiable credential with an <dfn>embedded proof</dfn>.
@mprorock I would be happy to take that suggestion, if it is in a mergable format. |
"In first-order logic with equality, the axiom of extensionality states that two sets which contain the same elements are the same set." Is |
I'll note that I've been presuming that this has been in everyone's mental model so far, but just speaking it out loud in case that it wasn't. Just because you're using an external proof doesn't mean it's always going to be a JWT... and just because you're using an embedded proof doesn't mean it's always going to be a Data Integrity Proof. Given the current v2.0 work and this PR, I believe the answer to @OR13's question boil down to what's below: Option P1: If the VCWG decides that they DO want an Option P2: If the VCWG decides that they DO NOT want an Option P3: If the VCWG decides that they DO want an ... and in all cases, we should expect a subset of the ecosystem to get the media type wrong and have backup processing rules that ensure that proof checking, when necessary, does the right thing (and is resilient when the media type is wrong).
I don't think there is a desire to register a media type for data integrity at this point in time. It's an embedded proof format and is expected to ride along w/ existing media types without needing entirely new additions to the IANA media type tree. We /might/ decide to register a media type suffix for it after a few years of implementation experience to see if people would find the media type suffix useful... but again, we expect that sort of thing to cause more problems (media type proliferation) than actually result in more secure systems. IOW, DI doesn't really need anything beyond |
I agree with this. Thank for the options breakdown. P1 & P2 seems less desirable after considering @dlongley 's points. P3 directly contradicts what he has been saying. Your proposals really boils down to the question "Does Based on the current spec and @dlongley 's argument, which I am coming round too... I don't think it does. I think I am going to restate your P3 so its easier to compare to my P4:
Compared to P4:
This reveals a contradiction with
^ this seems not desirable. If we wanted to communicate that an It is true that Another consideration is the layering of media types.
To me, this implies the following:
Perhaps Stated as P5:
Note this line from the spec: https://w3c.github.io/vc-data-model/#proofs-signatures P5 captures this requirement by explicitly acknowledging that a document with an Stated in terms of JSON Web Token Protected Header: // one or more proof use case, external proof mandatory, embedded proof optional.
{
"typ": "verifiable-credential+ld+jwt",
"cty": "verifiable-credential+data-integrity", // `proof` required.
// OR
"cty": "credential+ld+json" // `proof` allowed but not required.
} |
@OR13 --
Incorrect.
If it were in play,
Not according to the rules of media types. Also, |
credential+ld+json also does not exist and multiple suffixes is not yet an RFC. Your argument that Since neither is defined formally yet, this would seem to be up to the working group. |
As a media type, This is not a question of how the WG defines anything. It's a question of the way that IANA-registered media types work. We've been discussing I am not aware of any discussion of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strongly agree with the this comment.
In addition... Usual rules for JWT processing would be ignore claims you do not understand. I do not believe it is safe for the same rule to apply to a proof
JWT claim. For example, if a JWT is secured using a proof
property, but the system does not understand it and ignores it and decides to accept the credential - would not be a good idea.
From the F2F today, it appeared to me that we are fundamentally talking about a single underlying data object with several different integrity mechanisms. That feels like the same media-type with parameters.
All of these things represent that same underlying type of object: a credential as defined to be verifiable per VCDM. This is in contrast to media types that vary
Seeing these different mediatypes, no one would have any reason to know they are representing the same underlying object type. Parameters seem like the right option. |
Embedded proofs and external proofs are syntactically different. One possibility is for us to add guidance to the use of objects identified with the |
I would slightly tweak what @jandrieu suggested in #1014 (comment) Given that
Then I propose to only register the following
Additionally, the VCDM focuses on VCs throughout the spec (in contrast to only |
Example of friends discussing if media type parameters are a good idea. https://github.com/ietf-scitt/draft-birkholz-scitt-architecture/pull/44/files#r1105975998 |
From a newcomer's perspective, I want to see a single media type that conveys a document that meets this specification (and no more or no less). That is something like "w3c-vc+..." or "w3c-vc-2.0+..." and make it clear that it describes a document conformant with this specification (as I read it does not need to contain a proof but may contain one). I've no strong preference on adding other media types that further restrict the data model to contain or not contain the proof but I realize that this can make things easier or harder. |
The issue was discussed in a meeting on 2023-02-14
View the transcript2. Content Types.See github pull request vc-data-model#1014. See github pull request vc-jwt#51.
Orie Steele: We're going to be talking about content types and media types..
Orie Steele: They typically register because they want to distinguish content... there is a registry, it has useful entries, large number of existing registry entries..
Orie Steele: media types / content types -- looking at a few specifications, you should read them... in JSON Web Signatures, content type header parameter, in JWS, header and payload, one of the header parameters can be content type... secure CSV or JSON... declare cty using media type.. Michael Jones: I'm excited about application/json because it does a lot of heavy lifting every day.. Orie Steele: one of the other places you see media types show up is in APis... developers and consumers tend to think more about APIs... APIs are a great place to see real value wrt. media types and content types.. Mahmoud Alkhraishi: What's the timeline on te proposal?. Orie Steele: This is in VC Data Model 2.0 -- has to go to CR first, that's timeline. It's up to us when it goes somewhere.... Ivan Herman: Formally speaking, W3C sends the request for media type when document is in CR, not before, right now nobody knows about this media type and won't know until we get to CR... if I'm still staff contact, this is the official process, I send it..
Michael Jones: One thing to add, it's good to send this when we're at CR because we can still make changes if IANA says we did something wrong. What we would be asking for at CR is a provisional recommendation, it does appear and it appears as "temporary"... once we have a REC, at that point we send another request to IANA to update registration from provisional to permanent..
Ted Thibodeau Jr.: One more wrinkle, because it has two plus signs at the right hand side, there is an RFC that is going through IETF, but indeterminate future, if it gets accepted then this media type is accepted, otherwise it gets rejected.
Manu Sporny: rfc for multiple pluses is in good shape and getting ready for last call.
Orie Steele: Here we have our request, one of the questions that we've been asking is whether proof is a valid/expected member of this content type..
Orie Steele: on to application/credential-claims-set-1.1+json ... maybe we want .v1+json -- those are questions, this one is requested to be registered in a different technical recommendation in our group..
Orie Steele: I'm referring to the credential-claims-set-1.1, the interesting piece is "claims-set" == in example you can see members of a payload, there's sub, jti, iat, exp, nonce -- all of those are registered claim names.... the one important to us is 'vc'.
Orie Steele: For different registry than media types, for JOSE -- JSON Web Tokens Claims registry. VC, the member of this payload, has a structure that looks like a credential, ignoring proof for a second, it looks like JSON-LD... the claim set in vc, has the same sort of thing.. Michael Jones: Some background, this term claims set is defined by the JWT specification, RFC7519 and it's just a name for the JSON that is the body of a JSON Web Token, it's a JSON object with a bunch of claim names as the field names, so iss, typ, jti, those are the claims set claims..
Michael Prorock: There are some benefits for how claims sets are registered in -- JSON tends to get verbose, standardized way to say "these are things we say all the times".... Orie Steele: Thanks for the point about the shortness, there is text that says "We like short names for payload/header" ... but why, it could be that being more verbose would be more semantically unambigouous.. Michael Jones: The reason why, JWTs can be used in browser query strings, for various reasons, there are still browser URL length restrictions that are small... 2k, 4k, 8k... It was fixed IE at one point, it's bigger than it used to be, you have systems truncating content.. Orie Steele: To make the token format, you can make a string encoding on top of another string encoding. Michael Jones: By a factor of 33% larger. Orie Steele: If these names get longer, those other names get longer, that's part of the design here. Part of the content type for the token themselves. After the break, we can see full token, token itself can be response from server, token can be encoded response.. Samuel Smith: For JSON Web Token, this would be the payload, adn then tunnelled within payload could have, content type, vc property could be JSON-LD formatted VC with proof included if proof is part of VC spec.. Orie Steele: This particular registration request is also in VC-JWT today, to describe what we did in v1.1. We are working on v2.0, but we want to be able to refer to that object in v1.1 that concretely matches, shorer arguments about what we're doing in the future.. Samuel Smith: This one is saying proof is externally attached.. Orie Steele: Yes...
Kristina Yasuda: found the statement, if JWS is present, digital signature applies to issuer... or VP ... is a holder..:
Orie Steele: It says "can be omitted", doesn't say "MUST" be omitted. What we would interpret that as is proof is optional..
Kristina Yasuda: This has caused a lot of confusion, to clarify vc claim does not contain entire VC, it only contains properties defined in VC Data Model that didn't have mapping into original JWT claims, but VC only should contain stuff about credential subject.. Orie Steele: At this point, we should read definition of credential and verifiable credential.. Joe Andrieu: I don't know if this is substantive, is type after
Oliver Terbu: More background information, proof, why it can be omitted, to use it to express proofs other than what you can express w/ JWTs... you'd have VC JWT with proof with DI proof... those are things that are not great. Discussion over last few years, JWT claims repeated... instead of vs. in addition to -- intention was to focus on small size footprint ... use JWTs in query strings, that's why we decided to do that stuff..
Orie Steele: This is probematic language in v1.1, definition of credential and verifiable credential uncomfortable and confusing to readers.. Brent Zundel: A credential is a set of one or more claims made by an issuer. A verifiable credential is (reads from spec)....
Orie Steele: What I heard Kristina to say is: This isn't verifiable, but we use verifiable name for it... there is no cryptographic authorship, this is secured with an external proof, maybe this should be called credential because it has no proof, or call it creential cause it has an external proof, but confusing ....
Orie Steele: how many media types do we have defined right now? ... 0 in v1.1, 1 currently in the proposal. Joe Andrieu: at least a 3rd media type here we should understand as a group. media type we are securing, and two that are the secured version of them. the language here mushes them all together. Orie Steele: the media type we're securing has the most consensus in the group. vc ld json. pull request 1014 is attempting to describe normative requirements for credential+ld+json. if we can gain consensus we can proceed to securing it. can be easy if the normative requirements are clear on how we do it.
Orie Steele: new things! talking about v1.1 up until now. and core data model objects (credential). now, switching to talk about other concepts - proposals. pull request recently merged for vc+ld+jwt, but no FPWD for vc-jwt. there is plenty of time to object.. Michael Jones: typ is what you would put in a browser if you were to transmit. Orie Steele: cty is about the payload alone, not header or signature.
See github pull request vc-jwt#51. Michael Prorock: a note on cty. referring to payloads is important for business logic processing. seeing this at IETF. #1014 - not whether a proof is allowed in the payload. whether it's allowed with credential+ld+json. or, should we say: if there is a proof embedded in the payload should we use a different cty to describe it, along with a different media type for the browser, etc. not saying whether you can have an embedded proof. just the rules around it.. Orie Steele: that's right. spice from the first slide!. Ted Thibodeau Jr.: editing others slides. adding the / to the cty, for both cty and typ attrs. these values are shortened because people like to shorten things. specifically to delete "application/" from the beginning. it should be interpreted as if this were present.. Orie Steele: can someone read the section that describes the removing of the application prefix? TallTed is right..
David Waite: from RFC 7515 - section 4.1.10. - to keep messages compact, recommend you omit prefix when no other slash appears in the media type value. must treat it as if there application/ were prepended... Manu Sporny: part that's concerning. can't remember having to think this hard about other media types. general concern: all media types we're considering, how will they work combinatorially? lots to understand and learn. developers will get this wrong..
Manu Sporny: "we will make important decisions around media types" -- slightly misguided. we will do our best for media types. devs will get it wrong, because it's difficult. what do we do then?. Orie Steele: the point about "going into the thing to determine whether it's secured or not" is important. one thing dlongley has been saying...can have an intermediary processing a cty that has no ability to verify - just relay. all that it's able to do is to send along a cty. don't make any intermediary responsible for parsing..
Orie Steele: what manu_ is saying: be careful writing normative requirements that mandate parsing. envs that cannot dig in won't be able to handle normative requirements. an important part of considering this..
Orie Steele: second part: as a developer. don't like being told I'm going to make mistakes, even if I know that I will make them. here for simplicity as much as we can. remember the warning from browser vendors about handling ctys. browser vendors know that mistakes will be made. they try to warn, we should too. Dmitri Zagidulin: clarifying question about what mprorock said about #1014: what is the usefulness of embedding an embedded proof json-ld proof in a JWT? what's the benefit of a proofs section inside a jwt.
Michael Prorock: there is a case I can foresee. not advocating for it. with a JWS you are not signing the same thing that you're signing with a data integrity proof. inherently signing two different things. with a JWS what you're signing is what you see (what the system sees at first glance). with a data integrity proof, signing over the semantics of the data in the credential -- signing a transformation, the nquads. different thing.
Michael Prorock: at a top level you can ask "is this data tampered with?" that's the use of JWS - the external signature. what's coming with the proof...let me run URDNA2015 and verify the signature. what that tells you is what is the intention of the semantics tied back to the vocba. was that itself modified? different than just signing the bytes.
Kristina Yasuda: 2 things - to manu_: agree we have a job to make it clear to readers which cty to use depending on which direction we want to go. to that extent, the current spec gives us those options already. heard a lot of feedback people want to do different things. codification is useful. maybe could be different than cty and typ..
Kristina Yasuda: 2nd point - reacting to mprorock -- made a comment on the PR, explanation makes me thing if we want to sign JWT with an embedded proof it should be a separate media type. could be dangerous security wise.
Michael Prorock: yes.
Michael Jones: agree with kristina. appreciate what Orie, mprorock and others have advocated. starting to separate and cleanly advocate for things that are separated but distinct. in vc 1.0 spec we had the vcdm representation of content types. now we have a media type for that. also had 1 or more jwt claim sets for vc-jwts. depending on 'in addition to' or 'depends on' option - 2.0 now codifies that. delineation is important. th.
Michael Jones: appreciate how differentiating things that are actually different has enabled us to make progress. Joe Andrieu: something to be learned/looked at for how gzip is handled on the web. not clean either. media type could be gzip, could be content encoding as gzip. in a future universe would like an integrity type. do not have a way to do that yet. stuck with multiplicity. we also have verifiable presentations. have the same multiplicity there..
Joe Andrieu: however things get secured, will need to add it for both VCs and VPs..
Samuel Smith: to Gabe's comment, there can be a way of communicating other steps. not only does the signature need to validate, there needs to be validation against some schema. additional level of validation. can be useful to constrain semantics. need some discussion of how we can convey that.
Michael Prorock: I think manu_'s on to something important. here be dragons areas. let's be careful of what we're defining and how. what do we actually mean by typ and cty? what typ is saying -- we are expecting the overall body of the JWT to be a verifiable credential and have LD (as indicated) and expecting a JWT format. what the CTY is saying about the payload -- expecting it to be a credential with LD.
Michael Prorock: this has been of the reasons I've been stubborn about 'when we start adding additional modifiers ...' many CVEs around this. openSSL had a typ confusion thing because x509 has badly handled this stuff. we should learn. even if it prevents us from doing some stuff. let's be explicit when there's a divergence.. Orie Steele: dmitriz asked the use case for proof being in claim set. mprorock answered, but want to repeat processing comment. can think of it as tunneling. I'm tunneling my embedded proof through the cty header..
Orie Steele: e.g. constrained environment. it's a good thing, can forward content with different values for typ and cty. the concept of being allowed to tunnel one security format to another is a thing we see in the wild. should be careful of how that will be interpreted by browsers. allowing for tunneling could be a thing they like or don't - let's make a case for tunneling.
Orie Steele: can close door to some use cases being affected if we do this. Dave Longley: tunneling is one of the cases, yes. have left many comments. let's only have as many media types as we need and not any more. always another place to draw a typ boundary. let's make sure the boundaries we draw solve concrete problems. e.g. places that use binary data. does not mean the same concerns will apply to a json format, parsing, or browser parsing.
Dave Longley: if we have too many media types we can have more problems for ourselves. can lead to vulns. let's have concrete examples of threads/problems we can analyze to see if we can add more media types. then we should add them. let's not jump ahead and add all the types today. can cause problems. Manu Sporny: +1 to dave. trying to see what type of problems we're solving. yes, specific media types we want in this group. nobody's saying we shouldn't have them if they're paired with a good use case, paired with good security practices. that's good - no objections heard. objection -- let's create media type patterns. we could have 20-30 media types based on these patterns, that's where I get shaky..
Manu Sporny: ... typ, yes. cty, yes. let's figure out if we want to add the word 'verifiable' in front of it. what do we want to do with the proof thing? is there anyone objecting to having the typ field (#51)?.
Manu Sporny: I expect that to happen (typ field). for a content type, that's a separate discussion. is it a subtype of credential? if it isn't there's a can of worms. see PR #1014.. Orie Steele: intention not to make any decisions during the f2f, just to inform. Brent Zundel: let's make decisions!.
Andres Uribe: considerations for using the typ parameter for the media type defined that specifies how the credential is secured? anything that has 'verifiable' should have a typ param that specifies how it's secured. does this avoid the need for cty? as someone who's coming in relatively new. very different when you talk about a verifiable cred vs cred. verifiable = must have a proof whether embedded or external. would clarify a lot for devs. Orie Steele: parameterization of media types..beginning said 'careful!' yes it's an option, can propose it on any of the open PRs. if we do that we need to describe these parameters in registration requests. less opaque. have to ask questions-is param present? what does it have? think if we can avoid parameterization we should. Mahmoud Alkhraishi: to understand...the cty saying credential+ld+json indicates to me that the object vc will or will not have a proof (determined on #1014), if not a proof, will know whether the object is signed or not based on the typ?. Orie Steele: current spec has a section called 'proofs' which says proofs can be embedded or external. one of the most confusing parts - with media types, can have a media type with an external proof. content wouldn't have a proof..
Orie Steele: jwt uses external proofs. the type vc+ld+jwt indicates the presence of an external proof. the cty param credential+ld+json (let's pretend it's verifiable+credential+ld+json), proof would be in the payload. we won't know until we constrain the payload.
Mahmoud Alkhraishi: when I'm processing vc+ld+jwt am I expected to process two signatures?. Orie Steele: that's the warning!. Joe Andrieu: advocate for parameter use. have an underlying data model which is being delivered. if we had an integrity type in the header I'd be arguing for using that. the integrity mechanism feels like a parameter to me. if we have two media types they don't have to be related at all. part of the confusion is about tunneling. the idea that you can tunnel is something we should not encourage - have to check different proofs. Brent Zundel: what decisions do we want to make here? PR #1014 proposes to say proof is not included, right?. Orie Steele: yes, dlongley and I had a discussion. he suggested proof is allowed in credential+ld+json. my original intention was to forbid that from being possible..
Orie Steele: merged that since it was within my original intention. merging intentions into PR does not mean consensus! there is currently no consensus on #1014. most accurately captured by whether the cty can have a proof in it. can make that decision today. Brent Zundel: also a decision to be made for #51?. Orie Steele: yes, for #51 we are requesting the registration for typ parameter in vc-jwt. it is JWT specific. #1014 is core data model specific.
Brent Zundel: in addition to those two decisions, what other proposals could/should be on the table for the group for the next 30m?.
Orie Steele: other proposal: application/vc+ld+cwt.. Dave Longley: we get to decide if it makes sense if squares are rectangles in this case :).
Michael Prorock: assumption from devs coming in that you're just living in a JOSE/COSE world. but have two different ways of exchanging info beyond that. if I as an issuer send this, if I believe there's a high degree of value in the embedded proof then I should signal it.. Manu Sporny: the other thing at play -- we have two different philosophies. the JWT philosophy, and embedded proof, and they don't make the same decisions. for media types let me propose we only have two media types forever: credential+ld+ and verifiable+credential+ld+.
Manu Sporny: specifically not saying it has "proof" because that's just how we do it today. I'm concerned about us fixating on whether proof is there or not, we should focus on whether there's an embedded proof or not. from the DI side there will not be a +di for a while. we won't make that decision any time soon. will wait for impl feedback.. David Waite: it is a little bit weird. the way things are structured at the JSON level are not how they're processed at the LD level, since being an RDF graph. a lot closer than you think. in my opinion, we've defined multiple proofs only in the context of the proof parameter can we have multiple values.
David Waite: the semantics of a data integrity proof and a proof inside a JWT are quite different. e.g. protecting someone else's proof with my own (today it is chains). the only interpretation I can think of is that the proofs are independent. if there are multiple, choose between them. any intermediary could remove proofs, and have no way to know that happened. Joe Andrieu: the parameter still resonates with me. not talking yet about the API used to send a proto-vc (unsigned) to be signed by another component. to me that is credential+ld+json, parameters used to specify the type of proof embedded in the VC, instead of the VC embedded in the proof.
Kristina Yasuda: reviewing document posted above (https://hackmd.io/Q8EOfbzYTZK_jHH-BJcfKA?view). Orie Steele: do not have any merges for typ in the core data model just yet. Kristina Yasuda: Manu's proposal in typ not cty.
Michael Jones: we can also provide guidance in the spec on the use of credential+ld data type if used with an external proof--must not also contain an internal proof. whereas; if used within a context with internal proofs it may contain one. that would be OK with me..
Orie Steele: [new slide - 25] Pull #51 in vc-jwt has both vc+ld+cwt and vc+ld+jwt. Andres Uribe: concrete proposal to make sure there are parameters. not just add credential+ld+json but specify how it is proved, e.g. ?proof=[jwt, cwt, etc.]. similar to what Joe was saying / and adding to Manu was saying.
Dave Longley: every VC is a C. every square is a rectangle. already have a model where you look at JSON properties - you look at the ones you understand, ignore the ones you don't. also understand whether they care about these proofs/properties. if you don't care - fine - does not mean a sender should explicitly remove it, should not need to explicitly remove it because it's there. providing guidance around processing is fine. sh.
Orie Steele: responding to dlongley. let's surface legitimate use cases. let's make sure use cases are driving our spec development. without these normative statements, we have trouble satisfying. tunneling is a real use case. "type confusion attack" mprorock mentioned. is that a concrete use case for forbidding proof being in the payload?. Michael Prorock: want to call out. parameterization - have experience around media stream handling. there is a parameterization to be had, maybe not in 2.0 but in future versions of LD. how do we indicate at the header layer what signature am I expecting? same notion of parameterization in cryptography.. Samuel Smith: one use case not seen discussed - prevalent in business/legal world: endorsements. have a party create something. other parties lend credibility via endorsements. can have threshold structures for issuance. can also have threshold structures for receiving. should support things that are really prevalent in the business world. Manu Sporny: clarification there is no +ld media type. does not exist. it is only ld+json. would need to figure out something.... Brent Zundel: Dave mentioned squares and rectangles. Let's talk about the accurate labeling of squares and rectangles.. Mahmoud Alkhraishi: had a similar question. if we separate credential+ld+json to state there will be no proof and have vc+ld+json and say there will be a proof, will allow us, when we use the cty field, if the typ field says ld+jwt, can say process the internal signature as well or do not.
Mahmoud Alkhraishi: can make it very clear when I receive a vc+ld+json I should process a proof. normative requirement would be straightforward. allows us on the JWT side to say I care or don't.. Dave Longley: if we're deciding to add a prohibitive statement let's be clear about what problem it's solving with a concrete use case..
Orie Steele: unbounded number of content types that are relevant to software development. that's why cty is liked. can treat as opaque bytes with a cty that informs the strucutre.
Orie Steele: dlongley asking for concrete use cases. proof being present in a credential. mprorock has said openSSL type confusion as an example. another place it could be a problem: if the @context returns different content than what it was when the canonicalized proof was created, the outer external proof will verify, the inner won't until the URL in the context change(s/d). Michael Prorock: two concrete problems: one - what Orie mentioned. second -- crawling and parsing massive amounts of data...what went into models...could be cases where I want to embed different types of proofs. don't want that knowledge to be confused for any of my users. e.g. for web archive, need to maintain the state of the internet at different times. how we begin to do this?. Joe Andrieu: there is an unstated presumption that's conflating the signature on a JWT with a signature on a proof. no reason it needs to be on the same issuer.. Manu Sporny: going back to what selfissued said. maybe we can add language to not outright forbid in the base class. start out with softer language in the beginning - not outright "don't do x" - then we can either go to "now it's forbidden" or we realize the guidance goes against real use cases..
Manu Sporny: where the JWT stuff can be very explicit: do not do it. the base spec can say "this is what people are expecting" . see VC-JWT for more information on this.
Samuel Smith: echo Orie's statement. if I sign a receipt. if someone purchases from me, it's signed, it has a proof. I give a receipt on the proof that I sign, referencing what they sign. liability on issuing receipt is on me..
Gabe Cohen: The use case for tunnelling, at TBD, transport credentials over DWNs. Use cases on status list, different signature methods, blob might be signed as a VC itself, might have many different types -- tunnelling is real..
Michael Prorock: different between verifiable+credential and verifiable/credential.
Dave Longley: need to know what you care about a priori. need to decide what you're verifying and make decisions there. let's make sure we know where the knowledge is coming from. verifier should know ahead of time what it's looking for and willing to verify.
Orie Steele: queued to comment on verifiable+credential or verifiable-credential - have been comments on #1014. if the dash were changed to a plus it would say there's a direct relation to credential...only a content type when ld+json, different than verifiable-credential+otherstuff. what Ted said: verifiable-credential and credential are at the same level. Michael Prorock: just do this in 'vc-jwts': if this is a W3C....we eat this up in the browser. we need to think about what I, as a browser, see application/??? - do I know how to verify it?.
Manu Sporny: all vendors support application/ld+json. tell vendors do things as you've always been doing..
|
<p> | ||
This media type MUST NOT be used to describe a verifiable credential with an <dfn>embedded proof</dfn>. | ||
</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<p> | |
This media type MUST NOT be used to describe a verifiable credential with an <dfn>embedded proof</dfn>. | |
</p> | |
<p> | |
If this media type is used any <dfn>embedded proof</dfn> MUST be ignored for the purposes of verification. | |
</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 to this. VCs and the three party model generally do not create a "control structure". Neither the issuer (nor this WG) get to tell a verifier what they can or cannot accept. I think media types are not the right way to accomplish the goals here.
If an API does not want to accept a particular field, it should say so / have a schema that rejects it, etc. Those are the right tools for that job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlongley are you ok with "should not" language for inclusion of an embedded proof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we can flip the language around and say something like this:
If
application/credential+ld+json
is specified, verifiers MAY ignore top-level embedded proofs. Implementers are strongly urged to use a more specific media type, such asapplication/verifiable-credential+ld+json
if the credential is secured using an embedded proof.
I'm offering the text above not because I think it should be in the spec, but to do a temperature check on compromise language that might work.
IOW, the statement above says: "Use this 'unsecured' media type and embedded proofs together at your peril... DO use a more specific media type for expressing that the content is secured."
That said, we will probably also need language that effectively states: "DO NOT just trust media types... that's an easily exploitable attack vector". If it says the media type is secured, you MUST run algorithm XYZ to ensure that it is.
Another way to look at this is when MUST an implementation throw an error. I can think of two instances where it definitely MUST throw an error:
application/verifiable-credential+ld+json
is specified, but the VC contains no embedded proofs.application/verifiable-credential+ld+json
is specified, and it contains embedded chained proofs, where one of the proofs in the chain is missing/invalid.
... I'm sure there are other cases where we can all agree that an error MUST be thrown (at some layer).
To see if other media type registrations contained language like this, I started with application/jwt
-- https://www.rfc-editor.org/rfc/rfc7519#section-10.3.1
It does not contain any language about what properties MUST/SHOULD or MUST NOT/SHOULD NOT be included in the media type registration... so, that got me wondering... are there other media types that have this sort of language?
Update: I just looked through the entire structured syntax suffix registry and found no language in each IANA registration section that prohibited certain content based on media type.
What inspired the language in this PR? Was it from another media type registration? If so, which one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you ok with "should not" language for inclusion of an embedded proof?
No, I'm afraid not. I don't think it's the right use of media types. IMO, the language is unnecessarily restrictive, ineffectual, and likely to create more confusion and unnecessary complexity as a result.
However, I am not against more specific media types (other than that I think we should have a strong reason for every new media type we create). Nor am I against recommending that more specific media types be used if there's a desire to communicate that some data meets some subclass constraints. But that just seems to be general media type advice that shouldn't need to be said in a particular media type registration.
IOW, the statement above says: "Use this 'unsecured' media type and embedded proofs together at your peril... DO use a more specific media type for expressing that the content is secured."
I think language like that is unnecessarily fearful. There's no reason to bring concepts like "use this as your own peril" into this. We don't need to create situations where people are afraid to consume certain data because of the media type that accompanies it. That further proliferates the wrong idea about media types, how to decide whether data is authentic (and what to do / not do about it), and how to work with data in the ecosystem generally, IMO.
In the three party model, there is no single party that tells other parties what to accept and then they just ... accept it. That's a core difference to some common two party models. There is no "Authorization Server" here. Instead, there is self-describing data that conforms to certain specs. Independent parties implement to those specs and when they see data that they understand -- they know how to consume it and what it means. If they don't understand it, they ignore it or reject it -- their choice. What they understand and are willing to accept is up to their business rules. They don't need to get together with other parties or have some special, elevated authorization party tell them what to do; that's just not how this technology works at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@msporny This seems closer...
If
application/credential+ld+json
is specified, verifiers MAY ignore top-level embedded proofs. Implementers are strongly urged to use a more specific media type, such asapplication/verifiable-credential+ld+json
if the credential is secured using an embedded proof.
How about this variation:
If application/credential+ld+json is specified, verifiers MAY ignore top-level embedded proofs. Implementers SHOULD use a more specific media type, such as application/verifiable-credential+ld+json if the credential is secured using an embedded proof.
To your point re the registration side - I think you are correct there, in that I don't think the registration block is the right place for this. I would like clear normative guidance about how to use the media types, and believe that the correct location for language to this effect does not need to be noted right inline below the table as it is done here, and probably could be located elsewhere in the spec.
edit: to add a notes on few things that get into multiple representations that then point out to external specs for normative and interop purposes such as csvm+json
If we are going to have any guidance in this section it should likely be in one (or aspects of the guidance in both of ) the Additional information
and Interoperability considerations
- there is some normative 'like' language in many docs, usually in the interop section, e.g. in H224, but normative language as to usage in this case is probably better located in a section detailing use of media types with VCs - this then could provide guidance for WG and other specs that utilize media types with VCs (like 'don't register everything, only what is required')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an API does not want to accept a particular field, it should say so / have a schema that rejects it, etc. Those are the right tools for that job.
The v1 context says proof
can be a member of the RDF type VerifiableCredential
Perhaps the mistake we have made was to define credential+ld+json
instead of vc+ld+json
... since the concept of what a "credential" is seems to not have strong agreement, I thought it was obvious that a credential is not a verifiable credential and would not have an embedded or external proof, after reading the v1 spec... but it seems that there is no agreement on the interpretation of the v1 spec text:
https://w3c.github.io/vc-data-model/#proofs-signatures
What inspired the language in this PR? Was it from another media type registration? If so, which one?
@msporny confusion over what a "credential" is... and wanting to define a media type so that the bytes for a representation of a "credential" can be secured and referred to consistently... As I noted above, perhaps there is no concept of credential
, there is only VerifiableCredential
... Certainly this is the case regarding the RDF interpretation of the core data model.
In either case, this dialog is emphasizing a critical flaw in the spec... it is clear the current language is being interpreted in drastically different ways by readers.
A few paths forward.
- Remove
proof
from the core spec (rely on other specs, like data integrity to define it's presence in VCs, Capabilities, or other LD Objects) - Register
vc+ld+json
instead ofcredential+ld+json
, sinceproof
can be present invc+ld+json
- Remove the language from core spec about
credential
and just defineverifiable credential
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the mistake we have made was to define credential+ld+json instead of vc+ld+json
Strong +1 to this. It's the point I was trying to convey in #1044 (comment)
AFAIK there is no normative definition of what a Credential
is. We only have a normative definition for a VerifiableCredential
.
I am against 1, and in favor of doing 2 and 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not necessarily against what @andresuribe87 proposes (and the spirit of his comment) ... provided that we can make everything hang together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should also note that @andresuribe87's comment I think somewhat captures what we tried do to simplify things in the 1.1 work.
The issue was discussed in a meeting on 2023-02-28
View the transcript1. Add normative requirements regarding media type and proof (pr vc-data-model#1014)See github pull request vc-data-model#1014. See github pull request vc-data-model#1034. See github issue vc-data-model#1044. Brent Zundel: There is a long conversation in the PR, goal of this meeting is to hash out what different sides of debate are, and can then try to come to proposals, recommendations on how to move forward.. Manu Sporny: There are a couple of threads for this discussion. One of the threads is "hey, we should have a media type for unsecured credentials and we should have one for secured ones. Those are the two classes we should have.". Michael Prorock: I think that's a reasonably good summary, in my mind, it's a more fundamental question. What is a credential? What makes it a verifiable credential?. Dave Longley: I think it's good to avoid confusion where we can, we have to be careful about avoiding confusion when we're talking about secured vs. unsecured or trusted vs. untrusted data... the recipient of the data needs to have an expectation about what they're receiving. They can't just accept whatever the sender sends, that opens the door to potential vulnerabilities.. David Waite: Some of this may be defining the behaviour when you have multiple proofs by default, I can remove one of the proofs and it's still verifiable, just with less mechanisms.. Michael Prorock: The PR here is explicit around one line related to Manu Sporny: So I want to agree with what David Waite was saying.. Dave Longley: media types are not necessarily the right tool to make sure that something is an input to an API. The media type is a way to identify the type of content and how it would be parsed and used, regardless of what API you're using it in. It's a more encompassing thing, not for just one particular use case, one particular API, it's not the right tool for accomplishing that..
Dave Longley: That said, APIs have their own shape, can provide their own restrictions, you can do that, that's not a problem and that might be the right way to define that, you can do that via API schemas to accomplish that goal.. Kristina Yasuda: Manu's comment confused me. I do not understand the concept of top level, secondary level proofs, one of the things we did in data model v2 is we separated data model in how we sign/secure it. The notion that something that conceptually is not supposed to be signed can have this proof is very confusing to me.. Michael Prorock: Big +1 to what Kristina said, building on that, we have a good model if we accept this PR, core data model defines what a credential is, and how different specifications might process that... what's expected, in VC-JWT there is a clean way of specifying media types, what to expect when, JWS, or something COSE related, since we have a nice data integrity spec, perhaps we should put media types w/ proofs in a credential, people know how to deal with that information.. Manu Sporny: To address Kristina's concern. We have a concept here -- and JWTs deal with it differently. With Data Integrity proofs, you can have documents that nest other documents and each can nest proofs at every layer..
Manu Sporny: Saying Dave Longley: big +1 to what manu said... if we moved proof into data integrity, it would further undercut case for restricting Kristina Yasuda: To clarify -- not questioning legitimacy of use case where there are nested data integrity proofs -- what I'm not understanding is credential that is conceptually not supposed to be signed, including a proof property? That's what clearly separates credential from verifiable credential. Let's not completely remove proof from data model, remove it from section 4 to section 6..
Kristina Yasuda: Imagine a JSON structure that includes proof property... and that is signed by JWS... if that enters JWS processor, that doesn't understand proof claim, it'll ignore it. In which case, signature is entirely being ignored, which I'm concerned about. Is there a way to resolve that? Happy to hear solutions.. Michael Prorock: No one is saying these are not valid use cases, we want a clear and valid media type. David Waite: In the spec we talk about embedded proofs and external proofs, way to state the same thing... embedded proof is a way to extract proof and recanonicalize to create an external proof over the document. It comes down to giving recommendations --- if one proof is chaining, pointing into another, you cannot express outer proof w/o inner one, but you can strip off outer proof, so when people embed a data integrity protected VC inside a JWT, they need to understand that they're requiring twice as much security processing logic... that may affect full stack decisions, or they could just discard the JWT and send credential and data integrity around and they need to decide if it's appropriate or not. If we do have it, we should have a SHOULD NOT with guidance.. Dave Longley: There are going to be many applications that can accept a credential whether or not it has a proof attached to it, important that base media type supports that... if there are other applications that want to have a media type that specifically says "this property is not allowed in this media type" - I think we can do that, but I recommend against it, I don't think it's a good idea. I don't think we should call that out on the base media type. Maybe have base media type to be superclass of other things, someone could create a media type for their other applications, can't have specific property..
Dave Longley: "What if this verifier receives something where a signature has been stripped?" -- verifiers need to know what to expect from a holder. They will know what proofs they need to look for, they need to know which issuers to trust, how many signatures to expect, etc... we should remember that we're defining a base data model here, there are things that people don't necessarily process in a credential... base media type that's a super class allows that to happen, I don't think there's a problem there..
Kristina Yasuda: yes, but what they accept is up to them. Manu Sporny: I think we need to look at the PR. The PR is specifically talking about the
Manu Sporny: So I think the thing you mentioned is a different conversation and that I agree with you more over there, than on this PR. This PR says, we're going to single out this one property and makes it illegal..
Manu Sporny: I think we're trying to use media types to signal things that we shouldn't rely on, it gives power to the attacker..
Michael Prorock: I don't think anyone is disagreeing on the multiple layers on security checks, not trusting inputs, I would hope all of us in this group are familiar with those concepts..
Michael Prorock: What I'm concerned about is saying that something should not have a proof in any way shape or form, perhaps from a consensus standpoint we end up in SHOULD NOT. If we want to think about a variety of things, no one here wants a proliferation of media types for every options, No one is asking for a proliferation of business rules defined in the spec. If it is implicitly confusing that receives a proof in it, or embedded in a VC-JWT, what proof type comes first. If we don't have a clear way of saying: "Don't do this, you're going to confuse the user"... it's highly concerning to me. If we leave it up to verifier, certain issuers are going to assume one type of security model might be required w/o clean mechanism to know that they're doing that. Let's try to get explicitly clear about what a credential is before it becomes a VC and then be very clear about that as well.. Tobias Looker: I'm still catching up on this issue, so might have perspective that's invalid -- my interpretation of text in PR doesn't insinuate that it makes embedded proofs illegal, media type doesn't establish any semantics around inclusion of property of what it means. If I sent application/json, and data is customer record, that is at a different layer semantically. Embedded proof might have something secured, media type isn't going to communicate presence or processing logic to do anything w/ that..
Ted Thibodeau Jr.: Media types don't do these things... media types don't say "in this structure of file you can only have X field name". Media types give you the structure of the file so you can deal w/ it's contents. If you feed a .zip archive to Excel, it's going to choke because it doesn't know how to deal w/ that content..
Ted Thibodeau Jr.: Vice versa, unzip cannot do anything for you with the contents of an excel file... it's not the structure of the document it works on, the structure is the thing that matters here. Subtypes in the mediatype universe are limited not in what fields they can have and the contents of what they could be... bigger type might allow larger number of rearrangements, smaller more focused might allow less..
Ted Thibodeau Jr.: Media types don't care about proofs, or business logic, or anything like that. Trying to make them do that job, we're going to break all sorts of security models. It don't work that way..
Ted Thibodeau Jr.: I've said in a number of times, a number of places, people don't seem to understand how media types work, if you don't understand how this stuff works, and we try to spec with it, we're going to break something.. Tobias Looker: Further the point that I'm making, don't think media type should rule out in an extensible data model technology whether or not the member of an element should /shouldn't make something illegal. I don't think media type should reach into the data representation technology and says "this should never exist". I think that messes w/ data extensibility model in JSON..
Tobias Looker: The media type doesn't convey that, media type doesn't tell you anything about what the media type tells you that... it could just be a string w/ some other meaning attached to it. That's all I take the language to mean in this PR. It doesn't communicate anything about the presence of an embedded proof or not.. Manu Sporny: +1 to tplooker, Tallted.. David Waite: One of the ongoing concerns, we're trying to define an extensible data model to use it in ways that we haven't imagined... even extension points we have, like
Michael Prorock: Is there an approach that might work? If media type is used, embedded proof must be ignored by processor by verifier? This is a reasonably complex thing we're trying to say.. Kristina Yasuda: I think we double clicked on statements meant by tobias and tallted -- JWT registered
Brent Zundel: Jump on PR and try to refine some language..
Brent Zundel: See everyone on the call tomorrow.. |
Hi guys,
I am getting notifications about your project, and it's better if I
don't, I might have been added by mistake.
I am "andres" at github
best regards,
…On 3/1/23 11:54, Andres Uribe wrote:
***@***.**** commented on this pull request.
------------------------------------------------------------------------
In index.html
<#1014 (comment)>:
> + <p>
+ This media type MUST NOT be used to describe a verifiable credential with an <dfn>embedded proof</dfn>.
+ </p>
Perhaps the mistake we have made was to define credential+ld+json
instead of vc+ld+json
Strong +1 to this. It's the point I was trying to convey in #1044
(comment)
<#1044 (comment)>
AFAIK there is not normative definition of what a |Credential| is. We
only have a normative definition for a |VerifiableCredential|.
I am against 1, and in favor of doing 2 and 3.
—
Reply to this email directly, view it on GitHub
<#1014 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAABBSCTIRDBSQKHPWFEL43WZ6LOPANCNFSM6AAAAAAUB5MRAQ>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@iherman -- #1014 (comment) needs a quick edit, to change |
The issue was discussed in a meeting on 2023-03-14 List of resolutions:
View the transcript1. Media types
|
Closing this PR, we will attempt to define a media type for I will wait to open the new PR until #1062 is merged. |
This PR adds normative requirements that connect the concept of the
application/credential+ld+json
to the concept ofembedded
andexternal
proof.Preview | Diff