-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add certhash #130
add certhash #130
Conversation
Co-authored-by: Rod Vagg <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so .. I went googling because I hadn't heard of "certhash" and go-multihash is the top result for webtransport "certhash"
. Is "certificatehash" a more appropriate/common name?
your call
That's certainly the case, but it's also longer. Not sure how this should factor into the decision. @aschmahmann @Stebalien any opinions? |
WebTransport addrs will already be pretty long, and iiuc there will be more than one certificate hash to facilitate switching to a new hash over time. So.. maybe we could go.. shorter? Since we have:
we could go with shorter (and generic)
|
Eh, I don't think the length matters that much. In terms of name, |
Seems like a good thing! Search results will point to this rather than give a generic description. |
Thanks for all the suggestions! I like |
Following up on a discussion in multiformats/rust-multiaddr#59 In regards to the discussion on whether a multihash makes sense here, I would argue yes it does due to:
See also recent discussion on the specs pull request libp2p/specs#412 (comment) In regards to the discussion on whether a multibase makes sense here, I would argue yes it does, though I would argue less strongly:
I think this is a sane mindset and a good reasoning for going with multihash and multibase. Though I think we should be consistent on the base across implementations for now. |
I don't see any advantage to that (and maybe even a slight disadvantage, exercising the flexibility we have does have value). On the wire, it's all bytes anyway. And where we share strings, they'll get passed through a multibase decoder. |
I think there really isn't much choice since (for the near to mid future) certhashes will be mostly used for browser comms and will probably just be base64url. Spec-wise it should still be defined multibase + multihash since we really dont want to code/spec ourselves into a corner. |
I'm going to detail my point of view in this comment. Multiaddresses aren't actually composable(this is just a preamble, and I think everybody will agree to the content of this section) As we all know, a multiaddr is a list of protocols, such as It is easy to think of these protocols are being wrapped inside of each other, for example an IP header followed with a TCP header followed with a WebSocket frame. However, this doesn't represent the truth. A few examples:
Additionally, the protocol stacks that multiaddresses describe are specific to libp2p. This is particularly obvious for To me, and probably to everyone, a multiaddr can be seen as having the same problem as the OSI model: it doesn't represent the truth, but it is close enough to be a useful mental model. How to choose multiaddr representationsIf a multiaddress doesn't represent the truth, then which criteria should we use to decide on the format of multiaddresses? I believe, and I think most people will agree, that a good multiaddress is a multiaddress that is reasonably intuitive for a human being to understand when reading it. I personally don't see any other rational criteria for choosing multiaddresses formats than its readability/writability by human beings. Is /certhash intuitive?The How intuitive something is is obviously subjective, but I don't see how If multiaddresses are supposed to convey an idea of stack of protocols, a bit like the OSI model does, then I fail to see what Is using a multibase+multihash intuitive?Let's put ourselves in the shoes of a devops that is configuring some kind of software that uses libp2p. On the other hand, if the hash was the SHA256 hash, the devops could use the built-in UNIX utility Additionally, using a multibase has the consequence that comparing multiaddresses is no longer as simple as comparing their text output. You now need to actually parse the multibase, and check whether their payloads compare equal. This is quite a big complication. You might say that preferring a format over another because how well a devops is able to read/write it or how one can compare them is a pretty weak argument, but this is IMO literally the only reasonable argument. One could also argue that multiaddress formats should be coherent in order to be intuitive, and I agree with that. However there is no precedent (apart from the very new Counter-argument: extensibilityThe argument in favor of using a multibase and multihash is "extensibility". This PR proposes that we add But there is a pretty obvious other way to extend multiaddresses: adding new protocols. Rather than add I understand that using a multihash as opposed to multiple protocols has some small consequences on the code paths in the implementation. However, it would be a mistake to try implement Furthermore, I'd like to claim that leaving the list of hash algorithms extensible is a weakness in the specification. It is incomplete for an implementation to claim to support However, importantly, what I want to highlight in this section is not that I prefer What I would proposeWhat I would propose for WebRTC is: Something like It is reasonably simple to read and construct, and conveys the idea. It doesn't have any of the drawbacks expressed so far, or at least minimizes them. Why I understand that there's a lot of arbitrary in choosing a protocol name, but the important points I want to highlight in this comment is that I think that A side note about the visionI think that what I describe in this comment is a relatively coherent way to design multiaddresses, as opposed to the current way of doing things. Correct me if I'm wrong, but there really isn't any guiding document regarding multiaddresses. It seems that everybody just adds protocols randomly (choosing an arbitrary protocol number), and there's no argument in favor of specifying protocols in a certain way and not another. When asked why use a multihash and a multibase, the answer is "well, you never know, for the future". |
Agreed with the two points above. It is a good mental model though not consistent across all protocol combinations.
Or even
Agreed. multiaddresses should be easy to read and write.
For me there is also:
As an aside, I am not familiar with any alternatives. Pointers appreciated.
If I am not mistaken it is not about being intuitive, but about being able to specify multiple certificate hashes in a single multiaddress, allowing to renew an endpoints certificate without downtime. This is needed in WebTransport. Quote from the WebTransport draft specification:
Does that make sense @tomaka? Happy for alternative suggestions on how to support the renewal process in WebTransport.
Agreed that this is tedious. I would expect the implementation to print the listening address including the
One would need to write a small tool for this. Not ideal in my eyes, though I would argue worth the benefit of being able to easily migrate away from a decision we make today.
For the sake of completeness, as listed above, I do see more benefits for the multiaddress format beyond being intuitive to read and write.
If I am not mistaken the new peer ID format using a CID does involve both a multihash and a multibase. https://github.com/libp2p/specs/blob/master/RFC/0001-text-peerid-cid.md I have not been involved in this design, thus I might be missing something here.
Correct. The same applies e.g. to the support of peer IDs with the optional support for ECDSA. https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md Now this in itself is not an argument. (Two wrongs don’t make a right.) I would argue that this is not a big issue. In case we do decide to support an additional hash function, one would first need to roll out the reading and later on the writing side on a live network.
With the above mention of WebTransport needing multiple certificates in the same multiaddress, would you stick with your proposal @tomaka?
I don't have a strong opinion here. I moved the discussion to the WebRTC specification pull request. I hope you don’t mind. https://github.com/libp2p/specs/pull/412/files#r926205076
I don't think the two are mutually exclusive. I.e. we can have both a vision and prepare for the fact that this vision will most likely be (partially) wrong in the future. Happy to give feedback on a vision document, though I can not commit to actually writing one. |
The alternative suggestion that immediately comes to mind is: why not advertise two (or more) multiaddresses, one for each certificate?
Asking the operator to run an implementation is also not great. It means that the operator cannot write a script that generates a certificate and obtains the corresponding multiaddress, which is a problem if you want to automate the deployment of nodes.
I have forgotten to include this, because to me a peer ID isn't part of a multiaddress, but this is an off-topic question.
None of the benefits that you list above are relevant to the discussion about whether All your arguments against what I'm saying all boil down to "we use a multibase+multihash because we want to make multiaddresses extensible", but at the same time you don't answer the points in my "Counter-argument: extensibility" section that explain that
I don't think we have the same definition of "vision". To me, the entire point of a vision is to lay down the known knows and the known unknowns, in order to be able to know what is set is stone and what isn't. "Wrong" is neither a "known" nor an "unknown", it is a third category. A vision can be wrong, but a wrong vision should be seen as a systemic failure, and the potential of a wrong vision is not a reason to program defensively. |
I will let @marten-seemann comment on this as the WebTransport expert.
Correct. Extensibility is the benefit I see in favor of multibase+multihash.
Would you mind enumerating the points I did not address? In regards to In regards to eventually rolling out support for a hash algorithm beyond sha256, I agree that we have to touch the code either way, still I would argue that using multibase+multihash requires us to touch less code.
Unfortunately I agree that we diverge here. That said, I am optimistic that we can find common ground. |
The If you had multiple WebTransport multiaddrs, you'd either
That's actually a feature, not a bug. How many certhashes you include depends on how long you expect an address you once advertised to be cached by potential clients. With 2 certhashes, your address can be cached for (at least) 14 days. With 3 certhashes, you're at (at least) 28 days, etc. For most use cases, 14 days is probably plenty, but it's nice to have the option to extend that period. |
I don't suggest to encode the expiry date in the multiaddr. I mention that this gives the option to other protocols (similar to Kademlia) to provide an expiry date alongside with a multiaddr, as some kind of metadata.
Option 3: examine all the multiaddresses known for a peer before trying to connect to it, and pass to the
Sorry but I really don't understand this attitude. The entire point of a specification is to guarantee interoperability. You can't have an implementation accept multiaddresses with a maximum of 3 certificates and other implementations accept multiaddresses with a maximum of 5 certificates, as otherwise a multiaddress with 4 certificates will work on some implementations and not others. |
Attaching a TTL to particular addresses would require a major overhaul of how we advertise addresses, at all places where we advertise addresses (Kademlia, Gossipsub, Rendezvous, mDNS, just to name a few).
I don't see any reason to impose any limit at all. There's no security risk here, there's no way an attacker can exhaustively list all possible hashes. There's also no (additional) DoS risk here, an attacker can already send arbitrarily long multiaddrs. I'd be happy to add this as a requirement to the spec, something along those lines: "Multiaddress parsers MUST NOT limit the number of certhashes that they accept in a multiaddr". A PR would be appreciated. |
I'm not suggesting that we overhaul everything. There is no expiration date attached anywhere at the moment, therefore there is no overhaul needed. The context is comparing the solution of one multiaddress with multiple certificates (without any expiration date attached) with the solution of multiple multiaddresses with one certificate each (without any expiration date attached). My point is that in the future if we were to attach expiration dates, it would make more sense to attach them to individual multiaddresses, because it's not just certificates that can expire but multiaddresses as a whole.
Because computers don't have unlimited memory.
That seems like something that needs to be fixed. |
Adding my perspective on the above discussion whether to impose a limit on the number of I don't think we should impose a limit on the number of |
what about https://datatracker.ietf.org/doc/html/rfc8122#section-5?
|
It then mentions that "md2" and "md5" must not be supported, and that "sha-256" is preferred. I personally don't think we should support "sha-1" as it's deprecated. |
Outsider's opinion: @tomaka have raised some very good points 👍 with regards to multiaddresses, certhash and multibase/multihash. The fact that certhash has to support N hashes and many different formats seem to clash with webrtc requirements where we only need 1 hash and it can be "sha-256" (an option to upgrade is nice imho, although it may be safe to assume? sha-256 will be a default choice for years to come). I'd also go for |
fyi: webrtc-rs lib only supports |
I think the recent comments above miss the option of #130 (comment) suggests the option of introducing I understand that using
Instead of specifying the supported hash algorithms in the multiaddress specification, how about explicitly limiting the supported hash algorithms in the webrtc specification (libp2p/specs#412)? E.g. something along the lines of:
In case we ever want to support sha512 we would update the WebRTC specification. Such update would need to consider the many implementations and would need to detail an upgrade path for existing networks.
@melekes can you expand on these attack vectors. I am blanking on this, sorry. |
Just because the option exists doesn't mean that we should support it.
I'm trying to give precise reasons in these comments, rather than just "in my eyes".
As I mention in the same comment, I'm not advocating for What I'm advocating for is something like
This is a good thing to do, but what I'm advocating for is making multiaddresses less confusing.
I don't know whether you're presenting this as a negative point, but to me this is a positive point. |
see libp2p/specs#412 (comment)
good analogy 👍 |
As a positive point. Rephrased: I think it is a good thing to force such change, e.g. sha-512 support, to go through the libp2p specification process, thus keeping the many implementors from the many implementations in the loop and making us think about a good upgrade path. |
libp2p/specs@7009f94 adds the above suggestion, namely to specify
I don't understand how the hash algorithm and base encoding relates to these attacks. Can you expand on that @melekes? |
This seems like overspecifying things. Is there any precedent in our stack where we say "you need to support decoding this multihash and that base encoding? |
I am not aware of a case for multibase and multihash. Though related is the peer ID specification requiring ed25519 for all implementations.
https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md |
What is the point of a specification if there isn't even any intersection between what every implementation must support? Why even have a specification at all and not just let every implementation define their own custom multiaddr format for WebTransport and WebRTC, then? |
Update for multiformats/multiaddr#130 and multiformats/multiaddr#131 Even though I disagree with these protocols, I guess the way forward is to conform.
needed for WebTransport (and potentially WebRTC)