-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Protobuf Transaction Signing #6078
Comments
I do want to say that after considering the above, my personal preference is to go with approach (3) (signing the raw binary) as default and to allow (4) Amino JSON for a limited period of time while clients transition. (3) seems to offer the simplest UX for clients as well as the least number of malleability issues. Given that Supporting (4) temporarily through a flag doesn't seem to introduce any new issues that aren't already here, and has the big benefit of not disrupting wallets, exchanges, etc. that don't have time to transition overnight. |
Thank you for writing this up @aaronc ! Here are the problems I see with 3. This basically means that you can't decode the transactions inside your Trusted Computing Base and you need to extend your trust boundaries around signing to enclude another system that will deserialize your proto bufs for you. This is going to be a disaster for practical security. Strongly prefer 4 and 1. Because AMINO Json serialization in good enough form is pretty easy to implement and widely available. proto3 cannonical json actually looks okay but not really sure how wide spread support for this is. I think prost doesn't support this? |
@zmanian why is protobuf deserialization in embedded environments so hard? Have you looked at stuff like https://github.com/nanopb/nanopb? Even if that didn't work, I don't think a hand decoder should be too hard. The JSON stuff in the ledger app appears to be mostly hand coded. I'm fine with enabling Amino JSON for compatibility, but we should have an alternative going forward that just requires .proto files. |
Regarding this:
...it's reminiscent of the An alternative is to make This gives you a strong binding between an algorithm and a public key, known a priori to the verifier, and with that, a signature can name the "SPKI" (or SPKI hash) |
Some discussion here My guess would be that nanopb would be too big for the ledger. Adding a C deserialization library seems very much against our goals of minimizing C in the TCB. |
Also pretty strongly support the idea of binding the serialization system to the public key to support forward migration! |
The Protobuf Schema language lacks descriptions for the (maximum) size of variable-length fields. This means it isn't possible to codegen a struct with a fixed-sized APDU-like structure which is typically used in heapless (e.g. microcontroller) environments. Libraries which do support Protobufs in embedded environments therefore tend to work a level of abstraction below a typical Protobuf library which generates message types directly from the schema. This is also bad from a code size perspective, because it punts all of the work of size-checking the underlying fields to the end user. |
Okay, so that's definitely a consideration for supporting multiple algorithms. I do want to note that we're talking about supporting maybe 2-3 algorithms and
Would there be an option to change the binding to a newer system?
In that thread it seems there are at least 2 users who hacked together a way to do it. So definitely possibly, just not standardized in a library yet.
How is this any different from JSON? Strings/byte arrays have no max length in JSON either. If you had fixed sized arrays, you would need to truncate strings with either json or pb. But I'm not even sure why you'd need to do that sort of copying into a struct. From my quick glance at the ledger app source code, it seems like one of the key things it's trying to do is display info in fields to users. You could do the same thing in protobuf with zero allocation. Just iteratively navigate through the message like you would with JSON and keep track of what level of nesting you're at (which should be possible with a fixed depth). Strings should be easier to extract from protobuf because you don't need to copy to remove escape chars. And addresses need a small array to convert from bytes to bech32. Anyway, I definitely can understand the concern of not wanting to rewrite a bunch of firmware, thus my support for keeping an Amino JSON option. But these problems with protobuf to me seem solvable... |
JSON and Protobufs are both problematic in this regard. There are some optimizations you can do for JSON as a proper context-free grammar, but ultimately on platforms with low memory the main one is ensuring either JSON or Protobufs have a well-known fixed-sized structure.
I've been working on this sort of pushdown automaton in Veriform, as it were. For some extreme low-end embedded environments, even that sort of thing is too much (our target is a 500+MHz Cortex-A environment, whereas the problematic environments are much lower clocked Cortex-M ones with much smaller stacks) |
I want to jump in with an argument against Amino JSON. Basically, if we keep using that, we will need to support that tooling on all platforms. Much less work than binary Amino, but over 2 years there has been amazing little work by the core team to port any of Amino, and assuming all this magically happens "from the community" is wishful thinking. If we need to stay with Amino JSON, then I would say that porting client side libraries in major languages (not just JS and Rust, but say Java/Kotlin and ObjC/Swift at the minimum, ideally python and some more) comes into scope as part of this migration. If we can use tooling that already works out of the box in all these languages, than that is much easier. The strong valid criticism I see above is that it is hard to parse Protobuf in an HSM.
I encourage you to look at the Ledger app that Juan (same dev who wrote Cosmos Ledger app) wrote for IOV: https://github.com/iov-one/ledger-iov There isn't too much code there for the parsing. Actually grabbing a few fields from a predefined protobuf format is quite easy, and doesn't require parsing arbitrary structs into memory. It parses the Protobuf signing format we use for transactions and displays it to the users. It didn't take him that long to do it (I believe less than the original Cosmos app). I just want to say that using protobuf is not some new and untested idea, but was used for over 2 years now and on a mainnet. That IOV has no clue about business and marketing doesn't mean the code there has no technical merit. I advise you to borrow liberally. |
Okay, well sounds like this problem would only be solved by something like ASN.1 which isn't really an option. But either way sounds like we can actually deal with the memory problem with an event driven parser as opposed to decoding into a struct. So then the issue becomes code size, right? Here I can see one potential argument in favor of JSON. JSON is self-describing so if you don't have the schema, you could still iteratively display each raw field on the JSON on a device like the ledger. To do that in protobuf, you would need to include the schema for every type or limit support to just a few types to limit code size. Is that one of the tradeoffs you're seeing @tarcieri ? One point I will grant to Amino JSON is that it bech32 encodes addresses, pub keys, etc. Say you wanted to iteratively display every element of a JSON object in the ledger without the schema, with protobuf you would get base64 whenever you don't know the schema. Maybe that's not a hard problem to solve, but I do see it as a valid good thing about Amino JSON. To do this with protobuf we would need a custom JSON serialization format which maybe indicated the bech32 type of Anyway, I will include this as a new option (5) and update the pros and cons of the other options to reflect this discussion. I do still think there is an elegance to approach (3) and if it did work for embedded devices, maybe following the example in https://github.com/iov-one/ledger-iov, that might make everyone's lives easier. |
This self describing nature of Javascript allows the Cosmos app to work any number of chains out of the box. |
That and the fact that they all use a superset of the schema the ledger app understands. It checked for eg. Note that with option (3), protobuf is also self-describing to a degree. At least you can check the type they claim and will not mix up |
I want to caution against weighting this case too highly in the context of a generic user interface. A user interface that is made up of simply printing out the json key/value pairs without understanding the underlying message format will yield a strictly poor user experience for most "end" users. I feel like there is an important distinction between the audience of developers using the system and the users the developers are intending to support. The developers should be familiar with Proto and the tooling required to deal with it ... so the Json step is strictly speaking an extra effort that may or may not support the needs of the users the developers are working for. |
To further emphasize this, I believe Ledger has required moving from displaying raw JSON to a UI which extracts, displays, and confirms values in the message for exactly these reasons.
For what it's worth, ASN.1 DER solves this problem poorly as well. Most embedded implementations of DER are actually BER parsers that don't even verify the BER is canonical (and therefore DER). The most embedded friendly formats follow an "APDU"-like structure (i.e. fixed-sized fields everywhere). You can get similar properties out of either Protobufs or JSON if you ensure all of the fields in either a message are constant-length (by using e.g. |
@tarcieri Would you consider Cap'n Proto embedded friendly? (Not that it's an option anytime soon...) |
Not particularly. Cap'n Proto is significantly more complicated than Protobufs (see for example message segments and inter-segment pointers) |
Great stuff here, everyone. A few additional 👍 / 👎 for the main list that you are free to merge somehow: (1) Protobuf canonical JSON Pros:
Cons:
(2) Pros:
Cons:
(4) Cons:
(5) Pros:
(multiple signing algorithms?) Cons:
@zmanian if there was a JSON document that is signed, would you expect a JSON->proto conversion to be possible, i.e. you only operate on JSON and assume this can be translated back to the transaction format understood by Tendermint (proto binary). Or would it be sufficient to create a JSON document from a proto document (one way function proto->JSON), which is then sent to chain? Amino allows two-way mappings, but this is significantly harder to get right than one way mappings. While IOV's Ledger app is great for IOV's use cases, I think it is important to note that as of now it only supports a single message type (with ongoing work to add a handful more). Chain ID and address prefix are compile time constants with a just a boolean testnets/mainnet flag. |
Not true. Maybe re-read how I framed it and look at how |
Sorry @aaronc, you're completely right. What would probably help (at least for me) is to repeat in the top what we are talking about: the ecoding of the |
Just
My take here is that having schema aware signers should be an enhancement of the baseline signing experience. Prior art in Ethereum, Bitcoin etc is that if your signer isn't aware of the schema you are signing then you are signing pretty much opaque bytes. By using json as the signing target, you get an enhanced experience if the signer is aware of the schema and fall back to something somewhat human readable. I can imagine a format that is easier to implement in Rust and other languages than Amino JSON. But also aprprox 10 LOC to implement. I'm generally in favor of 1 or 4.
It's totally fine if you need a protobuf schema to turn the json into bytes on the wire. The general pattern is that singers are resource constrained and difficult to update settings and serializers are much more flexible. |
@zmanian okay, that's level 1 independence. Level 2: what if it was not possible at all to map back JSON->proto?. Instead you must use the proto doc from the composing environment plus the signature:
|
So I'd like to take 4 (Amino JSON) off the table as a long term solution. Short-term, sure. But we you want something like Amino JSON, let's consider 1 or 5 where all of the information is in the .proto files. |
@webmaster128 added your pros/cons to the main list (except the Amino JSON con which was already there worded differently) |
I love this diagram The composing environment is expected to know how to take an unsigned prototx and a signature and turn it into signed bytes.. The verifier is expected to know how take signed bytes, generate a SignDoc and verify a signature.
|
Glad to hear that. Where I am heading to is: proto->JSON serialization is going to be non-trivial, but I am sure it can be done as described in (1), (4) or (5). The reverse operation (deserializing JSON->proto) however is specified very openly, allowing all kind of JSON variants that lead to the same proto document. This starts with allowing both numbers and strings to deserialize to When the JSON representation is only used for signing, we lose the current flow of broadcasting signed JSON to the REST server, which is a good thing in my opinion. I believe a client (tx compose and breadcasting environment; no privkey here) should be able to operate on proto, given a Cosmos specific wrapper around a general purpose proto lib. But I want to make sure there is consensus on this. |
I do want to re-iterate that there is something pretty elegant about (3) - just signing the raw binary. All of the JSON solutions including the standards-based approach (1) require both a) a fair amount of additional client library support and b) substantial auditing to check for edge cases and malleability issues. It seems that the biggest benefit of the JSON solutions is that we could just show raw JSON to users of the ledger if the ledger app doesn't have the full proto schema. But this convenience does come at a cost elsewhere. With approach (3), you have both the least surface area for transaction malleability issues and the easiest implementation for composing and verification environments. For hardware signing environments, there is going to be complexity whichever approach is used. Is the benefit of being able to show raw JSON as a fallback worth all the additional complexity elsewhere? |
yes the whole strategy of signing the raw binary produces a system that is far too cumbersome to extend. an isolated signing environment brings little benefit if you don't have access to secure display to confirm what you are signing. The weak link in blockchain protocols are the humans that interact with them. I can't emphasize enough that I am overwhelmingly opposed to signing non-self describing dataformats. |
I'd suggest taking a look at Adam Langley's blog post about extensibility. I think it winds up being pretty important: https://www.imperialviolet.org/2016/05/16/agility.html
That sounds great! Although to really make it work, you need a Protobuf implementation that allows you to extract the unknown fields so you can check if any are critical. |
Thanks for sharing. So I think the big takeaways are make the extension system simple - in protobuf this is basically adding new fields - and allow newer clients to work with older servers - thus the desire to not just reject all messages which clients send with new fields but only ones that are flagged as critical.
Yeah, we might need to do something hand-generated that takes the message descriptors and flags unknown critical fields. |
I'm not terribly familiar with the Go Protobuf ecosystem, but I believe you can use |
Yeah unfortunately it would break other stuff to enable that so it would likely need to be a separate parser pass. |
We'll be discussing the topics related to this, and more specifically any outstanding blocking issues with ADR020 tomorrow on our bi-weekly architecture review call. https://hackmd.io/UcQfA-JRQjG1zfKKWxH1gw Zoom link in the HackMD if anyone from this thread would like to join. |
Looking at the protobuf document, this is no issue: every message is stored in an
You need to parsing steps anyways (either explicitely or hidden by gogoproto): (1) the outer document including the So the extra fields we are talking about is when a client has a different message schema for the same type URL with extra fields? Well, then I guess the chain is the source of trueth and the client can add whatever data they want to a given message. But this is a matter of message type upgradability, not signature verification. |
Yes, this is about message upgradeability, not signature verification per se. Although the canonicalization approach to signature verification prevents upgradeability (in the direction of newer clients interacting with older servers). |
So summary from our architecture review call today (https://hackmd.io/UcQfA-JRQjG1zfKKWxH1gw?view), which focused on ADR020 (#6111) We had consensus on the approach laid out in this PR, with a few minor additions:
Oustanding question:
EDIT: It is expected that the resolution of this @aaronc Can you lay out more details for this outstanding point here? |
There is a pretty big discussion that has been unfolding related to #6111 and #6174 in those PRs and now on discord. I think it has gotten to the point where it deserves its own list of alternatives with pros and cons. I will make my own attempt below. The general context of this discussion is the UX of a multi-signer transaction. I will frame the various alternatives under three broad umbrellas - (1) Closed signers, (2) Open signers, (3) Open then closed signers. (1) Closed signers (the current Cosmos SDK model)Definition: the list of signers is fixed by the content of the messages. Extra signers are not allowed. Every signer must sign over the list of other signers Multi-Signer UX:
Pros:
Cons:
Variation (a): Don't ask for pubkey and signing mode in step 1This is effectively the current SDK model which doesn't include signing modes Compared to variation (b):
Variation (b): Ask for pubkey and signing mode in step 2This is model proposed in #6111 Compared to variation (a):
(2) Open signers (the Weave model)Definition: extra signers not specified by messages are allowed This is Weave's model. (@webmaster128 if I get something wrong here please correct me) Multi-Signer UX:
Pros:
Variation (a): Original Weave model - no signature limitAs @webmaster128 shared, this had the following cons:
Variation (b): Fixed Weave modelAccording to @webmaster128:
I believe this is the actual issue: iov-one/weave#1091 While this solves some issues, it still appears to be exploitable (??), i.e. if the message was intercepted an extra signer could still add themselves and screw things up for the fee payer, is that true?? (3) Open then closed signersThis is the approach proposed in #6174 Multi-Signer UX:
Pros:
Apologies for this being such a long post. I really want to get to the bottom of this and agree on an approach and move forward. In my personal opinion, approach (3) seems to have the most benefits and least tradeoffs for multi-signer transactions. It allows the benefits of (2) without the downsides. Any variation of (1) would be the status quo and acceptable. I don't see a huge difference between (1)(a) and (1)(b) - it's still the same number of phases and requires fixing signers before signing. I would prefer the guarantees of (b) but at this point whatever. Neither seems to actually give a very good multi-signer UX which I thought was the point of this. That's why I recommend approach (3) #6174 - good security guarantees + good multi-signer UX. |
I feel like 3 is the correct direction as well. |
Thank you @aaronc for the great summary. Here are a few additions/remarks and then separated my person vote. (1)
Regarding Con:
The point does not apply to a multisig-pubkey, which is a single signature that can be satisfied by any signatures of the group that combined make up a valid multi-pubkey signature. For the required signers of the individual messages, it the required signers are deterministic already. (1a)
The tx composer does not even need to ask for the addresses. They exist automatically as part of the messages. (2) There is a significat difference between (1a) and (1b): The list of required signer addresses is available for free locally in the messages. Needing to know the public key for each required signer address can only be automated if all pubkeys are stored on-chain, which is not given for new accounts. The communication overhead of needing to know the signing modes of each signer is relevant as it increases the roundtrips of human communication. Additionally it violates separation of concerns. My vote goes to (1a) as it
ps.: there is a variant of (3), let's call it (4) which only allows a single top level signature and keeps all other signatures at a lower level. It comes with the same two-phase signing process as (3) but simplifies everything. Most transactions only need a single top-level signature (which can be one of a multi-pubkey groups) and the few transactions multi multiple inner signatures can be processed in two steps. It is a radical change. Let me know if someone is interested to learn more. |
But that's not true because they still need to agree upon the list of addresses to include in the messages. If you wanted to coordinate 3 signers out of 5, you would still need to ask all 5, then get an ACK from 3 of them and then compose the messages and then have them sign it. I acknowledge that the weave model made this simpler and I think (3) gets pretty close. |
Based on discussions on discord, I have updated my original PR #6111 with the following:
I agree with @alexanderbez that we should try to keep our initial implementation as simple and as secure as possible. Even if we did come to consensus that (3) provides a better UX, it is beyond the scope of our current work to address this. But I think it's good to document the alternatives and then see what users ask for in practice. (1)(b) seems to be the simplest to implement with the most security guarantees and wouldn't preclude (3) and/or (1)(a) from existing as alternate signing modes in the future. |
What is left to get the work started? What needs to be merged? |
#6111, then maybe an init PR with the proto file and then implementing the signing logic. |
There are two types of multiple signatures currently discussed:
The later results in a single (nested) top-level signature, i.e. is one signer no matter which group participants signed. So no coordination required on which participants will sign. The first one is not suited to implement n/m multisig at all because the nature of the |
Problem Definition
The Cosmos SDK has historically used Amino JSON for signing of transactions whereas Amino binary is used for encoding.
During the SDK's migration to protobuf, we had made the preliminary decision to use a canonical protobuf JSON encoding for signing as described in https://github.com/regen-network/canonical-proto3.
As a consequence of #6030, the Cosmos SDK is moving in the direction of using protobuf's
Any
type for the transaction encoding and signing format. In this discussion, a number of participants have asked that we revisit the transaction signing discussion. The options that have been discussed/are available for consideration are outlined below.It should be noted that it is theoretically possible to support more than one of the following options via an enum flag on the signature. Whether that should or should not be done is a related question.
Proposals
Feel free to suggest updates to these alternatives in the comments
(1) Protobuf canonical JSON
The official proto3 spec defines a canonical mapping to JSON. This is not really deterministic, however, so we define a canonical encoding on top of that using https://gibson042.github.io/canonicaljson-spec/.
Pros:
Cons:
Timestamp
, although that malleability is likely un-exploitable.some_field
becomessomeField
(2) Protobuf canonical binary
This involves re-encoding the protobuf used for transaction encoding canonically for signing - meaning that fields must be ordered and defaults omitted. This is how Weave does signing.
Pros:
Cons:
Any
(because the full type URL is included andoneof
's are not copied manually) and breaking change checkers like Buf/Prototool(3) Protobuf binary as encoded in transaction
This simply uses the protobuf encoding as broadcast in the transaction. This becomes a little easier for both signature creation and verification because of
Any
(although it could be done without Any too). BecauseAny
wraps the raw bytes of thesdk.Msg
, it is pretty easy to use these same exact bytes for signing and verification and only require thatSignDoc
itself is encoded canonically, rather than everyMsg
Pros:
Cons:
Any
(4) Amino JSON
This is how tx's are signed currently. The reason this is under consideration is because breaking Amino JSON signing would break many clients especially the ledger app. Transactions could still be encoded with protobuf and the
/tx/encode
endpoint could accept Amino JSON and return protobuf rather than amino binary for tx broadcasting - some upfront work would be required to enable this but it is possiblePros:
Cons:
Signature
so maybe this is a non-issue(5) Custom Proto JSON
Extend (1) to support custom encoding of certain types like bech32 addresses
Pros:
(multiple signing algorithms?)
Cons:
Related Question: Should we allow multiple signing algorithms?
Theoretically we can allow clients to use multiple signing algorithms and indicate which one they used as an enum flag on the
Signature
struct.Pros:
Cons:
For Admin Use
The text was updated successfully, but these errors were encountered: