Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metadata for offline signers #46
Metadata for offline signers #46
Changes from 32 commits
2fbe803
8d05445
36eb717
acffee2
0f800bf
8870436
2c006b4
9b2beb2
888223a
0bb74b0
c2708a8
4f7d07b
439e119
ab1c0d4
4c42ca9
613d1bb
674498d
5ea8c9b
9c4ccb0
9418178
4f317c5
7b472d1
4763f2a
b192d19
958e934
a2d8791
6d9603b
10bfa00
17b046c
21dc94a
78b8e67
ebfa55f
a00ecf7
91e9bdf
2de8cd7
6cb7381
601c623
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't 100% understand this. Do you mean something like:
Additional information (on top of the metadata) that's required to decode extrinsics and runtime constants for a given runtime version must be included in the digest?
But what are those information? Shouldn't metadata itself includes everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, metadata does not contain all information. A few extra things like base currency name ("DOT") and base currency value in
planck
s are often (not always) not included there. This is a known issue that required us to separately transfer metadata and "additional network specifications" to cold wallets in past. Here we just combine those.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If metadata does not contain all the necessary metadata, I will suggest just fix the metadata. I raised a somewhat related issue before paritytech/substrate#3686
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice solution; however, for things like "units" we'll need more robust automation, those are not arbitrary values, but items declared in runtime. We'll need to make them properly coded part of metadata, otherwise parachain upgrading will end up in mess.
I'm all for fixing metadata issues, but I think this should be done one at a time. This particular issue that units are not mentioned anywhere was discussed for a long time and no effective solution was proposed. I think we should move on with this project and then, if there is a proper way to include units in metadata for all chains, we'll keep wasting couple byte in transfer protocol for empty values and then eventually remove this extra information if an upgrade to shortened metadata happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By this you mean that we may change the version to fix a critical bug and we only need to roll this out on chain to fix it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, and all wallets would have to explicitly change this value, no chance of partial compatibility here. There are many strategies for communicating correct version per chain to cold signer if these diverge for some reason, I think it's out of scope here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get what you mean by this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say that the device may choose not to verify the data. It'd be like current situation with PJS extension for example, that gets the metadata and uses it to interpret the extrinsic, without verifying its authenticity (probably because it can't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something we have no chance of enforcing here really. So we just state that it is not provable at this part of the architecture and expect the possibility of absence of this check when designing security strategies (for example, this would be required check in app code audits).
Also, tools like staking payout machine do not really need to check any metadata authenticity since they are limited to small number of calls and are often connected to petty cash accounts that are not worth being attacked anyway. We have an expetimental project ourselves where a signer resides on a physical key for physical lock, it does not need metadata verification as it only signs very narrow range of extrinsics (and it would severely degrade performance to do the check as hardware there is really humble); this is alternative - and very centralized in a sense, you just trust specialized firmware developer - approach to the same problem. To support signers like these we need to keep this optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the chain doesn't really care if you checked the metadata or not or if you just added some hash that you got from someone. This is just some way to make it able for people to ensure that they got the correct metadata. TLDR, why do we need to list this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as a reminder to everybody that this is a fact - you should not rely on this actually happening. Explicit statement that proving that the check actually happened is not planned at all, just that the tool that could have done it had all required information in genuine form. I think this is important for future security analysts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why "one at a time" is "ultimately" because parallel transmitting could be more efficient when appliable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be all varieties of options ideally, to fit growing metadata into as small footprint systems as possible. There should also be options to send it in bulk data-efficient format where transmission channel is limiting. This will ensure that we are future-compatible with unpredictably developing hardware (well, and software) systems. Proposed solution solves this challenge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these chunks really self descriptive? Self descriptive for would be for me something like using
json
for transferring them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, they are if you accept that under SCALE-encoding nothing is self-descriptive by itself, but it relies on
RuntimeMetadataV14
registry type and basic SCALE types.Both parts should know what
Registry
and SCALE is.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean they are stripped, when sending them to the offline signer? Does this mean that each variant is a new chunk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily. When sending data to the device, only used variants are sent (all in one type). This differs from the proof generation, which is explained afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manner of these being sent to device is not specified really, they could be sent as separate chunks or as fewer variant enums or even as full enums, why not. What is important, they are indeed separate chunks for proof generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is
ExtrinsicMetadata
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is string, not u32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we've seen parachains and standalone chains where this was not
u32
. Those went extinct some time ago though. This is not forbidden anywhere and people are experimenting sometimes, there is no reason to punish them. Or is there? We certainly should change it intou32
(it would be easier and cleaner indeed) if this is planned to be forced, but that has to be documented well in chain developer docs first, which is out of my reach.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is SCALE encoded right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about for the chain that does not have native token or multiple native tokens? I guess we can pass 0 and empty vec for symbol but best to be explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need to explain how a merkle tree works nor what the naming of the individual parts is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for conventions - what exactly notation would be used further. Not sure it is used much right now, but I don't see how this hurts either. Might help ask questions for reviewers in same definitions to speed up this process a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really part of the stuff you are describing here? I have the impression that you are describing here how to build the hash, not sure we need to explain how to add it to the signed data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a general flow of the whole construction; the (4) just indicates final destination of result, as mentioned elsewhere (we say that we are adding a constant to
additionalSigned
and do not give any name to that step; thus here we just repeat those words to connect flow).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to use V15?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We clearly should not mention the
docs
field, as I assume that this isn't present in the "shortened metadata" representation. In general this algorithm does not make that much difference, if you don't know the types.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Types registry comes directly from full, non-shortened metadata, that has all fields, and this (1) is the stage where it loses
docs
, not earlier. As we describe algorithm here, this is an essential step. The only other way to describe this would be copying all other fields to new structure, but that is just longer description of the same step.Also, there is ambiguity about docs field missing altogether vs being blank that would probably surface across implementations; this statement should resolve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it the other way around, you associate chunks with leaf indices?
In general I would also by intuition, order all leaves in ascending order. Then I would chunk them into two and hash them. The output of this would follow the same procedure until only one hash (the root hash) is left.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean generally this is the same as you explain below, but yours sounds somewhat more complicated 😅
Maybe we also don't need to explain this, as this is the default way for building a binary merkle tree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other approaches; this is kind of canonical for CBMT, but there could be implementers that do not know canonical stuff. I know we've independently made exactly same implementation for hot-to-cold proof transfer protocol with @carlosala based on natural order of universe, but I'm afraid here we should be a bit more explicit as canonical here is not the best construction path, just a convenient one. Ordering matters here, and also there is an issue of pairing last odd leaf that could be resolved in more awkward or even slightly less efficient way (making tree incomplete, but I'm not sure it's obvious for everyone).
Also, just following my description verbatim, it's possible to construct the tree without knowing much of background. That should just make implementations easier hopefully. I was thinking about also writing pseudocode representation, but that's probably too much already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, I can not directly follow you description verbatim because it is not 100% unambiguously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you do it for the nodes in different order than for the leaves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it should be fine to just say that you use the root hash plus the hash of the metadata descriptor as digest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting removal of line 206? I was kind of afraid this would be a bit difficult to follow if we don't state what's the next step, but yes, we can remove it. Should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I meant for this entire section, but I don't remember exactly what I meant 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the entire digest is added to additional signed? Only the hash? The version will already be added to the "signed extension data" that gets signed.
I would also maybe drop
additionalSigned
and just explain that it gets added to the data that gets hashed as part of the extrinsic or not.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we add 1 byte that is version/inclusion byte, and then we add 32-byte digest to that part of data. Not explaining that it is part of
additionalSigned
would just add to unmeasurable amount of confusion around this obscure entity. I lost count of times I've been asked how Signed Extensions work and I spent way too much time figuring it out myself first, probably massively pestered you as well, I don't remember. I'm pretty solid that this should stay here if we want to have any hope of more implementations appearing eventually.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant just to drop
additionalSigned
and describe it in a more human friendly way. Because outside of Substrate it is maybe not calledadditionalSigned
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would I explain it differently? It's a field that's called that way in metadata and at least in major Rust and JS codebases, if we drop the name or change it nobody would know where this belongs. Things that go into signature and their order are extremely confusing. It's not described anywhere (yet) in full. So if we just say that it is added to data that gets signed, the manner of addition would remain unclear - and I'm not describing this step anywhere else in the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do already have a version byte in extrinsic format. maybe we can just use it instead of a new byte?