-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add e2hs file format spec #368
base: master
Are you sure you want to change the base?
Conversation
BlockTuple := CompressedHeader | CompressedBody | CompressedReceipts | ||
----- | ||
Version := {type: 0x3265, data: nil} | ||
CompressedHeader := {type: 0x03, data: snappyFramed(rlp(HeaderWithProof))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a new type it should use a unique type number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I thought, but then I saw that era1
and e2ss
use the same version number, so I wasn't sure what to do. But I agree it should be something unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only type that is reused is 0x03
for snappyFramed(rlp(header))
, but it's the same for all 3 existing types: era
, era1
and e2ss
.
You define new type snappyFramed(rlp(HeaderWithProof))
, so it should be different.
CompressedHeader := {type: 0x03, data: snappyFramed(rlp(HeaderWithProof))} | ||
CompressedBody := {type: 0x04, data: snappyFramed(rlp(BlockBody))} | ||
CompressedReceipts := {type: 0x05, data: snappyFramed(rlp(Receipts))} | ||
Accumulator := {type: 0x06, data: hash_tree_root(List(block_hash, 8192))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would look into the shortfalls of historical roots and summaries, Beacon slots can be empty, so I am not sure if this ridge requirement would work for the other accumulators, so I am assuming this accumulator section is under specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I'm not entirely sure what the best approach is for this field. I'm not even sure it's necessary, since all the headers will have accompanying proofs that can be directly verified. I need to double-check exactly how it's handled, but even in era
files they have a complete list of block_roots
even if there are missed slots, so we can probably just copy their approach for handling missed slots. But I'll dig into this
Then my final question to everyone is does this Most links are to PR's to projects, or lined commit links. So I am not sure if this should belong in the |
Also, just noting here that this spec will also need to handle the final, pre-merge truncated epoch, so that post-merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I'm in favor of this direction
BlockTuple := CompressedHeader | CompressedBody | CompressedReceipts | ||
----- | ||
Version := {type: 0x3265, data: nil} | ||
CompressedHeader := {type: 0x03, data: snappyFramed(rlp(HeaderWithProof))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only type that is reused is 0x03
for snappyFramed(rlp(header))
, but it's the same for all 3 existing types: era
, era1
and e2ss
.
You define new type snappyFramed(rlp(HeaderWithProof))
, so it should be different.
BlockTuple := CompressedHeader | CompressedBody | CompressedReceipts | ||
----- | ||
Version := {type: 0x3265, data: nil} | ||
CompressedHeader := {type: 0x03, data: snappyFramed(rlp(HeaderWithProof))} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use snappyFramed(ssz(HeaderWithProof))
, because we don't encode HeaderWithProof
with rlp anywhere at the moment (but we do ssz encoding). And even if you would like to rlp encode it, i think it wouldn't work well.
I think BlockBody and Receipts are currently encoded both with rlp and ssz, depending on the context. But if we encode HeaderWithProof with ssz, I would also encode body and receitps with it as well (for consistency).
|
||
``` | ||
e2hs := Version | BlockTuple* | OtherEntry* | Accumulator | BlockIndex | ||
BlockTuple := CompressedHeader | CompressedBody | CompressedReceipts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in comparison to era1 format, we lost total_difficulty. Is reasoning that we don't need it now that proofs are already part of the CompressedHeader
?
Would we need it in some other context?
The file format is defined as: | ||
|
||
``` | ||
e2hs := Version | BlockTuple* | OtherEntry* | Accumulator | BlockIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is OtherEntry?
Why do we have to align So if we want one clean type, primarily for portal network usage, I would suggest that:
Basically, it would be each of the HistoryContentValue types, encoded as they are in portal-spec (ssz). And potentially some other meta-data, like starting index, total_difficulty, etc. |
Ahh, that's a fair point. The only strong argument I can see for "aligning" with I'm ok with ssz-ing everything, I think @KolbyML might have preferred rlp-ing receipts/bodies in a side chat? I'm a little unclear in my understanding of the purpose of the
Which I would understand as an index for each block is necessary for indexing directly to a specific block without iterating over the whole file. Though, looking at our I can't think of any reason why the |
I just want to do whatever is less work. The nice thing about rlp is we know the format will never change. But maybe we will change our format for ssz encoding them? I am fine either way, ssz-ing everything just seemed like it was more work, for little to no gain. But if there is a gain I think we should do it, it wasn't apparent to me there would be value, expessially as Our ssz bodies and receipts are just wrappers around rlp, it seemed pretty pointless, when we already have the infrastructure in place, why create custom types for something where there is already a good solution i.e. why reinvent the wheel when nothing is broken. For HeaderWithProof it makes way more sense for it to be in ssz, as we don't have code to encode/decode rlp for it. HeaderWithProof is far more ssz native, hence why I said what I did |
Regarding Regarding rlp vs ssz, I'm not strongly set one way or the other... The benefits of ssz:
|
This is a proposal for a new file storage format to be used in the History network. The goals of this format...
This format will require additional architecture to generate new
e2hs
files for each finalized epoch, as well as some bridge work to support this format.Maybe the only concern I can think of is that the new era files will be based on a portal-defined type, rather than more "established" native ethereum types. so, let's say we want to change HeaderWithProof at some point in the future, we will have to re-generate all of the files (though, imo after the recent union removal, this type has solidified).
Any feedback on the specifics or any pushback on using such a format would be great, there might be other short-comings to this idea that I haven't noticed.