-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: transaction rlp encoding #1536
Conversation
I'm not extremely happy with this approach, as it complicates the Transaction object, which is very commonly used (even tho complexity is mostly hidden away). The only other reasonable approach that I see is to have another type (e.g. |
Couldn't we implement alloy_rlp length trait with the pre-existing code as well. If neither of the proposed refactors are a "happy approaches" we can just implement the length trait and call it a day no? which would also be much similar than the proposed changes. I think the manual rlp calls instead of deriving them is fine because as you said the "manual" path is hardly used. So is it worth the complexity to implement auto derives for a rare usecase in the grand scheme of things. |
Oh yeah this is starting to come back to me, it's been about 3 years! Since you're looking for alternatives, I'll share what we did in py-evm. We found it helpful to have a unified pre-RLP-encoded representation. So that means the primitives of byte-strings and lists (er, vectors), which can contain more byte-strings and vectors. When we correctly serialize into this format, then the RLP library should handle the actual encoding for us (by prepending the length of the payload recursively). So the approach we took in py-evm was to offer two separate methods on the global transaction type:
For a legacy transaction, that looks like:
For a typed transaction, split it into two layers, a wrapping layer that identifies the transaction type and a payload layer that is fully defined by that type ID. The payload layer works effectively the same as a legacy transaction, for serialization. In the wrapping layer:
Then the standard RLP encoding should work on a list of transactions that have each been run through This is a lot, so I'm happy to get on a call, if that would help. |
nit since this is adding a new ability in the rpc, I think naming commit something like this would be clearer: Also, it could make sense to split into a |
Hm, the longer I look at this, the more I think it's potentially an orthogonal problem. I guess it's possible we never dealt with this funny case just when calculating block size, where just the typed transactions are double-rlp-encoded (it doesn't seem familiar to me). So I guess if I don't have any better ideas. 🤷🏻♂️ I guess it's somewhat related how we have to add an extra rlp-encoding to legacy transactions when calculating the transaction root. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Nvm I think I am wrong about the performance claims I made as after reading the implementation for the So with this refactor, assuming everything implements length(), whether that is through a macro, or manually, there should be a performance increase. As all primitives should rely on .len() instead of encoding the rlp |
I gave this PR a brief review. We should implement I will give a more thorough once all the tests are passing. Sorry about the confusion in my earlier messages. |
@@ -123,17 +123,16 @@ impl BlockBody { | |||
/// Returns reference to uncle headers. | |||
/// | |||
/// Returns None post Merge fork. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Returns None post Merge fork. | |
/// Returns None post Merge fork. |
this doc is no longer true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR looks good. I think this solution is fine as I would want to avoid a wrapper type as it just seems kinda feels off to me.
let size = None; | ||
// Note: transactions are encoded with header | ||
let size = { | ||
let payload_size = header.length() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let payload_size = header.length() | |
let payload_size = header.length() |
Our Header type here doesn't use the macro or implement length in its implementation of Encodable
so we are stilling encoding to find the length of Header
. Would you want to implement length()
for header in this PR or make an issue for it any we can solve it in another PR.
This would be the only place in calculating size where we still are encoding to get the length I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do it in a separate PR (no need to open issue about it, I will do it right now).
@KolbyML @carver I created another PR (very similar to this one) where I used separate type that is just a wrapper around Now that I see both of them, I like that one a bit more because it doesn't complicate the |
I like it more too. I thought it would look weirder in my head. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer the other approach
Closing in favor of #1539 |
What was wrong?
Transaction can be rlp encoded/decoded with (less common) or without (most common) additional rlp header.
Currently, we have to deal with less common case manually, which isn't great. As of now, we have two use cases for it: inside
BlockBody
and as block size insideeth_getBlockBy*
calls (which is not implemented).How was it fixed?
Added const generic argument to the
Transaction
struct.The default value represents the most common case, which means that we don't have to specify it in most of the cases.
To-Do