Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New proposal for upgradability changes. #95

Closed
wants to merge 1 commit into from

Conversation

mfornet
Copy link
Member

@mfornet mfornet commented Jul 18, 2020

  • On Handshake we should use (version, oldest_supported_version) to actually allow connection between peers with different versions. (Actually this needs to be fixed anyway).

  • Enums in versioned Data Structures are only used at the client level but not at the network level.

Advantages:

  1. We avoid one byte for encoding the version in the enum, and actually we completely remove the (virtual) issue with the maximum number of enum options of 256.
  2. We can freely modify the layout of a message without creating a breaking change between versions. Right now for every data structure that needs to be updated and don't have an enum wrapping it we need to add the enum and it is a breaking change, which removes the whole point of upgradability. This can be partially addressed by wrapping PeerMessage inside an enum, but it will still remain very complex, since modifying any deep data structure will require versioning all parent data structures as well.
  3. It will allow writing better backward compatible changes that check we are actually backward compatible up to the version we claim.
  4. Regarding implementation it will allow to write cleaner and simpler code with macros in the following way:

Suppose we decide to move from our 256 bits hash to a stronger 1024 bits hash. The change on the data structures will look like:

struct BlockHeader {
    #[version(until=30)]
    prev_hash_old: CryptoHash,
    #[version(from=31)]
    prev_hash: StrongCryptoHash,
    ...
}

What should happen in the background is that some new data structures are created:

struct BlockHeaderV1 {
    prev_hash_old: CryptoHash,
    ...
}

struct BlockHeaderV31 {
    prev_hash: StrongCryptoHash,
    ...
}

/// The most recent layout is the canonical
type BlockHeader = BlockHeaderV31;

/// We should had implementation converting from each type to canonical
/// This can be handled automatically from the macro if all changes are of the form:
/// - Removing a field
/// - Adding a new field with default or initializer value
///
/// But it makes sense to have custom way to convert from one type to another, for
/// example in this case we can convert from one hash to the other.
impl From<BlockHeader> for BlockHeaderV1 {
    // ...
}

impl From<BlockHeaderV1> for BlockHeader {
    // ...
}

/// To serialize and deserialize we use the minimum among our version and our peers version
impl BlockHeader {
    fn serialize(message: BlockHeader, version: u32) -> Vec<u8> {
        match version {
            1..30 => message.into<BlockHeaderV1>(),
            31 => message.into<BlockHeaderV31>(),
            _ => unreachable!(),
        }
    }

    fn deserialize(buffer: Vec<u8>, version: u32) -> BlockHeader {
        match version {
            1..30 => buffer.deserialize<BlockHeaderV1>().into(),
            31 => buffer.deserialize<BlockHeaderV31>().into(),
            _ => unreachable!(),
        }
    }
}

* On Handshake we should use (version, oldest_supported_version) to actually allow connection between peers with different versions. (Actually this needs to be fixed anyway).

* Enums in versioned Data Structures are only used at the client level but not at the network level.

Advantages:

1. We avoid one byte for encoding the version in the enum, and actually we completely remove the (virtual) issue with the maximum number of enum options of 256.
2. We can freely modify the layout of a message without creating a breaking change between versions. Right now for every data structure that needs to be updated and don't have an enum wrapping it we need to add the enum and it is a breaking change, which removes the whole point of upgradability. This can be partially addressed by wrapping `PeerMessage` inside an enum, but it will still remain very complex, since modifying any deep data structure will require versioning all parent data structures as well.
3. It will allow writing better backward compatible changes that check we are actually backward compatible up to the version we claim.
4. Regarding implementation it will allow to write cleaner and simpler code with macros in the following way:

Suppose we decide to move from our 256 bits hash to a stronger 1024 bits hash. The change on the data structures will look like:

```rust
struct BlockHeader {
    #[version(until=30)]
    prev_hash_old: CryptoHash,
    #[version(from=31)]
    prev_hash: StrongCryptoHash,
    ...
}
```

What should happen in the background is that some new data structures are created:

```rust
struct BlockHeaderV1 {
    prev_hash_old: CryptoHash,
    ...
}

struct BlockHeaderV31 {
    prev_hash: StrongCryptoHash,
    ...
}

/// The most recent layout is the canonical
type BlockHeader = BlockHeaderV31;

/// We should had implementation converting from each type to canonical
/// This can be handled automatically from the macro if all changes are of the form:
/// - Removing a field
/// - Adding a new field with default or initializer value
///
/// But it makes sense to have custom way to convert from one type to another, for
/// example in this case we can convert from one hash to the other.
impl From<BlockHeader> for BlockHeaderV1 {
    // ...
}

impl From<BlockHeaderV1> for BlockHeader {
    // ...
}

/// To serialize and deserialize we use the minimum among our version and our peers version
impl BlockHeader {
    fn serialize(message: BlockHeader, version: u32) -> Vec<u8> {
        match version {
            1..30 => message.into<BlockHeaderV1>(),
            31 => message.into<BlockHeaderV31>(),
            _ => unreachable!(),
        }
    }

    fn deserialize(buffer: Vec<u8>, version: u32) -> BlockHeader {
        match version {
            1..30 => buffer.deserialize<BlockHeaderV1>().into(),
            31 => buffer.deserialize<BlockHeaderV31>().into(),
            _ => unreachable!(),
        }
    }
}
```
@render
Copy link

render bot commented Jul 18, 2020

@bowenwang1996
Copy link
Collaborator

bowenwang1996 commented Jul 18, 2020

What happens when you want to add a new message type or add a new field to some existing message type?

@mfornet
Copy link
Member Author

mfornet commented Jul 19, 2020

What happens when you want to add a new message type or add a new field to some existing message type?

The only important thing is that we don't change layout of the handshake (because that is the first message we will be using before knowing version) and it sounds like a good compromise.

If you add a new field, you should have something similar to a default value for this field, or if some custom handling is required we can add a field with the version of this message (but this field can be marked with borsh_skip and inferred at deserialization time).

The whole point of this update, is that currently I don't see how it is possible making a backward compatible change that updates message layout. I saw the BlockHeaderV1 example, but it was actually not backward compatible, was it? Because it changed the layout and hence previous node was unable to deserialize that message.

@bowenwang1996
Copy link
Collaborator

It is backward compatible in the sense that nodes with new version can process blocks with the old version. It is obviously true that the node with old version will not be able to process blocks with the new version, but that doesn't matter.

@SkidanovAlex
Copy link
Contributor

I like the idea of annotating fields instead of explicitly creating the datastrucutres a lot.
Do we know if rust marcos actually allow doing this?

@bowenwang1996
Copy link
Collaborator

An issue that I see is that when you want to call some method on BlockHeader, you first have to convert to one of the BlockHeaderV.. types. This does create some potential overhead.

@mfornet
Copy link
Member Author

mfornet commented Jul 22, 2020

I like the idea of annotating fields instead of explicitly creating the datastrucutres a lot.
Do we know if rust marcos actually allow doing this?

@SkidanovAlex Basically if we have a clear idea of which code should be generated, from my "little" experience with rust macros they will allow us to do it. In particular they work good annotating fields.

An issue that I see is that when you want to call some method on BlockHeader, you first have to convert to one of the BlockHeaderV.. types. This does create some potential overhead.

@bowenwang1996 in my design this not an issue. Versioned messages are only used to serialize+deserialize but they are not used anywhere else in the code. After receiving a message you deserialize it using the expected version and then it is converted immediately in the current canonical data structure, so we are abstracted from versions (*). When you send a message the canonical data structure is converted to the versioned data structured and serialized like it.

So methods and fields are implemented for the canonical version and can be changed without problem, we will only convert from canonical to versioned after deserializing / before serializing. i.e. whenever possible most of our system should not be aware of the multiple versions.

(*) We will be abstracted from versions as much as the logic allows it ofc. It is possible that some changes will require handling each version separately. In that case we can add some flag/enum to the canonical data structure and just make sure when we convert from the versioned data structure to the canonical we populate the flag/enum correctly, but notice how the versioned data structure doesn't know about that flag/enum, it all happens in the conversion.

@ilblackdragon
Copy link
Member

Did this end up being implemented?

@bowenwang1996
Copy link
Collaborator

Did this end up being implemented?

Unfortunately no. I think details were not fully figured out here

@frol frol added the WG-protocol Protocol Standards Work Group should be accountable label Sep 5, 2022
@bowenwang1996 bowenwang1996 added the S-draft/needs-moderator-review A NEP in the DRAFT stage that needs a moderator review. label Sep 28, 2022
@ori-near ori-near added the A-NEP A NEAR Enhancement Proposal (NEP). label Oct 13, 2022
@ori-near
Copy link
Contributor

Hi @mfornet, this seems like an old proposal before we had the NEP process. Is it still relevant? Or should we close it?

@mfornet mfornet closed this Feb 2, 2023
@frol frol added S-retracted A NEP that was retracted by the author or had no activity for over two months. and removed S-draft/needs-moderator-review A NEP in the DRAFT stage that needs a moderator review. labels Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-NEP A NEAR Enhancement Proposal (NEP). S-retracted A NEP that was retracted by the author or had no activity for over two months. WG-protocol Protocol Standards Work Group should be accountable
Projects
Status: RETRACTED
Development

Successfully merging this pull request may close these issues.

6 participants