Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple Mapping Storage Primitive #945

Closed
Tracked by #9354
HCastano opened this issue Sep 28, 2021 · 8 comments
Closed
Tracked by #9354

Simple Mapping Storage Primitive #945

HCastano opened this issue Sep 28, 2021 · 8 comments
Assignees
Labels
A-ink_storage [ink_storage] Work Item B-enhancement New feature or request B-research Research task that has open questions that need to be resolved.

Comments

@HCastano
Copy link
Contributor

Motivation

At the moment, ink! data structures, such as the ink_storage::LazyHashMap, are quite
complicated. This is because they were designed in a very "polite" way. That's to say,
they did a lot of internal bookkeeping to ensure that they cleaned up after themselves,
and that they were only going to the contract storage database when it was absolutely
neccessary.

While this was great from a user's point of view, this approach to data structure in
ink! has lead to contract sizes being larger than they probably need to be.

As such, we want to experiment with some simpler data structures, such as a Solidity-like
mapping.

ink_storage::Mapping

The new ink_storage::Mapping would be a paper thin type-safe wrapper around some of the
ink_storage and ink_env storage primitives.

The Mapping type would look as follows:

struct Mapping<K, V>(PhantomData<K>, PhantomData<V>);

Note that we may also need some sort of Key field to differentiate between different
instances of Mapping and avoid collisions. This would be similar to what LazyHashMap
does.

The API of the Mapping would look (roughly) like this:

impl<K, V: Default> Mapping<K, V> {
    ...

    /// Write value to `key`.
    fn push(&mut self, key: K, value: V) {
        todo!()
    }

    /// Get the value at `key`.
    ///
    /// If there is no value at the `key` it will return the default value.
    fn get(&self, key: K) -> V {
        todo!()
    }

    ...
}

It would also need to implement the SpreadLayout and PackedLayout traits. This is
what would actually be doing the heavy lifting as far as talking to the contract storage
goes.

Another thing that will have to get done is adding some new low level push/pull
functions which allow the return values to be empty. At the moment these functions assume
that if you're querying for a key, that it already exists.

This would not be the case with the new Mapping, since you can query any key and if it
doesn't exist you'd get the Default value back.

ERC-20 Benchmarking

After implementing the new Mapping type we want to implement a new ERC-20 example
contract with it. The thing to look for here is how much better (or worse) the contract
sizes are compared to our current ERC-20 example.

@HCastano HCastano added B-enhancement New feature or request A-ink_storage [ink_storage] Work Item B-research Research task that has open questions that need to be resolved. labels Sep 28, 2021
@HCastano HCastano self-assigned this Sep 28, 2021
@xgreenx
Copy link
Collaborator

xgreenx commented Sep 28, 2021

Hi, I've already tested the change like this, so I can briefly describe the result and later provide the full report.
The using of simple mapping saves 16 KB for the current ERC-20 example.

@Robbepop
Copy link
Collaborator

@xgreenx Is there anywhere we can look up your implementation for the Solidity-like storage mapping?

@ascjones
Copy link
Collaborator

/// Write value to `key`.
    fn push(&mut self, key: K, value: V) {
        todo!()
    }

Nit: insert instead of push?

@xgreenx
Copy link
Collaborator

xgreenx commented Sep 29, 2021

@xgreenx Is there anywhere we can look up your implementation for the Solidity-like storage mapping?

Yes, you can find SimpleHashMap here, diff.

The problem with this solution is that after the creation of the map, we don't know the storage key of the map. So we can't immediately start insert and get items.
You can find a hack here to force an update of the storage key(I flushed and loaded the structure back).

This map only shows how the size of the contract can be reduced.

Current master with default storage hash map Erc20:
Original wasm size: 68.0K, Optimized: 28.8K
Current master with simple hash map Erc20:
Original wasm size: 37.8K, Optimized: 12.9K

Later I will work on resolving the issue with the storage key. I see three possible solutions:

  1. Introduce some simple lazy structure to store entries and flush them only on push_layout. But it will increase the size of the contract. But maybe the usage of a linked list or a vector doesn't take a lot of space, it requires testing.
  2. Introduce a new concept - StorageKey. Each struct that implements SpreadLayout also will store a storage key. And when the developer creates an instance of the simple hash map, he must pass some structure, which implements SpreadLayout. It will be a "parent" structure, for this hash map. Based on the parent's storage key we can calculate the key of the current hash map. In this solution we don't need to store the storage key directly in the storage, it can be evaluated during pull_layout.
  3. The developer must specify the storage key by self for each map, ink! will provide some macros which will simplify this process like:
balance_of: SimpleHashMap::new(storage_key!("balance_of"));

In this case, we need to store the storage key directly in the storage and load it during pull_layout.

@Robbepop
Copy link
Collaborator

Thank you for linking your solution.
I wonder why you have a len field.

Some remarks and questions:

  • I think implementing the PackedLayout for the ink_storage::Mapping<K, V> is not trivial and maybe not a good idea.
    For now I'd leave it away and just implement SpreadLayout.
  • Why is key of type Option<Key>? I think it should be just of type Key. If the user wants to lazily load everything they can still wrap the whole thing in an ink_storage::Lazy<T>.
  • We should concentrate on the fact that the ink_storage::Mapping (aka your SimpleHashMap) really is not a storage data structure. It does not own any elements or is aware of what it has inserted or erased. For the same reason I'd remove the len field.
  • Your Encode and Decode implementations encode and decode the len field instead of the key. I wonder why.

The idea is to store no state in the contract storage for the field that is associated to the ink_storage::Mapping. You currently store a len but you could just leave this away and still had a correct implementation. It is just unnecessary overhead. :)
The key field can be fully derived by the SpreadLayout's key_ptr parameter which is basically what you already do.

@xgreenx
Copy link
Collaborator

xgreenx commented Sep 29, 2021

The map has a len field because I tried to create a pattern for any data structure which will work directly with storage. In Solidity, you know the length of the array, so I implemented the same for the map=)

PackedLayout was implemented to test the case with a hash map of hash maps=) Also in another project, we tried to have a hash map inside of another hash map via Box. But we found some issues with overwriting of storage data. So I tried to support it from default. But I agree that it requires investigation and discussions.

Yes, it should be Key, as I mentioned above it depends on the solution of the problem with not initialized storage key. BTW, ink_storage::Lazy generates 1KB of the code=D

In the current implementation, the storage key is optional, because it can be evaluated during pull_layout and can be not specified by the user. But it is not the final solution=)

@Robbepop
Copy link
Collaborator

Robbepop commented Sep 29, 2021

In the current implementation, the storage key is optional, because it can be evaluated during pull_layout and can be not specified by the user. But it is not the final solution=)

Ah yeah makes sense!

Yes, it should be Key, as I mentioned above it depends on the solution of the problem with not initialized storage key. BTW, ink_storage::Lazy generates 1KB of the code=D

holy! ... x(

The map has a len field because I tried to create a pattern for any data structure which will work directly with storage. In Solidity, you know the length of the array, so I implemented the same for the map=)

For Solidity arrays it makes sense to store a len but imo not for this kind of mapping.

@Robbepop
Copy link
Collaborator

Implemented in #946. Closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ink_storage [ink_storage] Work Item B-enhancement New feature or request B-research Research task that has open questions that need to be resolved.
Projects
None yet
Development

No branches or pull requests

4 participants