-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Macro based storage rework #1134
Comments
I played with that idea, tested different implementations and cases, checked the sizes. First of all to reduce the size, better to avoid frequent usage of Instead of: #[ink(storage)]
pub struct Flipper {
value: Storage<bool>,
value2: Storage<bool>,
value3: Storage<u32>,
value4: Storage<u32>,
} Better to: #[ink(storage)]
pub struct Flipper {
value: Storage<Test>,
}
struct Test {
value: bool,
value2: bool,
value3: u32,
value4: u32,
} Or #[ink(storage)]
pub struct Flipper {
value: bool,
value2: bool,
value3: u32,
value4: u32,
} The second point - introduction of accessors like: // we create accessors for each field with the correct mutability
// this way we keep the natural way of marking messages as
// mutable and getters
pub fn a(&self) -> &AFieldStorageValue { return &self.a };
pub fn b(&self) -> &BFieldMapping { return &self.b };
pub fn a_mut(&mut self) -> &mut AFieldStorageValue { return &mut self.a };
pub fn b_mut(&mut self) -> &mut BFieldMapping { return &mut self.b }; Will be very unclear for the user + we will have the problem with highlighting in IDE. Based on the results I want to propose the next:
ink! will provide types and structures that allow moving some fields into their own cell. These types will have two modes:
That solution provides control over the storage keys. The developer can decide by himself that key use and where. The solution introduces only one limitation - all storage-related structures should be defined via the That solution doesn't affect multi-files because everyone will use that rules to work with storage -> structures are marked with // `value` and `value2` are `Test2`. All inner storage keys are the same -> self.value.get().value4.set(&144) will also affect self.value2.get().value4
#[ink(storage)]
pub struct Flipper {
value: Storage<Test2, ManualKey<1>>,
value2: Storage<Test2, ManualKey<4>>,
}
#[ink(storage)]
struct Test2 {
value: Storage<bool>,
value2: Storage<bool>,
value3: Storage<u32>,
value4: Storage<u32>,
} That problem has a workaround with manual offset: #[ink(storage)]
pub struct Flipper {
value: Storage<Test2<100>, ManualKey<100>>,
value2: Storage<Test2<200>, ManualKey<200>>,
}
#[ink(storage)]
struct Test2<const Offset: u128> {
value: Storage<bool, ManualKey<{ Offset + 1 }>>,
value2: Storage<bool, ManualKey<{ Offset + 2 }>>,
value3: Storage<u32, ManualKey<{ Offset + 3 }>>,
value4: Storage<u32, ManualKey<{ Offset + 4 }>>,
} But then we need to return an error if the storage layout contains two same keys to prevent the user from bugs=) Below the example of the code that I used for testing(It is only MVP). The same idea can be implemented in #[derive(Default)]
struct AutoKey;
#[derive(Default)]
struct ManualKey<const KEY: u128>;
#[derive(Default)]
struct Storage<T: Decode + Encode, Mode = AutoKey>(PhantomData<(T, Mode)>);
#[cfg(feature = "std")]
impl<T: Decode + Encode + scale_info::TypeInfo + 'static, Mode> ::ink_storage::traits::StorageLayout for Storage<T, Mode>
where Self: KeyTrait {
fn layout(_key_ptr: &mut KeyPtr) -> Layout {
Layout::Cell(CellLayout::new::<T>(LayoutKey::from(
&<Self as KeyTrait>::key(),
)))
}
}
trait KeyTrait {
fn key() -> Key;
}
trait StorageTrait: KeyTrait {
type Value: Decode + Encode;
fn get(&self) -> Self::Value {
ink_env::get_contract_storage(&Self::key()).unwrap().unwrap()
}
fn set(&self, value: &Self::Value) {
ink_env::set_contract_storage(&Self::key(), &value);
}
}
impl<T: Decode + Encode, const KEY: u128> KeyTrait for Storage<T, ManualKey<KEY>> {
#[inline]
fn key() -> Key {
Key::from(unsafe { core::mem::transmute::<(u128, u128), [u8; 32]>((KEY, 0)) })
}
}
impl<T: Decode + Encode, const KEY: u128> StorageTrait for Storage<T, ManualKey<KEY>> {
type Value = T;
}
impl<T: Decode + Encode> KeyTrait for Storage<T, AutoKey> {
#[inline]
fn key() -> Key {
// Here we will put `__ink_error...`
panic!()
}
}
impl<T: Decode + Encode> StorageTrait for Storage<T, AutoKey> {
type Value = T;
}
trait IsAutoKey<const KEY: u128> {
type StorageType;
}
impl<T: Decode + Encode, const KEY: u128> IsAutoKey<KEY> for Storage<T, AutoKey> {
type StorageType = Storage<T, ManualKey<KEY>>;
} |
Is that somewhere reflected in the code you posted? I think it contradicts the rest you are saying: We are selecting different keys with manual or automatic keys. In your examples I don't see the overarching main contract structure that ties all of that together.
This is why we only want that one layer deep. You can only have different storage cells by putting different fields into the main storage structure. Also, would be nice to have the example with |
I didn't implement The idea is that the developer for some type will create an empty implementation of So all atomic types are loaded with contract. Other types doesn't require loading.
I will prepare more detailed example=)
Seems I find the solution for that problem. struct Test2<const KEY: u128 = 0> {
value: bool,
value2: bool,
value3: u32,
value4: u32,
} So our
Will try to add an example for |
No matter, we can't use that I will prepare the simple version only with a manual key and with generated.
I think we don't need to limit the developer. We only need to throw an error if he is using the same storage keys in several places. We can do that check during the generation of metadata. |
If we can catch this during compilation then it is fine. Would be better to catch it during the compilation of the wasm, though.
This is not completely unlike how it is currently working. We also store the whole contract structure under a single root key and the storage types like
I have some further questions:
|
I managed to implement auto-incrementing of the storage key during compilation(it is a hard variant where we don't need to worry about keys)=) Here is an example. It contains only an example with
Nope!=) Currently, we are storing each field under the own storage key. If we put all fields under If we generate storage keys during compilation, then the empty implementation of
Usage like
Yeah, I want to propose using some integer as a storage key, because in terms of the contract we don't need to worry about the collision of fields. Yes, in the hard version that I posted it is auto-incremented.
Storage key implements let key = (&Mapping::StorageKey, &Key).encode().as_ref();
let value = value.encode().as_ref();
seal_get_storage(key, key.len(), value, value.len())
|
I mentioned that issue: rust-lang/rust#26925 But seems we can handle it. The derive macro can be expanded before the |
You are meaning the one on your I looked through your prototype and I think this would a a very good design. Having a numeric counter at compile team is the best of both worlds:
This looks really promising to me. Once this progresses we also need to add a new version of storage functions to pallet_contracts that allow variably sized keys (with a limit though). The current size of 32bytes is to small if contracts do not hash the key because you need at least |
I tested it with current ink's codegen, so it was simpler to transmute
It is caused by the
Yep, you are right, I forgot to update the comment. // The user defiend the next struct:
//
// #[derive(Default, Debug)]
// #[ink(storage)]
// struct Test {
// value: bool,
// value2: Storage<bool>,
// value3: u32,
// value4: Storage<u32>,
// }
Yep. I didn't think yet about the question: Is that possible to find duplicate keys during compilation? I plan to think about it. BUT, to avoid duplicate keys we can remove
I think resolving that issue will close that question as we discussed in the element channel. I also want to highlight another problem in that solution. I added We can generate the impl<OldParent: NextStorageKey, NewParent: NextStorageKey> AsRef<Test<NewParent>> for Test<OldParent> {
#[inline]
fn as_ref(&self) -> &Test<NewParent> {
unsafe { core::mem::transmute::<&Test<OldParent>, &Test<NewParent>>(self) }
}
}
impl<OldParent: NextStorageKey, NewParent: NextStorageKey> AsMut<Test<NewParent>> for Test<OldParent> {
#[inline]
fn as_mut(&mut self) -> &mut Test<NewParent> {
unsafe { core::mem::transmute::<&mut Test<OldParent>, &mut Test<NewParent>>(self) }
}
} |
Yes. This will kill two birds with one stone.
If it makes things easier we should drop the manual key for now. Right now the keys are also derived from order. We can add this later. Maybe even as a post processing step on metadata to not further complicate the code (as you initially suggested).
Yeah this is kind of a bummer. Didn't understand the code well enough to help with that right now. But I am sure that using transmute for this isn't the right way. |
@xgreenx We think the best way forward would be to iterate on your example and get rid of the unsafe transmute. For the host functions you could mock them for now, so to not be blocked on that. What are your thoughts? |
We need to decide on several questions and I think we can move forward.
1.1. We can add a generic type during the codegen as described above. By default, it will be Pros:
Cons:
1.2. The user should explicitly specify the generic type for Pros:
Cons:
The definition looks like: #[derive(Default, Debug)]
#[ink(storage)]
struct Test<Last: NextStorageKey> {
value: bool,
value2: bool,
value3: u32,
value4: u32,
}
For example, we can use Or you can suggest your ideas=)
|
1.1 vs 1.2: Given that these are the only two options I am leaning to the explicit solution. That said, I didn't put nearly as much thought into it as you did. So what is your recommendation and is there really no better way (I can't tell)?
I would suggest keeping
I agree. If this is visible to the user a simple name as |
I also prefer the explicit solution. Because the developer can google Rust information and find what is going on.
We want to know the key during the compilation time of each Here we have two options:
Previously used cells should affect the storage key of the current struct -> The struct should know about the parent -> We need to use Generic. Pros:
Cons:
Pros:
Cons:
But we can identify during metadata generation that someone used the struct more than one time and throw an error with description. And the user should use another struct.
I'm okay with that solution=) |
Can you please remind me what you mean when you say "atomic struct"? I think we should go with the first solution (numeric key and explicit generics). The second one has the collision problem and will also lead to much bigger keys compiled into the data section of the contract as opposed to simple numbers. I think before you start implementing, it would be best if you'd post a complete example here on how this stuff would be used. No need to write out all the types involved as in the first mock you posted. Only from a user perspective. With |
"atomic struct" - all fields are atomic. All types are atomic except
We still can use
const ROOT_KEY: u32 = 123;
#[ink(storage(ROOT_KEY))]
pub struct Flipper {
value1: AtomicStruct,
value2: NonAtomicStruct,
value3: Storage<AtomicStruct>,
// Possible
value4: Storage<NonAtomicStruct>,
// Possible
value5: Mapping<u128, AtomicStruct>,
// Compilation Error, it is not possible
// value6: Mapping<u128, NonAtomicStruct>,
}
impl Flipper {
fn get_value1(&self) -> &AtomicStruct {
&self.value1
}
fn get_value2<T: Storage>(&self) -> &NonAtomicStruct<T> {
&self.value2
}
fn get_value3<T: Storage>(&self) -> &Storage<AtomicStruct, T> {
&self.value3
}
fn get_value4<T1: Storage, T2: Storage>(&self) -> &Storage<NonAtomicStruct<T2>, T1> {
&self.value4
}
fn get_value5<T: Storage>(&self) -> &Mapping<u128, AtomicStruct, T> {
&self.value5
}
}
#[ink::storage_ext]
pub struct AtomicStruct<Last: Storage = ()> {
value1: bool,
value2: u32,
value3: Vec<u32>,
value4: String,
}
#[ink::storage_ext]
pub struct NonAtomicStruct<Last: Storage = ()> {
value1: Mapping<u32, bool>,
value2: Storage<Vec<u32>>,
}
impl<Last: Storage> NonAtomicStruct<Last> {
fn get_value1<T: Storage>(&self) -> &Mapping<u32, bool, T> {
&self.value1
}
fn get_value2<T: Storage>(&self) -> &Storage<Vec<u32>, T> {
&self.value2
}
} |
I don't get it. How would we do that if we are generating the key from a name? Hash the name at compile time and truncate to 8 byte? We would still have the problem of collisions if a struct was used multiple times. I think with a numeric solution we could even get away with Regarding your example. I think this looks fine. Currently, we ask users to derive a lot of strange traits on storage structs which is not better. So I think we can manage with a few generics as long as you can also see them properly in the rust docs and it is all explicit. Just one question: Why is the type parameter on the storage structs called |
Yes, we can do the same as for selectors, we will truncate it to
In WASM it still is
Okay=) It looks normal to me. I only worry that someone from Solidity can be afraid of it=)
Because actually, we will pass the last data type(that type took the previous cell) that will be used to get the next storage key. We can use |
I think those constants will go into the wasm data section and can be packed densely there. Of course they will be loaded into an
Some renaming suggestions:
Do you have a preference regarding numeric keys vs truncated 64bit hashes? My thoughts on truncated hashes: Con:
Pro:
Let's do some napkin math regarding the size benefits of numeric keys. Let's say numeric keys are 1byte in contract size (stored in data section which is linear memory) and truncated keys are 8byte. So we look at 7byte overhead for each field. How many fields does the average contract have? I would say less than 20 (maybe even way less). That would be 140 byte size overhead for the truncated hash version. ERC20 has 4 fields and multisig would probably have 7 fields if we had Given these numbers I might be inclined to say that truncated hash version might be better if it is substantially simpler and gets rid of generics for the user. |
I also prefer the version with truncated hash, because It will be simpler for me as for a smart contract developer. I don't like to put generics everywhere in cases where I plan to use only atomic structures(or not atomic struct but one time).
Yea, the overhead seems not too high + it requires investigation regarding 1byte=)
If someone wants to use some struct twice or more, he can specify the generic by himself. /// The final storage key = Key ^ XOR.
struct ManualKey<const Key: u64, const XOR: u64 = 0>(PhantomData<(Key, XOR)>);
const ROOT_KEY: u64 = 123;
#[ink(storage(ROOT_KEY))]
pub struct Flipper {
value: TestTwice<hash_u64!("Flipper::value")>,
value2: StoredValue<TestTwice<hash_u64!("Flipper::value2")>>,
value3: TestOnce,
}
#[ink::storage_ext]
struct TestTwice<const Key: u64> {
value: StoredValue<bool, ManualKey<hash_u64!("Test::value"), Key>>,
value2: StoredValue<bool, ManualKey<hash_u64!("Test::value2"), Key>>,
value3: StoredValue<u32, ManualKey<hash_u64!("Test::value3"), Key>>,
value4: StoredValue<u32, ManualKey<hash_u64!("Test::value4"), Key>>,
}
#[ink::storage_ext]
struct TestOnce {
value: StoredValue<bool, ManualKey<hash_u64!("Hello_World"), 42>>,
value2: StoredValue<bool, ManualKey<hash_u64!("Hello_World")>>,
value3: StoredValue<u32>,
value4: StoredValue<u32>,
} |
Having to manually key every field in |
We can add /// The final storage key = Key ^ XOR.
struct ManualKey<const Key: u64, const XOR: u64 = 0>;
/// The auto-generated storage key(based on the name of the stuct and field) will be XORed with `XOR` value
struct AutoKey<const XOR: u64 = 0>;
const ROOT_KEY: u64 = 123;
#[ink(storage(ROOT_KEY))]
pub struct Flipper {
value: TestTwice<hash_u64!("Flipper::value")>,
value2: StoredValue<TestTwice<hash_u64!("Flipper::value2")>>,
value3: TestOnce,
}
#[ink::storage_ext]
struct TestTwice<const Key: u64> {
value: StoredValue<bool, AutoKey<Key>>,
value2: StoredValue<bool, AutoKey<Key>>,
value3: StoredValue<u32, AutoKey<Key>>,
value4: StoredValue<u32, AutoKey<Key>>,
}
#[ink::storage_ext]
struct TestOnce {
value: StoredValue<bool, ManualKey<hash_u64!("Hello_World"), 42>>,
value2: StoredValue<bool, ManualKey<hash_u64!("Hello_World")>>,
value3: StoredValue<u32, AutoKey<42>>,
value4: StoredValue<u32>,
} |
We can try to automate that process. For that, we need some marker to understand that it is generic for the storage key. Maybe we can define an alias in Then we can simplify: #[ink::storage_ext]
struct TestTwice<const Key: StorageKey> {
value: StoredValue<bool, AutoKey<Key>>,
value2: StoredValue<bool, AutoKey<Key>>,
value3: StoredValue<u32, AutoKey<Key>>,
value4: StoredValue<u32, AutoKey<Key>>,
} Into: #[ink::storage_ext]
struct TestTwice<const Key: StorageKey> {
value: StoredValue<bool>,
value2: StoredValue<bool>,
value3: StoredValue<u32>,
value4: StoredValue<u32>,
} |
Yes this is what I had in mind. Sure we can't detect the re-alias but I don't think this is a case we need to consider. This is all looks pretty solid to me. The latest example looks like something I can very much live with. We should have the rest of the ink! team look over it so we are all on the same page. |
This discussion is pretty dense, but I'll try and take a look later this week |
Exactly, it is the same as knowing the name of the struct!=) We can know the value of the attribute, but it will be the same value for all structs like that=D It is why we want to use a generic to have a "new" type of struct, specified by generic value. |
Ha ha yes I am just dumb. That's all. I will update the summary. |
Thanks for that summary Alex, super helpful! Some more thoughts/questions
Green, would it be too much to ask for an updated Playground example and example of |
If the user uses the struct twice or more, only in that case the parent key should be specified for
The user will be able to do that, but it is useless in that way. But it can be useful if
The value of the
Yep, the question here do we want to have explicit
I can create a new example, but I think it better to spend that time for real change=D Because most behaviors and cases are already discussed. |
Enforcing it to be part of the root struct would be difficult I think. Only thing that comes to mind is that
AFAIK yes. This could work. Not sure how nice the code would be, though. Since the root struct will be reachable through
It does not matter where to define the
Yes
Do we need some kind of marker trait for keys? Because non atomic types will have an (empty)
Talked to @cmichi . We are basically OK with continuing the rest of this discussion in a PR. What do you think @HCastano ? It all depends on your willingness @xgreenx to make bigger changes to your PR (or restart it completely) when we notice some bigger stumbling stones late in the process. |
Sure, that sounds alright with me |
Will also be interesting to see how the contract metadata will then change. Chances are that with this redesign we could close #767, which then would enable implementing use-ink/cargo-contract#101. @xgreenx Yeah so from our side you're good to go whenever you're up for it :-). |
Hey @xgreenx! Any updates here? |
Hi, we've started work on it, but we are not fully involved at the moment. The previous week was related to stabilizing of OpenBrush(to use the stable ink!, speedup CI, adapting our projects), fixing issues found by users. In that week we are focused on upgradable contract, Proxy, and Diamond standards. We plan to be fully involved in that issue after one week. it is part of 5th milestone of the Grant, so we plan to implement that=) |
Closed via #1331 |
https://use.ink/ink-vs-solidity
Can this document be updated? |
Most of the existing storage types were removed because they are simply to inefficient in size and executed instructions. Right now we only have
Mapping
. However, this type doesn't work very well with the existing storage infrastructure and needs a band aid namedSpreadAllocate
andinitialize_contract
.We want to pivot more in the direction of how storage works in FRAME. We should keep the overarching contract data structure but use the storage macro to generate the supporting code instead of a elaborate system of traits.
This would also allow us more flexibility in how to generate the storage keys for these data structures and more control over what code is generated. The aim is to reduce the contract sizes by removing monomorphization and copying of the root keys.
Please note that the following is only a sketch of the generated code. It is probably not how the generated code should look like. @Robbepop noted that we should not use private sub modules. Please feel free to post your suggestions and I will update the top post.
We add a new trait that will be implemented by the storage macro to supply our storage traits with a key. Our storage traits have default implementations for every function. The macro will only set the associated types.
Then the storage macro creates a private module where it dumps all the generated code:
The biggest downside of this approach is the same as substrate's: New storage collections always need changes to be made in codegen in order to support them. However, I assume this will safe us a lot of code passing around all those keys and stateful objects (lazy data structures) at runtime. It is also much simpler and easier to reason about (because it has less features).
Open Questions
How does this affect/play with things like:
The text was updated successfully, but these errors were encountered: