Prevent bug when reusing type ids in hashing #1075

tadeohepperle · 2023-07-17T15:07:54Z

I first thought a simple restructuring approach like @jsdw mentioned in #1066 would work, but correct me if I'm wrong, I think it cannot solve the recursive type problem, because creating a new HashSet in get_type_id() will eventually run into an infinite loop.

Instead of using a HashSet for keeping track of the visited ids, we can use a HashMap that saves some info about if the type has been visited yet and also if a complete hash has been calculated yet.

I added a test to make sure it works (Without my code changes, it fails).

jsdw · 2023-07-17T15:24:02Z

because creating a new HashSet in get_type_id() will eventually run into an infinite loop.

What I meant was that get_type_id could be exposed to the rest of the module to get type IDs, and this would internally create a new HashSet or whatever.

And then "internally" (and not exposed to the rest of the crate) one could use a get_type_id_with_cache or whatever that looks like the current one does.

Maybe some stuff would end up in a different module or something to support that.

And the point of doing this would just be to guarantee that get_type_id always starts from a fresh state and doesn't re-use anything from other invocations which could change the result or make it worse :)

But anyways, maybe that's irrelevant now and I'll have a gander at the approach you took!

jsdw · 2023-07-17T15:47:21Z

metadata/src/utils/validation.rs

    }
-
+    cache.insert(id, 0);


Ooh, I like the idea of caching the hashes like this in general; good thinking!

There's maybe a small chance that the first 4 bytes of type_hash are actually [0,0,0,0], which would make that type look like the special recursive type that leads to an early return.

I also wonder; is it better to store the full [u8; HASH_LEN] type hash in the cache? We use more memory with this, but then don't need to hash it, and it means that the cache is truly caching the computed hashes for various types rather than caching something that's not actually the original hash we computed for a type. It feels "safer" to reuse the cache in other places if we know that it is storing the actual computed hashes over something a bit different.

Combining the two thoughts above, I wonder whether the cache should look like:

type Cache = HashMap<u32, CachedValue>; enum CachedValue { Recursive, Hash([u8; HASH_LEN]) }

Where we insert CachedValue::Recursive initially and update that to a final hash when done. (if get_type_hash looks in the cache and sees a CachedValue::Recursive it can return with eg MAGIC_RECURSIVE_HASH_VALUE, which is functionally the same as right now where it would return with hash(0)).

I'm suggesting a new enum above over using Option just because it would be a little more self-documenting, but either works really!

We could definitely do that, thought about something like that as well, it is more secure at the cost of a little bit more memory. I will change it! :)

That'd be great, and then perhaps also we could add a test which uses the cache across a couple of invocations of hashing the same type to comfirm that indeed you get the same hash back both times (first time caches the hash, second time just returns it straight off). This should even work if the type has a recursive definition in which is cool :)

Ultimately whether we pass a new cache or the same cache in to multiple calls, the results should all be the same.

jsdw · 2023-07-18T09:51:12Z

metadata/src/utils/validation.rs

+    //    if the cache has an entry for the id, just return a hash derived from it.
+    //    if the type has not been seen yet, mark it with `CachedHash::Recursive` in the cache and proceed to `get_type_def_hash()`.
+    //        -> During the execution of get_type_def_hash() we might get into get_type_hash(id) again for the original id
+    //            -> in this case the `CachedHash::Recursive` provokes an early return.


Nit: it's more like "in this case, the CachedHash::Recursive returns a fixed "magic" hash rather than recursing into the same type again" :)

metadata/src/utils/validation.rs

jsdw

This looks great! Just one tiny suggestion to add a test showing that the cache is cahcing actual hashes and thus can safely be reused across multiple calls to hash the same types or whatever, but otherwise looks good to me!

…n-hashing

jsdw · 2023-07-18T17:19:32Z

metadata/src/utils/validation.rs

+        let CachedHash::Hash(a_cache_hash) = cache[&a_type_id] else { panic!() };
+        let CachedHash::Hash(b_cache_hash) = cache[&b_type_id] else { panic!() };
+
+        assert_eq!(a_hash, a_cache_hash);
+        assert_eq!(b_hash, b_cache_hash);


Can we also check that a_hash == b_hash; imo that's more iomportant than looking into the cached stuff :)

I don't think we want the hashes to be the same, because A and B are different structs, with different fields:

struct A { pub b: Box<B> } struct B { pub a: Box<A> }

Maybe I understood you wrong, what did you mean?

Ah, sorry yeah, I mean, it'd be nice to have a test like:

let a_type_id = registry.register_type(&meta_type::<A>()).id; let a_hash1 = get_type_hash(&registry, a_type_id, &mut cache); let a_hash2 = get_type_hash(&registry, a_type_id, &mut cache); assert_eq!(a_hash1, a_hash2);

ie, we want to check that sharing the cache across different uses doesn't change the hashes that are produced for some type.

…n-hashing

* practice TDD * implement a hashmap 2-phases approach * use nicer types * add test for cache filling * adjust test --------- Co-authored-by: James Wilson <[email protected]>

* remove config, doc tests are expected to fail (book not adjusted yet) * make doc tests pass * Prevent bug when reusing type ids in hashing (#1075) * practice TDD * implement a hashmap 2-phases approach * use nicer types * add test for cache filling * adjust test --------- Co-authored-by: James Wilson <[email protected]> * remove the unnecessary intos --------- Co-authored-by: James Wilson <[email protected]>

* routing and signing example * cliipy fix * submitting extrinsics * change order of lines * Skip call variants if there aren't any (#980) Co-authored-by: Niklas Adolfsson <[email protected]> * Tidy up some metadata accessing (#978) * Reduce some repetition when obtaining metadata pallets/runtime_traits * make them pub * fix docs and clippy * Bump tokio from 1.28.1 to 1.28.2 (#984) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.28.1 to 1.28.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](tokio-rs/tokio@tokio-1.28.1...tokio-1.28.2) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump regex from 1.8.2 to 1.8.3 (#986) Bumps [regex](https://github.com/rust-lang/regex) from 1.8.2 to 1.8.3. - [Release notes](https://github.com/rust-lang/regex/releases) - [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md) - [Commits](rust-lang/regex@1.8.2...1.8.3) --- updated-dependencies: - dependency-name: regex dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump quote from 1.0.27 to 1.0.28 (#983) Bumps [quote](https://github.com/dtolnay/quote) from 1.0.27 to 1.0.28. - [Release notes](https://github.com/dtolnay/quote/releases) - [Commits](dtolnay/quote@1.0.27...1.0.28) --- updated-dependencies: - dependency-name: quote dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump proc-macro2 from 1.0.58 to 1.0.59 (#985) Bumps [proc-macro2](https://github.com/dtolnay/proc-macro2) from 1.0.58 to 1.0.59. - [Release notes](https://github.com/dtolnay/proc-macro2/releases) - [Commits](dtolnay/proc-macro2@1.0.58...1.0.59) --- updated-dependencies: - dependency-name: proc-macro2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * restrict sign_with_address_and_signature interface (#988) * changing js bridge * dryrunresult ok * submitting extrinsic working * tiny up code and ui * formatting * remove todos * support tip and mortality * Prevent bug when reusing type ids in hashing (#1075) * practice TDD * implement a hashmap 2-phases approach * use nicer types * add test for cache filling * adjust test --------- Co-authored-by: James Wilson <[email protected]> * small adjustment * Merge branch 'master' into tadeo-hepperle-browser-extension-signing-example * fix lock file * tell users how to add Alice account to run signing example * adjust to PR comments * fmt --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: James Wilson <[email protected]> Co-authored-by: Niklas Adolfsson <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

tadeohepperle added 2 commits July 17, 2023 15:49

practice TDD

b725367

implement a hashmap 2-phases approach

5216e25

tadeohepperle requested a review from a team as a code owner July 17, 2023 15:07

tadeohepperle changed the title ~~Tadeo hepperle prevent reuse of type ids in hashing~~ Prevent bug when reusing type ids in hashing Jul 17, 2023

jsdw reviewed Jul 17, 2023

View reviewed changes

use nicer types

7878d42

jsdw reviewed Jul 18, 2023

View reviewed changes

metadata/src/utils/validation.rs Show resolved Hide resolved

jsdw approved these changes Jul 18, 2023

View reviewed changes

tadeohepperle and others added 2 commits July 18, 2023 16:18

add test for cache filling

2e148e7

Merge branch 'master' into tadeo-hepperle-prevent-reuse-of-type-ids-i…

1b8b78b

…n-hashing

jsdw reviewed Jul 18, 2023

View reviewed changes

jsdw and others added 2 commits July 19, 2023 10:32

Merge branch 'master' into tadeo-hepperle-prevent-reuse-of-type-ids-i…

c9d8672

…n-hashing

adjust test

9cbb5f6

jsdw approved these changes Jul 19, 2023

View reviewed changes

Merge branch 'master' into tadeo-hepperle-prevent-reuse-of-type-ids-i…

3d0be91

…n-hashing

lexnv approved these changes Jul 19, 2023

View reviewed changes

tadeohepperle merged commit 475a141 into master Jul 19, 2023

tadeohepperle deleted the tadeo-hepperle-prevent-reuse-of-type-ids-in-hashing branch July 19, 2023 17:49

jsdw mentioned this pull request Jul 24, 2023

Prep for 0.30 release #1089

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent bug when reusing type ids in hashing #1075

Prevent bug when reusing type ids in hashing #1075

tadeohepperle commented Jul 17, 2023 •

edited

Loading

jsdw commented Jul 17, 2023

jsdw Jul 17, 2023 •

edited

Loading

tadeohepperle Jul 17, 2023

jsdw Jul 17, 2023 •

edited

Loading

jsdw Jul 18, 2023

jsdw left a comment

jsdw Jul 18, 2023 •

edited

Loading

tadeohepperle Jul 19, 2023

jsdw Jul 19, 2023

Prevent bug when reusing type ids in hashing #1075

Prevent bug when reusing type ids in hashing #1075

Conversation

tadeohepperle commented Jul 17, 2023 • edited Loading

jsdw commented Jul 17, 2023

jsdw Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

tadeohepperle Jul 17, 2023

Choose a reason for hiding this comment

jsdw Jul 17, 2023 • edited Loading

Choose a reason for hiding this comment

jsdw Jul 18, 2023

Choose a reason for hiding this comment

jsdw left a comment

Choose a reason for hiding this comment

jsdw Jul 18, 2023 • edited Loading

Choose a reason for hiding this comment

tadeohepperle Jul 19, 2023

Choose a reason for hiding this comment

jsdw Jul 19, 2023

Choose a reason for hiding this comment

tadeohepperle commented Jul 17, 2023 •

edited

Loading

jsdw Jul 17, 2023 •

edited

Loading

jsdw Jul 17, 2023 •

edited

Loading

jsdw Jul 18, 2023 •

edited

Loading