feat: add new message_hash column #2127

ABresting · 2023-10-13T12:22:42Z

Description

Adding the message_hash (messageHash) as a new column in the Waku archive protocol. To compute the messageHash, we refer to rfc_guide. computeDigest function has been modified to support the conjunction of pubSubToic without breaking the existing test cases. In future PRs, the existing id attribute/column of the databases (SQLite and PostgreSQL) shall be removed with due testing and pagination support.

The change will help attain a message index type attribute which will be used in sync-related protocols for archive Waku node.

Changes

The message_hash attribute should be present SQLite.
The message_hash attribute should be present Postgres.

Issue

#2112

github-actions · 2023-10-13T12:22:57Z

This PR may contain changes to database schema of one of the drivers.

If you are introducing any changes to the schema, make sure the upgrade from the latest release to this change passes without any errors/issues.

Please make sure the label release-notes is added to make sure upgrade instructions properly highlight this change.

github-actions · 2023-10-13T12:29:31Z

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2127

Built from 7d9408e

alrevuelta

Approach looks good. Can we add a simple unit test?

Also not sure on the migration strategy, since it modifies the schema.

vpavlin · 2023-10-16T10:05:14Z

waku/waku_archive/common.nim

@@ -18,14 +19,16 @@ import

 type MessageDigest* = MDigest[256]

-proc computeDigest*(msg: WakuMessage): MessageDigest =
+proc computeDigest*(msg: WakuMessage, pubSubTopic: string = DefaultPubsubTopic): MessageDigest =


Since every message would come through a pubSub topic, what is the point of the default here? (Feels like it would only lead to issues in the future if we forget to pass in a pubSub topic and then get inconsistent behaviour)

Ah, that is for the test cases...would it make sense to rather fix the tests?

Yes, but the pubSub topic needs to be passed to the computeDigest function as it is not a WakuMessage attribute.

The default is used since we are using the existing function in so many test cases already, one way to get rid of that (without changing all the other test cases) is:

extend the function with a default (what I've used) or,

add another function with some sort of similar name and redirect it from the queue driver and Waku archive code for production workflow

better of the two seems to be the former. WDYT @vpavlin?

Why is "update the tests" not an option?:)

ABresting · 2023-10-16T10:42:48Z

Approach looks good. Can we add a simple unit test?

I am wondering what a simple unit test for messageHash looks like. Just like id it is computationally generated, and there is no existing test case for id as well.

If I allow my imagination to run, Hash implies something unique, so maybe a hash collision test? to make sure that the hash computed is indeed unique? IDK if this is an overkill..? WDYT @alrevuelta?

vpavlin · 2023-10-16T11:24:04Z

I am wondering what a simple unit test for messageHash looks like. Just like id it is computationally generated, and there is no existing test case for id as well.

You could just test that if a different pubsub topic is used, or there is different metadata field value that the hashes are different.

I'd say it is less about testing the hash function output, but rather making sure the digest proc inputs actually influence the output.

I.e. if someone comes back and changes the compueDigest to previous version by removing the pubSub from it, your test should catch the regression.

Testing if the hash collides does not make sense, unless we are actualyl implementing the hash algo

Ivansete-status

Thanks for the PR! I think is a good step forward!

Nevertheless, would it be possible to split it into three PRs?

Changing the Postgres schema.
Changing the SQLite schema.
Changing the queue schema.

With respect to Postgres, we will need to coordinate the change with the infra team so that the shards.test fleet is properly updated beforehand.

Changing the SQLite will require adding a new "migration" script where we actually perform the schema change, and we will need to test it manually and check that the schema is being properly updated.

ABresting · 2023-10-16T15:07:55Z

closing this PR so that new PRs are split more targetted towards the specific components such as Postgres, Sqlit, Queue schema etc. Thanks, @Ivansete-status for the suggestions and feedback/review from Alvaro and Vaclav!

feat: add new message_hash column

c1b27cd

ABresting requested review from SionoiS, alrevuelta and Ivansete-status October 13, 2023 12:22

ABresting self-assigned this Oct 13, 2023

alrevuelta suggested changes Oct 16, 2023

View reviewed changes

vpavlin reviewed Oct 16, 2023

View reviewed changes

Ivansete-status requested changes Oct 16, 2023

View reviewed changes

ABresting closed this Oct 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add new message_hash column #2127

feat: add new message_hash column #2127

ABresting commented Oct 13, 2023

github-actions bot commented Oct 13, 2023

github-actions bot commented Oct 13, 2023

alrevuelta left a comment

vpavlin Oct 16, 2023

vpavlin Oct 16, 2023

ABresting Oct 16, 2023

vpavlin Oct 16, 2023

ABresting commented Oct 16, 2023

vpavlin commented Oct 16, 2023

Ivansete-status left a comment

ABresting commented Oct 16, 2023

feat: add new message_hash column #2127

feat: add new message_hash column #2127

Conversation

ABresting commented Oct 13, 2023

Description

Changes

Issue

github-actions bot commented Oct 13, 2023

github-actions bot commented Oct 13, 2023

alrevuelta left a comment

Choose a reason for hiding this comment

vpavlin Oct 16, 2023

Choose a reason for hiding this comment

vpavlin Oct 16, 2023

Choose a reason for hiding this comment

ABresting Oct 16, 2023

Choose a reason for hiding this comment

vpavlin Oct 16, 2023

Choose a reason for hiding this comment

ABresting commented Oct 16, 2023

vpavlin commented Oct 16, 2023

Ivansete-status left a comment

Choose a reason for hiding this comment

ABresting commented Oct 16, 2023