-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add new message_hash column #2127
Conversation
This PR may contain changes to database schema of one of the drivers. If you are introducing any changes to the schema, make sure the upgrade from the latest release to this change passes without any errors/issues. Please make sure the label |
You can find the image built from this PR at
Built from 7d9408e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach looks good. Can we add a simple unit test?
Also not sure on the migration strategy, since it modifies the schema.
@@ -18,14 +19,16 @@ import | |||
|
|||
type MessageDigest* = MDigest[256] | |||
|
|||
proc computeDigest*(msg: WakuMessage): MessageDigest = | |||
proc computeDigest*(msg: WakuMessage, pubSubTopic: string = DefaultPubsubTopic): MessageDigest = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since every message would come through a pubSub topic, what is the point of the default here? (Feels like it would only lead to issues in the future if we forget to pass in a pubSub topic and then get inconsistent behaviour)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that is for the test cases...would it make sense to rather fix the tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the pubSub topic needs to be passed to the computeDigest
function as it is not a WakuMessage attribute.
The default is used since we are using the existing function in so many test cases already, one way to get rid of that (without changing all the other test cases) is:
- extend the function with a default (what I've used) or,
- add another function with some sort of similar name and redirect it from the queue driver and Waku archive code for production workflow
better of the two seems to be the former. WDYT @vpavlin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "update the tests" not an option?:)
I am wondering what a simple unit test for If I allow my imagination to run, Hash implies something unique, so maybe a hash collision test? to make sure that the hash computed is indeed unique? IDK if this is an overkill..? WDYT @alrevuelta? |
You could just test that if a different pubsub topic is used, or there is different metadata field value that the hashes are different. I'd say it is less about testing the hash function output, but rather making sure the digest proc inputs actually influence the output. I.e. if someone comes back and changes the compueDigest to previous version by removing the pubSub from it, your test should catch the regression. Testing if the hash collides does not make sense, unless we are actualyl implementing the hash algo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I think is a good step forward!
Nevertheless, would it be possible to split it into three PRs?
- Changing the Postgres schema.
- Changing the SQLite schema.
- Changing the queue schema.
With respect to Postgres, we will need to coordinate the change with the infra team so that the shards.test
fleet is properly updated beforehand.
Changing the SQLite will require adding a new "migration" script where we actually perform the schema change, and we will need to test it manually and check that the schema is being properly updated.
closing this PR so that new PRs are split more targetted towards the specific components such as Postgres, Sqlit, Queue schema etc. Thanks, @Ivansete-status for the suggestions and feedback/review from Alvaro and Vaclav! |
Description
Adding the message_hash (messageHash) as a new column in the Waku archive protocol. To compute the
messageHash
, we refer to rfc_guide.computeDigest
function has been modified to support the conjunction of pubSubToic without breaking the existing test cases. In future PRs, the existingid
attribute/column of the databases (SQLite and PostgreSQL) shall be removed with due testing and pagination support.The change will help attain a message index type attribute which will be used in sync-related protocols for archive Waku node.
Changes
Issue
#2112