Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gossipsub): prevent erroneously duplicate message IDs #3716

Merged
merged 11 commits into from
Apr 4, 2023

Conversation

dariusc93
Copy link
Member

@dariusc93 dariusc93 commented Mar 31, 2023

Description

Previously, we only mutably borrowed the last_seq_no in the current scope but did not modify the underlying number. This is because u64 is copy and calling wrapping_add consumes self so the compiler just copied it. We introduce a new-type instead that is not Copy.

Additionally, wrapping_add and initializing with a random u64 might actually warp the number and thus not give us sequential numbers as intended in #3551. To solve this, we initialize with the current unix timestamp in nanoseconds. This allows a node to publish 1000000 messages a second and still not reuse sequence numbers even after a restart / re-initialization of the configuration. This is also what the go implementation does.

Resolves #3714.

Co-authored-by: Thomas Eizinger [email protected]

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@thomaseizinger thomaseizinger changed the title fix(gossip): Mutablely borrow last_seq_no fix(gossipsub): prevent erroneously duplicate message IDs Mar 31, 2023
Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the title and PR description, let me know if I understood the problem and fix correctly!

protocols/gossipsub/CHANGELOG.md Outdated Show resolved Hide resolved
protocols/gossipsub/CHANGELOG.md Show resolved Hide resolved
@thomaseizinger
Copy link
Contributor

cc @divagant-martian In case you are also upgrading to 0.51.2 already.

@thomaseizinger
Copy link
Contributor

Lockfile needs updating!

dariusc93 and others added 5 commits March 31, 2023 09:52
We should never need to clone this, it might break the uniqueness
guarantee of sequence numbers.
Basic data type are easily handled the wrong way because they implement
Copy.

This newtype doesn't implement Copy and increments every time to access
the sequence number.
@thomaseizinger
Copy link
Contributor

I've pushed an improved fix that should make handling this slightly safer.

cc @AgeManning I've also changed this to start with the current time. The thinking was that if we want to be strictly increasing, wrapping_add is the wrong function as we might actually not increase the sequence number if we start with a really high random number. Instead, I am using milliseconds of the current unix timestamp. This means, even if a node would start and burst out messages, then shut down and start again, they have to send more than 1000 messages per second before they'd get into a state where they'd send out messages that other nodes perceive as duplicates. I think that is reasonably safe.

I also chose to expect the wrapping. Even if we send 1 million messages a second, we'd only run out of space after 584942 years if my calculation is correct.

@thomaseizinger
Copy link
Contributor

cc @MarcoPolo This PR changes our implementation to be the exact same as go.

Copy link
Contributor

@AgeManning AgeManning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep this makes sense to me, the panic at u64::max() is for someone living 600k years in the future's problem (if they can keep a libp2p node running that long without updating :).)

This also doesn't effect us as we don't use sequence numbers, so all good from our end, thanks for the ping :).

@thomaseizinger
Copy link
Contributor

Thank you @dariusc93 for the initial discovery of the bug and the pull request.

@mergify mergify bot merged commit f2a7457 into libp2p:master Apr 4, 2023
@mxinden
Copy link
Member

mxinden commented Apr 4, 2023

Thanks @dariusc93 for debugging and proposing a patch!


@thomaseizinger shall I cut a new patch release already, or do you want to wait for #3625?

@mxinden mxinden mentioned this pull request Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicated error with gossipsub messages
4 participants