Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: refactor gossip usage of kbuckets #1480

Merged
merged 3 commits into from
Sep 25, 2024

Conversation

morph-dev
Copy link
Collaborator

What was wrong?

The gossip logic holds kbuckets lock for non-trivial amount of code-lines, which makes my improvements regarding kbuckets lock somewhat hard. So I want to refactor this logic upfront.

How was it fixed?

Extracted logic that uses locks into separate functions: interested_enrs and batch_interested_enrs. These functions will be moved into separate structure in the next PR so they shouldn't contain gossip specific functionality (e.g. calling select_gossip_recipients).

To-Do

@morph-dev
Copy link
Collaborator Author

My intention was to NOT change any functionality. But I observed some things in the code that I'm questioning, so I would like to highlight them here, and we can have a discussion if they are correct or not.

  1. For gossip logic, we remove all non-connected nodes. I don't understand details of how underlying protocol works, but I assume nodes are not connected only while transferring data to/from them. Otherwise, this sounds like a mistake
  2. We gossip only to nodes whose radius indicates that they are interested into content. Is this according to spec?
    • What if average radius ends up being really small (e.g. image populated state network) and none of the nodes in our routing table want that specific content? The gossip would die with us.
    • I assume this is ok/desired if gossip is done as a follow up to receiving content via OFFER, but it sounds wrong if we are handling JSON gossip request (in which case maybe we want to do some form of Recursive Gossip?)
  3. (somewhat unrelated to this change) Looking at OverlayService::handle_offer, it seems that in case we accept the offer, but there is an error while receiving the content (e.g. network, validation fails), we call fallback_find_content which to my understanding tries to get content from some other peer.
    • First, this doesn't work at all for state network (as the content we receive that way is not verifiable)
    • Second, doesn't this expose a attack vector where someone can spam us with offer requests that we accept (they fall in our radius), but they never provide us with any content and we end up spamming other peers with for non-existing content?
  4. Enr used to be converted to String (using .to_base64()) and later converted back to Enr (using Enr::from_str(..))
    • I removed this because I don't understand why it was done like that. Maybe leftover after some previous refactoring?!
    • If this is desired, I will revert it

Copy link
Collaborator

@njgheorghita njgheorghita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. IIRC there's some reason that non-connected nodes are in our routing table... maybe bc we only evict nodes if a bucket is full? so even if it disconnects but the bucket isn't full, then we keep it in, so we need to filter these out from active gossip
  2. I don't think this is spec defined and up to client implementation... In the past, I've attempted to integrate a RFN lookup (so we include some nodes, that might not be in our routing table, in the gossip)... idk why, but it just didn't work very well (most likely due to implementation error), though it's probably worth another effort

3.1 I think you're right that this doesn't work for state. We'll need to implement some special handling depending on network type.

3.2 IIUC fallback_find_content only triggers concurrent subsequent requests if offers are received from other peers during transmission/validation for the given content key. So if a peer spams us with random content keys, but we don't receive offers for those content keys during tx from another peer, then the process will stop. Unless, a malicious party spins up 100 nodes and offers us the same spammy content key from each node... which I guess is an attack vector....

  1. Probably just leftover from earlier refactoring

This is just off the top of my head though, but i'm happy to dig deeper / verify any assumptions above

// Key is base64 string of node's ENR.
let mut enrs_and_content: HashMap<String, Vec<(TContentKey, Vec<u8>)>> = HashMap::new();
// Map from content_ids to interested ENRs
let mut interested_ends = batch_interested_enrs::<TMetric>(&content_ids, kbuckets);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interested_enrs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I guess

Suggested change
let mut interested_ends = batch_interested_enrs::<TMetric>(&content_ids, kbuckets);
let mut interested_enr_batch = batch_interested_enrs::<TMetric>(&content_ids, kbuckets);

Since interested_enrs is used below for those interested in a single ID.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I settled for content_id_to_interested_enrs

Copy link
Collaborator

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, looks great! 👍🏻

For your questions. Agree with Nick, but to add:
2. I think it's ok (good?) that a peer is only effective at gossiping to nearby peers. Let the content gossip die out if there are no known interested peers. Recursive gossip sounds like the job of the bridge, to me. So no change necessary.
3. I stumbled on that code-block myself recently and had the same concern. I don't understand Nick's take.

Comment on lines 299 to 301
let random_farther_enrs = farther_enrs
.into_iter()
.choose_multiple(&mut rand::thread_rng(), NUM_FARTHER_NODES);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely this isn't of practical importance to performance, because I wouldn't expect enrs to get too huge, but I got curious and thought I would report my findings:

  • Randomly selecting from an iterator seems like it needs to iterate the whole thing, making it O(farther_enrs.len()) here, which is supported by the docs
  • The algorithmically fastest option I found is partial_shuffle() on a slice, which is O(NUM_FARTHER_NODES), as makes sense

Copy link
Collaborator Author

@morph-dev morph-dev Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it's of little practical importance, but since we started debate, let's go deeper :)

I see 3 ways to do it:

// 1. Shuffle: clean, very readable, no cloning. BAD: O(farther_enrs.len())
farther_enrs.shuffle(&mut rand::thread_rng());
farther_enrs.truncate(NUM_FARTHER_NODES);
enrs.extend(farther_enrs);

// 2. Partial shuffle: clean, O(NUM_FARTHER_NODES), BAD: we have to clone ENRs
let (selected_farther_enrs, _other_farther_enrs) =
   farther_enrs.partial_shuffle(&mut rand::thread_rng(), NUM_FARTHER_NODES);
enrs.extend_from_slice(selected_farther_enrs);

// 3. Manual: simple, O(NUM_FARTHER_NODES), no cloning. BAD: not as clean, very manual
let mut rng = rand::thread_rng();
for _ in 0..NUM_FARTHER_NODES {
    let enr = farther_enrs.swap_remove(rng.gen_range(0..farther_enrs.len()));
    enrs.push(enr);
}

First one is the cleanest, but worse performance (question is how much it matters).

Second one has good performance, but we have to clone ENRs (again, it's only 4 of them, does it matter?)

Last one, that I ended up going with, has both best performance and no cloning. The downside is that I don't find it as "clean" as the other two, but I think it's simple and clean enough.

I don't have strong preferences to any of the them, so if you do, I'm willing to change it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, a good bit of performance fun. Good catch on the cloning. LGTM!

let permit = match utp_controller {
Some(ref utp_controller) => match utp_controller.get_outbound_semaphore() {
Some(permit) => Some(permit),
None => continue,
None => {
debug!("Permit for gossip not acquired! Skipping gossiping to enr: {enr}");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a log that could spew inside a loop like this belongs in a trace!

// Key is base64 string of node's ENR.
let mut enrs_and_content: HashMap<String, Vec<(TContentKey, Vec<u8>)>> = HashMap::new();
// Map from content_ids to interested ENRs
let mut interested_ends = batch_interested_enrs::<TMetric>(&content_ids, kbuckets);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I guess

Suggested change
let mut interested_ends = batch_interested_enrs::<TMetric>(&content_ids, kbuckets);
let mut interested_enr_batch = batch_interested_enrs::<TMetric>(&content_ids, kbuckets);

Since interested_enrs is used below for those interested in a single ID.

.filter(|node| {
XorMetric::distance(&content_id, &node.key.preimage().raw()) < node.value.data_radius()
TMetric::distance(content_id, &node.key.preimage().raw()) <= node.value.data_radius
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine we are inconsistent about <= and = for comparing to the radius. I just scanned the specs and don't see an official disambiguation. It's fine to change, because I think it's practically irrelevant. Do you think it's important? If so, maybe we should put it in the spec.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mental model, radius was always inclusive. Reason being, if radius is MAX, we want all content, even if it is the exact opposite from our node_id (in practice, not likely to happen).

As I was typing this, I can see the argument for radius being non-inclusive. If radius is zero, we don't want any content, even if it is the same as our node_id. If I have to pick which of the two impossible scenarios make more sense, I would still go with the first one (you can interpret it as "I'm responsible for my node_id +/- radius", in which case, you are always "responsible" for at least your node_id because it's about you).

When deciding if we should store the content (link) we use radius as inclusive, so at least now we will be somewhat consistent.

Either way, I think it's practically irrelevant, so I don't think it we need to update the spec. We can ask others, maybe somebody has strong opinion or good reason for one way or the other.

@morph-dev
Copy link
Collaborator Author

morph-dev commented Sep 25, 2024

  1. I think my main question that causes my confusion is, when is node considered connected? From my understanding, we don't keep open connection to other nodes. Or do we?! I guess it's time for me to read Discv5 in details...

  1. I will open a bug to track this for the state network (as it doesn't work in that case).

| 3. I stumbled on that code-block myself recently and had the same concern. I don't understand Nick's take.

I think it tries to cover the scenario when we are receiving multiple offers for the same content id (e.g. new data being seeded into network and we are offered the same content from multiple other nodes, as part of their gossip). There is logic to accept only one Offer request for one content key, but if that one fails, we try to get it from other peer that offered it to us. @njgheorghita is that correct?

@morph-dev morph-dev merged commit 24ed772 into ethereum:master Sep 25, 2024
9 checks passed
@morph-dev morph-dev deleted the gossip_refactor branch September 25, 2024 09:41
@njgheorghita
Copy link
Collaborator

@morph-dev yup! It's basically protection for the "thundering herd" problem as you described, though maybe the actual mechanism could use a better name since the name "fallback" might seem to imply that it will go out to seek all failed content txs from the network...

@carver
Copy link
Collaborator

carver commented Sep 25, 2024

  1. I think my main question that causes my confusion is, when is node considered connected? From my understanding, we don't keep open connection to other nodes. Or do we?! I guess it's time for me to read Discv5 in details...

Right, there is no formal open connection. As I recall, a node is considered disconnected if it recently failed to respond to a ping or other message. Then we sporadically ping "disconnected" nodes to see if it's back online.

ETA an example of node status changing to Disconnected after a request timeout:
https://github.com/sigp/discv5/blob/682b51e317216b1e2ec3d75b541cdff66d27cf32/src/service.rs#L1526

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants