-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: dont gossip to disconnected nodes if none are connected #1042
chore: dont gossip to disconnected nodes if none are connected #1042
Conversation
1047c77
to
e3adcda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one more case missing
fn init_find_nodes_query(
&mut self,
target: &NodeId,
callback: Option<oneshot::Sender<Vec<Enr>>>,
) -> Option<QueryId> {
let target_key = Key::from(*target);
let mut closest_enrs = self.closest_connected_nodes(&target_key, self.query_num_results);
// All nodes may be disconnected at boot, before they respond to pings.
// If no connected nodes, use the closest nodes regardless of connectivity.
if closest_enrs.is_empty() {
closest_enrs = self
.kbuckets
.write()
.closest_values(&target_key)
.map(|closest| closest.value.enr)
.take(self.query_num_results)
.collect();
// `closest_values` return is empty if querying our own Node ID
if closest_enrs.is_empty() {
closest_enrs = self.table_entries_enr();
}
}
Should be changed to
fn init_find_nodes_query(
&mut self,
target: &NodeId,
callback: Option<oneshot::Sender<Vec<Enr>>>,
) -> Option<QueryId> {
let target_key = Key::from(*target);
let closest_enrs = self.closest_connected_nodes(&target_key, self.query_num_results);
if closest_enrs.is_empty() {
// If there are no nodes whatsoever in the routing table the query cannot proceed.
warn!("No nodes in routing table, find nodes query cannot proceed.");
if let Some(callback) = callback {
let _ = callback.send(vec![]);
}
return None;
}
Or something like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛵
6b0e6df
to
c815599
Compare
@KolbyML Well, I agree with your suggestion, especially if we're following the convention that "disconnected" nodes should not be a concept outside of the routing table. However, making your suggested change does seem to have broken a few of our tests. At first glance, it seems like on startup, our routing table is designed to expect to operate w/ disconnected nodes. And the assumption is that as the initial find nodes queries process, they'll become connected. On one hand, we have this convention that we're trying to enforce "no disconnected nodes outside of the routing table" - on the other hand, we have broken tests and possibly significant amount of work to strictly enforce this convention. Imo, we should revert your suggested change, and track it in an issue so that it doesn't get dropped, and we can merge the other fixes asap. Thoughts? |
@njgheorghita the failing test appears to be for the same reason hive tests will fail, just merge the PR without the change and I will make a new one with the fix for the test. My priority is to move fast and fix bugs, so it is fine to merge and create an issue for this kind of thing. It might be a good idea to mark these kind of issues with a shelf life and low priority tag as well |
@KolbyML I strongly disagree that we should be merging broken tests into master |
I am not saying to merge with broken tests. I said i.e. remove the change which broke the tests then merge this PR. Then I will make a new PR with the changes "which broke the test" and fix the tests. I said nothing about merging failing CI code into master. This isn't a issue with this PR this is a issue with how the tests were written is it not? and being written under the assumption of "gossip to disconnected nodes if none are connected".
I said that is fine. I will just fix the miswritten tests in a follow up PR. I agreed with this and said I would fix it after this is merged.
I am not sure where this is coming from, maybe confusion over text which happens. But if something is unclear saying |
c815599
to
0f0a3fd
Compare
@KolbyML My bad, I just misunderstood your comment, sorry if it came across a bit harsh. When I read |
No worries
…On Thu, Nov 30, 2023 at 8:31 AM Nick Gheorghita ***@***.***> wrote:
@KolbyML <https://github.com/KolbyML> My bad, I just misunderstood your
comment, sorry if it came across a bit harsh. When I read just merge the
PR without the change and I will make a new one with the fix for the test.
I interpreted it as saying don't make any changes to the pr as is (eg.
merge it with failing tests) and you'll correct the failing tests in a
following pr. Now I understand that the change you were referencing was
actually the change that broke the tests. Again sorry for the
misunderstanding, communication over github can be tricky at times.
—
Reply to this email directly, view it on GitHub
<#1042 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHRTWZB2BBQ5J3DWBD6XT73YHCRGJAVCNFSM6AAAAAA74LD3A2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZUGAYDMOBSGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
What was wrong?
From a conversation on discord,
disconnected
nodes should only be used within the routing table, and shouldn't be exposed in exterior processes. Responsibility for pingingdisconnected
nodes falls within the scope of the routing table, and nowhere else.How was it fixed?
Removed usage of disconnected nodes from our gossip / rfc processes.
To-Do