-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update persistent networking parameters manager. #1762
Conversation
@@ -62,6 +76,13 @@ pub trait NetworkingParametersRegistry: Send + Sync { | |||
|
|||
/// Enables Clone implementation for `Box<dyn NetworkingParametersRegistry>` | |||
fn clone_box(&self) -> Box<dyn NetworkingParametersRegistry>; | |||
|
|||
/// Triggers when we removed the peer address from the permanent storage. Returns optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a bit more specific, we don't really remove it from persistent storage when this event is triggered. Maybe call it on_unreachable
or something of that sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I like on_unreachable_address
but NetworkingParametersRegistry
only knows about address removal nothing about the connectivity. I could add the intended usage comment though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it does have a timer and it does essentially check for how long address was unreachable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we need to remove it. Current name is misleading because address is literally not removed from anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'll change to on_unreachable_address
last_address | ||
}; | ||
|
||
address_removed_events.push(address_removed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not calling even handlers from here? Looks like you're collecting them into vector here just to iterate it afterwards. Or you can turn this for_each
into filter_map
and return iterator of these things without collecting first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easier to test this way. I don't expect a lot of elements in this vector.
if event.last_address { | ||
self.swarm | ||
.behaviour_mut() | ||
.kademlia | ||
.remove_peer(&event.peer_id); | ||
} else { | ||
self.swarm | ||
.behaviour_mut() | ||
.kademlia | ||
.remove_address(&event.peer_id, &event.address); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, will Kademlia really keep a peer in routing table that we know no address of? Also maybe you can query Kademlia for list of addresses rather than sending last_address
with event? Just looks like an awkward API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does. However, I think there is a bug here anyway. The problem will arise when we remove known addresses that are unresponsive and remove other addresses (via remove_peer
) that we don't know about because we didn't dial them previously. I changed the logic to handle the UnroutablePeer
event instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense and I think we should then replace handle_removed_address_event
with handle_removed_peer
, but even then it seems problematic that we may not remember all the peers in Kademlia anyway, thus unable to remove them. Should known peers be unbounded in memory then and bounded when written to disk (by unbounded I mean bounded by Kademlia externally)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I followed here. I moved remove_peer
call from there the UnroutablePeer
handler. This way I both remove unreachable addresses and handler unroutable peers. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is that our known peers cache is limited in size. It seems possible that we will have some peers added to Kademlia, but not in known peers. If that is the case, they will never be removed from Kademlia and we do not have event about things evicted from LRU cache of known peers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we derive cache limits in your pending PR from the Kademlia internals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not dynamically, but if we have fixed Kademlia parameters and we know upper bound then yes. I agree this PR is an improvement, I'm just trying to understand whether it is a 100% solution to the problem.
crates/subspace-networking/src/behavior/persistent_parameters.rs
Outdated
Show resolved
Hide resolved
@@ -397,10 +398,16 @@ where | |||
.kademlia | |||
.remove_peer(&event.peer_id); | |||
} else { | |||
// Remove both versions of the address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really possible to have both? It seems odd to not have identity in the address, how would you establish secure connection to someone you don't know identity of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, it's possible: https://github.com/libp2p/rust-libp2p/blob/f95f39f634d08dad8df72d7ecf92312f9ad2d826/swarm/src/dial_opts.rs#L79
We don't use this feature though. AFAIK, in each case, when we need to establish a connection with the peer we always have a function signature with both PeerId
and Multiaddress
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't handle it then if it is not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant when we use (implicitly) address without P2p(peer_id)
protocol we have PeerId
known. But it's Kademlia internals. I saw both types of addresses in the logs this is why I remove both types as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm not sure when we use address without Peer ID, this is why I'm confused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use addresses without Peer ID when we add them as self-reported addresses to Kademlia. They become self-reported when they are taken from listening and external addresses lists on the other peer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we do not add our own address to our Kademlia, do we? I don't think I follow you here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we do (implicitly) - when we add external addresses and specify listening addresses. When we request providers - Kademlia fills providers with those addresses (both lists chained).
@@ -62,6 +76,13 @@ pub trait NetworkingParametersRegistry: Send + Sync { | |||
|
|||
/// Enables Clone implementation for `Box<dyn NetworkingParametersRegistry>` | |||
fn clone_box(&self) -> Box<dyn NetworkingParametersRegistry>; | |||
|
|||
/// Triggers when we removed the peer address from the permanent storage. Returns optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it does have a timer and it does essentially check for how long address was unreachable.
crates/subspace-networking/src/behavior/persistent_parameters.rs
Outdated
Show resolved
Hide resolved
crates/subspace-networking/src/behavior/persistent_parameters.rs
Outdated
Show resolved
Hide resolved
@@ -397,10 +398,16 @@ where | |||
.kademlia | |||
.remove_peer(&event.peer_id); | |||
} else { | |||
// Remove both versions of the address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't handle it then if it is not used
if event.last_address { | ||
self.swarm | ||
.behaviour_mut() | ||
.kademlia | ||
.remove_peer(&event.peer_id); | ||
} else { | ||
self.swarm | ||
.behaviour_mut() | ||
.kademlia | ||
.remove_address(&event.peer_id, &event.address); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense and I think we should then replace handle_removed_address_event
with handle_removed_peer
, but even then it seems problematic that we may not remember all the peers in Kademlia anyway, thus unable to remove them. Should known peers be unbounded in memory then and bounded when written to disk (by unbounded I mean bounded by Kademlia externally)?
# Conflicts: # crates/subspace-networking/src/behavior/tests.rs # crates/subspace-networking/src/bin/subspace-bootstrap-node/main.rs # crates/subspace-networking/src/create.rs
if event.last_address { | ||
self.swarm | ||
.behaviour_mut() | ||
.kademlia | ||
.remove_peer(&event.peer_id); | ||
} else { | ||
self.swarm | ||
.behaviour_mut() | ||
.kademlia | ||
.remove_address(&event.peer_id, &event.address); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not dynamically, but if we have fixed Kademlia parameters and we know upper bound then yes. I agree this PR is an improvement, I'm just trying to understand whether it is a 100% solution to the problem.
@@ -397,10 +398,16 @@ where | |||
.kademlia | |||
.remove_peer(&event.peer_id); | |||
} else { | |||
// Remove both versions of the address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we do not add our own address to our Kademlia, do we? I don't think I follow you here.
…pace#1791 autonomys/subspace#1786 autonomys/subspace#1787 autonomys/subspace#1785 autonomys/subspace#1783 autonomys/subspace#1761 autonomys/subspace#1782 autonomys/subspace#1784 autonomys/subspace#1778 autonomys/subspace#1776 autonomys/subspace#1762 autonomys/subspace#1772 autonomys/subspace#1777 autonomys/subspace#1767 autonomys/subspace#1775 autonomys/subspace#1768 autonomys/subspace#1771 autonomys/subspace#1760 autonomys/subspace#1766 autonomys/subspace#1742 autonomys/subspace#1765 autonomys/subspace#1770 autonomys/subspace#1764
…pace#1791 autonomys/subspace#1786 autonomys/subspace#1787 autonomys/subspace#1785 autonomys/subspace#1783 autonomys/subspace#1761 autonomys/subspace#1782 autonomys/subspace#1784 autonomys/subspace#1778 autonomys/subspace#1776 autonomys/subspace#1762 autonomys/subspace#1772 autonomys/subspace#1777 autonomys/subspace#1767 autonomys/subspace#1775 autonomys/subspace#1768 autonomys/subspace#1771 autonomys/subspace#1760 autonomys/subspace#1766 autonomys/subspace#1742 autonomys/subspace#1765 autonomys/subspace#1770 autonomys/subspace#1764
This PR contains two changes to the persistent networking parameters manager:
Code contributor checklist: