-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412
Conversation
The extension paper S-Kademlia includes a proposal for lookups over disjoint paths. Within vanilla Kademlia, queries keep track of the closest nodes in a single bucket. Any adversary along the path can thus influence all future paths, in case they can come up with the next-closest (not overall closest) hops. S-Kademlia tries to solve the attack above by querying over disjoint paths using multiple buckets. To adjust the libp2p Kademlia implementation accordingly this change-set introduces the notion of `PathId`s into the `ClosestPeersIter` and the ability to specify the amount of disjoint paths to pursue in the `ClosestPeersIterConfig`. `ClosestPeersIter` explores each path in a round-robin fashion. It iterates the not-yet-contacted nodes one by one starting with the closest one. When the given peer origins from the initial set or from a past query with the path id currently selected, it is tagged with that path id and returned to the upper layer to be queried. Once iterated through all peers without success, one chooses the one with the lowest distance from the target detached from its path id (a bit like a work-steeling queue). Setting the `disjoint_paths` to `1` within the `ClosestPeersIterConfig` effectively disables the feature.
PeerState::Succeeded => | ||
if let Some(ref mut cnt) = result_counter { | ||
*cnt += 1; | ||
// If `num_results` successful results have been delivered for the | ||
// closest peers, the iterator is done. | ||
if *cnt >= self.config.num_results { | ||
if *cnt > self.config.num_results { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths
is set to 1
(disabled). Only tests failing without the line above are put_record
and add_provider
. I still need to investigate this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH for simplicity I would not even keep the old behaviour as it is obviously insecure and I can't see a use case for ever not enabling disjoint paths, even for testing/debugging purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when
disjoint_paths
is set to1
(disabled).
I don't currently see how this can possibly result in correct behaviour. By first counting all successes while skipping over not-yet-contacted peers, not only does that seem to violate the main termination condition for iterative queries (whether S/Kademlia or regular Kademlia), but it also seems to still permit a single malicious node to route the entire query to nodes of its own choosing and prematurely terminate the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mxinden Apologies for taking so long to take a look. I was occupied with #1440 for the past weeks. As I mentioned in another comment, I don't currently see how this implementation can achieve the desired goals.
In terms of implementation strategy, I currently do not think changing the ClosestPeersIter
in any significant way is the way to go. Rather, I think a promising approach is to adapt the existing iterative query implementation in query.rs
to internally employ d
instances (one per desired "disjoint path") of ClosestPeersIter
. The initial starting peers as well as the desired number of results and the concurrency factor \alpha
of the query would be split among these instances (at least that is one possible approach and the one that does not result in more I/O than previously). As the query (i.e. it still looks like a single query on the API of query.rs
) gets results reported back from the network (behaviour), it can forward these to the iterators. Whenever an iterator yields a new peer to query, the query.rs
layer can check if another iterator already yielded that peer and if so, immediately report this peer for that iterator as "failed" (on_failure
). Rinse and repeat until the iterator returned a peer that no other iterator has used yet, which is then in turn yielded to the network behaviour to perform the I/O. The query terminates, once all iterators terminate (by the regular termination condition) and the resulting peers of the query is the union of the results of all iterators. This is a rough sketch, I'm not claiming that I've thought through all details and possible complications.
Also, the S/Kademlia paper seems to imply that such an implementation is coupled with an implementation of the "sibling list". Personally, I would actually like to see that implemented, because the current k-bucket implementation is based on a "flat array", not a tree, and thus does not implement the special-case bucket splitting from the original Kademlia paper, meaning the current implementation does not provide the same guarantees w.r.t. full knowledge of a node's "close neighbourhood" and it seems convenient that the sibling list supposedly serves as a replacement for that special-handling from the original paper.
PeerState::Succeeded => | ||
if let Some(ref mut cnt) = result_counter { | ||
*cnt += 1; | ||
// If `num_results` successful results have been delivered for the | ||
// closest peers, the iterator is done. | ||
if *cnt >= self.config.num_results { | ||
if *cnt > self.config.num_results { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when
disjoint_paths
is set to1
(disabled).
I don't currently see how this can possibly result in correct behaviour. By first counting all successes while skipping over not-yet-contacted peers, not only does that seem to violate the main termination condition for iterative queries (whether S/Kademlia or regular Kademlia), but it also seems to still permit a single malicious node to route the entire query to nodes of its own choosing and prematurely terminate the query.
Kademlia's iterative query should proceed until the k closest nodes were queried successfully, or there are no further nodes to query that have not succeeded, failed or timed out.
a8f7815 split the loop within `next()` into two. The first one checking for success, the second one checking for future nodes to connect to. This violates the main termination condition of Kademlia. One would return once one has `config.num_results` successfull replies, even though one is still aware of better peers to query. This patch fixes the regression. Overall this patch-set does force a query to use different disjoint paths. But in the case where a malicious node can reply significantly faster than non-malicious nodes, it can still control the result. The iterator picks peers from other paths if none are available for the current one. In addition it does not enforce the final result-set to have different paths. Thus if a malicious node is able to quickly fill up the success slots it controls the final result.
Closing in favor of #1473. |
The extension paper S-Kademlia [1] includes a proposal for lookups over
disjoint paths. Within vanilla Kademlia, queries keep track of the
closest nodes in a single bucket. Any adversary along the path can thus
influence all future paths, in case they can come up with the
next-closest (not overall closest) hops. S-Kademlia tries to solve the
attack above by querying over disjoint paths using multiple buckets.
To adjust the libp2p Kademlia implementation accordingly this change-set
introduces the notion of
PathId
s into theClosestPeersIter
and theability to specify the amount of disjoint paths to pursue in the
ClosestPeersIterConfig
.ClosestPeersIter
explores each path in a round-robin fashion. Ititerates the not-yet-contacted nodes one by one starting with the
closest one. When the given peer origins from the initial set or from a
past query with the path id currently selected, it is tagged with that
path id and returned to the upper layer to be queried.
Once iterated through all peers without success, one chooses the one
with the lowest distance from the target detached from its path id (a
bit like a work-steeling queue).
Setting the
disjoint_paths
to1
within theClosestPeersIterConfig
effectively disables the feature.
[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.4986&rep=rep1&type=pdf
Status
This is still a work-in-progress pull request. I think the overall design of the change is already worth discussing (e.g. do we want this in the first place, should this be a separate
ClosestPeersIter
, ...). This is the least intrusive change, probably not as resilient as one could be and definitely not complete in regards to @infinity0 research. There is still a borrowing issue, some clean-up and plenty more tests to write.