protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412

mxinden · 2020-01-25T23:58:38Z

The extension paper S-Kademlia [1] includes a proposal for lookups over
disjoint paths. Within vanilla Kademlia, queries keep track of the
closest nodes in a single bucket. Any adversary along the path can thus
influence all future paths, in case they can come up with the
next-closest (not overall closest) hops. S-Kademlia tries to solve the
attack above by querying over disjoint paths using multiple buckets.

To adjust the libp2p Kademlia implementation accordingly this change-set
introduces the notion of PathIds into the ClosestPeersIter and the
ability to specify the amount of disjoint paths to pursue in the
ClosestPeersIterConfig.

ClosestPeersIter explores each path in a round-robin fashion. It
iterates the not-yet-contacted nodes one by one starting with the
closest one. When the given peer origins from the initial set or from a
past query with the path id currently selected, it is tagged with that
path id and returned to the upper layer to be queried.

Once iterated through all peers without success, one chooses the one
with the lowest distance from the target detached from its path id (a
bit like a work-steeling queue).

Setting the disjoint_paths to 1 within the ClosestPeersIterConfig
effectively disables the feature.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.4986&rep=rep1&type=pdf

Status

This is still a work-in-progress pull request. I think the overall design of the change is already worth discussing (e.g. do we want this in the first place, should this be a separate ClosestPeersIter, ...). This is the least intrusive change, probably not as resilient as one could be and definitely not complete in regards to @infinity0 research. There is still a borrowing issue, some clean-up and plenty more tests to write.

The extension paper S-Kademlia includes a proposal for lookups over disjoint paths. Within vanilla Kademlia, queries keep track of the closest nodes in a single bucket. Any adversary along the path can thus influence all future paths, in case they can come up with the next-closest (not overall closest) hops. S-Kademlia tries to solve the attack above by querying over disjoint paths using multiple buckets. To adjust the libp2p Kademlia implementation accordingly this change-set introduces the notion of `PathId`s into the `ClosestPeersIter` and the ability to specify the amount of disjoint paths to pursue in the `ClosestPeersIterConfig`. `ClosestPeersIter` explores each path in a round-robin fashion. It iterates the not-yet-contacted nodes one by one starting with the closest one. When the given peer origins from the initial set or from a past query with the path id currently selected, it is tagged with that path id and returned to the upper layer to be queried. Once iterated through all peers without success, one chooses the one with the lowest distance from the target detached from its path id (a bit like a work-steeling queue). Setting the `disjoint_paths` to `1` within the `ClosestPeersIterConfig` effectively disables the feature.

mxinden · 2020-01-26T00:01:10Z

protocols/kad/src/query/peers/closest.rs

                PeerState::Succeeded =>
                    if let Some(ref mut cnt) = result_counter {
                        *cnt += 1;
                        // If `num_results` successful results have been delivered for the
                        // closest peers, the iterator is done.
-                        if *cnt >= self.config.num_results {
+                        if *cnt > self.config.num_results {


Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths is set to 1 (disabled). Only tests failing without the line above are put_record and add_provider. I still need to investigate this further.

TBH for simplicity I would not even keep the old behaviour as it is obviously insecure and I can't see a use case for ever not enabling disjoint paths, even for testing/debugging purposes.

Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths is set to 1 (disabled).

I don't currently see how this can possibly result in correct behaviour. By first counting all successes while skipping over not-yet-contacted peers, not only does that seem to violate the main termination condition for iterative queries (whether S/Kademlia or regular Kademlia), but it also seems to still permit a single malicious node to route the entire query to nodes of its own choosing and prematurely terminate the query.

romanb

@mxinden Apologies for taking so long to take a look. I was occupied with #1440 for the past weeks. As I mentioned in another comment, I don't currently see how this implementation can achieve the desired goals.

In terms of implementation strategy, I currently do not think changing the ClosestPeersIter in any significant way is the way to go. Rather, I think a promising approach is to adapt the existing iterative query implementation in query.rs to internally employ d instances (one per desired "disjoint path") of ClosestPeersIter. The initial starting peers as well as the desired number of results and the concurrency factor \alpha of the query would be split among these instances (at least that is one possible approach and the one that does not result in more I/O than previously). As the query (i.e. it still looks like a single query on the API of query.rs) gets results reported back from the network (behaviour), it can forward these to the iterators. Whenever an iterator yields a new peer to query, the query.rs layer can check if another iterator already yielded that peer and if so, immediately report this peer for that iterator as "failed" (on_failure). Rinse and repeat until the iterator returned a peer that no other iterator has used yet, which is then in turn yielded to the network behaviour to perform the I/O. The query terminates, once all iterators terminate (by the regular termination condition) and the resulting peers of the query is the union of the results of all iterators. This is a rough sketch, I'm not claiming that I've thought through all details and possible complications.

Also, the S/Kademlia paper seems to imply that such an implementation is coupled with an implementation of the "sibling list". Personally, I would actually like to see that implemented, because the current k-bucket implementation is based on a "flat array", not a tree, and thus does not implement the special-case bucket splitting from the original Kademlia paper, meaning the current implementation does not provide the same guarantees w.r.t. full knowledge of a node's "close neighbourhood" and it seems convenient that the sibling list supposedly serves as a replacement for that special-handling from the original paper.

romanb · 2020-02-10T10:53:12Z

protocols/kad/src/query/peers/closest.rs

                PeerState::Succeeded =>
                    if let Some(ref mut cnt) = result_counter {
                        *cnt += 1;
                        // If `num_results` successful results have been delivered for the
                        // closest peers, the iterator is done.
-                        if *cnt >= self.config.num_results {
+                        if *cnt > self.config.num_results {


Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths is set to 1 (disabled).

I don't currently see how this can possibly result in correct behaviour. By first counting all successes while skipping over not-yet-contacted peers, not only does that seem to violate the main termination condition for iterative queries (whether S/Kademlia or regular Kademlia), but it also seems to still permit a single malicious node to route the entire query to nodes of its own choosing and prematurely terminate the query.

Kademlia's iterative query should proceed until the k closest nodes were queried successfully, or there are no further nodes to query that have not succeeded, failed or timed out.

a8f7815 split the loop within `next()` into two. The first one checking for success, the second one checking for future nodes to connect to. This violates the main termination condition of Kademlia. One would return once one has `config.num_results` successfull replies, even though one is still aware of better peers to query. This patch fixes the regression. Overall this patch-set does force a query to use different disjoint paths. But in the case where a malicious node can reply significantly faster than non-malicious nodes, it can still control the result. The iterator picks peers from other paths if none are available for the current one. In addition it does not enforce the final result-set to have different paths. Thus if a malicious node is able to quickly fill up the success slots it controls the final result.

mxinden · 2020-04-06T08:20:29Z

Closing in favor of #1473.

mxinden commented Jan 26, 2020

View reviewed changes

romanb self-requested a review February 4, 2020 10:38

romanb reviewed Feb 10, 2020

View reviewed changes

mxinden added 3 commits February 11, 2020 19:39

protocols/kad/src/query/peers/closest: Revert num_result hack

e2897a4

protocols/kad/src/query/peers/closest: Add test for wrong early finish

aab9f3b

Kademlia's iterative query should proceed until the k closest nodes were queried successfully, or there are no further nodes to query that have not succeeded, failed or timed out.

mxinden mentioned this pull request Feb 26, 2020

protocols/kad: Implement S-Kademlia's lookup over disjoint paths v2 #1473

Merged

mxinden closed this Apr 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412

protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412

mxinden commented Jan 25, 2020

mxinden Jan 26, 2020

infinity0 Jan 31, 2020

romanb Feb 10, 2020

romanb left a comment

romanb Feb 10, 2020

mxinden commented Apr 6, 2020

protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412

protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412

Conversation

mxinden commented Jan 25, 2020

Status

mxinden Jan 26, 2020

Choose a reason for hiding this comment

infinity0 Jan 31, 2020

Choose a reason for hiding this comment

romanb Feb 10, 2020

Choose a reason for hiding this comment

romanb left a comment

Choose a reason for hiding this comment

romanb Feb 10, 2020

Choose a reason for hiding this comment

mxinden commented Apr 6, 2020