Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protocols/kad: Implement S-Kademlia's lookup over disjoint paths #1412

Closed
wants to merge 4 commits into from

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Jan 25, 2020

The extension paper S-Kademlia [1] includes a proposal for lookups over
disjoint paths. Within vanilla Kademlia, queries keep track of the
closest nodes in a single bucket. Any adversary along the path can thus
influence all future paths, in case they can come up with the
next-closest (not overall closest) hops. S-Kademlia tries to solve the
attack above by querying over disjoint paths using multiple buckets.

To adjust the libp2p Kademlia implementation accordingly this change-set
introduces the notion of PathIds into the ClosestPeersIter and the
ability to specify the amount of disjoint paths to pursue in the
ClosestPeersIterConfig.

ClosestPeersIter explores each path in a round-robin fashion. It
iterates the not-yet-contacted nodes one by one starting with the
closest one. When the given peer origins from the initial set or from a
past query with the path id currently selected, it is tagged with that
path id and returned to the upper layer to be queried.

Once iterated through all peers without success, one chooses the one
with the lowest distance from the target detached from its path id (a
bit like a work-steeling queue).

Setting the disjoint_paths to 1 within the ClosestPeersIterConfig
effectively disables the feature.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.4986&rep=rep1&type=pdf

Status

This is still a work-in-progress pull request. I think the overall design of the change is already worth discussing (e.g. do we want this in the first place, should this be a separate ClosestPeersIter, ...). This is the least intrusive change, probably not as resilient as one could be and definitely not complete in regards to @infinity0 research. There is still a borrowing issue, some clean-up and plenty more tests to write.

The extension paper S-Kademlia includes a proposal for lookups over
disjoint paths. Within vanilla Kademlia, queries keep track of the
closest nodes in a single bucket. Any adversary along the path can thus
influence all future paths, in case they can come up with the
next-closest (not overall closest) hops. S-Kademlia tries to solve the
attack above by querying over disjoint paths using multiple buckets.

To adjust the libp2p Kademlia implementation accordingly this change-set
introduces the notion of `PathId`s into the `ClosestPeersIter` and the
ability to specify the amount of disjoint paths to pursue in the
`ClosestPeersIterConfig`.

`ClosestPeersIter` explores each path in a round-robin fashion. It
iterates the not-yet-contacted nodes one by one starting with the
closest one. When the given peer origins from the initial set or from a
past query with the path id currently selected, it is tagged with that
path id and returned to the upper layer to be queried.

Once iterated through all peers without success, one chooses the one
with the lowest distance from the target detached from its path id (a
bit like a work-steeling queue).

Setting the `disjoint_paths` to `1` within the `ClosestPeersIterConfig`
effectively disables the feature.
PeerState::Succeeded =>
if let Some(ref mut cnt) = result_counter {
*cnt += 1;
// If `num_results` successful results have been delivered for the
// closest peers, the iterator is done.
if *cnt >= self.config.num_results {
if *cnt > self.config.num_results {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths is set to 1 (disabled). Only tests failing without the line above are put_record and add_provider. I still need to investigate this further.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH for simplicity I would not even keep the old behaviour as it is obviously insecure and I can't see a use case for ever not enabling disjoint paths, even for testing/debugging purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths is set to 1 (disabled).

I don't currently see how this can possibly result in correct behaviour. By first counting all successes while skipping over not-yet-contacted peers, not only does that seem to violate the main termination condition for iterative queries (whether S/Kademlia or regular Kademlia), but it also seems to still permit a single malicious node to route the entire query to nodes of its own choosing and prematurely terminate the query.

@romanb romanb self-requested a review February 4, 2020 10:38
Copy link
Contributor

@romanb romanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mxinden Apologies for taking so long to take a look. I was occupied with #1440 for the past weeks. As I mentioned in another comment, I don't currently see how this implementation can achieve the desired goals.

In terms of implementation strategy, I currently do not think changing the ClosestPeersIter in any significant way is the way to go. Rather, I think a promising approach is to adapt the existing iterative query implementation in query.rs to internally employ d instances (one per desired "disjoint path") of ClosestPeersIter. The initial starting peers as well as the desired number of results and the concurrency factor \alpha of the query would be split among these instances (at least that is one possible approach and the one that does not result in more I/O than previously). As the query (i.e. it still looks like a single query on the API of query.rs) gets results reported back from the network (behaviour), it can forward these to the iterators. Whenever an iterator yields a new peer to query, the query.rs layer can check if another iterator already yielded that peer and if so, immediately report this peer for that iterator as "failed" (on_failure). Rinse and repeat until the iterator returned a peer that no other iterator has used yet, which is then in turn yielded to the network behaviour to perform the I/O. The query terminates, once all iterators terminate (by the regular termination condition) and the resulting peers of the query is the union of the results of all iterators. This is a rough sketch, I'm not claiming that I've thought through all details and possible complications.

Also, the S/Kademlia paper seems to imply that such an implementation is coupled with an implementation of the "sibling list". Personally, I would actually like to see that implemented, because the current k-bucket implementation is based on a "flat array", not a tree, and thus does not implement the special-case bucket splitting from the original Kademlia paper, meaning the current implementation does not provide the same guarantees w.r.t. full knowledge of a node's "close neighbourhood" and it seems convenient that the sibling list supposedly serves as a replacement for that special-handling from the original paper.

PeerState::Succeeded =>
if let Some(ref mut cnt) = result_counter {
*cnt += 1;
// If `num_results` successful results have been delivered for the
// closest peers, the iterator is done.
if *cnt >= self.config.num_results {
if *cnt > self.config.num_results {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that I first iterate all peers looking for timeouts and successes, the function does not behave as before anymore even when disjoint_paths is set to 1 (disabled).

I don't currently see how this can possibly result in correct behaviour. By first counting all successes while skipping over not-yet-contacted peers, not only does that seem to violate the main termination condition for iterative queries (whether S/Kademlia or regular Kademlia), but it also seems to still permit a single malicious node to route the entire query to nodes of its own choosing and prematurely terminate the query.

Kademlia's iterative query should proceed until the k closest nodes were
queried successfully, or there are no further nodes to query that have
not succeeded, failed or timed out.
a8f7815 split the loop within `next()` into two. The first one
checking for success, the second one checking for future nodes to
connect to. This violates the main termination condition of Kademlia.
One would return once one has `config.num_results` successfull replies,
even though one is still aware of better peers to query.

This patch fixes the regression.

Overall this patch-set does force a query to use different disjoint
paths. But in the case where a malicious node can reply significantly
faster than non-malicious nodes, it can still control the result. The
iterator picks peers from other paths if none are available for the
current one. In addition it does not enforce the final result-set to
have different paths. Thus if a malicious node is able to quickly fill
up the success slots it controls the final result.
@mxinden
Copy link
Member Author

mxinden commented Apr 6, 2020

Closing in favor of #1473.

@mxinden mxinden closed this Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants