-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More insight into Kademlia queries. #1567
Conversation
More insight: The API allows iterating over the active queries and inspecting their state and execution statistics. More control: The API allows aborting queries prematurely at any time. To that end, API operations that initiate new queries return the query ID and multi-phase queries such as `put_record` retain the query ID across all phases, each phase being executed by a new (internal) query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a bunch for this patch set! Among other things this will help a lot understanding the impact of #1473. I will deploy this pull request to kademlia-exporter.max-inden.de so we can test it out.
Woud you mind mentioning the changes of this pull request in the changelog?
@@ -188,7 +199,7 @@ fn handle_input_line(kademlia: &mut Kademlia<MemoryStore>, line: String) { | |||
publisher: None, | |||
expires: None, | |||
}; | |||
kademlia.put_record(record, Quorum::One); | |||
kademlia.put_record(record, Quorum::One).expect("Failed to store record locally."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of leaving the interpretation of the error up to the user (in this case the expect
string, what do you think of having RecordStore::put
return a proper error? Not saying this should happen within this pull request. I am happy to do that as a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpretation isn't up to the user, the error type is a store::Error
.
protocols/kad/src/behaviour.rs
Outdated
@@ -732,72 +770,114 @@ where | |||
target = kbucket::Key::new(PeerId::random()); | |||
} | |||
target | |||
}).collect::<Vec<_>>(); | |||
}).collect::<Vec<_>>().into_iter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make this iterator lazy delaying the heavy computations to the time they are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly do you have in mind? Note the following:
- The absolute maximum size of this vector is 254 elements, but this is extremely unlikely (probabilistically). The realistic sizes are ~1-10 elements (iterations).
- There are no heavy computations (at least I don't consider hashing, xor and sampling bytes from a PRNG to be heavy computations).
- We need to know the length of this vector / iterator immediately below, so if we wanted to leave it as an iterator, it must be an
ExactSizeIterator
. remaining
is put into theQueryInfo
for which I'd like to avoid adding type parameters.
Nevertheless, please let me know if you have a concrete approach in mind that is not invasive, maybe I'm missing something!
I have updated the Kademlia exporter with this patch and made it expose the new statistics as metrics. In addition I added a new graph (Number of hops per query) to the two (all-dhts, specific dht) dashboards at kademlia-exporter.max-inden.de. (It ignores |
I will update the changelog once the PR has passed review just before merging. That way I don't need to adjust the changelog together with ongoing changes resulting from reviews. Just a personal preference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good.
Docker hub is currently building a new version of my Kademlia exporter with the recent changes of this pull request incorporated. I would suggest holding off merging until it is deployed.
New version of exporter is deployed on kademlia-exporter.max-inden.de. |
I've been tagged as reviewer, but I don't really have the time/motivation to dig into the Kademlia code, and I'd trust Max's review here. |
This is ready to be merged from my side (just missing the changelog entries). 95% of all get queries on Kusama without disjoint paths involve at most ~180 failed and ~60 succeeded requests. 95% of all get queries on Kusama without disjoint paths involve at most ~240 failed and ~60 succeeded requests. ( |
romanb#4 should fix the existing merge conflicts. |
Merge branch 'libp2p/master' into kad-query-info
@mxinden Thanks a lot. |
This is an extended proposal for providing more insight into and control over Kademlia queries, based on earlier discussions at #1494.
More insight: The API allows iterating over the active queries and inspecting their state and execution statistics (
Kademlia::iter_queries
viaQueryRef
andKademlia::iter_queries_mut
viaQueryMut
). BothQueryRef
andQueryMut
give access to theQueryInfo
andQueryStats
. I went a bit back-and-forth about whether to exposeQueryInfo
or not but in the end decided to do so. Any query state that is purely internal can still kept in the internalQueryInner
, which is not exposed. Whenever a query finishes, successfully or with an error, execution statistics are also reported in theKademliaEvent
. TheQueryStats
include: Number of requests initiated, number of successful requests, number of failed requests, as well as query duration.More control: The API allows aborting queries prematurely at any time:
Kademlia::query_mut(id)
yields aQueryMut
that providesQueryMut::finish()
. This functionality existed before internally, but was not exposed.To that end, API operations that initiate new queries return the query ID and multi-phase queries such as
put_record
retain the query ID across all phases, each phase being executed by a new (internal) query.Tangentially, the behaviour of
Kademlia::bootstrap
has changed (for the better): Previously, once the initial self-lookup finished, queries to refresh all buckets beyond the first non-empty bucket would be initiated all at the same time, now they are initiated one after the other, which is more sane. The bootstrap events reported via aKademliaEvent
also report the remaining number of bootstrap queries. When that counter reaches 0, bootstrapping is complete. This was not really detectable before.