-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(iroh-net): combine discovery services and add heuristics when to start discovery #2056
Conversation
/// | ||
/// This will be called from a tokio task, so it is safe to spawn new tasks. | ||
/// These tasks will be run on the runtime of the [`super::MagicEndpoint`]. | ||
fn publish(&self, _info: &AddrInfo) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these not async
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can answer this at least for the original discovery trait.
They are intended to be "fire and forget". At the site where you call these you do not want to wait for them to complete. E.g. if we publish to a HTTP PUT endpoint we don't want to wait for successful publish before continuing, since that would probably create havoc inside the magicsock and in any case not help anybody.
E.g. in case of the pkarr discovery, this just puts the record somewhere and starts a loop to publish and republish to the DHT. Publishing to the DHT can take seconds, you don't want to wait for that. And what are you going to do if it fails? E.g. you are offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well from a task management perspective I would have expected a handling more like this
// in the magicsock
let publish_task = tokio::task::spawn(discovery.publish());
// track task or drop depending on the management
because this way you can eg cancel things on shutdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The task is owned by the discovery service, so it's the job of the discovery service to cancel the task on drop. I think there is not necessarily is a 1:1 relation between tasks that need to be spawned and publish calls.
E.g. in the pkarr discovery there is just a single long lived republish task owned by the discovery, and calling publish just replaces the value to be published with the new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would I communicate that I want the publish to end then? there is no shutdown method or anything like that if this is service is long running
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe pass CancellationToken
s into both resolve
and publish
? This would give full flexibility to both. Downside is that implementors would have to use tokio::select!
likely. Still, even for the resolve case it could make implementations simpler. Depending on how a resolver works, detecting that the returned stream was dropped may not be straightforward or instant. E.g. with channels, you might only realize once you try to sent, and so you might have been doing uneeded work already up to that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we could add an optional fn shutdown()
to the discovery trait which will be called in MagicEndpoint::close
and can be used by the discovery service to abort all tasks it started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the cancellation token approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I saw the doc comment here I was wondering why the signature does not give you a handle to the runtime to spawn instead of having it just in the documentation that "tokio spawn is fine".
Given the cancellation discussion, maybe it could even be a JoinSet
that you're politely asked to spawn into, and then you get aborting for free when the magicsock drops the joinset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this for a followup and created an issue here: #2066
dd18f90
to
489ee46
Compare
…sock actor (#2058) ## Description While working on #2056 I spotted that we use the actor inbox with return channels for information that is readily available already on the shared inner magicsock. This removes the unneeded complexity and thus simplifies `get_mapping_addr`, `endpoint_info` and `endpoint_infos` to return the information non-async and infallible. Yay! ## Notes & open questions <!-- Any notes, remarks or open questions you have to make about the PR. --> ## Change checklist - [x] Self-review.
a0604fc
to
fb60d74
Compare
fb60d74
to
2db91c8
Compare
I added more docs and tests, and implemented a
This is ready to review now. I will rebase the DNS discovery on top of this also I think. |
d6e3ac7
to
adbc41f
Compare
pub addr_info: AddrInfo, | ||
} | ||
|
||
/// A discovery service that combines multiple discovery sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment that it resolves in parallel vs serial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe even call it ConcurrentDiscovery
?
iroh-net/src/discovery.rs
Outdated
/// Start a discovery task. | ||
pub fn start(ep: MagicEndpoint, node_id: NodeId) -> Result<Self> { | ||
if ep.discovery().is_none() { | ||
bail!("No discovery services configured"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could use ensure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some nits but overall nice improvements
Description
resolve
method of the discovery trait to return aStream
ofDiscoveryItems
CombinedDiscovery
to combine multiple discovery servicesmagicsock
to new modulediscovery
Notes & open questions
Change checklist