-
Notifications
You must be signed in to change notification settings - Fork 37
Dialer v2: modular and composable dialer #122
base: master
Are you sure you want to change the base?
Conversation
@@ -12,14 +12,14 @@ import ( | |||
const maxDialDialErrors = 16 | |||
|
|||
// DialError is the error type returned when dialing. | |||
type DialError struct { | |||
type Error struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sidenote: we could borrow the technique that multierror uses to defer allocs until/if an error is recorded.
var err *dial.Error
err.recordErr(addr, err) // does new(dial.Error) if the pointer is nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging into this a bit I don't see anything I'd do differently, this looks like a big, big improvement over the current system.
Random notes/thoughts:
- Would be nice to test in go-ipfs (sharness / real network / gateways)
- If we'd have a way to suspend pipelines, and an additional transport layer, we might be able to implement connection migration quite easily. (leave this for the future)
- I like this, it seems to provide a good base for further development
case StatusBlocked: | ||
return "Status(Blocked)" | ||
default: | ||
return fmt.Sprintf("Status(%d)", s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use %x
(or %b
) for readability.
released <- job | ||
} | ||
|
||
func (t *throttler) throttleFd(job *Job) (throttle bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In future it might be nice to expose this (and other throttler info) to planners so e.g. we won't defer dials to relays and/or defer dialing expensive transports
@Stebalien @vyzo we probably want to start moving on this soon. |
Calling out @vyzo @bigs specifically for a first round of review. A bunch of things are now depending on the dialler changes (self-dial, prioritising non-relay addresses, etc.), so we should move on this urgently to avoid dropping the ball. @magik6k thanks for the review! 🎉
Agree. I've so far tested with the DHT crawler, which is the most dial-intensive thing we have, and it looks good memory and CPU wise. But I agree we should integrate into IPFS, and test.
Could you elaborate on this? Suspending pipelines was on my mind but for a different use case (e.g. activating transports only if we need to dial peers with those transports). It allows for a more dynamic environment, but not an immediate requirement.
Yeah, optional self-dial can now be added by removing the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to take a while to review properly.
One first observation, this all seems a little static.
I would like to have a mechanism to supply at a minimum the address resolver for individual dial jobs.
Use case in mind: direct connection upgrade.
We will want to dial while already having an existing (relay) connection. For that to happen we will have to bypass the peerstore completely and only dial using user-specified addresses.
Can this be done with this design? Hard to say; if not, something must change.
@bigs has stepped up to move this PR over the finish line :-) |
8ef0c2f
to
84fd679
Compare
@raulk @Stebalien just rebased this and updated the branch to reflect the core refactor, amongst other things. tests passing, this should be ready for a review. |
// | ||
// Likewise, the background discovery process must stop when the provided context is closed. | ||
type AddressResolver interface { | ||
Resolve(req *Request) (known []ma.Multiaddr, more <-chan []ma.Multiaddr, err error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raulk this could probably just return a buffered channel, with size len(known)
. would simplify the api a bit. thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Mind creating a branch with that change, and sending a WIP PR to this branch to see how the code simplifies?
// channel to signal it has finished planning. | ||
// | ||
// The lifetime of asynchronous processes should obey the request context. | ||
NewPlan(req *Request, initial []ma.Multiaddr, out chan<- []*Job) (Plan, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
furthermore, it would remove an argument here
JobComplete(completed *Job) | ||
|
||
// ResolutionDone is called when the AddressResolver has finished. | ||
ResolutionDone() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this become an event on the eventbus? perhaps a future change.
// Run starts the throttling process. It receives planned jobs via the incoming channel, and emits jobs to dial via | ||
// the released channel. Both channels are provided by the caller, who should abstain from closing either during | ||
// normal operation to avoid panics. To safely shut down a throttler, first call Close() and await return. | ||
Run(incoming <-chan *Job, released chan<- *Job) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this also implements Closer
, maybe it makes sense to have the Throttler
own the release channel and, thus, be responsible for closing it. this would require an API change to
Run(incoming <-chan *Job) (released chan<- *Job)
Resuming this thread.
IMO, that would be too complex. We really don't want users of the swarm/network providing address resolvers per-dial; that leaks an abstraction. Instead, the address resolver can take contextual decisions based on the dial that it's processing. |
If this becomes a common case, we introduce an higher-level |
This PR deprecates #88.
Description
Here we introduce a Pipeline construction for the dialer. Copied from the godoc:
Recommendations for review
Here's the recommended walkthrough to review this PR:
dial/doc.go
for an overview of the solution.dial/interfaces.go
.dial/request.go
anddial/job.go
.dial/pipeline.go
.