-
Notifications
You must be signed in to change notification settings - Fork 37
feat(transport): use parallel limited dialer #195
Conversation
Replaces #193 Fixes #194
src/transport.js
Outdated
} | ||
if (q.canceled) { | ||
log('dial canceled: %s', multiaddr.toString()) | ||
// clean up already done dials |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the conn being closed, if we already had dialed one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahahaha I forgot to write it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably should use .close
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/transport.js
Outdated
} else { | ||
log('setting up new queue') | ||
q = queue((multiaddr, cb) => { | ||
const conn = t.dial(multiaddr, (err) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does dial return an error to its callback? It doesn't seem to for the websocket transport
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am finding a bug where the onConnect callback is being called even when the websocket 'open' event isn't called. Will file an issue. EDIT - actually scratch that, the problem is the websocket transport isn't handling errors correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See libp2p/js-libp2p-websockets#60
After this PR, websockets will indeed return an error to its callback, so your code will work great 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I got everything working, except that we need to upgrade both libp2p-tcp and libpp2-websocket to actually provide an error if one happened on the callback, otherwise the detection is horrific
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No there wasn't a pull-ws issue after all, just a libp2p-websockets error handling issue
and I've issued a PR to libp2p-websockets, see my comment above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well it is both I am afraid, I actually can't detect the error in libp2p-websockets easily as the connect callback is called even though an error occured :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think onConnect is meant to be called with errors as the first argument. Why is that difficult to check for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened then closed an issue on this as I realised that they sent errors to onConnect in several places. pull-stream/pull-ws#18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just merged you pr, thank you it works great with that :)
src/transport.js
Outdated
if (q.canceled) { | ||
log('dial canceled: %s', multiaddr.toString()) | ||
// clean up already done dials | ||
pull(pull.empty(), conn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other end will try to write to this conn as soon as it opens and since we are not reading those values, it will stay forever open.
Use .close/destroy and make sure it gets destroyed (we don't want to exhaust open sockets/fd limits)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, working on it
I've also added a timeout on the actual dial call so we don't hang indefinitely |
Seems that tests in swarm didn't catch all the cases, specially when there is WS + TCP involved. Getting these at libp2p-ipfs-nodejs:
TCP is missing the conn.close https://github.com/libp2p/js-libp2p-tcp/blob/master/src/index.js and also, WebRTC star doesn't have such a method https://github.com/libp2p/js-libp2p-webrtc-star/blob/master/src/index.js Seems that a |
Will finish the update on the new libp2p API for bitswap and js-ipfs and come back to this after that :) |
@diasdavid so should I simply change it to what I had before? |
Not quite the ideal solution, it has worked for us because we always end muxed streams and only close sockets when they get disconnected. Now we actually need to 'destroy' the socket. |
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
* feat(transport): use parallel limited dialer
cc @jackkleeman @dryajov @diasdavid
I haven't added specific test cases, if you have some please drop them in the comments so I can add them.
Needs
Replaces #193
Fixes #194