Skip to content
This repository has been archived by the owner on Aug 23, 2019. It is now read-only.

feat(transport): use parallel limited dialer #195

Merged
merged 6 commits into from
Mar 27, 2017
Merged

feat(transport): use parallel limited dialer #195

merged 6 commits into from
Mar 27, 2017

Conversation

dignifiedquire
Copy link
Member

@dignifiedquire dignifiedquire commented Mar 27, 2017

cc @jackkleeman @dryajov @diasdavid

I haven't added specific test cases, if you have some please drop them in the comments so I can add them.

Needs

Replaces #193

Fixes #194

src/transport.js Outdated
}
if (q.canceled) {
log('dial canceled: %s', multiaddr.toString())
// clean up already done dials
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the conn being closed, if we already had dialed one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahahaha I forgot to write it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should use .close instead

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/transport.js Outdated
} else {
log('setting up new queue')
q = queue((multiaddr, cb) => {
const conn = t.dial(multiaddr, (err) => {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does dial return an error to its callback? It doesn't seem to for the websocket transport

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixing this

Copy link

@jackkleeman jackkleeman Mar 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am finding a bug where the onConnect callback is being called even when the websocket 'open' event isn't called. Will file an issue. EDIT - actually scratch that, the problem is the websocket transport isn't handling errors correctly.

Copy link

@jackkleeman jackkleeman Mar 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See libp2p/js-libp2p-websockets#60
After this PR, websockets will indeed return an error to its callback, so your code will work great 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I got everything working, except that we need to upgrade both libp2p-tcp and libpp2-websocket to actually provide an error if one happened on the callback, otherwise the detection is horrific

Copy link

@jackkleeman jackkleeman Mar 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No there wasn't a pull-ws issue after all, just a libp2p-websockets error handling issue
and I've issued a PR to libp2p-websockets, see my comment above

Copy link
Member Author

@dignifiedquire dignifiedquire Mar 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well it is both I am afraid, I actually can't detect the error in libp2p-websockets easily as the connect callback is called even though an error occured :(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think onConnect is meant to be called with errors as the first argument. Why is that difficult to check for?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened then closed an issue on this as I realised that they sent errors to onConnect in several places. pull-stream/pull-ws#18

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just merged you pr, thank you it works great with that :)

src/transport.js Outdated
if (q.canceled) {
log('dial canceled: %s', multiaddr.toString())
// clean up already done dials
pull(pull.empty(), conn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other end will try to write to this conn as soon as it opens and since we are not reading those values, it will stay forever open.

Use .close/destroy and make sure it gets destroyed (we don't want to exhaust open sockets/fd limits)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, working on it

@dignifiedquire
Copy link
Member Author

I've also added a timeout on the actual dial call so we don't hang indefinitely

@daviddias daviddias merged commit a15e63c into master Mar 27, 2017
@daviddias daviddias deleted the feat/dial branch March 27, 2017 15:33
@daviddias
Copy link
Member

daviddias commented Mar 27, 2017

Seems that tests in swarm didn't catch all the cases, specially when there is WS + TCP involved. Getting these at libp2p-ipfs-nodejs:

 1) libp2p-ipfs-nodejs libp2p.dial using PeerInfo nodeA to nodeB:
     Uncaught TypeError: conn.close is not a function
      at dialWithTimeout (node_modules/libp2p-swarm/src/transport.js:70:22)
      at injectedCallback (node_modules/async/timeout.js:62:30)
      at transport.dial (node_modules/libp2p-swarm/src/transport.js:199:7)
      at f (node_modules/once/once.js:25:25)
      at Socket.rawSocket.once (node_modules/libp2p-tcp/src/index.js:40:7)
      at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1074:10)

TCP is missing the conn.close https://github.com/libp2p/js-libp2p-tcp/blob/master/src/index.js and also, WebRTC star doesn't have such a method https://github.com/libp2p/js-libp2p-webrtc-star/blob/master/src/index.js

Seems that a .close never really made it to the interface-connection (https://github.com/libp2p/interface-connection/blob/master/src/connection.js) and with good reason, it has a method to 'end' the call by sending the 'FIN' packet which gets fired once we do something like pull.empty(), needing only a 'destroy' event that would trickly to the socket.

@daviddias
Copy link
Member

Will finish the update on the new libp2p API for bitswap and js-ipfs and come back to this after that :)

@dignifiedquire
Copy link
Member Author

@diasdavid so should I simply change it to what I had before? pull(pull.empty(), conn)?

@daviddias
Copy link
Member

Not quite the ideal solution, it has worked for us because we always end muxed streams and only close sockets when they get disconnected. Now we actually need to 'destroy' the socket.

dryajov pushed a commit that referenced this pull request Mar 29, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 2, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 2, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 2, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 3, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 7, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 7, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 7, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 7, 2017
* feat(transport): use parallel limited dialer
dryajov pushed a commit that referenced this pull request Apr 7, 2017
* feat(transport): use parallel limited dialer
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants