-
Notifications
You must be signed in to change notification settings - Fork 18
Add WebAssembly support and the ability to Dial from browsers #51
Conversation
CI is failing because it uses Go 1.11. My PR is targeting the latest go version as of now which is 1.12. Can we update Travis to use the latest version? If you want to run the browser integration tests in Travis, we might also need to install Chrome. |
conn_js.go
Outdated
closeSignal chan struct{} | ||
} | ||
|
||
func (c *Conn) Read(b []byte) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would appreciate it if someone could take a closer look at this implementation of Read
. I'm not super well-versed in the semantics/behavior libp2p expects.
Is there any reason to merge this here instead of to simply create an entirely new transport? This is basically just a different implementation of the websocket transport and there isn't any shared code. We'd then conditionally import the correct transport from go-libp2p. |
@Stebalien there is a decent amount of shared code. Every file without a build tag at the top is shared. This includes:
In terms of functionality, the shared code:
There is a benefit to keeping this shared code in one repository so that if we ever need to change it in the future, we only need to change it in one place. There is another big advantage to putting the WebAssembly-related code here: it allows us to write integration tests that ensure compatibility. The goal here is to make the API and behavior the same regardless of whether you are compiling to WebAssembly or a binary executable, and integration tests are an important part of making sure we adhere to that goal. It's possible to have integration tests if the code lives in two different repositories, but again that would make it more difficult to manage if we ever made changes in the future. Finally, keeping this code in the same place makes it easier for developers who want to enable/disable specific transports instead of just using the libp2p defaults. In that case, they can just import |
My general concern is that it means anyone wanting to modify the normal go transport needs to make sure they don't accidentally break the wasm transport. However, you're right. It's a lot simpler to explain "here's the websocket transport" instead of "here's the websocket transport for platform A, here's the websocket transport for platform B". Note, in practice:
However, this is still probably the best way to do it. |
Maybe |
That should be fine. The user will just get an error on construction if they ask the browser to listen on a websocket address. |
I understand that concern and I'm happy to help field any issues that might come up 😄This package is critical for my project at work so I can commit some time to help maintain it. I'd also be happy to add some documentation here to help get contributors get oriented since most people will not have prior experience with Wasm.
Yeah that's a good point. There will be some difference in configuration but it's still easier if you don't have to import different packages for different platforms. And you're right it's probably still a good idea to have go-libp2p choose the appropriate defaults. |
I should've shared this before. For anyone unfamiliar with how WebAssembly works in Go, these are good resources to get started: |
@Stebalien I addressed all your feedback. I still need to add more robust error handling. Feel free to take another look now or I'll ping you again after error handling has been improved. |
This allows us to avoid the rather obtuse FileReader API and makes it easier to read incoming data syncronously.
@Stebalien I took a stab at adding proper error handling. It was definitely tricky to work with the event-driven JavaScript API. It also has some other weird quirks like the fact that the The basic idea is:
|
I removed the WIP prefix from the title. @Stebalien I believe I've addressed all your feedback and this PR is ready for final review. Ideally we can also get the tests working in TravisCI. |
@Stebalien just a friendly reminder that I believe I've addressed all your feedback and am waiting for you to take another look. Let me know if I missed anything or if you have any new thoughts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. We should be getting close.
(and sorry for the delay).
c.currDataMut.RUnlock() | ||
if n == 0 { | ||
if c.firstErr != nil { | ||
return 0, c.firstErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's firstErr
? If there's a connection error (not just a polite close), we should abort immediately without bothering to read the rest of the buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stebalien Apparently, due to security concerns most browsers always use code 1006: connection unexpectedly closed
whenever a WebSocket connection is closed. I don't think there is any way for us to differentiate between a polite close and an unexpected error in the underlying WebSocket connection.
Given that constraint, what do you want to do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing anything to suggest that. From the looks of it, 1006 means "something went wrong". We just need to distinguish between that case and "the connection was closed correctly".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see https://stackoverflow.com/a/53340067. That's a bug in chrome and there's nothing we can do about it. Where are you seeing the talk about this being a security feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the link I shared above:
However, Chrome will rarely report any close code 1006 reasons to the javascript side. This is likely due to client security rules in the WebSocket spec to prevent abusing websocket. (such as using it to scan for open ports on a destination server, or for generating lots of connections for a denial-of-service attack).
Empirically, 1006
is the only error code I've observed from Chrome and I don't see a way to get other information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm.. After reading this again, the quote doesn't exactly back up what I said. It just says that if there is a code 1006
error there is no way to get additional information. Let me do some more research. Maybe the reason I've been seeing 1006
has to do with how the connection is closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After doing some more testing, I have observed Chrome reporting status code 1005
(also with an empty reason
field). However, this doesn't really help us much. I can't find any way to produce code 1000
which is what we would expect for a polite close. This might be because of the bug in Chrome that you mentioned or because gorilla/websocket
isn't sending the appropriate close frame?
There is that comment you shared from StackOverflow that claims Chrome never reports code 1000
but I did find these tests in Chromium which suggest there are at least some cases where it does report code 1000
.
I'm not sure how to proceed and I have two open questions:
- Can we rely on close codes to differentiate polite closes from unexpected errors?
- How can we rectify the asynchronous JavaScript API (which uses the
onclose
event to communicate most errors) with the largely synchronous interface expected by libp2p (e.g. errors should be returned fromRead
andWrite
when they occur)? The way I do this now is by saving the first error that is reported viaonclose
in thefirstErr
field and then returning it at the next opportunity. It sounds like maybe you aren't satisfied with this approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this. Turns out, we aren't sending a nice close message: #54
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like maybe you aren't satisfied with this approach?
I'm fine with an async approach. The network itself is async so there's nothing we can do about that.
My concern is:
- When we receive an error, we should start failing reads/writes ASAP. This doesn't need to be synchronous, I just don't want to bother finishing reading.
- When we receive an EOF (nice close), we should continue reading to the end.
Note: We're mostly getting into pedantic correctness edge-cases here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, actually almost there now.
c.currDataMut.RUnlock() | ||
if n == 0 { | ||
if c.firstErr != nil { | ||
return 0, c.firstErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this. Turns out, we aren't sending a nice close message: #54
c.currDataMut.RUnlock() | ||
if n == 0 { | ||
if c.firstErr != nil { | ||
return 0, c.firstErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like maybe you aren't satisfied with this approach?
I'm fine with an async approach. The network itself is async so there's nothing we can do about that.
My concern is:
- When we receive an error, we should start failing reads/writes ASAP. This doesn't need to be synchronous, I just don't want to bother finishing reading.
- When we receive an EOF (nice close), we should continue reading to the end.
Note: We're mostly getting into pedantic correctness edge-cases here.
c.signalClose() | ||
c.mut.Lock() | ||
// Store the error in c.firstErr. It will be returned by Read later on. | ||
c.firstErr = errorEventToError(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we're taking the lock when setting this but not when reading it. We should either:
- Take the lock when reading it.
- Guard it with the channel:
- Set it within
signalClose(err)
. - Set it before closing.
- Only read it after observing that the close-signaling channel is actually closed.
|
||
// checkOpen returns an error if the connection is not open. Otherwise, it | ||
// returns nil. | ||
func (c *Conn) checkOpen() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just wait for the "close" event to happen then check the closeSignal
channel after that? That should save us a round-trip through the webasm runtime.
select {
case <-c.closeSignal:
return true
default:
return false
}
} | ||
|
||
// arrayBufferToBytes converts a JavaScript ArrayBuffer to a slice of bytes. | ||
func arrayBufferToBytes(buffer js.Value) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... go 1.13 is out. Want to go ahead and use CopyBytesToGo
and CopyBytesToJs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal preference would be to release a version that works with Go 1.12 and then include this optimization. There are some speed bumps with upgrading to Go 1.13 so some projects (0x Mesh included) might not have updated yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM. Note: libp2p is likely to update to 1.13 before too long as we need it for ed25519+TLS.
@Stebalien Thanks for your latest feedback and for investigating that graceful close issue. I'm sorry for the delay on my end. At 0x we've been gearing up for our next big protocol upgrade (version 3) so I've had to temporarily de-prioritize WebAssembly/browser integration. I'll get back to this PR as soon as I can. |
Np. And you never need to apologize for a delay. If it comes to it, I can address my own feedback and merge. Thanks for the heads up and the awesome work! |
This PR is still a work in progress as there are some details to work out (especially around error handling). I should probably also do one more round of cleanup. However, the basics of the transport are working. This PR also includes an integration test which covers basic communication over a stream between a server (running the vanilla Go transport) and a headless Chrome browser (running the WebAssembly/JavaScript version).
After doing some more investigation, I think it makes more sense to introduce direct WebAssembly support here rather than adding it to the
gorilla/websocket
dependency. The main reason is that thegorilla/websocket
API has a relatively large surface area, their own set of configuration options, and their own way of establishing connections that don't translate well to Wasm/JavaScript. The subset of thegorilla/websocket
API that is used in this package is much easier to translate.The
Listen
method is not supported Wasm/JavaScript environments and calling it will always return an error. Browsers cannot listen for incoming WebSocket connections; they can only dial.I'm open to suggestions about how to organize the files, but right now I'm using the file name suffixes
_go
and_js
to denote files that only work in vanilla Go or JavaScript/Wasm environments, respectively. Files without any suffix work in both environments.So for example, conn.go has been split into three files:
While the implementations defer, the exported API in
_go
and_js
files is the same and a lot of code is shared.