Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change_dependencies feedback #91

Closed
hustf opened this issue Feb 18, 2018 · 55 comments
Closed

Change_dependencies feedback #91

hustf opened this issue Feb 18, 2018 · 55 comments

Comments

@hustf
Copy link
Collaborator

hustf commented Feb 18, 2018

Due to the complexity of supporting both HTTP and HttpServer, we liberally merge PRs to this branch for easier experimentation. There's also issue #84, but that doesn't need to be cluttered with 'caps and tabs' details.

'Code review' and TO DO comments go here.

I'm working on a file-by-file list, but this is the most core. On branches HTTP/master and Websockets/Change_dependencies.

julia> begin
           using HTTP
           using WebSockets
           port = 8000
           @async HTTP.listen("127.0.0.1",UInt16(port)) do http
                  if WebSockets.is_upgrade(http.message)
                      WebSockets.upgrade(http) do ws
                          while ws.state == WebSockets.CONNECTED
                              msg = String(read(ws))
                              println("Echoing $msg") # Write the received message to the REPL
                              write(ws,msg)
                          end
                      end
                  end
            end
            sleep(2)
            WebSockets.open("ws://127.0.0.1:$(port)") do ws
                   write(ws, "Foo")
                   write(ws, "Bar")
                   println("Got back ", String(read(ws)) )
                   println("Got back ", String(read(ws)))
                   close(ws)
           end
       end
I- Listening on: 127.0.0.1:8000
I- Accept:  🔗    0↑     0↓    127.0.0.1:8000:8000 ≣16
Echoing Foo
Got back Foo
Echoing Bar
Got back Bar
INFO: Server received OPCODE_CLOSE
Echoing
HTTP.MessagesE- .Error:   💀    1↑     1↓🔒   127.0.0.1:8000:8000 ≣16Response
:|
e = """read: connection reset by peer (ECONNRESET)

HTTP/1.1 101 Switching Protocols
|  Upgrade: websocket
catch_stacktrace() = Connection: Upgrade
StackFrame[]Sec-WebSocket-Accept: aWCLPwDjhD4k2F6In03w96kczvs=


I- """Closed:  💀    1↑     1↓🔒   127.0.0.1:8000:8000 ≣16

@hustf
Copy link
Collaborator Author

hustf commented Feb 18, 2018

Oh, I forgot: @EricForgy.

As part of the review, and also as a kind of template for assisting in modifying the test suite, I added a more verbose / explicit chat example. I'll ask you to kindly review it when we make a PR for merging the branch.

It contains an alternative approach to doing without IDs. It also does away with JSON (which is very slow, this will improve with JSON2 and Julia 0.7). The slowness doesn't matter in examples, but it's nice to provide 'scaleable' examples.

The motivation is otherwise listed at the top of chat_explore.jl

Occasionaly I still manage to get untrappable errors from HttpServer. WebSockets is not part of the stack trace. I will report here if I manage to provoke it again; it would be lovely if we could modify HttpServer to trap the so-called errors.

Next step for me will be to extend chat_explore.jl with a simultaneous HTTP port server. Then provide show(io, STDOUT) for the types defined here.

@EricForgy
Copy link
Contributor

I am super happy @samoconnor found the problem with the for loops and this PR reflects the fix: #92 😊

image

It should be much easier to write tests now and ultimately do some benchmarking / comparisons with HTTP.WebSockets. I'm super optimistic about the outcome of all this 👍

@hustf
Copy link
Collaborator Author

hustf commented Feb 20, 2018

Not a comment tied directly to any particular lines of code now. Just a way to structure my own thoughts here.

So my current understanding is this: We're always working on a single thread of execution. In normal code, we don't expect our tasks to be interrupted except when we sleep() or yield(). But when we're reacting to events like sockets opening or closing, the new tasks take immediate control. When e.g. a browser closes all of its tabs or we're restarting a server with several open tabs, all the tasks are just barely able to start before they're interrupted by the next close or open event. Browsers open lots of sockets very quickly, even when we're just typing in urls. And the same kind of thing happens while errors are being thrown. The call stack output is ready quite a while after the initial error event.

So assigning place for a socket in a collection, and THEN putting in the reference to it is bound to cause trouble. That's one reason why httpserver sometimes just fails while opening sockets.

Not a computer guy, but this is what I think is good practice just now: If we can't update our mutable reference collections in just one line, we shouldn't do it at all.

In other words,

push!(COLLECTION, client) 

is fine. Some other solutions will cause random pain (glaring at HttpServer).

println(STDOUT, 1, 2) 

...cause random pain. We'll often find '2' way below in the logs. Hence, write to a temporary buffer, then output the buffer.

@hustf
Copy link
Collaborator Author

hustf commented Feb 20, 2018

Ref. WebSockets/HTTP.jl:33-37 and 58-61.

        try
            f(ws)
        finally
            close(ws)
        end

We are not giving the user the opportunity to write his own gatekeeper / selector functions, for example based on header "Sec-WebSocket-Protocol" (edit: Wrong. The user could define f in the context of the outer call, defining a 'nested function' in Julia terminology. In the outer call, http and binary is defined).

A neater and more backwards-compatible style could be achieved by also accepting two- or three-argument f. For example:

f(ws)
f(ws, http)
f(ws, http, binary)
f(ws, subprotocol)

But how to find out which arguments f take? I don't know any way to dispatch elegantly on the number of arguments that argument 'f' take. So would you instead approve of a pull request based on something like this? Or something of the sort?

  upgrade_ws(f::Function, http::HTTP.Stream; binary=false)
  upgrade_ws_http(f::Function, http::HTTP.Stream; binary=false)
  upgrade_ws_http_binary(f::Function, http::HTTP.Stream; binary=false)

@samoconnor
Copy link

new tasks take immediate control [...] tasks are just barely able to start before they're interrupted ...

I don't think its quite right to thing about tasks being interrupted or other tasks taking control. Tasks are not interrupted or preempted by other tasks. The only way an inactive task ever gets to run is if an active task calls wait(). Inside wait() the runtime runs scheduled tasks in a loop and/or calls process_events(). Inside process_events() libuv checks all the currently active files/sockets/pipes for activity and runs IO callbacks.

So, if you have a task that never calls wait() no other task will run.
The places that wait() is called include: sleep, yield and any place that an IO operation has to wait for the network, disk access etc.

The key thing to realise is that the task "scheduler" is not an all-powerful omnipresent entity that lives in a parallel dimension (a thread scheduler is like that). The task "scheduler" is just an ordinary Julia function that has to be explicitly called by another Julia function.

The "scheduler" is the wait() function: https://github.com/JuliaLang/julia/blob/master/base/event.jl#L255

function wait()
    while true
        if isempty(Workqueue)
            c = process_events(true)
            if c == 0 && eventloop() != C_NULL && isempty(Workqueue)
                # if there are no active handles and no runnable tasks, just
                # wait for signals.
                pause()
            end
        else
            reftask = poptask()
            if reftask !== nothing
                result = try_yieldto(ensure_rescheduled, reftask)
                process_events(false)
                # return when we come out of the queue
                return result
            end
        end
    end
    # unreachable
end

@samoconnor
Copy link

... gatekeeper / selector functions, for example based on header "Sec-WebSocket-Protocol"

You could add a request::HTTP.Request field to struct Websocket and then define:

header(ws::WebSocket, k, d="") = header(ws.request, k, d)
hasheader(ws::WebSocket, k) = hasheader(ws.request, k)

That would give the f(ws) code easy access to the request headers.

@samoconnor
Copy link

Re dispatching based on different function signatures in a do function:
HTTP.Servers.handle_stream has an example of this using applicable(f, ...):

function handle_stream(f::Function, http::Stream)

    try
        if applicable(f, http)
            f(http)
        else
            handle_request(f, http)
        end
    catch e
        ...
    end

    closeread(http)
    closewrite(http)
    return
end

@EricForgy
Copy link
Contributor

A neater and more backwards-compatible style could be achieved by also accepting two- or three-argument f. For example:

f(ws)
f(ws, http)
f(ws, http, binary)
f(ws, subprotocol)

I have some ideas to make this clean for users. Will try some things out and report back 😊

@EricForgy
Copy link
Contributor

I think we are close to having WebSockets.jl in shape to work as a server with both HTTP and HttpServer and as a client with HTTP. Once we have a more complete / robust set of tests, then I would suggest comparing / contrasting WebSockets.WebSocket and HTTP.WebSockets.WebSocket more deeply and consider revising the WebSocket type accordingly. For example, the String bytes issue together with the expected performance enhancements, I am liking @samoconnor's txpayload and rxpayload. HTTP.WebSockets was immune to the for loop String bytes issue. On the offer hand, I like the three states ( CONNECTED, CLOSING and CLOSED) of WebSockets.WebSocket.

My wish is that a single WebSockets module will emerge from this that we're all happy with 🙏

@hustf
Copy link
Collaborator Author

hustf commented Feb 21, 2018

Sam, thank you for taking the time to correct my bad! It's so easy to construct wrong explanations, and hard to get out of it without help.

The example below shows you're right and also shows me how I can debug some other code. It also demonstrates some seeming randomness in Julia's behaviour that I saw in the initial browser tests. This has to do with compiling, and starts to produce consistent output on the third run of the begin-end block. The first run may not print anything at all.

const CLOSEAFTER = Base.Dates.Microsecond(1)
function gg()
           print("|", "|")
           print("|", "|")
end
function ff()
           print("-", "-")
           print("-", "-")
end
begin
         t0 = now()
        while now() - t0 < CLOSEAFTER
            @async ff()
            @async gg()
        end
end

So for consistent message reception, perhaps good practice is simply sending a couple of echo messages at each application start. And absolutely avoid @schedule and @async in loops. Increase CLOSEAFTER if you dare.

@hustf
Copy link
Collaborator Author

hustf commented Feb 21, 2018

Eric, I'm sure you are right. With a small exception perhaps. Some of the symbols used for logging status in HTTP are not displayable in Windows. They work fine in cygwin, but are not covered by the Dejavu Sans Mono, which is the best we have in Atom, REPL or VSCode editors.

You are wisely not choosing now at once between the two suggested approaches: a) Extend Websocket, b) Extend the function (s).

I implemented the function approach with Sam's suggestion by overwriting the update function. chat_explore.jl

The example is now serving chat_explore.html on BOTH dependencies. HTTP fans can open localhost 8080, and the old guys can open 8000.

Currently, I'm trying to share the websocket handler and get this message, which I can't interpret today.

julia> include("chat_explore.jl")
INFO: Loading HTTP methods...
INFO: Loading HttpServer methods...
Chat server listening on 8000
INFO: Start HTTP server on port 8080
Listening on 0.0.0.0:8000...

I- Listening on: 127.0.0.1:8080
julia> Received: userName:fe
Tell everybody about fe
Received: fff
I- Accept:  🔗    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Accept:  🔗    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Accept:  🔗    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Accept:  🔗    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Closed:  💀    1↑     1↓🔒   127.0.0.1:8080:8080 ≣16
I- Closed:  💀    1↑     1↓🔒   127.0.0.1:8080:8080 ≣16
I- Accept:  🔗    0↑     0↓    127.0.0.1:8080:8080 ≣16
E- MethodError(gatekeeper_newtype, (WebSockets.WebSocket{TCPSocket}(TCPSocket(Base.Libc.WindowsRawSo                              cket(Ptr{Void} @0xffffffffffffffff) closed, 0 bytes waiting), true, CLOSED::WebSockets.ReadyState =                               3),), 0x0000000000005583)
|  catch_stacktrace() = StackFrame[#upgrade#34(::Bool, ::Function, ::#gatekeeper_newtype, ::HTTP.Str                              eams.Stream{HTTP.Messages.Request,HTTP.ConnectionPool.Transaction{TCPSocket}}) at HTTP.jl:59, server                              _def_newtype at chat_explore.jl:271 [inlined], handle_stream(::#server_def_newtype, ::HTTP.Streams.S                              tream{HTTP.Messages.Request,HTTP.ConnectionPool.Transaction{TCPSocket}}) at Servers.jl:446, (::HTTP.                              Servers.##46#47{#server_def_newtype,HTTP.ConnectionPool.Transaction{TCPSocket},HTTP.Streams.Stream{H                              TTP.Messages.Request,HTTP.ConnectionPool.Transaction{TCPSocket}}})() at task.jl:335]
I- Closed:  💀    1↑     1↓🔒   127.0.0.1:8080:8080 ≣16
I- Closed:  💀    0↑     0↓🔒   127.0.0.1:8080:8080 ≣16 inactive 70.0s
I- Closed:  💀    0↑     0↓🔒   127.0.0.1:8080:8080 ≣16 inactive 70.0s

@samoconnor
Copy link

Some of the symbols used for logging status in HTTP are not displayable in Windows.

Menlo is a good programming font: https://github.com/hbin/top-programming-fonts

@samoconnor
Copy link

In your ff() gg() example you are doing console IO in the async code (println), this will lead to unpredictable task switching.

@hustf
Copy link
Collaborator Author

hustf commented Feb 21, 2018

Sam, thanks for the font tip.
The unpredictable task switching doesnt seem to happen in this case because both functions take exactly the same time. But my initial logs were full of it.

I like the log style where you use symbols like skulls and chains. That comment about the font wasnt a good one. If it works out, lets put the font tip in readme.md and define similar show methods.

@hustf
Copy link
Collaborator Author

hustf commented Feb 22, 2018

Installed the font, Sam. It's nice but those symbols still don't display in other than the mintty cygwin terminal. Example below.

Also updated the chat example. It's closing after one message, will fix. But it is now working for exploring types so we can find a good common interface.

Old-style websockets pass a HttpCommon.request to the gatekeeper function. The useful part is its request.headers field, which is a Dict{String, String} currently.

For HTTP origin websockets, the http.message.headers field ought to be sufficient info. The current example in chat_explore.jl doesn't make a Dict out of it.

The gatekeeper function can't do the subprotocol upgrade. This would be an easy addition to add to WebSockets/HTTP.jl, I think.

Main> I- Accept:  �    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Accept:  �    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Accept:  �    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Accept:  �    0↑     0↓    127.0.0.1:8080:8080 ≣16
I- Closed:  �    1↑     1↓�   127.0.0.1:8080:8080 ≣16
I- Closed:  �    1↑     1↓�   127.0.0.1:8080:8080 ≣16
Received: userName:HTTPfan@8080
Tell everybody about HTTPfan@8080
I- Closed:  �    0↑     0↓�   127.0.0.1:8080:8080 ≣16 inactive 26.2s
Received: userName:OldBoy@8000
Tell everybody about OldBoy@8000
Received: I like Mike
Closing connection with OldBoy@8000
Received: Shouldn't have said that, OldBoy!
Closing connection with HTTPfan@8080
I- Closed:  �    1↑     1↓�   127.0.0.1:8080:8080 ≣16 inactive 184.5s

@EricForgy
Copy link
Contributor

True. Minor issue, but the cool symbols do no render in Windows 😊

@hustf
Copy link
Collaborator Author

hustf commented Feb 25, 2018

Just an update / status.
I'm working on an efficient logging system, meaning that there should be practically no delay when logging to Base.DevNullStream. It's implemented in a private server package so far. Nothing to commit just yet.

  • chat_explore.html is modernized and simplified to my eyes. This will serve as a part template for rewriting socket tests. It is tempting to extend it with drag-n-drop for various file types, demonstrating binary sockets.
  • chat_explore.jl distinguishes between various error types and has an added close_from_outside() function. The aim with this file has moved towards a template for 'real' websocket server. I.e. one that runs all day without hazzle.

Base.show methods are still not added.
Need to check/ implement subprotocols for HTTP source websockets.

How much time this takes to finish depends on the local weather here. Eric, do you expect to make more contributions here?

@EricForgy
Copy link
Contributor

EricForgy commented Feb 26, 2018

Hi @hustf,

As far as I'm concerned, I'm happy with the current state of things. I expect some private packages might break, but active packages in METADATA seem unaffected.

I like the fact WebSockets now has almost no real dependencies so others can extend it easily.

I still think we need a shootout between WebSockets.jl and HTTP.WebSockets.jl which will likely lead to more changes to WebSockets.jl as the two converge, e.g. we may want to add payload vectors to eliminate the need for allocation, but that is second order and can probably done later.

Personally, I don't expect to make any significant changes any time soon 😊

@samoconnor
Copy link

Hi @hustf, @EricForgy,

I have some questions that maybe you can answer given your experience using WebSockets with real live browser and js framework implementations...

  • What's up with fragmentation and frames? The RFC says "Unless specified otherwise by an extension, frames have no semantic meaning". Do you have to think about frames in real world WebSockets apps? Are there any commonly used "extensions" that specify a meaning for frames? The read method goes to some effort (with efficiency cost) to collect all the fragments of a frame before returning. Is this an essential feature in real-world use?

function Base.read(ws::WebSocket)
if !isopen(ws)
error("Attempt to read from closed WebSocket")
end
frame = read_frame(ws)
# Handle control (non-data) messages.
if is_control_frame(frame)
# Don't return control frames; they're not interesting to users.
handle_control_frame(ws,frame)
return frame.data
end
# Handle data message that uses multiple fragments.
if !frame.is_last
return vcat(frame.data, read(ws))
end
return frame.data
end

  • Is a WebSocket more like a Channel (a message stream) or an IO stream (a byte stream)? i.e. in real world apps, are you receiving odd chunks of bytes into a buffer, then checking for delimiters, complete JSONs, etc? Or do you always receive a nice complete chunk containing exactly one JSON message?

  • Do real world applications care about text-mode vs binary mode? or does everyone tend to use the same mode?

@hustf
Copy link
Collaborator Author

hustf commented Feb 26, 2018

Looking fwd. to reply. In the meantime: "we may want to add payload vectors to eliminate the need for allocation".

Logging buffered amount? Access to txpayload rxpayload? Please elaborate. Also, what have you found in the thinking box about which info the user might need to modulate streaming speed?

@hustf
Copy link
Collaborator Author

hustf commented Feb 26, 2018

Sam,
at parties, people always talk about TCP and changing packet sizes. So it's tempting to pretend I know something. But I don't. Websockets are a layer on top of TCP, so the minimum buffer is a packet size. Frame sizes could be made to fit in a packet if you need ultimate responsiveness. Last time I listened, there was a 64 KiB definite upper limit to packet size.

We'd like to support somebody using Julia to get rich on second-to-second trading, or somebody making RC cars that send intermittent bursts of sensor data. Data must be buffered on a frame-to-frame basis. The sensors / PLCs wouldn't know the length of the message when they start sending.

Chrome obscures the difference between message and frame (check F12 / network / ws - there's no distinction). The javascript user don't get a chance to decide. Perhaps the browser adapts based on the current network? So extended logs from our local browser tests wouldn't give a definite answer.

Regarding binary, there's an example in the tests. Javascript doesn't get access to the binary array. It can make views into it, but it's job is really to just pass a reference to the lower-level part of the browser that actually does the job of displaying the picture (in Internet Explorer, we do process it). It's hard for us now to imagine how this, and the evolving subprotocol standards, can be applied. The libraries/ subprotocols that I read up on tries to hide this kind of detail from the front-end-developer ('install our library on the server, try with our server first!'). Is there any benefit for us in removing binary support?

@samoconnor
Copy link

at parties, people always talk about TCP

You should try to get invited to better parties 😂

Is there any benefit to removing binary support?

Not that I know of, it's more a matter of learning what are the most common real-world use cases in order to be able to reason about performance tradeoffs.

It seems like there is a choice with messages and frames.

  1. make the WebSockets.jl API message-based:
    write(::WebSocket, message) and take!(::WebSocket) -> message, or...
  2. make WebSockets.jl API frame based:
    write(::WebSocket, frame, last=true) and readavailable(::WebSocket) -> frame

If we take option 2. and basically just ignore the concept of messages, there are some performance benefits (less state to deal with, don't have to buffer-up fragments before returning result).

This is an issue that only effects the receiving of data, and given that the main use case is Julia server receiving data from chrome/safari/ie, the questions become:

  • Is it more convenient to have API 1. or API 2.?
  • Are there degenerate performance cases for API 1. e.g. if the js code does send(Blob) for a huge file it wouldn't be good to buffer the whole message before returning.

Perhaps the answer is to provide take!(::WebSocket) -> message and readavailable(::WebSocket) -> frame and let the person writing the Julia server decide what suits them best.

@EricForgy
Copy link
Contributor

EricForgy commented Feb 27, 2018

Just two cents before heading into a day-long meeting...

WebSockets are for "web" (obviously 😃 ), so for WebSockets clients, I would look at the JavaScript API and for WebSockets servers, I would look at the Node API (which I would think is essentially the same?). We should not be too far off from the JS implementations I would think.

Edit:

Perhaps the answer is to provide take!(::WebSocket) -> message and readavailable(::WebSocket) -> frame and let the person writing the Julia server decide what suits them best.

This sounds ideal to me too.

@hustf
Copy link
Collaborator Author

hustf commented Mar 1, 2018

WebSockets were intended for "web", but WebSockets for Julia just could be the optimal choice for controlling hardware or calculation-heavy back-end. It's hard to beat node.js for the general web user.

Zmq defines some non-exported functions that you start diving into when meeting buffering problems. It would be nice to support API 2, but since it's clearly for highly optimized applications I suggest we only export message-based methods.

@hustf
Copy link
Collaborator Author

hustf commented Mar 1, 2018

Just in case you guys see a solution: Would it be worth to optimize this little function for the logger? It's used in conjuction with a truncating function to show a little of the content in a Response. I don't really know when RegExp's are faster.

function _compact_format(s::String)
   				replace(s, "\r\n", "\n") |>
   				s-> replace(s, "\n\n", "\n") |>
   				s -> replace(s, "\t\t", "\t") |>
   				s -> replace(s, "\n\t", "\n") |>
   				s -> replace(s, "\n", "\n\t") 
end

@hustf
Copy link
Collaborator Author

hustf commented Mar 18, 2018

Just to let you know things are coming along. I'm working on adapting the specialized logger, which is quite time consuming. With the current pace, it may take yet another month before merging this branch, but again, dependent on local weather.

@EricForgy
Copy link
Contributor

I am not concerned at all, but I am confused why a logger would hold up this PR. Isn't it two completely separate things? 😊

@hustf
Copy link
Collaborator Author

hustf commented Mar 18, 2018

I admit to being in doubt here too, but I think we need to be thorough this time. A logger is already implemented in the test suite, but it uses globals for redirection of output (which allocates some memory at each call and affects speed for small messages).

Websockets is a very enabling technology so I hope we can reduce the effort for new users, and also make the test suite easier to maintain by including the improved logger with show methods and @requires dependencies into the main code.

We also introduce a new and very enabling feature, but I am loath to go out and proudly recommend dropping zmq without having more examples. Search discourse zmq for recent examples. Should we add links to eg Pages.jl?

@hustf
Copy link
Collaborator Author

hustf commented Mar 31, 2018

Comparing ZMQ to the revamped WebSockets: ZMQ has plenty speed, but on Windows also a lot of latency unless we yield() to it several times per message. It might do better in a threaded test (?). For user interfaces, latency and warm-up is really critical, so this branch looks great so far for tying together GUIs with backends and with little code.

Small bed time story: I imported WebSockets.open, then modified 'open' in order to include optionalProtocol. Around two pages up from the bottom of the stack trace I find:

 [29] (::HTTP.#kw##open)(::Array{Any,1}, ::HTTP.#open, ::Function, ::String, ::String, ::Array{Pair{String,String},1}) at .\<missing>:0
 [30] #open#5(::Bool, ::Bool, ::String, ::Array{Any,1}, ::Function, ::Base.#readstring, ::String) at C:\Users\F\.julia\v0.6\Graphloop\src\instructions.jl:478
 [31] #readdlm_auto#11(::Array{Any,1}, ::Function, ::String, ::Char, ::Type{T} where T, ::Char, ::Bool) at .\datafmt.jl:134

...so reading a text file ends up in trying to open a HTTP connection... Just another story about the wrong path. The 'open' methods are perhaps a little too generic, although they are not normally exported. '_open' might be better style?

On another note, user code could be clearer and shorter if we added more information into the WebSocket type rather than calling multiple methods (like in the current chat_explore.jl upgrade function). If we do, we can more often re-use functions. This information:

  1. optionalProtocol (the term used in the standard - let's stop using 'subprotocol')
  2. binary or text
  3. headers identifying who's calling. Less important.

If this info could be used in dispatching, it would make for elegant code, depending on preferences. But T in WebSocket{T} is already used to distinguish between socket types so I guess we don't dare touch that.

So a couple of more fieldnames to WebSocket is preferrable. What were the disadvantages that led you to take away the .id field? Types in Julia usually have very few fields. Why is that?

@samoconnor
Copy link

... imported WebSockets.open, then modified 'open' ...

Importing HTTP.WebSockets.open into a module that relies on open being Base.open isn't right. HTTP.WebSockets.open and Base.open are two different functions. If you want to define a new method of HTTP.WebSockets.open for some new parameter types, you can define it like this:

module MyModule
    HTTP.WebSockets.open(f::Function, x::Dict) = HTTP.WebSockets.open(f, x["url"])
end

What were the disadvantages that led you to take away the .id field?

It just smelled bad. It suggested unneeded coupling, poor separation of concerns, or leaky abstraction layering.

Wikipedia has some decent summaries of basic software engineering principles:
https://en.wikipedia.org/wiki/Code_smell
https://en.wikipedia.org/wiki/Separation_of_concerns
https://en.wikipedia.org/wiki/Coupling_(computer_programming)
https://en.wikipedia.org/wiki/Information_hiding
https://en.wikipedia.org/wiki/Software_design_pattern
https://en.wikipedia.org/wiki/Composition_over_inheritance
https://en.wikipedia.org/wiki/Abstraction_layer
https://en.wikipedia.org/wiki/Code_refactoring
https://en.wikipedia.org/wiki/Extensibility

@hustf
Copy link
Collaborator Author

hustf commented Apr 1, 2018

Thanks for the reply, Sam. I appreciate those hand-picked links. Having read up on words, I may use some of them against you in future discussions!

You are right about importing HTTP.WebSockets.open into a module which relies on 'open' would be bad. But I didn't. My module relied on 'readdlm' with a file name as an argument. Which is accessible from any module. This type of errors used to occur much more often with Julia a couple of versions ago. People are getting more used to the code style which makes Julia actually work.

I don't know what is the best code style for making Julia actually work. One idea has been to export the functions expected to be used outside. The best documentation has sometimes been what you get when you type 'using Colors;whos(Colors)'.

HTTP follows a different style than that, which is OK. Another style idea is prefixing internal functions with underscore: WebSockets._checkupgrade(..) would be an example.

Thank also for the explanation about Websocket.id. Makes sense, and a bit more so after reading the wikipedia links.

I'll add another reason for keeping the number of fields down. It becomes very difficult to make flexible type constructors when there's more than say four fields in a type. Long argument lists in general are a pain.

@hustf
Copy link
Collaborator Author

hustf commented Apr 1, 2018

I just pushed some changes after testing with Firefox. This is just a preliminary, working version before implementing other changes:

  1. optionalProtocol checking moving into WebSockets. A turned-down socket connection should receive some standardized response. No need for every user to learn that.
  2. Changing the field names. Still open to discussion of course.

@samoconnor
Copy link

[29] (::HTTP.#kw##open)(...)
[30] #open#5(...) at C:\Users\F.julia\v0.6\Graphloop\src\instructions.jl:478
[31] #readdlm_auto#11(...) at .\datafmt.jl:134

If you have defined a method of HTTP.WebSockets.open, or HTTP.open and Base.readdlm has called it instead of Base.open that is a bug in Julia that should be reported here: https://github.com/JuliaLang/julia/issues

@hustf
Copy link
Collaborator Author

hustf commented Apr 2, 2018

..but I dont see this as a bug at all. There is no reason to import HTTP.open normally.

Httpserver and the old websocket avoids this little trap by defining lots of ad hoc types like httphandler and websockethandler. HTTP avoids it by not exporting, not bringing it into the user code scope.

I have also not tried to reproduce. There might be other explanations, although the stack frame seems clear.

If there's an actual crash with base syntax, which is always in scope of course, it is clear where the change should be made. Open is probably not moving out of base ever, although there used to be many more such functions to avoid in previous versions.

@hustf
Copy link
Collaborator Author

hustf commented Apr 3, 2018

Just pushed an update, removing some lines with temporary debug code that remained by accident.
Also, checking "Connection => "upgrade" or "keep-alive, upgrade" in a faster way. The previous commit called 'contains upgrade'.

This temporary code remains in HTTP.ugrade:

    catch err
        warn(STDERR, "WebSockets.HTTP.upgrade: Caught unhandled error while calling argument function f, the handler / gatekeeper:\n\t")
        mt = typeof(f).name.mt
        fnam = splitdir(string(mt.defs.func.file))[2]
        print_with_color(:yellow, STDERR, "f = ", string(f) * " at " * fnam * ":" * string(mt.defs.func.line) * "\nERROR:\t")
        showerror(STDERR, err, backtrace())

I am not sure we should do any try-catch in the upgrade function at all. HTTP would catch the errors anyway. It would probably be better to play along with HTTP's error catching and logging.

@hustf
Copy link
Collaborator Author

hustf commented Apr 8, 2018

r. status provokes an error if r is is a Request and the method is not GET. From Http.jl:90:

function is_upgrade(r::HTTP.Message)
    if (r isa HTTP.Request && r.method == "GET" || r.status == 101) 
   ....

The error is

E- ErrorException("type Request has no field status")

I guess the use case for checking a Response is debugging? If that's interesting I believe this would work:

if (r isa HTTP.Request && r.method == "GET")  || (r isa HTTP.Request && r.status == 101) 

@EricForgy
Copy link
Contributor

This was fixed in HTTP.jl, but maybe is copy / pasted here before the fix?

PS: Flight was delayed. It is 3:24am and I just checked into the hotel with meetings in the morning so this is a drive by comment without looking at the code 😅

@hustf
Copy link
Collaborator Author

hustf commented Apr 9, 2018

I hope the meeting turned out well anyway! I just pushed the correction above.
Also, it's fixed in the same way in HTTP. I must do Pkg.update() more often!

@hustf
Copy link
Collaborator Author

hustf commented Apr 9, 2018

Regarding "binary" websockets.
Ehem. There really is no such thing. To explain why that distinction ever got important: We just recently added optionalProtocol (subprotocol) to this package.

A optional protol could choose to have every 42nd message be text if there was a reason for it, although all frames in a message should be of the same type. Eric, you actually do something like that in the rewritten binary test. You transfer the UUID as text, then in the next message you read binary data.

On the browser side, the 'user', i.e. the javascript guy, needs to poke in at a slightly deeper level to speed up how incoming binary messages are interpreted. So the binaryType is "arraybuffer" or "blob". The very same websocket can happily continue to receive text messages. Ideally, javascript never accesses the contents of the binary but passes references to where they are needed.

But this browser distinction does not apply to clients generated by e.g. Julia or Python. So we can choose to test for 'binary' websockets but this would rather belong in the examples folder.

And the Websocket object or the upgrade function shouldn't convey information about 'binary'. That info is fully defined by the optionalProtocol.

P.S.
The HttpServer handshake isn't working in the pushed version, due to lowercase / uppercase. My bad, will fix.

@hustf
Copy link
Collaborator Author

hustf commented Apr 12, 2018

Just pushed some changes, still a temporary version.

Modified the HTTP upgrade function to be in line with the old upgrade function:

  • Check length of key before calling generate_websocket_key
  • Postponing setting response status to 101 before all checks are made.
  • Check for supported subprotocols
  • Includes subprotocol header in response
  • Added comment to upgrade field
  • No longer responds with a WebSocketError when incoming connections are turned down.
    The error messages trapped by HTTP would not be shown in the browser, so this debug mechanism is
    less useful. Chrome and Firefox may show some more useful error messages in the developer console
    when we reply according to protocol.
  • Removed the 'binary' parameter from the call to the user's handler.

Modified the HttpServer upgrade function:

  • Improved comment

Modifed WebSockets.jl
hasprotocol(s::String ) - hasprotocol(s::AbstractString)

@hustf
Copy link
Collaborator Author

hustf commented Apr 12, 2018

Calling WebSockets.open is still making noises when not done in quite the right way. Here is another example for later checking:

julia> @edit WebSockets.open(myfun, SERVERURL, optionalProtocol = "relay_backend")
ERROR: could not determine location of method definition
Stacktrace:
 [1] functionloc(::Method) at .\reflection.jl:873
 [2] edit(::Function, ::Any) at .\interactiveutil.jl:88

julia> methods(WebSockets.open)
# 1 method for generic function "open":
open(f::Function, url; binary, verbose, optionalProtocol, kw...) in WebSockets at C:\Users\F\.julia\v0.6\WebSockets\src\HTTP.jl:5

julia> WebSockets.open(myfun, SERVERURL, optionalProtocol = "relay_backend")
18:47:58.57  WebSocket{TCPSocket}(client, (✓) CONNECTED, 0)
HTTP.Messages.Response:
"""
HTTP/1.1 101 Switching Protocols
Sec-WebSocket-Protocol: relay_backend
Sec-WebSocket-Accept: KrL8ARZgLQgIPzge8gF1rCwJW+k=
Upgrade: websocket
Connection: Upgrade

"""

@hustf
Copy link
Collaborator Author

hustf commented Apr 12, 2018

Also, some thoughts about making the interface extendable beyond local experimenting on a computer. We don't need to do this now, but it's a good time to pick the best parts from HttpServer and HTTP.
I would appreciate your comments, but in the meantime I'm going to update and add examples with the current interface I think.

It is recommended to do some further checks of the upgrade request beyond what 'upgrade' does now. For local experimenting, this can of course be neglected, so the user shouldn't have to bother about it.

  • The URI field is intended for identifying from where this particular connection is coming, for
    example which test or which html file.
  • "Origin" header: Included by clients in browsers, e.g: => "http://localhost:8000", but not by
    other local programs.
  • "Sec-WebSocket-Extensions" => "permessage-deflate" (for example)

So one way to tell 'upgrade' which connections to accept might be requiring a WebSockethandler object with a dictionary of acceptance criteria, or just fields with liberal starting value. This is more like the stricter interface of HttpServer. Loosely, a WebSockethandler could be defined with:
.handlerfunction
.acceptURI
acceptprotocols
.acceptsource
.acceptorigin
.acceptzip...

For reference, these are typical headers from Chrome:
"Connection" => "keep-alive, Upgrade"
"Sec-WebSocket-Version" => "13"
"http_minor" => "1"
"Keep-Alive" => "1"
"User-Agent" => "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Fi…
"Accept-Encoding" => "gzip, deflate"
"Cache-Control" => "no-cache"
"Origin" => "http://localhost:8000"
"Sec-WebSocket-Key" => "R9b6CHWxy9cg3H+1WuCFCA=="
"Sec-WebSocket-Protocol" => "relay_frontend"
"Sec-WebSocket-Extensions" => "permessage-deflate"
"Host" => "localhost:8000"
"Upgrade" => "websocket"
"Pragma" => "no-cache"
"http_major" => "1"
"Accept" => "text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"
"Accept-Language" => "en-US,en;q=0.5"

@hustf
Copy link
Collaborator Author

hustf commented Apr 16, 2018

I experimented a bit locally with changing the @requires, because they lead to more recompiling. But the @requires is the only way I get Atom to work. It seems some parts of Atom use HTTP and some parts use HttpServer.

Here's a graph layout of the dependencies neighborhood from a couple of months back. This includes testing dependencies. The lighter ones are the dependencies on this package.

In change_dependencies branch:
examples/serve_verbose/svg/ws_neighborhood.svg

@hustf
Copy link
Collaborator Author

hustf commented May 13, 2018

Just pushed a commit with added benchmarks and logging utilities. See commit message.
An example result with plots is included in benchmark/log

The other significant change is adding a non-blocking read with timeout during closing handshakes. This hopefully gets rid of most ECONNRESETS.

There will be no large benefit to merging and tagging this while on Julia 0.6, since it adds seconds of loading time to e.g. Atom. That is less of a worry if a faster alternative is ready for 0.7.

@hustf
Copy link
Collaborator Author

hustf commented May 24, 2018

Just pushed commit 8edb459, which reverts a major change from master.

Control frames, like PONG, now iterates to the next control frame instead of returning.
For example, OPCODE_CLOSE or _PONG would make read(ws) return some binary data. Now, read() does not return at all. An alternative that enables listening for possible PONG is using or modifying the non-exported read_nonblocking(ws). It would also be nice to have something like read_timed. Low priority.

The commit also includes ServerWS, serve(::ServeWS, etc...) and some constructors.
This can be avoided if HTTP.Serve can be extended slightly. It also defines a Websockethandler wrapper type for HTTP that would be better off as HTTP.WebsocketHandler.

@EricForgy
Copy link
Contributor

EricForgy commented May 25, 2018

Cool to see progress 👍 I doubt HTTP.Serve will be extended. In fact, I believe removing it was discussed and is likely in favor of "listen".

@hustf
Copy link
Collaborator Author

hustf commented May 26, 2018

Commit bb9af69 reinstates readframe(ws) checking for mask.

    if mask != ws.server
        error("WebSocket reader cannot handle incoming messages without mask. " *
            "See http://tools.ietf.org/html/rfc6455#section-5.3")
    end

The original test was probably removed to enable WebSockets|client, but since the function now takes a WebSocket and not just any IO, we can do this.

@hustf
Copy link
Collaborator Author

hustf commented May 26, 2018

It would be interesting to know how the Serve discussion turns out. I added this interface to try and get rid of console output from HTTP, but was only able to turn off some messages.

'ServeWS' assumes that user understands http requests and responses, but doesn't know what streams are. This is nice for some. But then it demands that users know about these ad-hoc function wrappers which don't really add any functionality: HttpHandle, WebSocketHandle.

They do add some structure to defining servers, though. This structure would read clearer using keyword arguments. Using a specific type for these function wrappers is also beneficial for logging dispatch.

I especially don't like the term WebsocketHandler. A handler is a function that reacts to an outside event and returns something useful. A WebSocketHandler doesn't do that. I would say it's an event handling subroutine that does various stuff with the argument. But then, I'm not a computer guy.

If you can think of alternative names or interfaces, I would appreciate the input.

@hustf
Copy link
Collaborator Author

hustf commented May 26, 2018

Commit 15adbdf introduces status codes for closing handshakes, server and client side.

@hustf
Copy link
Collaborator Author

hustf commented Jun 2, 2018

Commit ac86833 brings test coverage to ~93%, before pulling in browsertests.
There is some code duplication regarding launching of browsers in /benchmark and in /test. Julia 0.7 will reduce the need for operating system dependant code there.

Julia 0.7 alpha is out, so it's time to merge this branch very soon.

@EricForgy
Copy link
Contributor

I thought it was merged already :)

@hustf
Copy link
Collaborator Author

hustf commented Jun 3, 2018

One would, looking at the dates. Blame the weather!
Commit b510f88 improves exception handling, provokes the exceptions and checks the new and nice reasons for closing. There's just one instance where the provocations lead to a 'hundreds of lines'-output. We're mostly containing it now.

I'll have a look at the documents before merging. We're at 100% compliance with the standard I hope, except for checking that no-one used the reserved bits.

@hustf
Copy link
Collaborator Author

hustf commented Jun 4, 2018

Pretty much a large reshuffle of Readme.md.
I have tried to emphasize the bits which confused me the most, namely how the workflow actually works.
I thought for a long time that releasing references to a websocket was what made the callers release resources. The examples, with infinite while..end loops, led me to that misunderstanding.

@hustf
Copy link
Collaborator Author

hustf commented Jun 6, 2018

Commit ada7f2d reshuffles Readme.md again to make the actual interfaces / starting point clear. Also improved inline docs for WebSockets.serve.

Remaining: Restructure browsertest. Include the 'client' example from readme in examples. Update wiki. But the wiki is edited in the repo web interface, and interface is settled, so I'm merging this to master now.

@hustf
Copy link
Collaborator Author

hustf commented Jun 6, 2018

Closing this long side issue.

@hustf hustf closed this as completed Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants