-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream audio/image input values into and out of the server #2249
Conversation
Uffizzi Preview |
@@ -0,0 +1,133 @@ | |||
defmodule LivebookWeb.Helpers.Codec do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautifully done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful. I have added some additional comments and the previous note about tests, then we can ship it! :)
Co-authored-by: José Valim <[email protected]>
Co-authored-by: José Valim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ship it
Currently we pass the values for audio/image as LV events to and from the client encoded with base64. For large files (especially audio) base64 encoding takes a notable amount of time. More importantly, passing such large payloads using the LV socket is bad because it can block other events.
This PR introduces off-bound streaming in both directions. For sending the binary value to the server we use LV uploads and for getting the value preview to all clients we fetch data from a separate endpoint. In case of audio the endpoint always returns a WAV file, so we can use it as the
<audio>
src
directly; if the data we keep is PCM, we encode on the fly; the endpoint also supports Range requests so that seeking works as expected. In case of images we always return the whole binary; for png/jpg we just set the<img>
src
directly; for pixel data wefetch
the binary and fill in a canvas.In line with all these changes, we no longer keep the values in memory as binaries, instead we store them in files (essentially mirroring the file input). We are making this change all the way to Kino, so
Kino.Input.read(...)
is going to return a file reference rather than the binary itself.Random highlights
In case of audio, we decode the file on the client and get some metadata like
sampling_rate
andchannels
. We need to send that metadata to the server alongside the actual binary. To do that, we build an annotated binary, which is<<metadata_size, metadata_json, binary>>
. However, when doing the LV upload, we don't want all of this to end up in the uploaded file. Instead we want to consume the metadata and then store the actual binary in a file. Doing that was very straightforward with a custom upload writer, that isLivebookWeb.AnnotatedTmpFileWriter
. // EDIT: changed in favour of phoenixframework/phoenix_live_view@d386704In the controller endpoint, we need to know the input value that we want to return. We could ask the session process, but this (a) increases the load on that single process (b) does not work with per-user input (only LV knows the value of these). Both of these are elegantly solved by generating a
Phoenix.Token
with the LV pid and input id, which we decode in the controller and ask LV for the input value directly (which also ensures the LV is still running).Demo
Perhaps the coolest thing of all is that for a large audio, we now show both a decoding indicator and a progress bar during the actual upload:
input_stream.mp4