Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should readable streams start right away? #269

Closed
domenic opened this issue Jan 21, 2015 · 9 comments
Closed

Should readable streams start right away? #269

domenic opened this issue Jan 21, 2015 · 9 comments

Comments

@domenic
Copy link
Member

domenic commented Jan 21, 2015

Inspired by w3c/ServiceWorker#606, it occurs to me that this might be a more general problem.

Right now, creating a readable stream is intended to start the flow of data into it. At least one chunk should (eventually) be enqueued, to move the stream from "waiting" to "readable". If this never happens, then the stream never becomes readable, and is useless. In terms of the API, this manifests in how constructing the stream kicks off a call to the underlying source's start(), and then if start() succeeds we immediately call pull().

However, there's another argument, which is that we want readable streams to be able to "represent" resources like HTTP request bodies or files. And in this case, it's often advantageous to have an "inert" request body or file object, which hasn't started doing any I/O (or at least hasn't done the "last mile" of I/O, e.g. IPC to move the chunks into the JS thread's process), until you ask it to.

In w3c/ServiceWorker#606 the current design seems to be guiding us toward the Request object being the "inert" request body, with req.stream() being a method that says "reify this into a JS-visible stream by doing IPC to get the chunks into the appropriate process." In that thread I phrased the two mental models as:

  • "Requests have a body which is a stream, plus convenience methods like .text() for reading the stream." vs.
  • "Requests are opaque objects which don't necessarily have an associated body. You can call .stream() to reify the body as a stream, or .text() to reify the body as text, etc."

You could also imagine a similar situation for files, where e.g. const file = fs.getFile("file.flac") gives you an opaque file object without doing any I/O, allowing e.g. file.moveTo("/tmp") but then also file.stream() to start reifying the data as a stream. (This is not how Node.js's fs API works, for the record.)

I see a few paths forward on this:

  1. Keep the current design. APIs that need the separation discussed here will need to manually stratify themselves into opaque objects with stream-vending methods, as appropriate---e.g. the req.stream() pattern.
  2. Don't have streams start immediately upon creation. Maybe by adding some kind of rs.start() method (eww), or by making rs.ready start the flow (not great either)?
  3. Move to async read. (Perhaps also motivated by Support reading bytes into buffers allocated by user code on platforms where only async read is available #253?) Then the first call to read() is a signal to start the flow of data, and the stream could avoid starting until that happens.

I am leaning toward just keeping the current design, but I wanted to bring it up.

@tyoshino
Copy link
Member

This is kinda related to #146 (comment).

I suggested removal of wait() as it's a little strange that we pull() automatically on data shortage but the initial pull() was invoked only when wait() was called.

And, instead, I planned ... to propose some API to allow the user to tell the Stream to start/stop pull() possibly with some number to indicate how much to pull(). Sorry for huge overdue...

@tyoshino
Copy link
Member

Sorry. I put a wrong issue ID in the last comment. Just fixed.

@tyoshino
Copy link
Member

In the Fetch's Response's case, the underlying source lives and is hidden in the Response?

It doesn't look so weird if the text(), body(), etc. ask the internal underlying source to "start" the stream when they are called. What do you think this reasoning if combined with w3c/ServiceWorker#606 (comment)?

@tyoshino
Copy link
Member

3 comments above are about (2) of Domenic's first post, i.e. introducing "inert" state to the ReadableStream API.

@domenic
Copy link
Member Author

domenic commented Jan 22, 2015

In the Fetch's Response's case, the underlying source lives and is hidden in the Response?

Agreed.

It doesn't look so weird if the text(), body(), etc. ask the internal underlying source to "start" the stream when they are called.

Agreed.

What do you think this reasoning if combined with w3c/ServiceWorker#606 (comment)?

I think it's valid reasoning. But there's another point of view which is equally valid IMO, if you look only at the public API and not how it's implemented. Which is that, the body getter returns the stream which has an underlying source that is already started (but not yet finished starting/pulling). The developer cannot tell the difference, based on the public API only, between:

  1. body getter lazily creates a ReadableStream (which automatically starts the underlying source)
  2. body getter returns an already-created ReadableStream which hasn't finished starting/pulling yet.

So I am OK with either a method (which kind of implies interpretation 1) or a getter (which kind of implies interpretation 2).

@tyoshino
Copy link
Member

The developer cannot tell the difference, based on the public API only, between ...

Agreed. Ah, so, the proposal of mine is just a variant of (1) in your first post.

@yutakahirano
Copy link
Member

Sorry, I don't understand. @tyoshino, do you still plan to add an API you mentioned at #269 (comment)?

@tyoshino
Copy link
Member

@yutakahirano I do. But it's just one of the solutions we have to address the issue (w3c/ServiceWorker#606).

#269 (comment) which is elaborated by Domenic in #269 (comment) is categorized into approaches that add a trigger point to make the response "inert" to "non-inert" on the Response object, not on ReadableStream.

#269 (comment) is categorized into approaches that add a trigger point to do so on the ReadableStream. The category includes (2) in Domenic's #269 (comment), too.

Currently, I'm thinking we can go with the former approach.

@domenic
Copy link
Member Author

domenic commented Mar 12, 2015

I think that, especially now that we have the flexibility provided by the separate getReader() and async read(), this issue can be considered solved. Although ReadableStream instances will start immediately, readable streams in general don't have to, and with some clever underlying source code you can even get a ReadableStream that doesn't really start immediately.

@domenic domenic closed this as completed Mar 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants