Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Streams support #72

Closed
annevk opened this issue Sep 27, 2016 · 91 comments
Closed

Add Streams support #72

annevk opened this issue Sep 27, 2016 · 91 comments

Comments

@annevk
Copy link
Member

annevk commented Sep 27, 2016

In 9224c4c#commitcomment-19169480 @jakearchibald @tyoshino @ricea are having a discussion about adding https://streams.spec.whatwg.org/ to the Encoding Standard that I'd prefer us to continue in this issue (and the eventual pull request). That will make it easier to track for other folks participating in this repository. Hope that's okay.

@inexorabletash
Copy link
Member

I'd love PRs for https://github.com/inexorabletash/text-encoding too. hint hint

@ricea
Copy link
Collaborator

ricea commented Oct 4, 2016

I think the first issue to resolve is whether the API should be

var stringReadable = byteReadable.pipeThrough(TextDecoder.stream());

or

var stringReadable = byteReadable.pipeThrough(new TextDecoder());

My spec patch currently implements the first one. The second one is clearly more readable, but it makes it unclear what this does:

var decoder = new TextDecoder();
decoder.decode(new Uint8Array([0xE0]), {stream: true});
setTimeout(() => decoder.decode(new Uint8Array([0xA0]), {stream: true}), 1);
var stringReadable = byteReadable.pipeThrough(decoder);

@jakearchibald in 9224c4c#commitcomment-19186596 suggested that decode() should throw if either the readable or writable streams are locked. This gives predictable behaviour: in the above example, the first call to decode() would work, and the byteReadable stream would have to start with a byte in the range 0xA0 to 0xBF. The second call to decode() would throw, because pipeThrough() has locked the writable member of decoder.

I expressed some other concerns about the second option in 9224c4c#commitcomment-19186596.

@annevk
Copy link
Member Author

annevk commented Oct 4, 2016

The main problem with the first option as I see is that stream() ends up creating an object there's no way to create through a constructor. And while we have some of that in the platform, the preference from the developer community is that we avoid such magical creation patterns.

As for the concern about sharing logic with Streams with the second option, that should be doable. We can refactor both standards as needed. That leaves the performance issue which I don't feel qualified to comment on.

@jakearchibald
Copy link

@ricea can you talk a bit more about the performance issue?

If the piped stream is enqueued using JS, won't there be a similar de-opt? What about:

const decoder = new TextDecoder();

jsCreatedStream.pipeTo(decoder.writable, {preventClose: true}).then(() => {
  return fetchStream.pipeThrough(decoder);
});

How would that be different from:

const decoder = new TextDecoder();

decoder.decode(whatever, {stream: true});
fetchStream.pipeThrough(decoder);

@ricea
Copy link
Collaborator

ricea commented Oct 4, 2016

@jakearchibald Both ReadableStream and WritableStream contain a queue. A natural optimisation for TransformStream is to eliminate one or both of those queues. This works even if both ends of the pipe are JS. If either end of the pipe is an inbuilt stream then more exciting optimisations become possible.

The object returned by TextDecoder.stream() in my draft spec is an instance of TransformStream, so any optimisations that have been implemented for TransformStream automatically apply. I am assuming that TextDecoder will never be a subclass of TransformStream. Any optimisation that can be implemented for TransformStream can also be implemented for TextDecoder, but unless user agents carefully share implementation code with this in mind, optimisations applied to one will not automatically apply to the other.

Concretely speaking, in Chrome the implementation of TextDecoder is in C++ but TransformStream is due to be implemented in Javascript. My expectation is that general piping optimisations will apply to TextDecoder but optimisations specific to TransformStream are unlikely to be ported to TextDecoder unless there is specific demand.

I don't mean to suggest this is a showstopper. It is just something I am interested in feedback on.

With regards to:

jsCreatedStream.pipeTo(decoder.writable, {preventClose: true}).then(() => {
  return fetchStream.pipeThrough(decoder);
});

I can describe how I think it will work in the optimisation proof-of-concept that @tyoshino has been working on. The decoder stream starts out with a Javascript source, so it is in "visible side-effects" mode. Even after switching to reading from fetchStream it remains in that mode until the data from jsCreatedStream has been completely drained. At that point it can transparently switch to "full optimisation" mode. Hopefully @tyoshino will correct me if I've got this completely wrong.

@jakearchibald
Copy link

jakearchibald commented Oct 4, 2016

@ricea

Thanks for the detailed explanation!

I am assuming that TextDecoder will never be a subclass of TransformStream

It could be. If that isn't possible, there could be a behind-the-scenes TransformStream and its writable and readable are bound on TextDecoder. Turning TextDecoder into a full transform stream would be my preference, but maybe I'm missing the blocker.

Ideally .decode/.encode would be respecced to use the underlying transform stream. The tricky bit is (as far as I can tell) is allowing the legacy methods to synchronously return.

@annevk
Copy link
Member Author

annevk commented Oct 4, 2016

Either a subclass or shared implementation seems doable. Subclass might be workable here, but not sure it is for all objects that need to interoperate with streams in due course.

@domenic
Copy link
Member

domenic commented Oct 4, 2016

The main problem with the first option as I see is that stream() ends up creating an object there's no way to create through a constructor. And while we have some of that in the platform, the preference from the developer community is that we avoid such magical creation patterns.

There is a way to create it: it directly uses the TransformStream constructor. There's no magic going on here, just an algorithm for assembling the constructor arguments.

In general I think a clean break between the stream interface and the old interface would be preferable, instead of a confusing mismash which has both APIs on the same object, and which raises questions about what happens when you mix them. (Answerable questions, but the answers are not necessarily intuitive.)

@ricea
Copy link
Collaborator

ricea commented Oct 26, 2016

I am leaning slightly towards the "overloaded class" approach because it will look better on a slide. In other words, this syntax:

var stringReadable = byteReadable.pipeThrough(new TextDecoder());

with .decode() throwing an exception if called when the stream is in use.

I think I can spec it relatively simply by delegating to a TransformStream. Whether or not to inherit from TransformStream can then be postponed to a separate discussion.

The spec for TransformStream is not ready yet, so we don't need to make a decision immediately. I expect it will be ready in the first half of November.

Incidentally, I made minor updates to the Stream patch at http://htmlpreview.github.io/?https://github.com/ricea/encoding-streams/blob/master/patch.html, however it still uses the TransformStream.stream() syntax.

@annevk
Copy link
Member Author

annevk commented Nov 16, 2016

Is it correct that this proposal automatically takes buffer reuse into account as stated in #69? If so, we should probably duplicate that issue against this one.

@domenic
Copy link
Member

domenic commented Nov 16, 2016

Hmm. I think right now we haven't quite integrated readable byte streams into transform streams. See whatwg/streams#601

@ricea
Copy link
Collaborator

ricea commented Nov 17, 2016

Hmm. I think right now we haven't quite integrated readable byte streams into transform streams. See whatwg/streams#601

I had doubts whether byte support in TransformStream was useful when it didn't exist in WritableStream. But #69 provides a clear use-case.

@ricea
Copy link
Collaborator

ricea commented Feb 9, 2017

I have updated the proposed patch for the spec to make the TextEncoder & TextDecoder objects behave as stream transformers directly as we discussed here. Internally they delegate to a TransformStream object which provides the glue logic.

http://htmlpreview.github.io/?https://github.com/ricea/encoding-streams/blob/master/patch.html

@yhirano identified an issue during review. Consider the following code:

let dec = new TextDecoder();
let writer = dec.writable.getWriter();
writer.write(incompleteByteSequence);
writer.releaseLock();
await doSomethingWith(dec.readable);
let what = dec.decode(moreBytes);

Here dec.decode() does not throw because the lock on writable was released. I am assuming that the lock on readable has also been released.

The issue is that the contents of what now depend on whether doSomethingWith() read from readable or not. TransformStream does not call transform() until a read happens, in order to respect backpressure[1]. If doSomethingWith() piped readable to a native stream it may be nondeterministic whether anything was read or not.

Our proposed solution is to make TextDecoder lock into a particular mode the first time decode() or the decode and enqueue chunk algorithm was called. Once it was locked into one mode attempting to use it the other way would be an error.

This doesn't make the above code deterministic, but it means that if incompleteByteSequence had indeed been processed then decode() would throw, which is hopefully better than producing garbage output.

Thoughts?

[1] This is a gross simplification.

@annevk
Copy link
Member Author

annevk commented Feb 9, 2017

Can't we lock the moment you invoke getWriter()?

@ricea
Copy link
Collaborator

ricea commented Feb 9, 2017

Can't we lock the moment you invoke getWriter()?

The Streams Standard doesn't track whether a stream has ever been locked, just whether it is locked at the moment. It also doesn't export any hooks to notify observers when a stream is locked, so the Encoding Standard cannot do the bookkeeping itself.

@annevk
Copy link
Member Author

annevk commented Feb 9, 2017

Hmm, I don't know enough about the particulars, but poking at the internals of streams, e.g., using https://streams.spec.whatwg.org/#is-writable-stream-locked, should be okay. Fetch does so too.

@jakearchibald
Copy link

We can tell if it's locked, but not if it's unlocked but was previously locked.

@ricea

If doSomethingWith() piped readable to a native stream it may be nondeterministic whether anything was read or not.

Wouldn't dec.decode throw because it couldn't get a lock on the readable?

@yutakahirano
Copy link
Member

Fetch API uses "disturbed" property which is turned on when the first read is made and will never turned off, although I'm not a big fan of the property. Regarding

It also doesn't export any hooks to notify observers when a stream is locked

, I think having decode_called boolean in the decoder and having the following logic in the decode function would work.

 if (!decode_called) {
   decode_called = true
   lock readable
   lock writable
 }

@ricea
Copy link
Collaborator

ricea commented Feb 9, 2017

@jakearchibald Imagine that doSomethingWith() looks like this:

async function doSomethingWith(readable) {
  let writable = getAwesomeNativeStream();
  await readable.pipeTo(writable, {preventCancel: true});
  return;
}

writable closes after accepting zero or more chunks of text, and readable is unlocked and can be reused.

@jakearchibald
Copy link

So with dec.decode(moreBytes), moreBytes may be added to an incomplete byte sequence. But isn't the same true if readable is piped somewhere else after doSomethingWith?

@ricea
Copy link
Collaborator

ricea commented Feb 10, 2017

@jakearchibald Do you mean like

let dec = new TextDecoder();
let writer = dec.writable.getWriter();
writer.write(incompleteByteSequence);
writer.releaseLock();
await doSomethingWith(dec.readable);
dec.readable.pipeTo(someOtherWritable());

?

In this case what is written to someOtherWritable() does indeed depend on what was read by doSomethingWith(). The differences are:

  1. This is consistent with how streams normally work. What data a second pipeTo() gets always depends on how much previous one read.
  2. Regardless of where the characters end up, they are always the same characters. There is no possibility of getting different output depending on timing.

The way @yhirano explained it to me was that the nondeterministic behaviour is caused by writable having its own buffer in addition to the one inside the TextDocoder object. A clean solution would be to force dec.decode() to use that buffer too, but that would be very complex to implement and require sophisticated optimisation to avoid a major performance regression.

@jakearchibald
Copy link

A clean solution would be to force dec.decode() to use that buffer too

Ahh, this is where my understanding was lacking. I thought .decode would be explained using the underlying stream, but I guess it's incompatible since it's sync.

In that case, the "mode locking" proposals sound good.

@ricea
Copy link
Collaborator

ricea commented Feb 28, 2017

I uploaded a prollyfill for this functionality at https://github.com/GoogleChrome/text-encode-transform-prollyfill. "prollyfill" because it implements a proposed change to the standard rather than a standard itself.

I have tested it in Chrome Canary (with the experimental flag) and Safari Technology Preview. Neither of these browsers have TransformStream, but you can get a polyfill here: https://github.com/whatwg/streams/blob/transform-stream-polyfill/reference-implementation/contrib/transform-stream-polyfill.js.

It doesn't yet incorporate the changes to way concurrent use of both APIs is prevented that we've discussed here. This doesn't matter unless you are planning on holding it wrong.

@domenic
Copy link
Member

domenic commented Feb 5, 2018

Regarding locking, to me the question is whether we want to say that either API is layered on top of the other, or whether they are two separate APIs that just happen to share the same object.

If we say that the dec.decode() API is layered on top of the dec.readable / dec.writable API, we'd write it something like

TextDecoder.prototype.decode = function (input, options) {
  const writer = this.writable.getWriter();
  const reader = this.readable.getReader();
  
  this.writer.write(input);
  const output = this.reader._readSync(); // private API I guess
  
  writer.releaseLock();
  reader.releaseLock();
  
  return output;
};

In this case the code in #72 (comment) seems fine to me, nondeterminism included.

Am I missing something?

(If they are separate APIs that happen to share the same object, then some kind of locking/disturbedness makes sense, I guess.)

@ricea
Copy link
Collaborator

ricea commented Feb 6, 2018

The non-determinism creates a footgun which in my opinion means we should strongly discourage mixing the APIs.

I'm also skeptical of explaining the behaviour of decode() in terms of a fictional synchronous streams API. I feel it doesn't mesh well with the very concrete way that Encoding and Streams are specified.

Of course, we could really add a synchronous API to Streams, but I think that would be a lose-lose-lose proposition for standard authors, implementers and developers alike.

@domenic
Copy link
Member

domenic commented Feb 7, 2018

The non-determinism creates a footgun which in my opinion means we should strongly discourage mixing the APIs.

I mean, this seems like any other case where you use both a high-level API and a low-level API together to operate on the same underlying data. The interactions will be subtle and perhaps unpredictable. But you can get nondeterminism when using streams in lots of ways. E.g. the code feels analogous to

let dec = new TransformStream();

let writer = dec.writable.getWriter();
writer.write(chunk1);

await doSomethingWith(dec.readable);

writer = dec.writable.getWriter();
writer.write(chunk2);

let reader = dec.readable.getReader();
let what = await dec.reader.read();

To be clear, here the contents of what depend on whether doSomethingWith read from readable or not. (If it did, what is chunk2; if it did not, what is chunk1.)

I'm not sure why we're concerned about #72 (comment), but not concerned about the above.

I'm also skeptical of explaining the behaviour of decode() in terms of a fictional synchronous streams API. I feel it doesn't mesh well with the very concrete way that Encoding and Streams are specified.

I guess I was more getting at the idea that they both could conceptually operate on the same underlying stream queues. As opposed to them being completely separate, but occupying the same object. I wasn't proposing actually adding a sync API.

@ricea
Copy link
Collaborator

ricea commented May 14, 2018

I wrote an explainer for the Streams integration, and uploaded it as a pull request: #143. I can put it somewhere else if here is not appropriate.

ricea added a commit to GoogleChromeLabs/text-encode-transform-polyfill that referenced this issue May 22, 2018
Use constructors named TextDecoderStream and TextEncoderStream for the
streaming versions. This is based on the agreement in discussion on the
Encoding Standard, particularly
whatwg/encoding#72 (comment).

The TextEncoder and TextDecoder prototypes are no longer modified, since
the new constructors are completely distinct objects. Creation of the
underlying TransformStream is no longer done lazily, as there's no
longer any concern of regressing performance for the method interfaces.

Update the tests to use the new constructors. Remove tests that involved
the encode() and decode() methods.

Update the README and design documents for the new change.
@ricea
Copy link
Collaborator

ricea commented May 30, 2018

Here's a strawman proposal for the new IDL:

dictionary TextDecoderOptions {
  boolean fatal = false;
  boolean ignoreBOM = false;
};

dictionary TextDecodeOptions {
  boolean stream = false;
};

interface mixin HasEncoding {
  readonly attribute DOMString encoding;
};

interface mixin IsDecoder {
  readonly attribute boolean fatal;  
  readonly attribute boolean ignoreBOM;
};

IsDecoder includes HasEncoding;

[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options),
 Exposed=(Window,Worker)]
interface TextDecoder {
  USVString decode(optional BufferSource input, optional TextDecodeOptions options);
};

TextDecoder includes IsDecoder;

[Constructor,
 Exposed=(Window,Worker)]
interface TextEncoder {
  [NewObject] Uint8Array encode(optional USVString input = "");
};

TextEncoder includes HasEncoding;

interface mixin GeneralTransformStream {
  readonly attribute ReadableStream readable;
  readonly attribute WritableStream writable;
};

[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options),
 Exposed=(Window,Worker)]
interface TextDecoderStream {
};

TextDecoderStream includes IsDecoder;
TextDecoderStream includes GeneralTransformStream;

[Constructor,
 Exposed=(Window,Worker)]
interface TextEncoderStream {
};

TextEncoderStream includes HasEncoding;
TextEncoderStream includes GeneralTransformStream;

Correction welcome.

Questions:

  • Is HasEncoding too trivial to be a mixin?
  • Is it okay that all the members of TextEncoderStream and TextDecoderStream come from mixins?
  • Shouild readable and writable be annotated [SameObject]?

@annevk
Copy link
Member Author

annevk commented May 30, 2018

Interface mixins cannot include interface mixins, yet: whatwg/webidl#537.

I'd really much prefer inheritance over GeneralTransformStream. This seems like something that should be a subclass, so we'd get any benefits added to transform streams.

@ricea
Copy link
Collaborator

ricea commented May 30, 2018

I'd really much prefer inheritance over GeneralTransformStream. This seems like something that should be a subclass, so we'd get any benefits added to transform streams.

Subclassing the "real" TransformStream would be difficult.

From a spec point of view, we have to make WebIDL know about TransformStream. The simplest way I can think of would be to add a [NonEnumerable] extended attribute to WebIDL so that it can express the existing interface.

From an implementation point of view, it's probably easy when TransformStream is implemented using IDL and difficult otherwise. In Chrome, the IDL parser knows nothing about TransformStream, so we'd have to either special-case it in the IDL bindings, reimplement TransformStream in C++, or implement TextDecoderStream without using IDL bindings at all.

From a design point of view, inheritance of implementation has led to tight coupling and poor maintainability in code bases I have worked on and it's not a practice I want to encourage.

@ricea
Copy link
Collaborator

ricea commented Jun 13, 2018

Sorry for bringing up the inheritance of implementation thing. That's a much wider issue that should be discussed elsewhere.

I've discussed this extensively with @domenic and we see TransformStream as a convenience factory for a { readable, writable } "structural object" to be consumed by pipeThrough(). It provides the nice properties that most people would want, but it's not the only way to get a { readable, writable } pair. https://streams.spec.whatwg.org/#example-both is an example of another way. TextEncoderStream would be another of those other ways.

We're not looking to extend TransformStream to have extra attributes beyond readable and writable because it would be preferable to find ways to make every { readable, writable } pair more functional instead.

@annevk
Copy link
Member Author

annevk commented Jun 13, 2018

If we go with duck typing, why does this use string properties rather than symbols (as iterators do)? I thought promises using then() was a special case that was acceptable because it was already widespread, but not a pattern worthy of copying.

@domenic
Copy link
Member

domenic commented Jun 13, 2018

I don't think duck-typing and symbols are very related. There are many duck-types throughout the platform (and the history of computing) based on string names. For example, everything on the platform that accepts a dictionary argument instead of a class. (Concrete example: DOMPointInit.)

Symbols are mostly used for syntax-triggered protocols like iteration. Although I'm not sure the dividing line is that clearly thought out.

@annevk
Copy link
Member Author

annevk commented Jun 13, 2018

It seems like this approach would also require pipeThrough() to be taught about all possible objects that are actually known to be streams for optimizations? (Implementation detail, but falls out of this kind of design.)

I guess I'm wishing we had a bit more precedent or more principled vision for these kind of objects. Is it worth asking TC39?

@domenic
Copy link
Member

domenic commented Jun 13, 2018

Well, I think TC39 is pretty far on the side of duck-typing, more so than we are comfortable with on the web platform. But the way I think of this is just that pipeThrough accepts a dictionary. This is generally what you want, to enable using it with cases like https://streams.spec.whatwg.org/#example-both.

Indeed there would be a lot of optimizations under the hood, but that's kind of built in to the whole piping idea. (And symbols vs. strings wouldn't really change that.)

@annevk
Copy link
Member Author

annevk commented Jun 13, 2018

Yeah, inheritance would though, and has a lot of precedent on the platform too.

@domenic
Copy link
Member

domenic commented Jun 13, 2018

I don't think so, at least, not the types of optimizations we're thinking of for Chrome. For example, you can use the specialized knowledge that TextEncoderStream is synchronous to do optimizations far beyond what you could do on any generic TransformStream.

The optimizations you could do on a generic TransformStream are pretty limited, in fact, given that generically speaking, TransformStream instances call out to user JavaScript code all the time. So optimizations will generally be done on specific, recognized, branded pairs of { readable, writable }. That could even include a WebSocketDuplexStream in the style of #example-both if we ever got around to adding that to the platform.

@annevk
Copy link
Member Author

annevk commented Jun 14, 2018

I guess the theoretical problem that remains is if you have something that wants to accept either a transform stream or something else and something else happens to have similarly named properties. But I suppose in that case you just provide two methods. That would be a bit different from established precedent in web platform APIs though.

@ricea
Copy link
Collaborator

ricea commented Jun 14, 2018

I guess the theoretical problem that remains is if you have something that wants to accept either a transform stream or something else and something else happens to have similarly named properties.

In practice using the APIs I find I don't pass around TransformStreams. I either supply them directly to pipeThrough(), or I split them into the readable and writable and do separate things with each part. pipeThrough() is specified to check that readable and writable properties exist, but I expect that to be the exception.

If we go with duck typing, why does this use string properties rather than symbols (as iterators do)?

Iterators do use string properties. An iterator is an object that has a next() method. It returns objects with properties value and done.

Iterables use symbol properties. I assume this is because they have to avoid name collisions and problems with enumeration.

I think a transform stream is more like an iterator than an iterable in this respect.

It seems like this approach would also require pipeThrough() to be taught about all possible objects that are actually known to be streams for optimizations? (Implementation detail, but falls out of this kind of design.)

Implementations are not prevented from sharing optimisation infrastructure because we don't inherit from TransformStream. Because pipeThrough() just delegates to pipeTo(), the pipe optimisation strategy I expect is to expose an internal API on WritableStream that says "here's the kinds of pipe optimisation I know how to participate in".

On the other hand, inheritance can get in the way of implementations that want to optimise by bypassing TransformStream machinery, as @domenic alluded to.

@domenic
Copy link
Member

domenic commented Jun 14, 2018

I guess the theoretical problem that remains is if you have something that wants to accept either a transform stream or something else

I think we've learned that overloading is in general not a great idea on the platform. I can't recall any overloads of the form (X or object), certainly.

@annevk
Copy link
Member Author

annevk commented Jun 15, 2018

If it's actually a TransformStream it wouldn't be (X or object). And I think overloading as in (Blob or DOMString) is totally fine. It's changing the meaning or number of arguments that's a rather dubious practice.

@domenic
Copy link
Member

domenic commented Jun 15, 2018

I guess I don't really understand the point you're trying to make, then. But regardless, taking a step back, it still seems like TextEncoderStream being its own standalone class that conforms to the { readable, writable } pattern (i.e., the pattern that all lowercase-transform-stream-accepting APIs accept) should be OK?

@annevk
Copy link
Member Author

annevk commented Jun 15, 2018

The point is that if someone wanted (TransformStream or SomeOtherThing) that wouldn't work, but they could have separate methods instead. And yeah, the current design seems okay to me. Thanks for going through the alternatives.

@ricea
Copy link
Collaborator

ricea commented Jun 19, 2018

New attempt at the IDL, getting closer to what I'd like to standardise. I've attempted to improve the naming. PTAL.

dictionary TextDecoderOptions {
  boolean fatal = false;
  boolean ignoreBOM = false;
};

dictionary TextDecodeOptions {
  boolean stream = false;
};

interface mixin TextEncoderAttributes {
  readonly attribute DOMString encoding;
};

interface mixin TextDecoderAttributes {
  readonly attribute DOMString encoding;
  readonly attribute boolean fatal;  
  readonly attribute boolean ignoreBOM;
};

[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options),
 Exposed=(Window,Worker)]
interface TextDecoder {
  USVString decode(optional BufferSource input, optional TextDecodeOptions options);
};

TextDecoder includes TextDecoderAttributes;

[Constructor,
 Exposed=(Window,Worker)]
interface TextEncoder {
  [NewObject] Uint8Array encode(optional USVString input = "");
};

TextEncoder includes TextEncoderAttributes;

interface mixin GenericTransformStream {
  readonly attribute ReadableStream readable;
  readonly attribute WritableStream writable;
};

[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options),
 Exposed=(Window,Worker)]
interface TextDecoderStream {
};

TextDecoderStream includes TextDecoderAttributes;
TextDecoderStream includes GeneralTransformStream;

[Constructor,
 Exposed=(Window,Worker)]
interface TextEncoderStream {
};

TextEncoderStream includes GeneralTransformStream;

@ricea
Copy link
Collaborator

ricea commented Jun 19, 2018

I missed

TextEncoderStream includes TextEncoderAttributes;

@domenic
Copy link
Member

domenic commented Jun 19, 2018

This is not a strong preference on my part, but this seems to be a more aggressive mixin-based factoring than we have in other parts of the platform. Personally I might just collapse all the interface mixins into their interfaces. They don't save much typing, and I don't think we have the usual motivation of wanting their target interfaces to be unaware of changes to the mixin.

@annevk
Copy link
Member Author

annevk commented Jun 19, 2018

Hmm, it seems okay to me. (And whenever we don't do this, as with postMessage(), we tend to somewhat regret not having a shared abstraction I think.)

@ricea
Copy link
Collaborator

ricea commented Jun 19, 2018

I'm using mixins based on the theory that having the mixins will make the standard text less repetitive. Of course, I haven't written the text yet. Maybe we can defer judgement until we see what it looks like?

ricea added a commit to ricea/encoding that referenced this issue Jul 18, 2018
Integrate with the streams standard by adding TextEncoderStream and
TextDecoderStream transform streams to the standard. These enable
binary<>text conversions on a ReadableStream using the `pipeThrough()`
method (see https://streams.spec.whatwg.org/#rs-pipe-through).

A TextEncoderStream object can be used to transform a stream of strings
to a stream of bytes in UTF-8 encoding. A TextDecoderStream object can
be used to transform a stream of bytes in the encoding passed to the
constructor to strings.

There is a prollyfill and tests for the new functionality at
https://github.com/GoogleChromeLabs/text-encode-transform-prollyfill.

Closes whatwg#72.
@ricea
Copy link
Collaborator

ricea commented Jul 18, 2018

I started a new pull request at #149 as the old one had become confusing.

What I'd like feedback on first is whether the mixins are useful or not. I'm leaning towards saying they are not useful.

annevk pushed a commit that referenced this issue Aug 29, 2018
Integrate with the streams standard by adding TextEncoderStream and
TextDecoderStream transform streams to the standard. These enable
byte<>string conversions on a ReadableStream using the pipeThrough()
method (see https://streams.spec.whatwg.org/#rs-pipe-through).

A TextEncoderStream object can be used to transform a stream of strings
to a stream of bytes in UTF-8 encoding. A TextDecoderStream object can
be used to transform a stream of bytes in the encoding passed to the
constructor to strings.

Tests: web-platform-tests/wpt#12430.

There is a prollyfill and tests for the new functionality at
https://github.com/GoogleChromeLabs/text-encode-transform-prollyfill.

Closes #72.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants