diff --git a/encoding.bs b/encoding.bs index 43d7e5d..b25b880 100644 --- a/encoding.bs +++ b/encoding.bs @@ -15,6 +15,8 @@ Translate IDs: dictdef-textdecoderoptions textdecoderoptions,dictdef-textdecodeo spec:infra; type:dfn; text:code point text:ascii case-insensitive +spec:streams; + type:interface; text:ReadableStream @@ -1038,36 +1040,26 @@ function decodeArrayOfStrings(buffer, encoding) { -
-dictionary TextDecoderOptions { - boolean fatal = false; - boolean ignoreBOM = false; -}; - -dictionary TextDecodeOptions { - boolean stream = false; -}; - -[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options), - Exposed=(Window,Worker)] -interface TextDecoder { +interface mixin TextDecoderCommon { readonly attribute DOMString encoding; readonly attribute boolean fatal; readonly attribute boolean ignoreBOM; - USVString decode(optional BufferSource input, optional TextDecodeOptions options); -};+}; + -
A {{TextDecoder}} object has an associated encoding,
-decoder, stream,
-ignore BOM flag (initially unset),
-BOM seen flag (initially unset),
-error mode (initially "replacement
"), and
-do not flush flag (initially unset).
+
The {{TextDecoderCommon}} interface mixin defines common attributes that are shared between
+{{TextDecoder}} and {{TextDecoderStream}} objects. These objects have an associated
+encoding,
+ignore BOM flag (initially unset),
+BOM seen flag (initially unset), and
+error mode (initially
+"replacement
").
-
A {{TextDecoder}} object also has an associated -serialize stream algorithm, that given a +
These objects also have an associated +serialize stream algorithm, that given a stream stream, runs these steps:
While true:
Let token be the result of - reading from stream. +
Let token be the result of reading from stream.
If encoding is UTF-8, UTF-16BE, or UTF-16LE, and - ignore BOM flag and BOM seen flag are unset, then: +
If encoding is UTF-8, UTF-16BE, or + UTF-16LE, and ignore BOM flag and + BOM seen flag are unset, then:
If token is U+FEFF, then set BOM seen flag. +
If token is U+FEFF, then set BOM seen flag.
Otherwise, if token is not end-of-stream, then set - BOM seen flag and append token to output. + BOM seen flag and append token to output.
Otherwise, return output.
The encoding
+attribute's getter, when invoked, must return this object's encoding's
+name in ASCII lowercase.
+
+
The fatal
+attribute's getter, when invoked, must return true if this object's
+error mode is "fatal
", and false otherwise.
+
+
The
+ignoreBOM
+attribute's getter, when invoked, must return true if this object's
+ignore BOM flag is set, and false otherwise.
+
+
+
+dictionary TextDecoderOptions { + boolean fatal = false; + boolean ignoreBOM = false; +}; + +dictionary TextDecodeOptions { + boolean stream = false; +}; + +[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options), + Exposed=(Window,Worker)] +interface TextDecoder { + USVString decode(optional BufferSource input, optional TextDecodeOptions options); +}; +TextDecoder includes TextDecoderCommon; ++ +
A {{TextDecoder}} object has an associated decoder, +stream, and do not flush flag (initially +unset). +
decoder = new TextDecoder([label = "utf-8" [, options]])
decoder . encoding
- decoder . encoding
+ decoder . fatal
- Returns true if error mode is "fatal
", and false
- otherwise.
+
decoder . fatal
+ Returns true if error mode is "fatal
", and
+ false otherwise.
-
decoder . ignoreBOM
- Returns true if ignore BOM flag is set, and false otherwise. +
decoder . ignoreBOM
+ Returns true if ignore BOM flag is set, and false + otherwise.
decoder . decode([input [, options]])
Returns the result of running encoding's decoder. The
- method can be invoked zero or more times with options's stream
set to
+
Returns the result of running encoding's decoder.
+ The method can be invoked zero or more times with options's stream
set to
true, and then once without options's stream
(or set to false), to process
a fragmented stream. If the invocation without options's stream
(or set to
false) has no input, it's clearest to omit both arguments.
@@ -1140,9 +1171,9 @@ while(buffer = next_chunk()) {
}
string += decoder.decode(); // end-of-stream
-
If the error mode is "fatal
" and
- encoding's decoder returns error, throws a
- {{TypeError}}.
+
If the error mode is "fatal
" and
+ encoding's decoder returns error,
+ throws a {{TypeError}}.
The @@ -1156,33 +1187,25 @@ constructor, when invoked, must run these steps:
Let dec be a new {{TextDecoder}} object. -
Set dec's encoding to encoding. +
Set dec's encoding to encoding.
If options's fatal
member is true, then set dec's
- error mode to "fatal
".
+ error mode to "fatal
".
If options's ignoreBOM
member is true, then set dec's
- ignore BOM flag.
+ ignore BOM flag.
Return dec.
The encoding
attribute's getter must return
-encoding's name in ASCII lowercase.
-
-
The fatal
attribute's getter must return true
-if error mode is "fatal
", and false otherwise.
-
-
The ignoreBOM
attribute's getter must return
-true if ignore BOM flag is set, and false otherwise.
-
The decode(input, options)
method, when invoked, must run these steps:
If the do not flush flag is unset, set decoder - to a new encoding's decoder, set stream - to a new stream, and unset the BOM seen flag. + to a new encoding's decoder, set + stream to a new stream, and unset the + BOM seen flag.
If options's stream
is true, set the
do not flush flag, and unset the do not flush flag
@@ -1207,7 +1230,8 @@ method, when invoked, must run these steps:
If token is end-of-stream and the do not flush flag - is set, then return output, serialized. + is set, then return output, + serialized.
The way streaming works is to not handle end-of-stream here when the do not flush flag is set and to not unset that flag. That way in a @@ -1220,10 +1244,10 @@ method, when invoked, must run these steps:
Let result be the result of processing token for decoder, stream, output, and - error mode. + error mode.
If result is finished, then return output, - serialized. + serialized.
Otherwise, if result is error, then throw a {{TypeError}}. @@ -1231,6 +1255,20 @@ method, when invoked, must run these steps:
+interface mixin TextEncoderCommon { + readonly attribute DOMString encoding; +}; ++ +
The {{TextEncoderCommon}} interface mixin defines common attributes that are shared between +{{TextEncoder}} and {{TextEncoderStream}} objects. + +
The encoding
+attribute's getter, when invoked, must return "utf-8
".
+
A {{TextEncoder}} object has an associated encoder. @@ -1254,7 +1293,7 @@ requires buffering of scalar values.
encoder = new TextEncoder()
Returns a new {{TextEncoder}} object. -
encoder . encoding
+ encoder . encoding
Returns "utf-8
".
encoder . encode([input = ""])
@@ -1272,9 +1311,6 @@ constructor, when invoked, must run these steps:
Return enc.
The encoding
attribute's getter must return
-"utf-8
".
-
The encode(input)
method, when invoked,
must run these steps:
@@ -1305,6 +1341,396 @@ must run these steps:
+
The {{GenericTransformStream}} interface mixin represents the concept of a +transform stream in IDL. It is not a {{TransformStream}}, though it has the same interface +and it delegates to one. + +
+interface mixin GenericTransformStream { + readonly attribute ReadableStream readable; + readonly attribute WritableStream writable; +}; ++ +
An object that includes {{GenericTransformStream}} has an associated +transform of type {{TransformStream}}. + +
The readable
attribute's getter,
+when invoked, must return this object's transform.\[[readable]].
+
+
The writable
attribute's getter,
+when invoked, must return this object's transform.\[[writable]].
+
+
+
+[Constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options), + Exposed=(Window,Worker)] +interface TextDecoderStream { +}; +TextDecoderStream includes TextDecoderCommon; +TextDecoderStream includes GenericTransformStream; ++ +
A {{TextDecoderStream}} object has an associated +decoder, and stream. + +
decoder = new
+ TextDecoderStream([label =
+ "utf-8" [, options]])
+ Returns a new {{TextDecoderStream}} object. +
If label is either not a label or is a label for replacement, + throws a {{RangeError}}. + +
decoder . encoding
+ decoder . fatal
+ Returns true if error mode is "fatal
", and
+ false otherwise.
+
+
decoder . ignoreBOM
+ Returns true if ignore BOM flag is set, and false + otherwise. + +
decoder . readable
+ Returns a readable stream whose chunks are strings resulting from running + encoding's decoder on the chunks written to + {{GenericTransformStream/writable}}. + +
decoder . writable
+ Returns a writable stream which accepts {{BufferSource}} chunks and runs them through + encoding's decoder before making them available to + {{GenericTransformStream/readable}}. + +
Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a + {{ReadableStream}} source. + +
+var decoder = new TextDecoderStream(encoding);
+byteReadable
+ .pipeThrough(decoder)
+ .pipeTo(textWritable);
+
+ If the error mode is "fatal
" and
+ encoding's decoder returns error, both
+ {{GenericTransformStream/readable}} and {{GenericTransformStream/writable}} will be errored with a
+ {{TypeError}}.
+
The
+TextDecoderStream(label,
+options)
constructor, when invoked, must run these steps:
+
+
Let encoding be the result of getting an encoding from label. + +
If encoding is failure or replacement, then throw a {{RangeError}}. + +
Let dec be a new {{TextDecoderStream}} object. + +
Set dec's encoding to encoding. + +
If options's fatal
member is true, then set dec's
+ error mode to "fatal
".
+
+
If options's ignoreBOM
member is true, then set dec's
+ ignore BOM flag.
+
+
Set dec's decoder to a new decoder + for dec's encoding, and set dec's + stream to a new stream. + +
Let startAlgorithm be an algorithm that takes no arguments and returns nothing. + +
Let transformAlgorithm be an algorithm which takes a chunk argument + and runs the decode and enqueue a chunk algorithm with dec and + chunk. + +
Let flushAlgorithm be an algorithm which takes no arguments and runs the flush + and enqueue algorithm with dec. + +
Let transform be the result of calling + CreateTransformStream(startAlgorithm, transformAlgorithm, + flushAlgorithm). + +
Set dec's transform to transform. + +
Return dec. +
The decode and enqueue a chunk algorithm, given a {{TextDecoderStream}} object +dec and a chunk, runs these steps: + +
Let bufferSource be the result of + converting chunk to a {{BufferSource}}. If this + throws an exception, then return a promise rejected with that exception. + +
Push a copy of bufferSource to + dec's stream. If this throws an exception, then return a + promise rejected with that exception. + +
Let controller be dec's + transform.\[[transformStreamController]]. + +
Let output be a new stream. + +
While true, run these steps: + +
If token is end-of-stream, run these steps: +
Let outputChunk be output, + serialized. + +
if outputChunk is non-empty, call + TransformStreamDefaultControllerEnqueue(controller, + outputChunk). + +
Return a new promise resolved with undefined. +
Let result be the result of processing token for + dec's decoder, dec's + stream, output, and dec's + error mode. + +
If result is error, then return a new promise rejected with a + {{TypeError}} exception. +
The flush and enqueue algorithm, which handles the end of data from the input +{{ReadableStream}} object, given a {{TextDecoderStream}} object dec, runs these steps: + +
Let output be a new stream. + +
Let result be the result of processing end-of-stream for + dec's decoder and dec's + stream, output, and dec's + error mode. + +
If result is finished, run these steps: +
Let outputChunk be output, + serialized. + +
Let controller be dec's + transform.\[[transformStreamController]]. + +
If outputChunk is non-empty, call + TransformStreamDefaultControllerEnqueue(controller, + outputChunk). + +
Return a new promise resolved with undefined. +
Otherwise, return a new promise rejected with a {{TypeError}} exception. +
+[Constructor, + Exposed=(Window,Worker)] +interface TextEncoderStream { +}; +TextEncoderStream includes TextEncoderCommon; +TextEncoderStream includes GenericTransformStream; ++ +
A {{TextEncoderStream}} object has an associated encoder, +and pending high surrogate (initially null). + +
A {{TextEncoderStream}} object offers no label argument as it +only supports UTF-8. + +
encoder = new TextEncoderStream()
+ Returns a new {{TextEncoderStream}} object. + +
encoder . encoding
+ Returns "utf-8
".
+
+
encoder . readable
+ Returns a readable stream whose chunks are {{Uint8Array}}s resulting from running + UTF-8's encoder on the chunks written to {{GenericTransformStream/writable}}. + +
encoder . writable
+ Returns a writable stream which accepts string chunks and runs them through + UTF-8's encoder before making them available to + {{GenericTransformStream/readable}}. + +
Typically this will be used via the {{ReadableStream/pipeThrough()}} method on a + {{ReadableStream}} source. + +
+textReadable
+ .pipeThrough(new TextEncoderStream())
+ .pipeTo(byteWritable);
+The
+TextEncoderStream()
+constructor, when invoked, must run these steps:
+
+
Let enc be a new {{TextEncoderStream}} object. + +
Let startAlgorithm be an algorithm that takes no arguments and returns nothing. + +
Let transformAlgorithm be an algorithm which takes a chunk argument + and runs the encode and enqueue a chunk algorithm with enc and chunk. + +
Let flushAlgorithm be an algorithm which runs the encode and flush + algorithm with enc. + +
Let transform be the result of calling + CreateTransformStream(startAlgorithm, transformAlgorithm, + flushAlgorithm). + +
Set enc's transform to transform. + +
Return enc. +
The encode and enqueue a chunk algorithm, given a {{TextEncoderStream}} object +enc and chunk, runs these steps: + +
Let input be the result of converting + chunk to a {{DOMString}}. If this throws an exception, then return a promise rejected + with that exception. + +
{{DOMString}} is used here so that a surrogate pair that is split between chunks can + be reassembled into the appropriate scalar value. The behavior is otherwise identical to + {{USVString}}. In particular, lone surrogates will be replaced with U+FFFD. + +
Convert input to a stream. + +
Let output be a new stream. + +
Let controller be enc's + transform.\[[transformStreamController]]. + +
While true, run these steps: + +
Let token be the result of reading from input. + +
If token is end-of-stream, run these steps: + +
Convert output into a byte sequence. + +
If output is non-empty, run these steps: + +
Let chunk be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing + output. + +
Call TransformStreamDefaultControllerEnqueue(controller, + chunk). +
Return a new promise resolved with undefined. +
Let result be the result of executing the convert code unit to scalar + value algorithm with enc, token and input. + +
If result is not continue, then process result for + encoder, input, output. + +
The convert code unit to scalar value algorithm, given a {{TextEncoderStream}} object +enc, token, and stream input, runs these steps: + +
If enc's pending high surrogate is non-null, run these steps: + +
Let high surrogate be enc's pending high surrogate. + +
Set enc's pending high surrogate to null. + +
If token is in the range U+DC00 to U+DFFF, inclusive, then return a code point + whose value is 0x10000 + ((high surrogate − 0xD800) << 10) + + (token − 0xDC00). + +
Prepend token to input. + +
Return U+FFFD. +
If token is in the range U+D800 to U+DBFF, inclusive, then set pending high + surrogate to token and return continue. + +
If token is in the range U+DC00 to U+DFFF, inclusive, then return U+FFFD. + +
Return token. +
This is equivalent to the "convert a JavaScript string into a scalar +value string" algorithm from the Infra Standard, but allows for surrogate pairs that are split +between strings. [[!INFRA]] + +
The encode and flush algorithm, given a {{TextEncoderStream}} object enc, +runs these steps: + +
If enc's pending high surrogate is non-null, run these steps: + +
Let controller be enc's + transform.\[[transformStreamController]]. + +
Let output be the byte sequence 0xEF 0xBF 0xBD. + +
This is the replacement character U+FFFD encoded as UTF-8. + +
Let chunk be a {{Uint8Array}} object wrapping an {{ArrayBuffer}} containing + output. + +
Call TransformStreamDefaultControllerEnqueue(controller, + chunk). +
Return a new promise resolved with undefined. +