Skip to content

Commit

Permalink
Add support for binary data (#12) (#15)
Browse files Browse the repository at this point in the history
* Add support for binary data (#12)

* Request implementations now supply a `Uint8Array` of binary data through to the `chunkParser`.
* `defaultChunkParser` converts the `chunkBytes` into a string for processing
* Overhauled how parserState is handled; no longer assume that the `chunkParser` will emit a string; instead we use a `state` object whose contract is wholly owned by the parser (we just shuffle it around internally).
* Added test-coverage for UTF-8 characters
* Added dependency on `utf-8` lib for handling (un)marhsalling of binary data to text.
* Updated documentation
* Updated npm dependencies.

Inspiration provided by @mwitkow and @ariutta -- many thanks! :)

* Use TextEncoder / Decoder over `utf-8`.

* Update README to convey which polyfills are required for crappy browsers.

* Remove typearray polyfill

Would appear that IE10 has typedarray support :)

* Add test to ensure we always get a Uint8Array supplied.

* Instantiate TextEncoder/Decoder only once. Use stream option.

* Patch up flakey test.

Browsers using the fallback `xhr` transport should expect an addtional call to their `chunkParser` as we flush out the state of the `TextEncoder` when the XHR connection is closed.

* Remove left over utility methods.
  • Loading branch information
jonnyreeves authored Aug 26, 2016
1 parent f6f8be9 commit 251c178
Show file tree
Hide file tree
Showing 14 changed files with 198 additions and 65 deletions.
8 changes: 8 additions & 0 deletions .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,14 @@
"no-var": 2,
"prefer-const": 2
},
"env": {
"browser": true
},
"globals": {
"Uint8Array": false,
"TextEncoder": false,
"TextDecoder": false
},
"parserOptions": {
"ecmaVersion": 6,
"sourceType": "module"
Expand Down
79 changes: 72 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,11 @@ $ npm install chunked-request

or as a standalone ES5 browser script by obtaining `dist/chunked-request.js` from a [tagged release](https://github.com/jonnyreeves/chunked-request/releases).

## Browser Support
This library is tested against IE 10, Safari, Firefox and Chrome. It relies on browser support for [TypedArray](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray), [TextDecoder](https://developer.mozilla.org/en-US/docs/Web/API/TextEncoder) and [TextDecoder](https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder) Browser APIs; for legacy environments such as Safari and IE10, you will need to supply one or more of the polyfills listed below:

* [TextEncoder / TextDecoder Polyfill](https://www.npmjs.com/package/text-encoding) (IE10, Safari)

## API

```js
Expand Down Expand Up @@ -52,16 +57,14 @@ Determine if HTTP cookies will be sent along with the request, one of `same-orig
A function which implements the following interface:

```js
(rawChunk, previousChunkSuffix, isFinalChunk) => [ parsedChunk, chunkSuffix ]
(chunkBytes, state, flush) => [ parsed, state ]
```

The `chunkParser` takes the raw, textual chunk response returned by the server and converts it into the value passed to the `onChunk` callback (see `options.onChunk`). The function may also yield an optional chunkSuffix which will be not be passed to the `onChunk` callback but will instead be supplied as the `previousChunkSuffix` value the next time the `chunkParser` is invoked.

If the `chunkParser` throws an exception, the chunk will be discarded and the error that was raised will be passed to the `onChunk` callback augmented with a `rawChunk` property consisting of the textual chunk for logging / recovery.
The chunk parser converts the supplied Uint8Array of bytes into structured data which will be supplied to the `onChunk` callback. If no `chunkParser` function is supplied the `defaultChunkParser` will be used which expects the data to be JSON literals delimited by newline (`\\n`) characters.

If no `chunkParser` is supplied the `defaultChunkParser` will be used which expects the chunks returned by the server to consist of one or more `\n` delimited lines of JSON object literals which are parsed into an Array.
See [Writing a Custom Chunk Parser](#Writing a Custom Chunk Parser) below for more deatils on how to implement this interface.

`chunkParser` will be called with `isFinalChunk` as `true` when the response has completed and there was a non-empty `chunkSuffix` from the last chunk. The `rawChunk` will be an empty string and the `previousChunkSuffix` will be the last returned `chunkSuffix`.
If the `chunkParser` throws an exception, the chunk will be discarded and the error that was raised will be passed to the `onChunk` callback augmented with a `chunkBytes` property that contains the byte Array supplied to the parser and a `parserState` property which contains the state that was supplied (see below).

#### onChunk (optional)
A function which implements the following interface:
Expand Down Expand Up @@ -94,6 +97,68 @@ A function which implements the following interface:
({ url, headers, method, body, credentials, onComplete, onRawChunk }) => undefined
```

The underlying function to use to make the request, see the provided implementations if you wish to provide a custom extension.
The underlying function used to make the request, see the provided implementations if you wish to provide a custom extension. Note that you must supply a Uint8Array to the `onRawChunk` callback.

If no value is supplied the `chunkedRequest.transportFactory` function will be invoked to determine which transport method to use. The default `transportFactory` will attempt to select the best available method for the current platform; but you can override this method for substituting a test-double or custom implementation.


## Writing a Custom Chunk Parser
The `chunkParser` takes a 'chunk' of bytes in the form of a `Uint8Array` which were provided by the remote server and then converts it into the value passed to the `onChunk` callback (see `options.onChunk`). In it's simplest form the `chunkParser` acts as a passthru; the following example converts the supplied bytes into a string:

```js
chunkedRequest({
chunkParser(bytes) {
const str = utf8BytesToString(bytes);
return [ str ];
}
onChunk(err, str) {
console.log(`Chunk recieved: ${str}`);
}
}
```
Chunk Parsers will typically be dealing with structured data (eg: JSON literals) where a message can only be parsed if it is well formed (ie: a complete JSON literal). Because of the nature of chunked transfer, the server may end up flushing a chunk of data to the browser that contains an incomplete datastructure. The example below illustrates this where the first chunk from the server (Chunk 1) has an incomplete JSON literal which is subsiquently completed by the proceeding chunk (Chunk 2).
```
Server (Chunk 1)> { "name": "Jonny" }\n{ "name": "Frank" }\n{ "na
Server (Chunk 2)> me": "Bob" }
```
A naieve chunk parser implementation would attempt to parse the JSON literals contained in each chunk like so:
```js
chunkParser(bytes) {
const jsonLiterals = utf8BytesToString(bytes).split("\n");
// This will not work; Array index 2 `'{ "nam' is an incomplete JSON
// literal and will cause a SyntaxError from JSON.parse
return [ jsonLiterals.map(v => JSON.parse(v)) ];
}
```
Instead, the chunkParser should make use of the `state` object to retain any incomplete messages so they can be processed in the next pass:
```js
chunkParser(bytes, state = {}) {
const jsonLiterals = utf8BytesToString(bytes).split("\n");

// Does the state object contain any data that was not parsed
// in a previous pass (see below).
if (state.trailer) {
// Glue the data back together for a (potentially) complete literal.
jsonLiterals[0] = `${state.trailer}${jsonLiterals[0]}`;
}

// Check to see if the last literal parsed from this chunk ended with a
// message delimiter.
if (jsonLiterals[jsonLiterals.length-1] !== "\n") {
// move the last entry into the parser's state as it's incomplete; we
// can process it on the next pass.
state.trailer = jsonLiterals.pop();
}

return [ jsonLiterals.map(v => JSON.parse(v)), state ];
}
```
Finally, stateful chunk parsers must observe the third argument, `flush`. This flag will be true when the server has closed the conneciton indicating that there will be no further data. The chunkParser must process any remaining data in the state object at this point.
2 changes: 2 additions & 0 deletions karma.conf.js
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ module.exports = function(config) {

// list of files / patterns to load in the browser
files: [
'node_modules/text-encoding/lib/encoding.js',
'build/integration-tests.js'
],

Expand All @@ -78,6 +79,7 @@ module.exports = function(config) {

proxies: {
'/chunked-response': 'http://localhost:2001/chunked-response',
'/chunked-utf8-response': 'http://localhost:2001/chunked-utf8-response',
'/split-chunked-response': 'http://localhost:2001/split-chunked-response',
'/error-response': 'http://localhost:2001/error-response',
'/echo-response': 'http://localhost:2001/echo-response'
Expand Down
35 changes: 22 additions & 13 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,14 @@
"jsnext:main": "src/index.js",
"repository": "https://github.com/jonnyreeves/chunked-request",
"license": "MIT",
"keywords": [ "request", "chunked", "transfer", "comet", "xhr", "fetch" ],
"keywords": [
"request",
"chunked",
"transfer",
"comet",
"xhr",
"fetch"
],
"scripts": {
"prepublish": "npm run clean && npm run build:lib",
"clean": "rm -rf build/*",
Expand All @@ -17,19 +24,21 @@
"release": "./release.sh ${npm_package_version}"
},
"devDependencies": {
"babel-cli": "^6.6.5",
"babel-preset-es2015": "^6.6.0",
"babelify": "^7.2.0",
"browserify": "^13.0.0",
"cookie": "^0.2.3",
"eslint": "^2.4.0",
"babel-cli": "^6.11.4",
"babel-preset-es2015": "^6.13.2",
"babelify": "^7.3.0",
"browserify": "^13.1.0",
"cookie": "^0.3.1",
"eslint": "^3.3.1",
"jasmine": "^2.4.1",
"jasmine-core": "^2.4.1",
"karma": "^0.13.22",
"karma-chrome-launcher": "^0.2.2",
"karma-jasmine": "^0.3.8",
"karma-sauce-launcher": "^0.3.1",
"lodash": "^4.6.1",
"karma": "^1.2.0",
"karma-chrome-launcher": "^1.0.1",
"karma-jasmine": "^1.0.2",
"karma-sauce-launcher": "^1.0.0",
"lodash": "^4.15.0",
"text-encoding": "^0.6.0",
"url": "^0.11.0"
}
},
"dependencies": {}
}
6 changes: 0 additions & 6 deletions src/.eslintrc.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
{
"env": {
"browser": true
},
"globals": {
"Uint8Array": false
},
"rules": {
"no-var": 2
}
Expand Down
28 changes: 18 additions & 10 deletions src/defaultChunkParser.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,34 @@ const entryDelimiter = '\n';
// The defaultChunkParser expects the response from the server to consist of new-line
// delimited JSON, eg:
//
// { "chunk": "#1", "data": "Hello" }
// { "chunk": "#1", "data": "Hello" }\n
// { "chunk": "#2", "data": "World" }
//
// It will correctly handle the case where a chunk is emitted by the server across
// delimiter boundaries.
export default function defaultChunkParser(rawChunk, prevChunkSuffix = '', isFinalChunk = false) {
let chunkSuffix;

const rawChunks = `${prevChunkSuffix}${rawChunk}`
.split(entryDelimiter);
export default function defaultChunkParser(bytes, state = {}, flush = false) {
if (!state.textDecoder) {
state.textDecoder = new TextDecoder();
}
const textDecoder = state.textDecoder;
const chunkStr = textDecoder.decode(bytes, { stream: !flush })
const jsonLiterals = chunkStr.split(entryDelimiter);
if (state.trailer) {
jsonLiterals[0] = `${state.trailer}${jsonLiterals[0]}`;
state.trailer = '';
}

if (!isFinalChunk && !hasSuffix(rawChunk, entryDelimiter)) {
chunkSuffix = rawChunks.pop();
// Is this a complete message? If not; push the trailing (incomplete) string
// into the state.
if (!flush && !hasSuffix(chunkStr, entryDelimiter)) {
state.trailer = jsonLiterals.pop();
}

const processedChunks = rawChunks
const jsonObjects = jsonLiterals
.filter(v => v.trim() !== '')
.map(v => JSON.parse(v));

return [ processedChunks, chunkSuffix ];
return [ jsonObjects, state ];
}

function hasSuffix(s, suffix) {
Expand Down
3 changes: 1 addition & 2 deletions src/impl/fetch.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ import { isObject } from '../util';
export const READABLE_BYTE_STREAM = 'readable-byte-stream';

export default function fetchRequest(options) {
const decoder = new TextDecoder();
const { onRawChunk, onRawComplete, method, body, credentials } = options;
const headers = marshallHeaders(options.headers);

Expand All @@ -17,7 +16,7 @@ export default function fetchRequest(options) {
raw: res
});
}
onRawChunk(decoder.decode(result.value));
onRawChunk(result.value);
return pump(reader, res);
});
}
Expand Down
9 changes: 1 addition & 8 deletions src/impl/mozXhr.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,7 @@ export default function mozXhrRequest(options) {
const xhr = new XMLHttpRequest();

function onProgressEvent() {
const view = new Uint8Array(xhr.response);
let len = view.length;

const rawString = new Array(len);
while(len--) {
rawString[len] = String.fromCharCode(view[len]);
}
options.onRawChunk(rawString.join(''));
options.onRawChunk(new Uint8Array(xhr.response));
}

function onLoadEvent() {
Expand Down
7 changes: 5 additions & 2 deletions src/impl/xhr.js
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
export const XHR = 'xhr';

export default function xhrRequest(options) {
const textEncoder = new TextEncoder();
const xhr = new XMLHttpRequest();
let index = 0;

function onProgressEvent() {
const rawChunk = xhr.responseText.substr(index);
const rawText = xhr.responseText.substr(index);
index = xhr.responseText.length;
options.onRawChunk(rawChunk);
options.onRawChunk(textEncoder.encode(rawText, { stream: true }));
}

function onLoadEvent() {
// Force the textEncoder to flush.
options.onRawChunk(textEncoder.encode(null, { stream: false }));
options.onRawComplete({
statusCode: xhr.status,
transport: XHR,
Expand Down
23 changes: 12 additions & 11 deletions src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -20,31 +20,32 @@ export default function chunkedRequest(options) {
chunkParser = defaultChunkParser
} = options;

let prevChunkSuffix = "";
// parserState can be utilised by the chunkParser to hold on to state; the
// defaultChunkParser uses it to keep track of any trailing text the last
// delimiter in the chunk. There is no contract for parserState.
let parserState;

function processRawChunk(rawChunk, isFinalChunk = false) {
function processRawChunk(chunkBytes, flush = false) {
let parsedChunks = null;
let parseError = null;
let suffix = "";

try {
[ parsedChunks, suffix ] = chunkParser(rawChunk, prevChunkSuffix, isFinalChunk);
prevChunkSuffix = suffix || "";
[ parsedChunks, parserState ] = chunkParser(chunkBytes, parserState, flush);
} catch (e) {
parseError = e;
parseError.rawChunk = rawChunk;
parseError.prevChunkSuffix = prevChunkSuffix;
parseError.chunkBytes = chunkBytes;
parseError.parserState = parserState;
} finally {
if (parseError || (parsedChunks !== null && parsedChunks.length > 0)) {
if (parseError || (parsedChunks && parsedChunks.length > 0)) {
onChunk(parseError, parsedChunks);
}
}
}

function processRawComplete(rawComplete) {
if (prevChunkSuffix != "") {
// Call the parser with isFinalChunk=true to flush the prevChunkSuffix
processRawChunk("", true);
if (parserState) {
// Flush the parser to process any remaining state.
processRawChunk(new Uint8Array(), true);
}
onComplete(rawComplete);
}
Expand Down
2 changes: 1 addition & 1 deletion src/util.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ export function isObject(value) {

export function noop() {
/* No operation */
}
}
4 changes: 3 additions & 1 deletion test/integ/.eslintrc.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{
"env": {
"browser": true,
"jasmine": true
},
"rules": {
"no-console": 0
}
}
Loading

0 comments on commit 251c178

Please sign in to comment.