Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v8: expose new V8 5.5 serialization API #11048

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
251 changes: 251 additions & 0 deletions doc/api/v8.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,3 +157,254 @@ setTimeout(function() { v8.setFlagsFromString('--notrace_gc'); }, 60e3);
[`vm.Script`]: vm.html#vm_new_vm_script_code_options
[here]: https://github.com/thlorenz/v8-flags/blob/master/flags-0.11.md
[`GetHeapSpaceStatistics`]: https://v8docs.nodesource.com/node-5.0/d5/dda/classv8_1_1_isolate.html#ac673576f24fdc7a33378f8f57e1d13a4

## Serialization API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to introduce this API, even tho I know this is going into the v8 module that is already clearly marked as being fluid based on what v8 chooses to do, we should mark this explicitly as being Experimental for the time being.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasnell done!

> Stability: 1 - Experimental

The serialization API provides means of serializing JavaScript values in a way
that is compatible with the [HTML structured clone algorithm][].
The format is backward-compatible (i.e. safe to store to disk).

*Note*: This API is under development, and changes (including incompatible
changes to the API or wire format) may occur until this warning is removed.

### v8.serialize(value)
<!--
added: REPLACEME
-->

* Returns: {Buffer}

Uses a [`DefaultSerializer`][] to serialize `value` into a buffer.

### v8.deserialize(buffer)
<!--
added: REPLACEME
-->

* `buffer` {Buffer|Uint8Array} A buffer returned by [`serialize()`][].

Uses a [`DefaultDeserializer`][] with default options to read a JS value
from a buffer.

### class: v8.Serializer
<!--
added: REPLACEME
-->

#### new Serializer()
Creates a new `Serializer` object.

#### serializer.writeHeader()

Writes out a header, which includes the serialization format version.

#### serializer.writeValue(value)

Serializes a JavaScript value and adds the serialized representation to the
internal buffer.

This throws an error if `value` cannot be serialized.

#### serializer.releaseBuffer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep close to the V8 API, this should be called release(). Did you keep it like that because a Buffer is returned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@targos I would assume V8 calling it Release is because ReleaseBuffer was already taken by the legacy method. But yeah, it’s nice that releaseBuffer() tells you the (otherwise not obvious) return type.


Returns the stored internal buffer. This serializer should not be used once
the buffer is released. Calling this method results in undefined behavior
if a previous write has failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that accurate? On failure, you aren't allowed to cleanup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do writes fail? do they throw errors? writeHeader/Value don't say they can throw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On failure, you aren't allowed to cleanup?

I guess that depends on what you mean by “cleanup”? Any resources held by the serializer will be returned when it gets gc’ed.

how do writes fail? do they throw errors? writeHeader/Value don't say they can throw.

Yes, writeValue may throw errors. I’ve noted that in the corresponding section.


#### serializer.transferArrayBuffer(id, arrayBuffer)

* `id` {integer} A 32-bit unsigned integer.
* `arrayBuffer` {ArrayBuffer} An `ArrayBuffer` instance.

Marks an `ArrayBuffer` as havings its contents transferred out of band.
Pass the corresponding `ArrayBuffer` in the deserializing context to
[`deserializer.transferArrayBuffer()`][].

#### serializer.writeUint32(value)

* `value` {integer}

Write a raw 32-bit unsigned integer.
For use inside of a custom [`serializer._writeHostObject()`][].

#### serializer.writeUint64(hi, lo)

* `hi` {integer}
* `lo` {integer}

Write a raw 64-bit unsigned integer, split into high and low 32-bit parts.
For use inside of a custom [`serializer._writeHostObject()`][].

#### serializer.writeDouble(value)

* `value` {number}

Write a JS `number` value.
For use inside of a custom [`serializer._writeHostObject()`][].

#### serializer.writeRawBytes(buffer)

* `buffer` {Buffer|Uint8Array}

Write raw bytes into the serializer’s internal buffer. The deserializer
will require a way to compute the length of the buffer.
For use inside of a custom [`serializer._writeHostObject()`][].

#### serializer.\_writeHostObject(object)

* `object` {Object}

This method is called to write some kind of host object, i.e. an object created
by native C++ bindings. If it is not possible to serialize `object`, a suitable
exception should be thrown.

This method is not present on the `Serializer` class itself but can be provided
by subclasses.

#### serializer.\_getDataCloneError(message)

* `message` {string}

This method is called to generate error objects that will be thrown when an
object can not be cloned.

This method defaults to the [`Error`][] constructor and can be be overridden on
subclasses.

#### serializer.\_getSharedArrayBufferId(sharedArrayBuffer)

* `sharedArrayBuffer` {SharedArrayBuffer}

This method is called when the serializer is going to serialize a
`SharedArrayBuffer` object. It must return an unsigned 32-bit integer ID for
the object, using the same ID if this `SharedArrayBuffer` has already been
serialized. When deserializing, this ID will be passed to
[`deserializer.transferArrayBuffer()`][].

If the object cannot be serialized, an exception should be thrown.

This method is not present on the `Serializer` class itself but can be provided
by subclasses.

#### serializer.\_setTreatArrayBufferViewsAsHostObjects(flag)

* `flag` {boolean}

Indicate whether to treat `TypedArray` and `DataView` objects as
host objects, i.e. pass them to [`serializer._writeHostObject`][].

The default is not to treat those objects as host objects.

### class: v8.Deserializer
<!--
added: REPLACEME
-->

#### new Deserializer(buffer)

* `buffer` {Buffer|Uint8Array} A buffer returned by [`serializer.releaseBuffer()`][].

Creates a new `Deserializer` object.

#### deserializer.readHeader()

Reads and validates a header (including the format version).
May, for example, reject an invalid or unsupported wire format. In that case,
an `Error` is thrown.

#### deserializer.readValue()

Deserializes a JavaScript value from the buffer and returns it.

#### deserializer.transferArrayBuffer(id, arrayBuffer)

* `id` {integer} A 32-bit unsigned integer.
* `arrayBuffer` {ArrayBuffer|SharedArrayBuffer} An `ArrayBuffer` instance.

Marks an `ArrayBuffer` as havings its contents transferred out of band.
Pass the corresponding `ArrayBuffer` in the serializing context to
[`serializer.transferArrayBuffer()`][] (or return the `id` from
[`serializer._getSharedArrayBufferId()`][] in the case of `SharedArrayBuffer`s).

#### deserializer.getWireFormatVersion()

* Returns: {integer}

Reads the underlying wire format version. Likely mostly to be useful to
legacy code reading old wire format versions. May not be called before
`.readHeader()`.

#### deserializer.readUint32()

* Returns: {integer}

Read a raw 32-bit unsigned integer and return it.
For use inside of a custom [`deserializer._readHostObject()`][].

#### deserializer.readUint64()

* Returns: {Array}

Read a raw 64-bit unsigned integer and return it as an array `[hi, lo]`
with two 32-bit unsigned integer entries.
For use inside of a custom [`deserializer._readHostObject()`][].

#### deserializer.readDouble()

* Returns: {number}

Read a JS `number` value.
For use inside of a custom [`deserializer._readHostObject()`][].

#### deserializer.readRawBytes(length)

* Returns: {Buffer}

Read raw bytes from the deserializer’s internal buffer. The `length` parameter
must correspond to the length of the buffer that was passed to
[`serializer.writeRawBytes()`][].
For use inside of a custom [`deserializer._readHostObject()`][].

#### deserializer.\_readHostObject()

This method is called to read some kind of host object, i.e. an object that is
created by native C++ bindings. If it is not possible to deserialize the data,
a suitable exception should be thrown.

This method is not present on the `Deserializer` class itself but can be
provided by subclasses.

### class: v8.DefaultSerializer
<!--
added: REPLACEME
-->

A subclass of [`Serializer`][] that serializes `TypedArray`
(in particular [`Buffer`][]) and `DataView` objects as host objects, and only
stores the part of their underlying `ArrayBuffer`s that they are referring to.

### class: v8.DefaultDeserializer
<!--
added: REPLACEME
-->

A subclass of [`Deserializer`][] corresponding to the format written by
[`DefaultSerializer`][].

[`Buffer`]: buffer.html
[`Error`]: errors.html#errors_class_error
[`deserializer.transferArrayBuffer()`]: #v8_deserializer_transferarraybuffer_id_arraybuffer
[`deserializer._readHostObject()`]: #v8_deserializer_readhostobject
[`serializer.transferArrayBuffer()`]: #v8_serializer_transferarraybuffer_id_arraybuffer
[`serializer.releaseBuffer()`]: #v8_serializer_releasebuffer
[`serializer.writeRawBytes()`]: #v8_serializer_writerawbytes_buffer
[`serializer._writeHostObject()`]: #v8_serializer_writehostobject_object
[`serializer._getSharedArrayBufferId()`]: #v8_serializer_getsharedarraybufferid_sharedarraybuffer
[`Serializer`]: #v8_class_v8_serializer
[`Deserializer`]: #v8_class_v8_deserializer
[`DefaultSerializer`]: #v8_class_v8_defaultserializer
[`DefaultDeserializer`]: #v8_class_v8_defaultdeserializer
[`serialize()`]: #v8_v8_serialize_value
[HTML structured clone algorithm]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm
4 changes: 3 additions & 1 deletion lib/buffer.js
Original file line number Diff line number Diff line change
Expand Up @@ -1398,4 +1398,6 @@ Buffer.prototype.toLocaleString = Buffer.prototype.toString;

// Put this at the end because internal/buffer has a circular
// dependency on Buffer.
exports.transcode = require('internal/buffer').transcode;
const internalBuffer = require('internal/buffer');
exports.transcode = internalBuffer.transcode;
internalBuffer.FastBuffer = FastBuffer;
119 changes: 119 additions & 0 deletions lib/v8.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,14 @@

'use strict';

const Buffer = require('buffer').Buffer;

const v8binding = process.binding('v8');
const serdesBinding = process.binding('serdes');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps something more explicit such as v8serdes would be better? Not sure about that tho.
@nodejs/node-chakracore have you all considered implementing this mechanism yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess that depends if and how node-chakracore would want to implement this. I don’t think the binding name really matters a lot anyway.

const bufferBinding = process.binding('buffer');

const { objectToString } = require('internal/util');
const { FastBuffer } = require('internal/buffer');

// Properties for heap statistics buffer extraction.
const heapStatisticsBuffer =
Expand Down Expand Up @@ -80,3 +87,115 @@ exports.getHeapSpaceStatistics = function() {

return heapSpaceStatistics;
};

/* V8 serialization API */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiniest of style nits but is there a reason for mixing C and C++-style comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis I would say C-style comments feel a bit more heading-y than C++-style comments? I never consciously noticed but I think I use // for text that only refers to the next one or two statements, whereas /* … */ refers to a longer section of code. That also seems to match how we use eslint-disable comments in our codebase.

If you feel strongly about it, I have no problem changing the format in either way. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never looked at it that way. I suppose it's fine, lib/ is a mixture of both; // is more prevalent but that's probably also because of the copyright header.


const Serializer = exports.Serializer = serdesBinding.Serializer;
const Deserializer = exports.Deserializer = serdesBinding.Deserializer;

/* JS methods for the base objects */
Serializer.prototype._getDataCloneError = Error;

Deserializer.prototype.readRawBytes = function(length) {
const offset = this._readRawBytes(length);
// `this.buffer` can be a Buffer or a plain Uint8Array, so just calling
// `.slice()` doesn't work.
return new FastBuffer(this.buffer.buffer,
this.buffer.byteOffset + offset,
length);
};

/* Keep track of how to handle different ArrayBufferViews.
* The default Serializer for Node does not use the V8 methods for serializing
* those objects because Node's `Buffer` objects use pooled allocation in many
* cases, and their underlying `ArrayBuffer`s would show up in the
* serialization. Because a) those may contain sensitive data and the user
* may not be aware of that and b) they are often much larger than the `Buffer`
* itself, custom serialization is applied. */
const arrayBufferViewTypes = [Int8Array, Uint8Array, Uint8ClampedArray,
Int16Array, Uint16Array, Int32Array, Uint32Array,
Float32Array, Float64Array, DataView];

const arrayBufferViewTypeToIndex = new Map();

{
const dummy = new ArrayBuffer();
for (const [i, ctor] of arrayBufferViewTypes.entries()) {
const tag = objectToString(new ctor(dummy));
arrayBufferViewTypeToIndex.set(tag, i);
}
}

const bufferConstructorIndex = arrayBufferViewTypes.push(Buffer) - 1;

class DefaultSerializer extends Serializer {
constructor() {
super();

this._setTreatArrayBufferViewsAsHostObjects(true);
}

_writeHostObject(abView) {
let i = 0;
if (abView.constructor === Buffer) {
i = bufferConstructorIndex;
} else {
const tag = objectToString(abView);
i = arrayBufferViewTypeToIndex.get(tag);

if (i === undefined) {
throw this._getDataCloneError(`Unknown host object type: ${tag}`);
}
}
this.writeUint32(i);
this.writeUint32(abView.byteLength);
this.writeRawBytes(new Uint8Array(abView.buffer,
abView.byteOffset,
abView.byteLength));
}
}

exports.DefaultSerializer = DefaultSerializer;

class DefaultDeserializer extends Deserializer {
constructor(buffer) {
super(buffer);
}

_readHostObject() {
const typeIndex = this.readUint32();
const ctor = arrayBufferViewTypes[typeIndex];
const byteLength = this.readUint32();
const byteOffset = this._readRawBytes(byteLength);
const BYTES_PER_ELEMENT = ctor.BYTES_PER_ELEMENT || 1;

const offset = this.buffer.byteOffset + byteOffset;
if (offset % BYTES_PER_ELEMENT === 0) {
return new ctor(this.buffer.buffer,
offset,
byteLength / BYTES_PER_ELEMENT);
} else {
// Copy to an aligned buffer first.
const copy = Buffer.allocUnsafe(byteLength);
bufferBinding.copy(this.buffer, copy, 0, offset, offset + byteLength);
return new ctor(copy.buffer,
copy.byteOffset,
byteLength / BYTES_PER_ELEMENT);
}
}
}

exports.DefaultDeserializer = DefaultDeserializer;

exports.serialize = function serialize(value) {
const ser = new DefaultSerializer();
ser.writeHeader();
ser.writeValue(value);
return ser.releaseBuffer();
};

exports.deserialize = function deserialize(buffer) {
const der = new DefaultDeserializer(buffer);
der.readHeader();
return der.readValue();
};
Loading