Support loading models with weights above 2GB on Chrome #7609

mattsoulanille · 2023-04-20T18:09:55Z

Chrome ArrayBuffers throw allocation errors above 2GB in size. This makes it impossible to load TFJS models above this size in Chrome (even with weight sharding) because model loading involves concatenating all the weights into a single ArrayBuffer.

This PR avoids this concatenation. Instead of slicing the weight tensors out of a single concatenated ArrayBuffer, it keeps the weight buffers in their original shards and slices them using the CompositeArrayBuffer class created in #7598.

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.

This change is

mattsoulanille · 2023-04-20T21:21:07Z

tfjs-node/src/io/io_utils.ts

-
-// TODO(cais): Use explicit tf.io.ModelArtifactsInfo return type below once it
-// is available.
-/**
- * Populate ModelArtifactsInfo fields for a model with JSON topology.
- * @param modelArtifacts
- * @returns A ModelArtifactsInfo object.
- */
-export function getModelArtifactsInfoForJSON(
-    modelArtifacts: tf.io.ModelArtifacts) {
-  if (modelArtifacts.modelTopology instanceof ArrayBuffer) {
-    throw new Error('Expected JSON model topology, received ArrayBuffer.');
-  }
-  return {
-    dateSaved: new Date(),
-    modelTopologyType: 'JSON',
-    modelTopologyBytes: modelArtifacts.modelTopology == null ?
-        0 :
-        Buffer.byteLength(JSON.stringify(modelArtifacts.modelTopology), 'utf8'),
-    weightSpecsBytes: modelArtifacts.weightSpecs == null ?
-        0 :
-        Buffer.byteLength(JSON.stringify(modelArtifacts.weightSpecs), 'utf8'),
-    weightDataBytes: modelArtifacts.weightData == null ?
-        0 :
-        modelArtifacts.weightData.byteLength,
-  };
-}


This is duplicated in tfjs-core/src/io/io_utils.ts

mattsoulanille · 2023-04-20T21:21:38Z

tfjs-core/src/io/io_utils.ts

@@ -285,7 +291,7 @@ const useNodeBuffer = typeof Buffer !== 'undefined' &&
 */
 export function stringByteLength(str: string): number {
  if (useNodeBuffer) {
-    return Buffer.byteLength(str);
+    return Buffer.byteLength(str, 'utf8');


tfjs-node used utf8 in its implementation, so I think it should also be here.

chunnienc · 2023-04-20T21:16:19Z

tfjs-core/src/io/io_utils.ts

 * @returns Result of concatenating `buffers` in order.
 */
-export function concatenateArrayBuffers(buffers: ArrayBuffer[]): ArrayBuffer {
+export function concatenateArrayBuffers(buffers: ArrayBuffer[]


Can we move this to be part of CompositeArrayBuffer? like static method CompositeArrayBuffer.join(buffers: ArrayBuffer[]) or through public method new CompositeArrayBuffer(buffers).toArrayBuffer(), which makes it easier to bridge CompositeArrayBuffer with native ArrayBuffer and pass CompositeArrayBuffer around in the future if needed.

I was considering that, and my original implementation actually used new CompositeArrayBuffer(buffers).slice(), but I removed it in favor of concatenateArrayBuffers because of an issue with the types in tfjs-converter tests (here was my fix for it in the spy_ops.ts file, but it's a bit hacky).

I'm fine with using the converter spy_ops.ts fix if it'll make the core implementation cleaner. What do you think?

Edit: ...and we can add a toArrayBuffer or static join method instead of using .slice.

Alternatively, I can move composite_array_buffer.ts out of io/

I took a look at the usage of spyOnAllFunctions in tests, and I think the test is something we should fix. A hacky way like what you did is probably fine.

In general, instead of automatically replace everything with spy using spyOnAllFunctions, we should explicitly create an ioSpy object which only contains the function we want to spy, so that we can make the test more controllable and reliable. There are some stuffs exported in io apparently should not be spied, like getWeightSpecs, which is a io helper function instead of a function to do io.

@chunnienc I've replaced concatenateArrayBuffers with CompositeArrayBuffer.join in tfjs-core and deprecated concatenateArrayBuffers. We can't replace it in other packages yet because that would introduce a breaking change. Downstream packages could not be used with an earlier version of tfjs-core that does not implement CompositeArrayBuffer (see #7273 for an example of why this is important). We can apply this change to all the packages in the next major release.

Actually, it's fine to use it in tests, since users will never run those. I'll swap concatenateArrayBuffers for CompositeArrayBuffer.join in the test files.

…fer.slice()

…files

pyu10055

Reviewed 4 of 18 files at r1, 1 of 1 files at r2, 13 of 13 files at r3, 4 of 4 files at r4, all commit messages.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @chunnienc)

mattsoulanille force-pushed the large_model_weights branch 4 times, most recently from fe844d2 to b21b302 Compare April 20, 2023 20:21

mattsoulanille requested review from pyu10055 and chunnienc April 20, 2023 20:26

mattsoulanille marked this pull request as ready for review April 20, 2023 20:26

mattsoulanille commented Apr 20, 2023

View reviewed changes

chunnienc requested changes Apr 20, 2023

View reviewed changes

mattsoulanille mentioned this pull request Apr 21, 2023

Support loading large model weights #7610

Merged

mattsoulanille added 7 commits May 3, 2023 14:11

Support using a list of ArrayBuffers as model weight data

71365a0

Avoid 'Array.flat()'

b6de597

Simplify some of the tests

ffcfa68

Do not export 'CompositeArrayBuffer' from tfjs-core

d2fdede

Update doc for weightData

121396d

Fix tfjs-node

feaa673

Remove unused import

94ed22e

mattsoulanille force-pushed the large_model_weights branch from 0276e01 to 94ed22e Compare May 3, 2023 21:11

mattsoulanille added 4 commits May 3, 2023 15:00

Replace concatenateArrayBuffers implementation with CompositeArrayBuf…

de30e69

…fer.slice()

Rename CompositeArrayBuffer.concatenateArrayBuffers to .join

eb279cc

Replace concatenateArrayBuffers with CompositeArrayBuffer.join in core

dd23261

Change concatenateArrayBuffers to CompositeArrayBuffer.join for test …

0568a30

…files

mattsoulanille requested a review from chunnienc May 3, 2023 22:37

chunnienc approved these changes May 4, 2023

View reviewed changes

Merge branch 'master' into large_model_weights

c89fe48

pyu10055 approved these changes May 4, 2023

View reviewed changes

mattsoulanille added 2 commits May 4, 2023 12:09

Merge branch 'master' into large_model_weights

281e847

Merge branch 'master' into large_model_weights

74209e6

mattsoulanille merged commit 086e9d8 into tensorflow:master May 4, 2023

mattsoulanille mentioned this pull request May 19, 2023

Saving / Loading Large Models in IndexedDB Causes OOM #7702

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support loading models with weights above 2GB on Chrome #7609

Support loading models with weights above 2GB on Chrome #7609

mattsoulanille commented Apr 20, 2023 •

edited

Loading

mattsoulanille Apr 20, 2023

mattsoulanille Apr 20, 2023

chunnienc Apr 20, 2023

mattsoulanille Apr 20, 2023 •

edited

Loading

mattsoulanille Apr 20, 2023

chunnienc Apr 21, 2023 •

edited

Loading

mattsoulanille May 3, 2023

mattsoulanille May 3, 2023

pyu10055 left a comment

Support loading models with weights above 2GB on Chrome #7609

Support loading models with weights above 2GB on Chrome #7609

Conversation

mattsoulanille commented Apr 20, 2023 • edited Loading

mattsoulanille Apr 20, 2023

Choose a reason for hiding this comment

mattsoulanille Apr 20, 2023

Choose a reason for hiding this comment

chunnienc Apr 20, 2023

Choose a reason for hiding this comment

mattsoulanille Apr 20, 2023 • edited Loading

Choose a reason for hiding this comment

mattsoulanille Apr 20, 2023

Choose a reason for hiding this comment

chunnienc Apr 21, 2023 • edited Loading

Choose a reason for hiding this comment

mattsoulanille May 3, 2023

Choose a reason for hiding this comment

mattsoulanille May 3, 2023

Choose a reason for hiding this comment

pyu10055 left a comment

Choose a reason for hiding this comment

mattsoulanille commented Apr 20, 2023 •

edited

Loading

mattsoulanille Apr 20, 2023 •

edited

Loading

chunnienc Apr 21, 2023 •

edited

Loading