Support dump with graphmodel.execute #6953

axinging · 2022-10-19T02:30:01Z

Currently only graph model predicting with executeAsync supports dump. This has two drawbackes:

Models predict with execute don't support dump.
Some models have a wrapping layer over graph model (such as bodypix, pose-detection), to support dumping by graphModel.executeAsync, non trival changes are required, example change: Support model debug in pose-detection tfjs-models#841.

This change removes these limitations and enables dumping on more models.

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.

This change is

axinging · 2022-10-25T02:01:19Z

@qjia7 @xhcao @haoyunfeix @gyagp PTAL

qjia7 · 2022-10-25T08:46:32Z

tfjs-converter/src/executor/graph_executor.ts

+  private tensorsPendingDisposalForAsync: NamedTensorsMap;
+  private idsKeepForAsync: Set<number>;
+  // Variables with Sync suffix is used for dumping by execute.
+  private tensorsPendingDisposalForSync: Tensor[];


Can you unify to use tensorsPendingDisposal or intermediateTensorsPendingDisposal for both async and sync for simplicity?

qjia7 · 2022-10-25T08:50:43Z

tfjs-converter/src/executor/graph_executor.ts

  private tensorsMap: NamedTensorsMap;
-  private keepTensorForDebug = false;
+  private dumpMode = DumpMode.Default;


A boolean variable keepTensorForDebug like before is enough? I think the tidy won't be used for async execute.

This is for:
https://github.com/axinging/tfjs/blob/ca5dbc94cbae0848af0a38d61f58aefe81240c4b/tfjs-converter/src/executor/graph_executor.ts#L346

In Default mode, it does dispose; Sync mode, does nothing; Async mode: pushes tensor to disposal queue.

if (this.dumpMode === DumpMode.Default) { tensor.dispose(); } else if (this.dumpMode === DumpMode.Async) { this.tensorsPendingDisposal.push(tensor); }

In sync mode, it won't go to this code since tensor.kept = true since you add keepTensors before it?

You are right. But I think putting code under "(this.dumpMode === DumpMode.Async)" help to remind this is dump specific.
BTW, I move this logic outside this checkTensorForDisposal in the latest change.

qjia7 · 2022-10-26T05:06:27Z

tfjs-converter/src/executor/graph_executor.ts

@@ -45,10 +52,11 @@ export class GraphExecutor implements FunctionExecutor {
  private _functions: {[key: string]: Graph} = {};
  private _functionExecutorMap: {[key: string]: FunctionExecutor} = {};
  private _resourceManager: ResourceManager;
-  private intermediateTensors: NamedTensorsMap = {};
-  private keepIds: Set<number>;
+  // Variables with Async suffix is used for dumping by executeAsync.


nit: Update the annotation.

qjia7 · 2022-10-26T06:39:22Z

tfjs-converter/src/executor/graph_executor.ts


      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
+        // Input tensors should be disposed by user.
+        this.keepTensors(tensors);


Why we need to add keepTensors for inputs compared with the original code?

I add some comments above this line:

// For some models, such as bodypix, it will dispose the input tensors // in its top level tidy. In dump mode, these tensors are required, so // call keep to eusure they are preserved. However, this comes with a // side effect in dump mode, tensor leak.

Use this.keepTensors(tensors, this.tensorsPendingDisposal); to avoid tensor leak?

if (this.keepIntermediateTensors) { this.keepTensors(tensors, this.tensorsPendingDisposal); }

This leak only happens in dump mode. And 'tensors' includes some tensors will be used in later inference, especially in e2e, which will run inference on two different backends.

Anyway, I think we need to fix it. We need to find which place releases it and debug why it's released and then reused again in debug mode. Or give reason that why tensor leak is unavoidable.

qjia7 · 2022-10-26T06:45:53Z

tfjs-converter/src/executor/graph_executor.ts

+    if (this.dumpMode === DumpMode.Sync) {
+      this.tensorsMap = tensorsMap;
+    } else {
+      this.tensorsMap = null;


It's safe to say this.tensorsMap = tensorsMap since L280 make sure that async won't go to this path?

Yes. This 'if' help to remind this is dump specific. Under non-dump mode, this.tensorsMap will never be used.

qjia7 · 2022-10-26T06:48:14Z

tfjs-converter/src/executor/graph_executor.ts

  private tensorsMap: NamedTensorsMap;
-  private keepTensorForDebug = false;
+  private dumpMode = DumpMode.Default;


In sync mode, it won't go to this code since tensor.kept = true since you add keepTensors before it?

qjia7 · 2022-10-26T06:49:52Z

tfjs-converter/src/executor/graph_executor.ts

+      this.tensorsPendingDisposal = null;
+    }
+    if (this.dumpMode === DumpMode.Async) {
+      this.disposeTensorsMap();


why we need to call this.disposeTensorsMap() only for async mode?

This is intended behaviour:
In sync mode, intermediat tensors are kept in tensorsPendingDisposal.
In async mode, intermediat tensors = = tensormap - tensorsToKeep.

This is why we have two different dispose.
Please note in the updated version, I tried to align these two path, which means, for both mode, all intermediat tensors are kept in tensorsPendingDisposal.

qjia7 · 2022-10-27T06:35:02Z

tfjs-converter/src/executor/graph_executor.ts


      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
+        // Input tensors should be disposed by user.
+        this.keepTensors(tensors);


Use this.keepTensors(tensors, this.tensorsPendingDisposal); to avoid tensor leak?

if (this.keepIntermediateTensors) { this.keepTensors(tensors, this.tensorsPendingDisposal); }

qjia7 · 2022-10-27T06:37:33Z

tfjs-converter/src/executor/graph_executor.ts

  private tensorsMap: NamedTensorsMap;
-  private keepTensorForDebug = false;
+  private keepTensorsForDump = false;


nit: For me, keepIntermediateTensors is a better name since the value is set by KEEP_INTERMEDIATE_TENSORS. And the purpose of it is not only for dump.

tfjs-converter/src/executor/graph_executor.ts

qjia7

Thanks Xing.
The code looks good in my side. But the tensor leak issue still needs you to do more investigation to figure out the reason.
Will add more reviewers. Thanks.

qjia7 · 2022-10-27T08:28:38Z

tfjs-converter/src/executor/graph_executor.ts


      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
+        // Input tensors should be disposed by user.
+        this.keepTensors(tensors);


Anyway, I think we need to fix it. We need to find which place releases it and debug why it's released and then reused again in debug mode. Or give reason that why tensor leak is unavoidable.

pyu10055

Reviewable status: 0 of 1 approvals obtained (waiting on @axinging, @jinjingforever, and @qjia7)

tfjs-converter/src/executor/graph_executor.ts line 193 at r6 (raw file):

  }

  private keepTensors(

this assumes there will be no duplication of tensors from the tensorsToKeep array, might be better to change tensorsPendingDisposal array to a set to enable auto-dedupe.

tfjs-converter/src/executor/graph_executor.ts line 268 at r6 (raw file):

Previously, qjia7 (Jiajia Qin) wrote…

Anyway, I think we need to fix it. We need to find which place releases it and debug why it's released and then reused again in debug mode. Or give reason that why tensor leak is unavoidable.

I am against keeping the input tensors as intermediate tensor of graph model, since they are not generated inside the model. This is beyond the tfjs model dump, it is part of the model API pipeline dump.

tfjs-converter/src/executor/graph_executor.ts line 308 at r6 (raw file):

                    this.intermediateTensors[nodeName][index] = tensor;
                  } else {
                    this.intermediateTensors[nodeName] = [];

the intermediateTensors contains the name for the tensors, how is the current approach tracking the name?

axinging · 2022-10-28T01:31:02Z

Reviewable status: 0 of 1 approvals obtained (waiting on @axinging, @jinjingforever, and @qjia7)

tfjs-converter/src/executor/graph_executor.ts line 193 at r6 (raw file):
  }

  private keepTensors(
this assumes there will be no duplication of tensors from the tensorsToKeep array, might be better to change tensorsPendingDisposal array to a set to enable auto-dedupe.

=> Done

tfjs-converter/src/executor/graph_executor.ts line 268 at r6 (raw file):

Previously, qjia7 (Jiajia Qin) wrote…
I am against keeping the input tensors as intermediate tensor of graph model, since they are not generated inside the model. This is beyond the tfjs model dump, it is part of the model API pipeline dump.

=> Done. With this change, for bodypix, we can not get the whole intermediate tensors because the inputs are disposed.

tfjs-converter/src/executor/graph_executor.ts line 308 at r6 (raw file):
                    this.intermediateTensors[nodeName][index] = tensor;
                  } else {
                    this.intermediateTensors[nodeName] = [];
the intermediateTensors contains the name for the tensors, how is the current approach tracking the name?

=> The original intermediateTensors is a misnomer, it should be tensorsPendingDisposal. The API getIntermediateTensors returns this.tensorMap. So the name in original intermediateTensors is of no use.

@pyu10055 ,updated, PTAL

axinging · 2022-10-31T07:51:48Z

@pyu10055 @jinjingforever
I will use bodypix to explain the background if or not keep input Tensors.

// https://github.com/tensorflow/tfjs-models/blob/master/body-pix/src/base_model.ts#L74
return tf.tidy(() => {
  const asFloat = this.preprocessInput(tf.cast(input, 'float32'));
  const asBatch = tf.expandDims(asFloat, 0);
  const results = this.model.predict(asBatch) as tf.Tensor4D[];

From above code, we can see asBatch will be disposed after tidy returns.
However, in the later dump, the input(asBatch) is required when running predictOp.
We will use the input(asBatch) together with other inputs to construct a tensorMap, and run predictOp.
So if the value of asBatch is being disposed, predictOp will fail at this op.

We have two options about this:

do not keep input tensors, some models, such as bodypix, will be un-dumpable.
keep input tensors, bodypix will have tensor leaks when dump is enabled.

Our goal is to support dumping on more models. And this tensor leak happens only when dump is on. So I personally prefer keeping these input tensors. WDYT?

pyu10055

thank @axinging for explaining, if I understand correctly, you are using the dump mode for the graph model to do an tensor audit for tfjs-models API that does not share a specific model file?
That sounds like quite different from the intermediate tensor dumping use case, still not clear in this case, why the inputs cannot be disposed?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

axinging · 2022-11-01T01:33:07Z

thank @axinging for explaining, if I understand correctly, you are using the dump mode for the graph model to do an tensor audit for tfjs-models API that does not share a specific model file? That sounds like quite different from the intermediate tensor dumping use case, still not clear in this case, why the inputs cannot be disposed?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

About: "why the inputs cannot be disposed?" ?
I think we have two considerations regarding to this question:

graphModel.getIntermediateTensors is expected to get all tensors: input tensors, weights, intermediate output tensors ,outputs. If the inputs are disposed, graphModel.getIntermediateTensors is not full functional.
As we know, the previouse debug mode has a problem: Errors may accumulate, resulting in fake errors. For example, subgraph {opA => opB => opC}. If opA is wrong, both opB and opC will be wrong.
In order to get rid of fake errors, the new Dump works in two steps:
First,Dump tensors into files according to dumpLevel. This is similar to normal predict, but all the intermediate tensors(input tensors, weights, intermediate output tensors ,outputs) are reserved. Code: https://github.com/tensorflow/tfjs/blob/master/e2e/benchmarks/local-benchmark/index.html#L239
Second, When tensor diffs spotted, apply below to each tensor related op:
use the reference as input, run the op(by predictOp) again under actual backend. Then dump all the results into files.
code: https://github.com/tensorflow/tfjs/blob/master/e2e/benchmarks/local-benchmark/dump.js#L204
The predictOp in 2nd step ensures all fake errors are removed. And it also requires all the input tensors, otherwise this predictOp will fail, which means the model dump is incomplete.

Conclusion: for both graphModel.getIntermediateTensors and remove fake errors, all input tensors are required.

Input tensors are used differently on e2e

In current e2e, there are two kinds of input tensor use scenarios, examples are MobileNetV3 and bodypix.

MobileNetV3

    predictFunc: () => {
      const input = tf.randomNormal([1, 224, 224, 3]);
      return predictFunction(input);
    },

In MobileNetV3, the input is never disposed at user side(This is a tensor leak). So the dump of MobileNetV3 is of full functionality. And keep this tensor in graph model is meanless for dump.

bodypix

// https://github.com/tensorflow/tfjs-models/blob/master/body-pix/src/base_model.ts#L74
return tf.tidy(() => {
  const asFloat = this.preprocessInput(tf.cast(input, 'float32'));
  const asBatch = tf.expandDims(asFloat, 0);
  const results = this.model.predict(asBatch) as tf.Tensor4D[];

As mentioned before, input tensor asBatch will be disposed in tidy. If we do not keep it(call keep) in graph model, graphModel.getIntermediateTensors can not get all the required tensors, and remove fake errors is incomplete. (This means dump on bodypix is incomplete)

Based on the above MobileNetV3&bodypix, to have better dump dupport, we'd better keep the input tensors.

@pyu10055 @jinjingforever

pyu10055

@axinging Thanks, this makes a lot more sense now, thanks. One nitpick is the name - intermediate tensors is not very intuitive, probably better be called 'inferenceTensorAudit', which could contains inputs + intermediateTensors+outputs.
I think for both MobileNetV3 and bodypix, the input tensor should be disposed as the benchmark is completed.
So, given that the dump has tracked for all tensors, when caller request dispose those tensors, it should be able to remove them all?
Why there will be memory leak?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

axinging · 2022-11-01T07:19:52Z

@axinging Thanks, this makes a lot more sense now, thanks. One nitpick is the name - intermediate tensors is not very intuitive, probably better be called 'inferenceTensorAudit', which could contains inputs + intermediateTensors+outputs. I think for both MobileNetV3 and bodypix, the input tensor should be disposed as the benchmark is completed. So, given that the dump has tracked for all tensors, when caller request dispose those tensors, it should be able to remove them all? Why there will be memory leak?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

about: Why there will be memory leak?

We can understand this by models: e2e/MobileNetv3, bodypix.

e2e/MobileNetv3 : tensor leak can be fixed.

    predictFunc: () => {
      const input = tf.randomNormal([1, 224, 224, 3]);
      return predictFunction(input);
    },

In above code, 'input' will never be disposed, this is a tensor leak (This exists long before, un-relating to dump feature). I think this a potential bug and this can be fixed by call input.dispose after predict is done (I will try to work out a fix for this later).

bodypix: no tensor leak or un-dumpable.
The original bodypix has no tensor leak, but this will fail dump because asBatch is disposed in tidy.

// https://github.com/tensorflow/tfjs-models/blob/master/body-pix/src/base_model.ts#L74
return tf.tidy(() => {
  const asFloat = this.preprocessInput(tf.cast(input, 'float32'));
  const asBatch = tf.expandDims(asFloat, 0);
  const results = this.model.predict(asBatch) as tf.Tensor4D[];

To support bodypix dump or avoid tensor leak at dump mode, we have three options at below line:
https://github.com/tensorflow/tfjs/pull/6953/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dR258

Option 1: do not keep input tensors.
Pros: no tensor leak at bodypix
Cons: bodypix can not be dumped

      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
      });

Option 2: Keep input tensors, not add it to the this.tensorsPendingDisposal.
Pros: bodypix can be dumped.
Cons: User need to dispose this input tensor(require changes in tfjs-model/bodypix), otherwise tensor leak.

      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
        this.keepTensors(tensors);
      });

Option 3: Keep input tensors, add it to the this.tensorsPendingDisposal.
Pros: bodypix can be dumped.
Cons: User need to clone the input in e2e/MobileNetV3(And other model calls predictFunction in e2e) so that it can be used for the second inference.

      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
        this.keepTensors(tensors, nodeName, this.tensorsPendingDisposal);
      });

We need consider e2e/MobileNetv3 together with e2e/bodypix here. Because they have different useage of input.
In e2e conformance, all models run two inferences. For e2e/MobileNetv3 the input tensors are shared by two inferences. For e2e/bodypix, input tensors are not shared.
This means, for e2e/MobileNetv3, input tensors can not be disposed after first inference(Option 3 will dispose this). For bodypix, its ok to dispose input. The fix is to clone the input tensors for e2e/MobileNetv3.

The basic workflow of current e2e conformance test:

// First inference
predictAndGetData(ExpectedBackend);
if(enableDump) {
 graphModel.getIntermediateTensors();
 graphModel.disposeIntermediateTensors();
}

// Second inference
// Input is shared between two backends. But it will be disposed at first disposeIntermediateTensors if "Input is added to the this.tensorsPendingDisposal".
predictAndGetData(ActualBackend);
if(enableDump) {
 graphModel.getIntermediateTensors();
 graphModel.disposeIntermediateTensors();
}

About name inferenceTensorAudit

I prefer a name like "GraphModel.getNamedTensorsMap", because getIntermediateTensors returns NamedTensorsMap which contains all tensors:

  getIntermediateTensors(): NamedTensorsMap {
    return this.tensorsMap;
  }

BTW, if change name from getIntermediateTensors to getNamedTensorsMap, it seems not related to flag KEEP_INTERMEDIATE_TENSORS namely. But from logic, getNamedTensorsMap do return a NamedTensorsMap, and KEEP_INTERMEDIATE_TENSORS do keep all "INTERMEDIATE TENSORS".

So summarize my proposal:

name: GraphModel.getIntermediateTensors => GraphModel.getNamedTensorsMap
Keep tensor policy: prefer "Option 2: Keep input tensors, not add it to the this.tensorsPendingDisposal"
https://github.com/tensorflow/tfjs/pull/6953/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dR258
(Tensor leak at dump mode, but enable dump on bodypix.)
Fix e2e/MobileNetv3 tensor leak in a followup PR, and try "Option 3: Keep input tensors, add it to the this.tensorsPendingDisposal." in a followup PR.

WDYT?
@pyu10055 @jinjingforever

pyu10055

Edited
I had a offline discussion with @mattsoulanille
We believe it would be better to use tensor.clone() to keep the input and intermediate tensors.
clone() would increase the ref count, while not forceful preventing the tensor to be disposed.
which means you need store the name and clone tensors instead just the name.

@mattsoulanille please feel free to chime in.

@axinging

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

mattsoulanille · 2022-11-01T22:06:32Z

To add to Ping's discussion, I think we can use tf.clone and references to solve this problem without needing special logic to determine when we can dispose each tensor.

Instead of using keep to determine when to dispose each tensor, we can tf.clone it to get another reference. To do this, instead of saving the tensorsMap by setting this.tensorsMap = tensorsMap;, we create a copy of it, where we clone each tensor in the map.

this.tensorsMap = Object.fromEntries(Object.entries(tensorsMap).map(([name, tensorsList]) => {
  const cloned = tensor.clone();
  // This clone needs to be 'kept' because model.execute may be called within a tidy(). We don't want
  // tidy() to dispose these cloned tensors because we need to look at them after the model has finished
  // executing (after the 'tidy()'). 
  // 
  // However, we don't check whether the tensor is 'kept' when we free it. 
  keep(clone);
  return [name, tensorsList.map(tensor => tensor.clone())];
}));

When a user calls getIntermediateTensors, they get this map of cloned tensors, which are guaranteed to not be disposed since nothing else interacts with this map (if tensorsMap is used elsewhere, we can just create an extra variable clonedTensorsMap and use it instead).

When we need to clean up the intermediate tensors, we no longer need to iterate tensorsPendingDisposal to make sure we only dispose tensors that we created. We can just dispose all the cloned tensors in tensorsMap, since they're just clones / extra references to the original tensors:

for (const tensorsList in Object.keys(this.tensorsMap)) {
  for (const tensor of tensorsList) {
    // This is a clone of the real tensor (i.e. another reference), so it's okay to dispose it.
    // We're not disposing the user's input tensor here. Just the clone.
    tensor.dispose();
  }
}

If the user calls model.execute again before calling disposeIntermediateTensors, we just call it for them before we clear tensorsMap and run the model again.

Works when a tensor is reused

Suppose the user reuses an input tensor like this:

const input = tf.randomNormal([1, 224, 224, 3]);

function runModel() {
  return model.execute(input);
}

We can count the number of references to the underlying data of the input tensor as this runs.

When input is declared, there is 1 reference to the data. RC = 1
When we run model.execute with dump mode, we get another reference in tensorsData. RC = 2.
We can look at this cloned tensor (and the other intermediate tensors) now.
When we run model.disposeIntermediateTensors, the cloned input tensor is disposed, so RC = 1 again.
When we eventually dispose input, then RC = 0 and the data gets deleted.

Works when a new tensor is created inside a `tidy()`

Suppose the user creates a new tensor each time like this:

function runModel() {
  tf.tidy(() => {
    const input = tf.randomNormal([1, 224, 224, 3]);
    return model.execute(input);
  });
}

We can count the number of references to the underlying data of the input tensor as this runs.

When input is declared, there is 1 reference to the data. RC = 1
When we run model.execute with dump mode, we get another reference in tensorsData. RC = 2
When we exit the tidy, two things happen:
- The original input tensor is disposed, so RC = 1 now.
- The cloned input tensor, is not disposed because it was marked as kept, so RC = 1.
We can now inspect the intermediate tensors, including the input tensor, because they were all kept through the tidy.
When we run model.disposeIntermediateTensors, the cloned input tensor is disposed, so RC = 0 and the data gets deleted.

axinging · 2022-11-02T07:59:59Z

Thanks @pyu10055 @mattsoulanille. The updated change is with keep+clone, PTAL.

pyu10055

Thank you @axinging, the PR LGTM with some minor questions.

Reviewable status: 0 of 1 approvals obtained (waiting on @axinging, @jinjingforever, and @qjia7)

tfjs-converter/src/executor/graph_executor.ts line 193 at r6 (raw file):

Previously, axinging (Xu Xing) wrote…

=> Done

just want to confirm that inputs are included in the tensorsMap, thanks

tfjs-converter/src/executor/graph_executor.ts line 560 at r13 (raw file):

        const currentContext = context.currentContext;
        if (util.isPromise(tensors)) {
          if (this.keepIntermediateTensors) {

why keepIntermediateTensors would not work with promise?
These promise should be resolved to tensors and put into the tensorMap.

pyu10055

Reviewed 1 of 3 files at r1, 1 of 2 files at r14, 1 of 1 files at r15, all commit messages.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

qjia7

LGTM with one nit.

qjia7 · 2022-11-03T03:36:21Z

tfjs-converter/src/executor/graph_executor.ts

@@ -256,6 +276,10 @@ export class GraphExecutor implements FunctionExecutor {
      if (this.parent == null) {
        context.dispose(tensorsToKeep);
      }
+      if (this.keepIntermediateTensors) {
+        this.clonedTensorsMap = this.cloneTensorMap(tensorsMap);
+      }


nit: Should this if happen before L276?

qjia7

LGTM, thanks.

@pyu10055 @mattsoulanille @jinjingforever Currently, getIntermediateTensors() will return all cloned tensors in the graph, which including inputs, weights, intermediate tensors, outputs. Maybe we need a new meaningful name for it in future. Just list here to get your attention!

mattsoulanille

I'm going to take a look at this PR in a few minutes, but I'm sending this review now to prevent it from being merged before I can take a look (since It already has two approvals).

tfjs-converter/src/executor/graph_executor.ts

mattsoulanille

Thanks for waiting for my review!

tfjs-converter/src/executor/graph_executor.ts

mattsoulanille · 2022-11-03T17:13:36Z

tfjs-converter/src/executor/graph_executor.ts

@@ -645,7 +651,7 @@ export class GraphExecutor implements FunctionExecutor {
  private mapInputs(inputs: NamedTensorMap) {
    const result: NamedTensorMap = {};
    for (const inputName in inputs) {
-      const tensor = this._signature?.inputs?.[inputName];
+      const tensor = this._signature ?.inputs ?.[inputName];


Suggested change

const tensor = this._signature ?.inputs ?.[inputName];

const tensor = this._signature?.inputs?.[inputName];

mattsoulanille · 2022-11-03T17:13:46Z

tfjs-converter/src/executor/graph_executor.ts

@@ -669,7 +675,7 @@ export class GraphExecutor implements FunctionExecutor {

  private mapOutputs(outputs: string[]) {
    return outputs.map(name => {
-      const tensor = this._signature?.outputs?.[name];
+      const tensor = this._signature ?.outputs ?.[name];


Suggested change

const tensor = this._signature ?.outputs ?.[name];

const tensor = this._signature?.outputs?.[name];

mattsoulanille · 2022-11-03T17:21:26Z

tfjs-converter/src/executor/graph_executor.ts

+    Object.entries(this.clonedTensorsMap).forEach(([, tensorsList]) => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.isDisposed) {
          tensor.dispose();
        }
      });
    });


This logic can probably be factored out into another function, since it's used in _executeAsync as well.

Suggested change

Object.entries(this.clonedTensorsMap).forEach(([, tensorsList]) => {

tensorsList.forEach(tensor => {

if (tensor && !tensor.isDisposed) {

tensor.dispose();

}

});

});

// Put this function outside of the class.

function tensorMapForEach(tensorsMap: NamedTensorsMap, f: (tensor: Tensor) => void) {

for (const tensorsList in Object.values(tensorsMap) {

for (const tensor in tensorsList) {

f(tensor);

}

}

}

tensorMapForEach(this.clonedTensorsMap, tensor => {

if (tensor && !tensor.isDisposed) {

tensor.dispose();

}

});

Maybe we can do this in a follow up change? In fact I donot see the benefit of this refactor, and the original version is more readable.

I should have made this a Nit suggestion. We can keep the current implementation as-is.

mattsoulanille · 2022-11-03T17:23:08Z

tfjs-converter/src/executor/graph_executor.ts

+    Object.entries(tensorsMap).forEach(([, tensorsList]) => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.kept && !tensor.isDisposed &&
+            !keepIds.has(tensor.id)) {
+          tensor.dispose();
+        }
+      });
+    });


I factored iterating over a tensor map into another function.

Suggested change

Object.entries(tensorsMap).forEach(([, tensorsList]) => {

tensorsList.forEach(tensor => {

if (tensor && !tensor.kept && !tensor.isDisposed &&

!keepIds.has(tensor.id)) {

tensor.dispose();

}

});

});

tensorMapForEach(tensorsMap, tensor => {

if (tensor && !tensor.kept && !tensor.isDisposed &&

!keepIds.has(tensor.id)) {

tensor.dispose();

}

});

tfjs-converter/src/executor/graph_executor.ts

mattsoulanille · 2022-11-03T19:58:55Z

tfjs-converter/src/executor/graph_executor.ts

+      if (this.keepIntermediateTensors) {
+        this.clonedTensorsMap = this.cloneTensorMap(tensorsMap);
+      }


We should dispose all the original intermediate tensors in tensorsMap now that we've cloned them. Otherwise, we're leaking them.

Does it make more sense to clone each tensor in the above for loop? That way, we can immediately dispose it if necessary (in this.checkTensorForDisposal()). That also simplifies this.checkTensorForDisposal, which would no longer need to check this.keepIntermediateTensors, because the tensor is already cloned.

This is possible. But 1), this.cloneTensorMap can not be reused, we need some new logic to do this.
2), Your above mentioned tensorMapForEach refactor conflicts with this change.

I think Matt is right. For the original intermediate tensors, we should clone each of them when we get them in L274. Otherwise, we are leaking them due to the changes (L330) you add in this.checkTensorForDisposal. Since you already use cloned tensor, theoretically, this.checkTensorForDisposal should be unchanged. And the code logic you mentioned can be changed based on our requirement.
Similar for the async execute.

tfjs-converter/src/executor/graph_executor.ts

mattsoulanille · 2022-11-03T20:06:47Z

tfjs-converter/src/executor/graph_executor.ts

    }

+    Object.entries(tensorsMap).forEach(([, tensorsList]) => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.kept && !tensor.isDisposed &&


Should this check tensor.kept? I think that's mostly reserved for tidy.

In fact I am not clear why kept is checked here, this is from the original version: https://github.com/tensorflow/tfjs/pull/5659/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dL369.

I keep it because I am not clear about this.

My guess is this was from the previous logic that used kept to save the intermediate tensors, although I could be wrong.

It looks like the keepIds set is what prevents this from disposing the input, output, and weight tensors, so we're not using tensor.kept to prevent them from being disposed. If all the other tensors in tensorsMap are intermediate tensors, then I think it's safe to remove this check.

axinging · 2022-11-04T01:27:43Z

Thanks @mattsoulanille for your great feedback.
Here are my comments regarding to your comments:
1), "if the user does not call disposeIntermediateTensors."
I merged your changes. But I donot think this is necessary.
"if the user does not call disposeIntermediateTensors." in dump mode, even with your change, tensor leaks will happens too(In the last run).
So I think the only way to ensure no tensor leak is to ask user call disposeIntermediateTensors.

2), tensorMapForEach:
I donot change this yet. In my opinion, the original version is easy to read.
Original:

    Object.entries(this.clonedTensorsMap).forEach(([, tensorsList]) => {
      tensorsList.forEach(tensor => {
        if (tensor && !tensor.isDisposed) {
          tensor.dispose();
        }
      });
    });
    Object.entries(tensorsMap).forEach(([, tensorsList]) => {
        tensorsList.forEach(tensor => {
          if (tensor && !tensor.kept && !tensor.isDisposed &&
              !keepIds.has(tensor.id)) {
            tensor.dispose();
          }
        });
      });

Your suggestion:

    // Put this function outside of the class.
    function tensorMapForEach(tensorsMap: NamedTensorsMap, f: (tensor: Tensor) => void) {
      for (const tensorsList in Object.values(tensorsMap) {
        for (const tensor in tensorsList) {
          f(tensor);
        }
      }
    }
    tensorMapForEach(this.clonedTensorsMap, tensor => {
      if (tensor && !tensor.isDisposed) {
        tensor.dispose();
      }
    });
    tensorMapForEach(tensorsMap, tensor => {
        if (tensor && !tensor.kept && !tensor.isDisposed &&
            !keepIds.has(tensor.id)) {
          tensor.dispose();
        }
      });

3), The return value of "private cloneTensorMap(tensorsMap: NamedTensorsMap) {"
It's interesting that "yarn lint" doesn't complain about this. Do you know why?

mattsoulanille

Thanks for making the changes I requested! Here are a few more, and then I think this will be ready to merge.

mattsoulanille · 2022-11-04T20:30:55Z

tfjs-converter/src/executor/graph_executor.ts


      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
+        this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);


Nit: Make this.cloneTensorsList return a cloned list of tensors instead of storing them in a map. I think this is a bit easier to read, and it matches this.cloneTensorsMap. Also, then this.cloneTensorsMap can use this.cloneTensorsList in its implementation.

Suggested change

this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);

this.clonedTensorsMap[nodeName] = this.cloneTensorList(tensors);

mattsoulanille · 2022-11-04T20:34:49Z

tfjs-converter/src/executor/graph_executor.ts

@@ -247,15 +290,18 @@ export class GraphExecutor implements FunctionExecutor {
                `Please use model.executeAsync() instead.`);
          }
          tensorsMap[node.name] = tensors;
+          this.cloneTensorList(tensors, node.name, this.clonedTensorsMap);


Does this need a check for this.keepIntermediateTensors?

No, it doesn't, because this.clonedTensorsMap is null, but IMO that's a bit confusing. See my other comment on this.

mattsoulanille · 2022-11-04T20:35:00Z

tfjs-converter/src/executor/graph_executor.ts


      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
+        this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);


Tensors should only be cloned if this.keepIntermediateTensors is set, right?

Oh, I see, cloneTensorsList checks if this.clonedTensorsMap is null before running. I find that a bit confusing. Can we do the check here instead?

Also, I think it might be better for this.cloneTensorsList to return a list of cloned tensors instead of mutating the tensors map. See my nit.

mattsoulanille · 2022-11-04T20:38:03Z

tfjs-converter/src/executor/graph_executor.ts

+    Object.values(this.clonedTensorsMap).forEach(tensorsList => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.isDisposed) {


Nit: Google's TS style guide prefers for (... of ...) instead of .forEach(...) (.map(...) is still okay when using the resulting value).

mattsoulanille · 2022-11-04T20:45:50Z

tfjs-converter/src/executor/graph_executor.ts

    }

+    Object.entries(tensorsMap).forEach(([, tensorsList]) => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.kept && !tensor.isDisposed &&


My guess is this was from the previous logic that used kept to save the intermediate tensors, although I could be wrong.

It looks like the keepIds set is what prevents this from disposing the input, output, and weight tensors, so we're not using tensor.kept to prevent them from being disposed. If all the other tensors in tensorsMap are intermediate tensors, then I think it's safe to remove this check.

mattsoulanille · 2022-11-04T20:51:36Z

tfjs-converter/src/executor/graph_executor.ts

+  }
+
+  private cloneTensorList(
+      tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {


Suggested change

tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {

tensors?: Tensor[], nodeName?: string, tensorsMap?: NamedTensorsMap) {

We'd like to enable strictNullChecks in the future, so if you don't accept the nit that changes this function to return a list of tensors instead of mutating tensorsMap, please mark all these arguments as optional.

return a list of tensors

tfjs-converter/src/executor/graph_executor.ts

Currently only graph model predicting with executeAsync supports dump. This has two drawbackes: 1. Some models don't support dump. 2. For model has a wrapping layer over grapp model, such as bodypix, pose-detection, a lot of change is required to support dump, example change: tensorflow/tfjs-models#841. This change removes this limitation, so that more models are supported and dump is easier.

Co-authored-by: Matthew Soulanille <[email protected]>

axinging

@mattsoulanille PTAL

axinging · 2022-11-07T01:33:20Z

tfjs-converter/src/executor/graph_executor.ts

+  }
+
+  private cloneTensorList(
+      tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {


return a list of tensors

axinging · 2022-11-07T01:33:33Z

tfjs-converter/src/executor/graph_executor.ts


      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
+        this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);


axinging · 2022-11-07T01:34:28Z

tfjs-converter/src/executor/graph_executor.ts

+    Object.values(this.clonedTensorsMap).forEach(tensorsList => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.isDisposed) {


axinging · 2022-11-07T01:34:41Z

tfjs-converter/src/executor/graph_executor.ts

    }

+    Object.entries(tensorsMap).forEach(([, tensorsList]) => {
+      tensorsList.forEach(tensor => {
+        if (tensor && !tensor.kept && !tensor.isDisposed &&


mattsoulanille

LGTM. Thanks!

axinging changed the title ~~Support dump with predict/execute~~ Support dump with graphmodel.execute Oct 19, 2022

axinging force-pushed the dump_sync branch 2 times, most recently from b0a9784 to 900ad48 Compare October 25, 2022 01:23

axinging marked this pull request as ready for review October 25, 2022 02:01

qjia7 reviewed Oct 25, 2022

View reviewed changes

axinging force-pushed the dump_sync branch from 900ad48 to ca5dbc9 Compare October 26, 2022 03:05

qjia7 reviewed Oct 26, 2022

View reviewed changes

axinging force-pushed the dump_sync branch 2 times, most recently from 0128d75 to a6949d3 Compare October 27, 2022 07:16

qjia7 reviewed Oct 27, 2022

View reviewed changes

qjia7 requested review from jinjingforever and pyu10055 October 27, 2022 08:33

pyu10055 requested changes Oct 27, 2022

View reviewed changes

axinging force-pushed the dump_sync branch from f62079e to ec9a745 Compare October 28, 2022 01:21

pyu10055 reviewed Oct 31, 2022

View reviewed changes

pyu10055 reviewed Nov 1, 2022

View reviewed changes

axinging force-pushed the dump_sync branch 3 times, most recently from 8894c6f to 926c46e Compare November 2, 2022 07:39

pyu10055 requested changes Nov 2, 2022

View reviewed changes

axinging force-pushed the dump_sync branch from 926c46e to b1f1f35 Compare November 3, 2022 01:00

pyu10055 approved these changes Nov 3, 2022

View reviewed changes

qjia7 approved these changes Nov 3, 2022

View reviewed changes

axinging force-pushed the dump_sync branch from be80f19 to 399feaf Compare November 3, 2022 06:27

qjia7 approved these changes Nov 3, 2022

View reviewed changes

mattsoulanille self-requested a review November 3, 2022 16:30

mattsoulanille requested changes Nov 3, 2022

View reviewed changes

tfjs-converter/src/executor/graph_executor.ts Outdated Show resolved Hide resolved

mattsoulanille requested changes Nov 3, 2022

View reviewed changes

axinging force-pushed the dump_sync branch 2 times, most recently from 6cb4acd to 1c7fedf Compare November 4, 2022 00:40

axinging force-pushed the dump_sync branch from 9ee79ca to 833bea3 Compare November 4, 2022 05:37

mattsoulanille requested changes Nov 4, 2022

View reviewed changes

axinging and others added 14 commits November 7, 2022 08:56

Refine cleanup

13a1833

Rename

22eb602

clean code, and add comments

50de538

Nit

0fcb42d

Fix comments

c2d55df

Use set instead of array to avoid dup

e3b58ac

Use clone and keep instead of keep only

becb1bb

Remove throw at promise

2f5c424

Add layer model todo

013f86b

Nit

8b6fe0c

Update tfjs-converter/src/executor/graph_executor.ts

7878e8c

Co-authored-by: Matthew Soulanille <[email protected]>

Clone tensorList instead of map

97cb188

Fix matt comment

349ba95

axinging commented Nov 7, 2022

View reviewed changes

axinging force-pushed the dump_sync branch from 833bea3 to 349ba95 Compare November 7, 2022 01:37

mattsoulanille approved these changes Nov 7, 2022

View reviewed changes

mattsoulanille merged commit e2e29e4 into tensorflow:master Nov 7, 2022

	const tensor = this._signature ?.inputs ?.[inputName];
	const tensor = this._signature?.inputs?.[inputName];

	const tensor = this._signature ?.outputs ?.[name];
	const tensor = this._signature?.outputs?.[name];

	this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);
	this.clonedTensorsMap[nodeName] = this.cloneTensorList(tensors);

	tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {
	tensors?: Tensor[], nodeName?: string, tensorsMap?: NamedTensorsMap) {

Support dump with graphmodel.execute #6953

Support dump with graphmodel.execute #6953

Conversation

axinging commented Oct 19, 2022 • edited Loading

axinging commented Oct 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axinging Oct 27, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qjia7 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pyu10055 left a comment

Choose a reason for hiding this comment

axinging commented Oct 28, 2022 • edited Loading

axinging commented Oct 31, 2022

pyu10055 left a comment

Choose a reason for hiding this comment

axinging commented Nov 1, 2022 • edited Loading

pyu10055 left a comment

Choose a reason for hiding this comment

axinging commented Nov 1, 2022

about: Why there will be memory leak?

About name inferenceTensorAudit

So summarize my proposal:

pyu10055 left a comment • edited Loading

Choose a reason for hiding this comment

mattsoulanille commented Nov 1, 2022 • edited Loading

Works when a tensor is reused

Works when a new tensor is created inside a tidy()

axinging commented Nov 2, 2022

pyu10055 left a comment

Choose a reason for hiding this comment

pyu10055 left a comment

Choose a reason for hiding this comment

qjia7 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

qjia7 left a comment

Choose a reason for hiding this comment

mattsoulanille left a comment

Choose a reason for hiding this comment

mattsoulanille left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axinging Nov 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axinging Nov 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axinging Nov 4, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axinging commented Nov 4, 2022 • edited Loading

axinging commented Oct 19, 2022 •

edited

Loading

axinging Oct 27, 2022 •

edited

Loading

axinging commented Oct 28, 2022 •

edited

Loading

axinging commented Nov 1, 2022 •

edited

Loading

pyu10055 left a comment •

edited

Loading

mattsoulanille commented Nov 1, 2022 •

edited

Loading

Works when a new tensor is created inside a `tidy()`

axinging Nov 4, 2022 •

edited

Loading

axinging Nov 4, 2022 •

edited

Loading

axinging Nov 4, 2022 •

edited

Loading

axinging commented Nov 4, 2022 •

edited

Loading