Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dump with graphmodel.execute #6953

Merged
merged 14 commits into from
Nov 7, 2022

Conversation

axinging
Copy link
Contributor

@axinging axinging commented Oct 19, 2022

Currently only graph model predicting with executeAsync supports dump. This has two drawbackes:

  1. Models predict with execute don't support dump.
  2. Some models have a wrapping layer over graph model (such as bodypix, pose-detection), to support dumping by graphModel.executeAsync, non trival changes are required, example change: Support model debug in pose-detection tfjs-models#841.

This change removes these limitations and enables dumping on more models.

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.


This change is Reviewable

@axinging axinging changed the title Support dump with predict/execute Support dump with graphmodel.execute Oct 19, 2022
@axinging axinging force-pushed the dump_sync branch 2 times, most recently from b0a9784 to 900ad48 Compare October 25, 2022 01:23
@axinging
Copy link
Contributor Author

@qjia7 @xhcao @haoyunfeix @gyagp PTAL

@axinging axinging marked this pull request as ready for review October 25, 2022 02:01
private tensorsPendingDisposalForAsync: NamedTensorsMap;
private idsKeepForAsync: Set<number>;
// Variables with Sync suffix is used for dumping by execute.
private tensorsPendingDisposalForSync: Tensor[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you unify to use tensorsPendingDisposal or intermediateTensorsPendingDisposal for both async and sync for simplicity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

private tensorsMap: NamedTensorsMap;
private keepTensorForDebug = false;
private dumpMode = DumpMode.Default;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A boolean variable keepTensorForDebug like before is enough? I think the tidy won't be used for async execute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for:
https://github.com/axinging/tfjs/blob/ca5dbc94cbae0848af0a38d61f58aefe81240c4b/tfjs-converter/src/executor/graph_executor.ts#L346

In Default mode, it does dispose; Sync mode, does nothing; Async mode: pushes tensor to disposal queue.

                if (this.dumpMode === DumpMode.Default) {
                  tensor.dispose();
                } else if (this.dumpMode === DumpMode.Async) {
                  this.tensorsPendingDisposal.push(tensor);
                }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sync mode, it won't go to this code since tensor.kept = true since you add keepTensors before it?

Copy link
Contributor Author

@axinging axinging Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. But I think putting code under "(this.dumpMode === DumpMode.Async)" help to remind this is dump specific.
BTW, I move this logic outside this checkTensorForDisposal in the latest change.

@@ -45,10 +52,11 @@ export class GraphExecutor implements FunctionExecutor {
private _functions: {[key: string]: Graph} = {};
private _functionExecutorMap: {[key: string]: FunctionExecutor} = {};
private _resourceManager: ResourceManager;
private intermediateTensors: NamedTensorsMap = {};
private keepIds: Set<number>;
// Variables with Async suffix is used for dumping by executeAsync.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Update the annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


Object.keys(inputs).forEach(name => {
const [nodeName, index] = parseNodeName(name);
const tensors: Tensor[] = [];
tensors[index] = inputs[name];
tensorsMap[nodeName] = tensors;
// Input tensors should be disposed by user.
this.keepTensors(tensors);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to add keepTensors for inputs compared with the original code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add some comments above this line:

        // For some models, such as bodypix, it will dispose the input tensors
        // in its top level tidy. In dump mode, these tensors are required, so
        // call keep to eusure they are preserved. However, this comes with a
        // side effect in dump mode, tensor leak.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use this.keepTensors(tensors, this.tensorsPendingDisposal); to avoid tensor leak?

if (this.keepIntermediateTensors)
{
  this.keepTensors(tensors, this.tensorsPendingDisposal);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leak only happens in dump mode. And 'tensors' includes some tensors will be used in later inference, especially in e2e, which will run inference on two different backends.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I think we need to fix it. We need to find which place releases it and debug why it's released and then reused again in debug mode. Or give reason that why tensor leak is unavoidable.

if (this.dumpMode === DumpMode.Sync) {
this.tensorsMap = tensorsMap;
} else {
this.tensorsMap = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's safe to say this.tensorsMap = tensorsMap since L280 make sure that async won't go to this path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This 'if' help to remind this is dump specific. Under non-dump mode, this.tensorsMap will never be used.

private tensorsMap: NamedTensorsMap;
private keepTensorForDebug = false;
private dumpMode = DumpMode.Default;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sync mode, it won't go to this code since tensor.kept = true since you add keepTensors before it?

this.tensorsPendingDisposal = null;
}
if (this.dumpMode === DumpMode.Async) {
this.disposeTensorsMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to call this.disposeTensorsMap() only for async mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intended behaviour:
In sync mode, intermediat tensors are kept in tensorsPendingDisposal.
In async mode, intermediat tensors = = tensormap - tensorsToKeep.

This is why we have two different dispose.
Please note in the updated version, I tried to align these two path, which means, for both mode, all intermediat tensors are kept in tensorsPendingDisposal.

@axinging axinging force-pushed the dump_sync branch 2 times, most recently from 0128d75 to a6949d3 Compare October 27, 2022 07:16

Object.keys(inputs).forEach(name => {
const [nodeName, index] = parseNodeName(name);
const tensors: Tensor[] = [];
tensors[index] = inputs[name];
tensorsMap[nodeName] = tensors;
// Input tensors should be disposed by user.
this.keepTensors(tensors);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use this.keepTensors(tensors, this.tensorsPendingDisposal); to avoid tensor leak?

if (this.keepIntermediateTensors)
{
  this.keepTensors(tensors, this.tensorsPendingDisposal);
}

private tensorsMap: NamedTensorsMap;
private keepTensorForDebug = false;
private keepTensorsForDump = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For me, keepIntermediateTensors is a better name since the value is set by KEEP_INTERMEDIATE_TENSORS. And the purpose of it is not only for dump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

tfjs-converter/src/executor/graph_executor.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Xing.
The code looks good in my side. But the tensor leak issue still needs you to do more investigation to figure out the reason.
Will add more reviewers. Thanks.


Object.keys(inputs).forEach(name => {
const [nodeName, index] = parseNodeName(name);
const tensors: Tensor[] = [];
tensors[index] = inputs[name];
tensorsMap[nodeName] = tensors;
// Input tensors should be disposed by user.
this.keepTensors(tensors);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I think we need to fix it. We need to find which place releases it and debug why it's released and then reused again in debug mode. Or give reason that why tensor leak is unavoidable.

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @axinging, @jinjingforever, and @qjia7)


tfjs-converter/src/executor/graph_executor.ts line 193 at r6 (raw file):

  }

  private keepTensors(

this assumes there will be no duplication of tensors from the tensorsToKeep array, might be better to change tensorsPendingDisposal array to a set to enable auto-dedupe.


tfjs-converter/src/executor/graph_executor.ts line 268 at r6 (raw file):

Previously, qjia7 (Jiajia Qin) wrote…

Anyway, I think we need to fix it. We need to find which place releases it and debug why it's released and then reused again in debug mode. Or give reason that why tensor leak is unavoidable.

I am against keeping the input tensors as intermediate tensor of graph model, since they are not generated inside the model. This is beyond the tfjs model dump, it is part of the model API pipeline dump.


tfjs-converter/src/executor/graph_executor.ts line 308 at r6 (raw file):

                    this.intermediateTensors[nodeName][index] = tensor;
                  } else {
                    this.intermediateTensors[nodeName] = [];

the intermediateTensors contains the name for the tensors, how is the current approach tracking the name?

@axinging
Copy link
Contributor Author

axinging commented Oct 28, 2022

Reviewable status: 0 of 1 approvals obtained (waiting on @axinging, @jinjingforever, and @qjia7)

tfjs-converter/src/executor/graph_executor.ts line 193 at r6 (raw file):

  }

  private keepTensors(

this assumes there will be no duplication of tensors from the tensorsToKeep array, might be better to change tensorsPendingDisposal array to a set to enable auto-dedupe.

=> Done

tfjs-converter/src/executor/graph_executor.ts line 268 at r6 (raw file):

Previously, qjia7 (Jiajia Qin) wrote…
I am against keeping the input tensors as intermediate tensor of graph model, since they are not generated inside the model. This is beyond the tfjs model dump, it is part of the model API pipeline dump.

=> Done. With this change, for bodypix, we can not get the whole intermediate tensors because the inputs are disposed.

tfjs-converter/src/executor/graph_executor.ts line 308 at r6 (raw file):

                    this.intermediateTensors[nodeName][index] = tensor;
                  } else {
                    this.intermediateTensors[nodeName] = [];

the intermediateTensors contains the name for the tensors, how is the current approach tracking the name?

=> The original intermediateTensors is a misnomer, it should be tensorsPendingDisposal. The API getIntermediateTensors returns this.tensorMap. So the name in original intermediateTensors is of no use.

@pyu10055 ,updated, PTAL

@axinging
Copy link
Contributor Author

@pyu10055 @jinjingforever
I will use bodypix to explain the background if or not keep input Tensors.

// https://github.com/tensorflow/tfjs-models/blob/master/body-pix/src/base_model.ts#L74
return tf.tidy(() => {
  const asFloat = this.preprocessInput(tf.cast(input, 'float32'));
  const asBatch = tf.expandDims(asFloat, 0);
  const results = this.model.predict(asBatch) as tf.Tensor4D[];

From above code, we can see asBatch will be disposed after tidy returns.
However, in the later dump, the input(asBatch) is required when running predictOp.
We will use the input(asBatch) together with other inputs to construct a tensorMap, and run predictOp.
So if the value of asBatch is being disposed, predictOp will fail at this op.

We have two options about this:

  1. do not keep input tensors, some models, such as bodypix, will be un-dumpable.
  2. keep input tensors, bodypix will have tensor leaks when dump is enabled.

Our goal is to support dumping on more models. And this tensor leak happens only when dump is on. So I personally prefer keeping these input tensors. WDYT?

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank @axinging for explaining, if I understand correctly, you are using the dump mode for the graph model to do an tensor audit for tfjs-models API that does not share a specific model file?
That sounds like quite different from the intermediate tensor dumping use case, still not clear in this case, why the inputs cannot be disposed?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

@axinging
Copy link
Contributor Author

axinging commented Nov 1, 2022

thank @axinging for explaining, if I understand correctly, you are using the dump mode for the graph model to do an tensor audit for tfjs-models API that does not share a specific model file? That sounds like quite different from the intermediate tensor dumping use case, still not clear in this case, why the inputs cannot be disposed?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

About: "why the inputs cannot be disposed?" ?
I think we have two considerations regarding to this question:

  1. graphModel.getIntermediateTensors is expected to get all tensors: input tensors, weights, intermediate output tensors ,outputs. If the inputs are disposed, graphModel.getIntermediateTensors is not full functional.
  2. As we know, the previouse debug mode has a problem: Errors may accumulate, resulting in fake errors. For example, subgraph {opA => opB => opC}. If opA is wrong, both opB and opC will be wrong.
    In order to get rid of fake errors, the new Dump works in two steps:
    First,Dump tensors into files according to dumpLevel. This is similar to normal predict, but all the intermediate tensors(input tensors, weights, intermediate output tensors ,outputs) are reserved. Code: https://github.com/tensorflow/tfjs/blob/master/e2e/benchmarks/local-benchmark/index.html#L239
    Second, When tensor diffs spotted, apply below to each tensor related op:
    use the reference as input, run the op(by predictOp) again under actual backend. Then dump all the results into files.
    code: https://github.com/tensorflow/tfjs/blob/master/e2e/benchmarks/local-benchmark/dump.js#L204
    The predictOp in 2nd step ensures all fake errors are removed. And it also requires all the input tensors, otherwise this predictOp will fail, which means the model dump is incomplete.

Conclusion: for both graphModel.getIntermediateTensors and remove fake errors, all input tensors are required.

Input tensors are used differently on e2e

In current e2e, there are two kinds of input tensor use scenarios, examples are MobileNetV3 and bodypix.

  1. MobileNetV3
    predictFunc: () => {
      const input = tf.randomNormal([1, 224, 224, 3]);
      return predictFunction(input);
    },

In MobileNetV3, the input is never disposed at user side(This is a tensor leak). So the dump of MobileNetV3 is of full functionality. And keep this tensor in graph model is meanless for dump.

  1. bodypix
// https://github.com/tensorflow/tfjs-models/blob/master/body-pix/src/base_model.ts#L74
return tf.tidy(() => {
  const asFloat = this.preprocessInput(tf.cast(input, 'float32'));
  const asBatch = tf.expandDims(asFloat, 0);
  const results = this.model.predict(asBatch) as tf.Tensor4D[];

As mentioned before, input tensor asBatch will be disposed in tidy. If we do not keep it(call keep) in graph model, graphModel.getIntermediateTensors can not get all the required tensors, and remove fake errors is incomplete. (This means dump on bodypix is incomplete)

Based on the above MobileNetV3&bodypix, to have better dump dupport, we'd better keep the input tensors.

@pyu10055 @jinjingforever

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axinging Thanks, this makes a lot more sense now, thanks. One nitpick is the name - intermediate tensors is not very intuitive, probably better be called 'inferenceTensorAudit', which could contains inputs + intermediateTensors+outputs.
I think for both MobileNetV3 and bodypix, the input tensor should be disposed as the benchmark is completed.
So, given that the dump has tracked for all tensors, when caller request dispose those tensors, it should be able to remove them all?
Why there will be memory leak?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

@axinging
Copy link
Contributor Author

axinging commented Nov 1, 2022

@axinging Thanks, this makes a lot more sense now, thanks. One nitpick is the name - intermediate tensors is not very intuitive, probably better be called 'inferenceTensorAudit', which could contains inputs + intermediateTensors+outputs. I think for both MobileNetV3 and bodypix, the input tensor should be disposed as the benchmark is completed. So, given that the dump has tracked for all tensors, when caller request dispose those tensors, it should be able to remove them all? Why there will be memory leak?

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

about: Why there will be memory leak?

We can understand this by models: e2e/MobileNetv3, bodypix.

e2e/MobileNetv3 : tensor leak can be fixed.

    predictFunc: () => {
      const input = tf.randomNormal([1, 224, 224, 3]);
      return predictFunction(input);
    },

In above code, 'input' will never be disposed, this is a tensor leak (This exists long before, un-relating to dump feature). I think this a potential bug and this can be fixed by call input.dispose after predict is done (I will try to work out a fix for this later).

bodypix: no tensor leak or un-dumpable.
The original bodypix has no tensor leak, but this will fail dump because asBatch is disposed in tidy.

// https://github.com/tensorflow/tfjs-models/blob/master/body-pix/src/base_model.ts#L74
return tf.tidy(() => {
  const asFloat = this.preprocessInput(tf.cast(input, 'float32'));
  const asBatch = tf.expandDims(asFloat, 0);
  const results = this.model.predict(asBatch) as tf.Tensor4D[];

To support bodypix dump or avoid tensor leak at dump mode, we have three options at below line:
https://github.com/tensorflow/tfjs/pull/6953/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dR258

Option 1: do not keep input tensors.
Pros: no tensor leak at bodypix
Cons: bodypix can not be dumped

      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
      });

Option 2: Keep input tensors, not add it to the this.tensorsPendingDisposal.
Pros: bodypix can be dumped.
Cons: User need to dispose this input tensor(require changes in tfjs-model/bodypix), otherwise tensor leak.

      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
        this.keepTensors(tensors);
      });

Option 3: Keep input tensors, add it to the this.tensorsPendingDisposal.
Pros: bodypix can be dumped.
Cons: User need to clone the input in e2e/MobileNetV3(And other model calls predictFunction in e2e) so that it can be used for the second inference.

      Object.keys(inputs).forEach(name => {
        const [nodeName, index] = parseNodeName(name);
        const tensors: Tensor[] = [];
        tensors[index] = inputs[name];
        tensorsMap[nodeName] = tensors;
        this.keepTensors(tensors, nodeName, this.tensorsPendingDisposal);
      });

We need consider e2e/MobileNetv3 together with e2e/bodypix here. Because they have different useage of input.
In e2e conformance, all models run two inferences. For e2e/MobileNetv3 the input tensors are shared by two inferences. For e2e/bodypix, input tensors are not shared.
This means, for e2e/MobileNetv3, input tensors can not be disposed after first inference(Option 3 will dispose this). For bodypix, its ok to dispose input. The fix is to clone the input tensors for e2e/MobileNetv3.

The basic workflow of current e2e conformance test:

// First inference
predictAndGetData(ExpectedBackend);
if(enableDump) {
 graphModel.getIntermediateTensors();
 graphModel.disposeIntermediateTensors();
}

// Second inference
// Input is shared between two backends. But it will be disposed at first disposeIntermediateTensors if "Input is added to the this.tensorsPendingDisposal".
predictAndGetData(ActualBackend);
if(enableDump) {
 graphModel.getIntermediateTensors();
 graphModel.disposeIntermediateTensors();
}

About name inferenceTensorAudit

I prefer a name like "GraphModel.getNamedTensorsMap", because getIntermediateTensors returns NamedTensorsMap which contains all tensors:

  getIntermediateTensors(): NamedTensorsMap {
    return this.tensorsMap;
  }

BTW, if change name from getIntermediateTensors to getNamedTensorsMap, it seems not related to flag KEEP_INTERMEDIATE_TENSORS namely. But from logic, getNamedTensorsMap do return a NamedTensorsMap, and KEEP_INTERMEDIATE_TENSORS do keep all "INTERMEDIATE TENSORS".

So summarize my proposal:

  1. name: GraphModel.getIntermediateTensors => GraphModel.getNamedTensorsMap
  2. Keep tensor policy: prefer "Option 2: Keep input tensors, not add it to the this.tensorsPendingDisposal"
    https://github.com/tensorflow/tfjs/pull/6953/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dR258
    (Tensor leak at dump mode, but enable dump on bodypix.)
  3. Fix e2e/MobileNetv3 tensor leak in a followup PR, and try "Option 3: Keep input tensors, add it to the this.tensorsPendingDisposal." in a followup PR.

WDYT?
@pyu10055 @jinjingforever

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edited
I had a offline discussion with @mattsoulanille
We believe it would be better to use tensor.clone() to keep the input and intermediate tensors.
clone() would increase the ref count, while not forceful preventing the tensor to be disposed.
which means you need store the name and clone tensors instead just the name.

@mattsoulanille please feel free to chime in.

@axinging

Reviewable status: 0 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

@mattsoulanille
Copy link
Member

mattsoulanille commented Nov 1, 2022

To add to Ping's discussion, I think we can use tf.clone and references to solve this problem without needing special logic to determine when we can dispose each tensor.

Instead of using keep to determine when to dispose each tensor, we can tf.clone it to get another reference. To do this, instead of saving the tensorsMap by setting this.tensorsMap = tensorsMap;, we create a copy of it, where we clone each tensor in the map.

this.tensorsMap = Object.fromEntries(Object.entries(tensorsMap).map(([name, tensorsList]) => {
  const cloned = tensor.clone();
  // This clone needs to be 'kept' because model.execute may be called within a tidy(). We don't want
  // tidy() to dispose these cloned tensors because we need to look at them after the model has finished
  // executing (after the 'tidy()'). 
  // 
  // However, we don't check whether the tensor is 'kept' when we free it. 
  keep(clone);
  return [name, tensorsList.map(tensor => tensor.clone())];
}));

When a user calls getIntermediateTensors, they get this map of cloned tensors, which are guaranteed to not be disposed since nothing else interacts with this map (if tensorsMap is used elsewhere, we can just create an extra variable clonedTensorsMap and use it instead).

When we need to clean up the intermediate tensors, we no longer need to iterate tensorsPendingDisposal to make sure we only dispose tensors that we created. We can just dispose all the cloned tensors in tensorsMap, since they're just clones / extra references to the original tensors:

for (const tensorsList in Object.keys(this.tensorsMap)) {
  for (const tensor of tensorsList) {
    // This is a clone of the real tensor (i.e. another reference), so it's okay to dispose it.
    // We're not disposing the user's input tensor here. Just the clone.
    tensor.dispose();
  }
}

If the user calls model.execute again before calling disposeIntermediateTensors, we just call it for them before we clear tensorsMap and run the model again.

Works when a tensor is reused

Suppose the user reuses an input tensor like this:

const input = tf.randomNormal([1, 224, 224, 3]);

function runModel() {
  return model.execute(input);
}

We can count the number of references to the underlying data of the input tensor as this runs.

  • When input is declared, there is 1 reference to the data. RC = 1
  • When we run model.execute with dump mode, we get another reference in tensorsData. RC = 2.
  • We can look at this cloned tensor (and the other intermediate tensors) now.
  • When we run model.disposeIntermediateTensors, the cloned input tensor is disposed, so RC = 1 again.
  • When we eventually dispose input, then RC = 0 and the data gets deleted.

Works when a new tensor is created inside a tidy()

Suppose the user creates a new tensor each time like this:

function runModel() {
  tf.tidy(() => {
    const input = tf.randomNormal([1, 224, 224, 3]);
    return model.execute(input);
  });
}

We can count the number of references to the underlying data of the input tensor as this runs.

  • When input is declared, there is 1 reference to the data. RC = 1
  • When we run model.execute with dump mode, we get another reference in tensorsData. RC = 2
  • When we exit the tidy, two things happen:
    • The original input tensor is disposed, so RC = 1 now.
    • The cloned input tensor, is not disposed because it was marked as kept, so RC = 1.
  • We can now inspect the intermediate tensors, including the input tensor, because they were all kept through the tidy.
  • When we run model.disposeIntermediateTensors, the cloned input tensor is disposed, so RC = 0 and the data gets deleted.

@axinging axinging force-pushed the dump_sync branch 3 times, most recently from 8894c6f to 926c46e Compare November 2, 2022 07:39
@axinging
Copy link
Contributor Author

axinging commented Nov 2, 2022

Thanks @pyu10055 @mattsoulanille. The updated change is with keep+clone, PTAL.

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @axinging, the PR LGTM with some minor questions.

Reviewable status: 0 of 1 approvals obtained (waiting on @axinging, @jinjingforever, and @qjia7)


tfjs-converter/src/executor/graph_executor.ts line 193 at r6 (raw file):

Previously, axinging (Xu Xing) wrote…

=> Done

just want to confirm that inputs are included in the tensorsMap, thanks


tfjs-converter/src/executor/graph_executor.ts line 560 at r13 (raw file):

        const currentContext = context.currentContext;
        if (util.isPromise(tensors)) {
          if (this.keepIntermediateTensors) {

why keepIntermediateTensors would not work with promise?
These promise should be resolved to tensors and put into the tensorMap.

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1, 1 of 2 files at r14, 1 of 1 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @jinjingforever and @qjia7)

Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit.

@@ -256,6 +276,10 @@ export class GraphExecutor implements FunctionExecutor {
if (this.parent == null) {
context.dispose(tensorsToKeep);
}
if (this.keepIntermediateTensors) {
this.clonedTensorsMap = this.cloneTensorMap(tensorsMap);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this if happen before L276?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@qjia7 qjia7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@pyu10055 @mattsoulanille @jinjingforever Currently, getIntermediateTensors() will return all cloned tensors in the graph, which including inputs, weights, intermediate tensors, outputs. Maybe we need a new meaningful name for it in future. Just list here to get your attention!

Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to take a look at this PR in a few minutes, but I'm sending this review now to prevent it from being merged before I can take a look (since It already has two approvals).

tfjs-converter/src/executor/graph_executor.ts Outdated Show resolved Hide resolved
Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for waiting for my review!

tfjs-converter/src/executor/graph_executor.ts Outdated Show resolved Hide resolved
@@ -645,7 +651,7 @@ export class GraphExecutor implements FunctionExecutor {
private mapInputs(inputs: NamedTensorMap) {
const result: NamedTensorMap = {};
for (const inputName in inputs) {
const tensor = this._signature?.inputs?.[inputName];
const tensor = this._signature ?.inputs ?.[inputName];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const tensor = this._signature ?.inputs ?.[inputName];
const tensor = this._signature?.inputs?.[inputName];

@@ -669,7 +675,7 @@ export class GraphExecutor implements FunctionExecutor {

private mapOutputs(outputs: string[]) {
return outputs.map(name => {
const tensor = this._signature?.outputs?.[name];
const tensor = this._signature ?.outputs ?.[name];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const tensor = this._signature ?.outputs ?.[name];
const tensor = this._signature?.outputs?.[name];

Comment on lines 359 to 385
Object.entries(this.clonedTensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.isDisposed) {
tensor.dispose();
}
});
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic can probably be factored out into another function, since it's used in _executeAsync as well.

Suggested change
Object.entries(this.clonedTensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.isDisposed) {
tensor.dispose();
}
});
});
// Put this function outside of the class.
function tensorMapForEach(tensorsMap: NamedTensorsMap, f: (tensor: Tensor) => void) {
for (const tensorsList in Object.values(tensorsMap) {
for (const tensor in tensorsList) {
f(tensor);
}
}
}
tensorMapForEach(this.clonedTensorsMap, tensor => {
if (tensor && !tensor.isDisposed) {
tensor.dispose();
}
});

Copy link
Contributor Author

@axinging axinging Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do this in a follow up change? In fact I donot see the benefit of this refactor, and the original version is more readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have made this a Nit suggestion. We can keep the current implementation as-is.

Comment on lines 428 to 457
Object.entries(tensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.kept && !tensor.isDisposed &&
!keepIds.has(tensor.id)) {
tensor.dispose();
}
});
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I factored iterating over a tensor map into another function.

Suggested change
Object.entries(tensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.kept && !tensor.isDisposed &&
!keepIds.has(tensor.id)) {
tensor.dispose();
}
});
});
tensorMapForEach(tensorsMap, tensor => {
if (tensor && !tensor.kept && !tensor.isDisposed &&
!keepIds.has(tensor.id)) {
tensor.dispose();
}
});

tfjs-converter/src/executor/graph_executor.ts Show resolved Hide resolved
Comment on lines 275 to 282
if (this.keepIntermediateTensors) {
this.clonedTensorsMap = this.cloneTensorMap(tensorsMap);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should dispose all the original intermediate tensors in tensorsMap now that we've cloned them. Otherwise, we're leaking them.

Does it make more sense to clone each tensor in the above for loop? That way, we can immediately dispose it if necessary (in this.checkTensorForDisposal()). That also simplifies this.checkTensorForDisposal, which would no longer need to check this.keepIntermediateTensors, because the tensor is already cloned.

Copy link
Contributor Author

@axinging axinging Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is possible. But 1), this.cloneTensorMap can not be reused, we need some new logic to do this.
2), Your above mentioned tensorMapForEach refactor conflicts with this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Matt is right. For the original intermediate tensors, we should clone each of them when we get them in L274. Otherwise, we are leaking them due to the changes (L330) you add in this.checkTensorForDisposal. Since you already use cloned tensor, theoretically, this.checkTensorForDisposal should be unchanged. And the code logic you mentioned can be changed based on our requirement.
Similar for the async execute.

tfjs-converter/src/executor/graph_executor.ts Show resolved Hide resolved
}

Object.entries(tensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.kept && !tensor.isDisposed &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check tensor.kept? I think that's mostly reserved for tidy.

Copy link
Contributor Author

@axinging axinging Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I am not clear why kept is checked here, this is from the original version: https://github.com/tensorflow/tfjs/pull/5659/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dL369.

I keep it because I am not clear about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is this was from the previous logic that used kept to save the intermediate tensors, although I could be wrong.

It looks like the keepIds set is what prevents this from disposing the input, output, and weight tensors, so we're not using tensor.kept to prevent them from being disposed. If all the other tensors in tensorsMap are intermediate tensors, then I think it's safe to remove this check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@axinging axinging force-pushed the dump_sync branch 2 times, most recently from 6cb4acd to 1c7fedf Compare November 4, 2022 00:40
@axinging
Copy link
Contributor Author

axinging commented Nov 4, 2022

Thanks @mattsoulanille for your great feedback.
Here are my comments regarding to your comments:
1), "if the user does not call disposeIntermediateTensors."
I merged your changes. But I donot think this is necessary.
"if the user does not call disposeIntermediateTensors." in dump mode, even with your change, tensor leaks will happens too(In the last run).
So I think the only way to ensure no tensor leak is to ask user call disposeIntermediateTensors.

2), tensorMapForEach:
I donot change this yet. In my opinion, the original version is easy to read.
Original:

    Object.entries(this.clonedTensorsMap).forEach(([, tensorsList]) => {
      tensorsList.forEach(tensor => {
        if (tensor && !tensor.isDisposed) {
          tensor.dispose();
        }
      });
    });
    Object.entries(tensorsMap).forEach(([, tensorsList]) => {
        tensorsList.forEach(tensor => {
          if (tensor && !tensor.kept && !tensor.isDisposed &&
              !keepIds.has(tensor.id)) {
            tensor.dispose();
          }
        });
      });

Your suggestion:

    // Put this function outside of the class.
    function tensorMapForEach(tensorsMap: NamedTensorsMap, f: (tensor: Tensor) => void) {
      for (const tensorsList in Object.values(tensorsMap) {
        for (const tensor in tensorsList) {
          f(tensor);
        }
      }
    }
    tensorMapForEach(this.clonedTensorsMap, tensor => {
      if (tensor && !tensor.isDisposed) {
        tensor.dispose();
      }
    });
    tensorMapForEach(tensorsMap, tensor => {
        if (tensor && !tensor.kept && !tensor.isDisposed &&
            !keepIds.has(tensor.id)) {
          tensor.dispose();
        }
      });

3), The return value of "private cloneTensorMap(tensorsMap: NamedTensorsMap) {"
It's interesting that "yarn lint" doesn't complain about this. Do you know why?

Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes I requested! Here are a few more, and then I think this will be ready to merge.


Object.keys(inputs).forEach(name => {
const [nodeName, index] = parseNodeName(name);
const tensors: Tensor[] = [];
tensors[index] = inputs[name];
tensorsMap[nodeName] = tensors;
this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Make this.cloneTensorsList return a cloned list of tensors instead of storing them in a map. I think this is a bit easier to read, and it matches this.cloneTensorsMap. Also, then this.cloneTensorsMap can use this.cloneTensorsList in its implementation.

Suggested change
this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);
this.clonedTensorsMap[nodeName] = this.cloneTensorList(tensors);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -247,15 +290,18 @@ export class GraphExecutor implements FunctionExecutor {
`Please use model.executeAsync() instead.`);
}
tensorsMap[node.name] = tensors;
this.cloneTensorList(tensors, node.name, this.clonedTensorsMap);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need a check for this.keepIntermediateTensors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it doesn't, because this.clonedTensorsMap is null, but IMO that's a bit confusing. See my other comment on this.


Object.keys(inputs).forEach(name => {
const [nodeName, index] = parseNodeName(name);
const tensors: Tensor[] = [];
tensors[index] = inputs[name];
tensorsMap[nodeName] = tensors;
this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tensors should only be cloned if this.keepIntermediateTensors is set, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, cloneTensorsList checks if this.clonedTensorsMap is null before running. I find that a bit confusing. Can we do the check here instead?

Also, I think it might be better for this.cloneTensorsList to return a list of cloned tensors instead of mutating the tensors map. See my nit.

Comment on lines 379 to 381
Object.values(this.clonedTensorsMap).forEach(tensorsList => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.isDisposed) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Google's TS style guide prefers for (... of ...) instead of .forEach(...) (.map(...) is still okay when using the resulting value).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

Object.entries(tensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.kept && !tensor.isDisposed &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is this was from the previous logic that used kept to save the intermediate tensors, although I could be wrong.

It looks like the keepIds set is what prevents this from disposing the input, output, and weight tensors, so we're not using tensor.kept to prevent them from being disposed. If all the other tensors in tensorsMap are intermediate tensors, then I think it's safe to remove this check.

}

private cloneTensorList(
tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {
tensors?: Tensor[], nodeName?: string, tensorsMap?: NamedTensorsMap) {

We'd like to enable strictNullChecks in the future, so if you don't accept the nit that changes this function to return a list of tensors instead of mutating tensorsMap, please mark all these arguments as optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return a list of tensors

tfjs-converter/src/executor/graph_executor.ts Show resolved Hide resolved
axinging and others added 14 commits November 7, 2022 08:56
Currently only graph model predicting with executeAsync supports dump.
This has two drawbackes:
1. Some models don't support dump.
2. For model has a wrapping layer over grapp model, such as bodypix,
   pose-detection, a lot of change is required to support dump, example
   change: tensorflow/tfjs-models#841.

This change removes this limitation, so that more models are supported and
dump is easier.
Copy link
Contributor Author

@axinging axinging left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

private cloneTensorList(
tensors: Tensor[], nodeName: string, tensorsMap: NamedTensorsMap) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return a list of tensors


Object.keys(inputs).forEach(name => {
const [nodeName, index] = parseNodeName(name);
const tensors: Tensor[] = [];
tensors[index] = inputs[name];
tensorsMap[nodeName] = tensors;
this.cloneTensorList(tensors, nodeName, this.clonedTensorsMap);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 379 to 381
Object.values(this.clonedTensorsMap).forEach(tensorsList => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.isDisposed) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

Object.entries(tensorsMap).forEach(([, tensorsList]) => {
tensorsList.forEach(tensor => {
if (tensor && !tensor.kept && !tensor.isDisposed &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@mattsoulanille mattsoulanille merged commit e2e29e4 into tensorflow:master Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants