-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new tensor diposal algorithm for sync graph execution #7392
Add new tensor diposal algorithm for sync graph execution #7392
Conversation
private checkTensorForDisposalWithNodeLiveUntilInfo( | ||
node: Node, tensorMap: NamedTensorsMap, tensorsToKeep: Set<number>, | ||
outputNodeNameSet: Set<string>, liveUntilNodes?: Node[]) { | ||
// Skip output nodes and any control flow nodes, since its dependency is | ||
// tricky to track correctly. | ||
if (isControlFlow(node) || outputNodeNameSet.has(node.name)) { | ||
return; | ||
} | ||
if (liveUntilNodes == null) { | ||
return; | ||
} | ||
|
||
for (const node of liveUntilNodes) { | ||
for (const tensor of tensorMap[node.name]) { | ||
if (!tensor || tensor.kept || tensorsToKeep.has(tensor.id)) { | ||
continue; | ||
} | ||
tensor.dispose(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is only called in one place, and it's essentially checking some properties of node
and then deleting everything in liveUntilNodes
. While liveUntilNodes
comes from a map of nodes with a specific property about when and why its nodes may be disposed, this function's implementation doesn't use that property (since it's all done in the setup of the nodeLiveUntilMap
). Maybe we can rename this function to be more generic, like disposeNodes
and only pass in a list of nodes to dispose. We could also factor out this snippet from here and the original checkNodesForDisposal
.
if (isControlFlow(node) || outputNodeNameSet.has(node.name)) {
return;
}
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intentionally make this function keep similar signature and name as the original checkNodesForDisposal
, to maintain the parity of tensor disposing functions in sync and async graph executor. I want to make it easier for future maintainer to understand that they are the same, at least from the aspect of intentions and behaviors.
I think a better rewrite may be renaming both of them to something like "function disposeNodes...(...)", since the original function not only "checks" the tensors but also disposes tensors in it. What do you think?
For factoring out some stuff, I don't see huge gains on making a separate function for that single if check. I'm thinking to rewrite the async disposing algorithm to utilize the node dependency and reduce loops over tensor edges, but I haven't figured out a better algo (and I need to learn more about async control flow to make sure the new algo won't break it). Maybe we can have a better picture of which parts can be shared then.
export function parseNodeName(name: string): [string, number, string] { | ||
export function parseNodeName( | ||
name: string, context?: ExecutionContext): [string, number, string?] { | ||
if (name === '') { | ||
return ['', 0, undefined]; | ||
} | ||
|
||
const isCacheEnabled = context != null && context.parseNodeNameCache != null; | ||
if (isCacheEnabled) { | ||
const cachedResult = context.parseNodeNameCache.get(name); | ||
if (cachedResult != null) { | ||
return cachedResult; | ||
} | ||
} | ||
const parts = name.split(':'); | ||
let result: [string, number, string?]; | ||
if (parts.length === 1) { | ||
return [name, 0, undefined]; | ||
result = [name, 0, undefined]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from #7391, so that PR should be merged before this one.
Co-authored-by: Matthew Soulanille <[email protected]>
Co-authored-by: Matthew Soulanille <[email protected]>
Co-authored-by: Matthew Soulanille <[email protected]>
Co-authored-by: Matthew Soulanille <[email protected]>
Co-authored-by: Matthew Soulanille <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
See code comments for more details.
It only works for sync execution since the new algorithm depends on a fixed node execution order. The time complexity decreases from
O(NumEdges*NumTensors)
or more preciseO(NumTensorEdges)
toO(NumTensors)
.The time saved is around 5% for MobileNetV3 (again, tested on my laptop and may be imprecise).
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is