Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache for parseNodeName #7391

Merged
merged 5 commits into from
Feb 22, 2023
Merged

Conversation

chunnienc
Copy link
Collaborator

@chunnienc chunnienc commented Feb 17, 2023

This PR adds cache for parseNodeName - an expensive string manipulation function in graph executor. The time saved is around 2% to 3% for MobileNetV3 (tested on my laptop, may be imprecise).

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.


This change is Reviewable

@chunnienc chunnienc marked this pull request as ready for review February 17, 2023 18:15
Copy link
Member

@mattsoulanille mattsoulanille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I was going to recommend using a closure to memoize parseNodeName instead of adding it to context but it looks like you need to pass this context to other functions like getParamValue.

@chunnienc
Copy link
Collaborator Author

chunnienc commented Feb 17, 2023

Yeah. I think the context is suitable since it covers calls from parseNodeName itself and also getTensor and getParamValue, and it associates the lifecycle of the strings with the graph, so that these string keys can be gc'ed with the graph.

@Linchenn
Copy link
Collaborator

Thanks! I think this approach would benefit small models a lot, by introducing cache and cache lookup to avoid string operations.

Have you benchmarked some large models? For large models, the number of nodes may be significant, so I am not sure, when the cache is very large, whether the cache lookup is still faster than string operations?

@chunnienc
Copy link
Collaborator Author

Thanks! I think this approach would benefit small models a lot, by introducing cache and cache lookup to avoid string operations.

Have you benchmarked some large models? For large models, the number of nodes may be significant, so I am not sure, when the cache is very large, whether the cache lookup is still faster than string operations?

(Discussed offline) Yes it's always faster.
Benchmark: https://jsbench.me/8eleeumdpi/1

Copy link
Collaborator

@Linchenn Linchenn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@chunnienc chunnienc merged commit 7dfcc76 into tensorflow:master Feb 22, 2023
@chunnienc chunnienc deleted the graph-exec-perf branch February 22, 2023 06:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants