Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make protoName unCamelCase lazy #606

Merged
merged 2 commits into from
Apr 19, 2022
Merged

Make protoName unCamelCase lazy #606

merged 2 commits into from
Apr 19, 2022

Conversation

jiahaog
Copy link
Member

@jiahaog jiahaog commented Apr 11, 2022

In a large 1p Flutter android application, package:protobuf accounts for ~220ms (8%) of the CPU time during startup. This _uncamelCase operation takes up ~25ms, or 11% of this 220ms which occurs whenever a proto is constructed.

Looking through the code, it appears that this protoName field is only needed for conversion to JSON which is less common. As such, make it a lazy getter to avoid this.

Internal link to pivot profile view which highlights this method (top half shows callers, bottom half shows callees)

In a large 1p Flutter android application, package:protobuf accounts for ~220ms (8%) of the CPU time during startup. This `_uncamelCase` operation takes up ~25ms, or 11% of this 220ms which occurs whenever a proto is constructed.

Looking through the code, it appears that this `protoName` field is only needed for conversion to JSON which is less common. As such, make it a lazy getter to avoid this.

Internal link to pivot profile view which highlights this method (top half shows callers, bottom half shows callees):

pprof/user-profile?id=d8d43f484e3e7b4064a29737cbdabf77&tab=flame&flamesearch=unCamel&pivot=Precompiled_____unCamelCase_19070143_4307
@jiahaog jiahaog marked this pull request as ready for review April 11, 2022 03:54
Copy link
Member

@osa1 osa1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @jiahaog.

This _uncamelCase operation takes up ~25ms, or 11% of this 220ms which occurs whenever a proto is constructed.

What are the numbers like with this PR? I'm wondering how much difference this change makes.

LGTM assuming it makes a difference in the numbers.

protobuf/lib/src/protobuf/field_info.dart Outdated Show resolved Hide resolved
@osa1 osa1 requested review from sigurdm and mraleph April 11, 2022 08:53
@jiahaog
Copy link
Member Author

jiahaog commented Apr 11, 2022

I ran the startup benchmark with this change and we see a 17ms or ~0.3% in one of the tracked metrics. I also collected a profile locally and noticed that the uncamelcase no longer appears. It's a small improvement but I'm hoping that we can find more things separately which might add up to have an eventual noticeable impact on latency.

@osa1
Copy link
Member

osa1 commented Apr 11, 2022

I ran the startup benchmark with this change and we see a 17ms or ~0.3% in one of the tracked metrics

Just trying to understand the benchmark here -- in the PR description it says "~220ms (8%)", so the entire startup is 2750ms. If we reduce that 220ms to 17ms the startup time becomes 2547ms, and 17ms of 2547ms is 0.6%, not 0.3%. Am I missing anything?

Not that it matters too much, the result looks great..


It seems like the suggestion I made was made using my gmail account. Could you maybe squash the commits with only you as the author? Or you could use [email protected] as my email. (For the future suggestions I will see if I can make suggestions with my Google account) Nvm, I signed the CLA from my personal account.

@jiahaog
Copy link
Member Author

jiahaog commented Apr 11, 2022

Sorry for being unclear. I'll try and elaborate further:

The original profile for startup I captured locally (aggregated across 3 local runs) showed that package:protobuf took 220ms in total as part of this profile. This is 8% so the entire startup (aggregated) is indeed 2750ms. In this profile, we can see that the uncamelCase takes ~25ms.

I then patched this PR into Google3 and ran a similar benchmark in the device lab which is supposedly more accurate (and is what we use to track the historic performance of the app across different CLs). This showed an improvement of 17ms or ~0.3% against another run of the same benchmark without this change patched for a related metric. I used the device lab instead of doing it locally because the performance of my uncalibrated local device can fluctuate quite a bit and the device lab provides much more precision. Hope this clears things up, also happy to share links to the internal benchmark runs internally 🙂

@osa1
Copy link
Member

osa1 commented Apr 11, 2022

Ah, OK, so this PR improves runtime 0.3% in a benchmark. Thanks for the clarification. I will document this in the commit message so that we will know why it was made lazy.

I'd like to wait for @sigurdm or @mraleph's approval, but I think they're not available until next week so this may have to wait a litte bit.

@jiahaog
Copy link
Member Author

jiahaog commented Apr 11, 2022

Sure, thanks!

Copy link
Member

@mraleph mraleph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. This field has very limited usage (I did internal Code Search) and I am somewhat surprised that it even exists in the first place.

@mraleph
Copy link
Member

mraleph commented Apr 19, 2022

@jiahaog feel free to create CL internally with the same change, package is currently out of sync with internal copy so you have to do backport manually.

@osa1 osa1 merged commit a0021c7 into google:master Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants