-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve binary read performance #459
Conversation
Hi @kbongort appreciate your patience on this. I know we spoke offline last week. We have been gathering some baseline metrics on our performance to see all the areas that need addressed. I can confirm that this does improve binary read performance by an average of ~2.5x. However, one concern is the change to modify the |
What would you suggest? |
Let me talk it over with the team. We know that the In the meantime, would you mind splitting out the two improvements ( |
What type of breakage are you concerned with, exactly? This does add a couple of methods to IBinaryReader. Are you worried that clients may have IBinaryReader implementations that will no longer conform to the interface, or something else? |
I've updated the PR to not modify IBinaryReader. Since the new methods are implemented strictly in terms of existing ones, it's possible to refactor them into static utility methods. So that's what I did. |
@smaye81 PTAL – Let me know if you have any further concerns, thanks! |
Hi Steve,
I believe that I've addressed your concerns with this PR. Can you please
take another look?
Thanks,
Ken
…On Tue, May 9, 2023 at 3:13 PM Steve Ayers ***@***.***> wrote:
Let me talk it over with the team. We know that the fromBinary >
readMessage path is a bottleneck, but I'm not sure if there is a good way
to reuse the BinaryReader like you're doing without breaking one of the
interfaces.
In the meantime, would you mind splitting out the two improvements (
BinaryReader vs. readScalar). I think we can probably get the latter
implemented and at least get a quick win.
—
Reply to this email directly, view it on GitHub
<#459 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGSDEQCO6KDTS7CIJNFVVTXFK6O5ANCNFSM6AAAAAAXAYRMIA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @kbongort -- yes, apologies for the delay. we have been a bit heads-down on some other work at the moment, but will take a look at this hopefully this week. thanks again for the PR. cc @timostamm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kbongort 👋
We added basic performance benchmarks in #491. They confirm what you already measured, just provide a little bit more detail.
The downside of keeping the BinaryReader
is that it makes it more difficult to generate speed-optimized code, which is the best lever for performance we have. But looking at the change in isolation, it does speed up the test "large google.protobuf.FileDescriptorSet" by ~35%. This seems worth it.
I pushed up 126ba29 to avoid T.fromBinary()
in readMapEntry()
, and 77460d3 to move BinaryReaderUtil.readMessage()
to a local function. It keeps the performance improvement, and reduces the bundle size increase to negligible 0.02%.
The readScalar()
change is a 1% increase in bundle size (that's the reason it wasn't a switch statement in the first place). But it's a really decent performance bump with 55% for "large google.protobuf.FileDescriptorSet". This is worth it. Good catch 🙂
Thank you for the contribution, LGTM!
Thanks for the review & further work! 🍾 |
This PR makes a couple of improvements to binary read performance:
messageField()
method to BinaryReader that reads a size-delimited message field without constructing a new BinaryReader. (Message.fromBinary
is refactored to share code). A knock-on effect of this change is that we passoptions
along without repeated calls tomakeReadOptions
, which otherwise needlessly spreadsoptions
intoreadDefaults
even whenoptions
is already identical toreadDefaults
.reader[method]()
which seems to incur an implicit call to .bind() on the reader method (i.e. it is equivalent toreader[method].bind(reader)()
). By using an inline switch statement we can avoid that indirection and use a faster path.I put together a simple benchmark (https://github.com/kbongort/protobuf-es/pull/1) using the following message definitions:
The benchmark uses both a more complete version of the User message and one with only a few fields set:
After warming up the cache by reading a serialized userMessage 1000 times, it measures the time to read
userMessage
, to read a UserList of 1000userMessage
s, and to read a UserList of 1000smallUserMessage
s. The amount of time it takes to callJSON.parse()
on the JSON version of the message is provided as a point of comparison. Without the changes in this PR, the results are:After the changes in this PR, the results are:
The three test cases show speedups of 1.3x, 1.7x, and 3.9x, respectively. The effect of (1) is especially pronounced when there are many small messages, as in the last benchmark test.