Reducing allocations when (de)serializing frames/commands. #732

stebet · 2020-02-24T14:19:49Z

Proposed Changes

This PR starts the work needed to reduce allocations when (de)serializing commands and frames that eventually get sent or received over the wire.

The only way to properly do this is to use Memory and slicing to read/write objects, but this brings some questions along with it since it does impact the Public API.

Currently a lot of allocations happen when a lot of MemoryStream objects are being temporarily created. If we want to get rid of them we need to make some changes, so I propose the following:

Internalize types that have anything to do with serialization and protocol handling. End users should never have to change/override them. I think it would make sense for any type deriving from Frame and MethodBase at least. This should reduce the exposed public API quite a lot. Relates to Reduce public API as much as possible #714
Phase out MemoryStream and NetworkBinary(Reader/Writer) and work directly with Memory<byte> instances when reading/writing types deriving from MethodBase.

Types of Changes

Bug fix (non-breaking change which fixes issue #NNNN)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause an observable behavior change in existing systems)
Documentation improvements (corrections, new content, etc)
Cosmetic change (whitespace, formatting, etc)

Checklist

I have read the CONTRIBUTING.md document
I have signed the CA (see https://cla.pivotal.io/sign/rabbitmq)
All tests pass locally with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)
Any dependent changes have been merged and published in related repositories

Further Comments

Tagging @bording for input.

stebet · 2020-02-24T14:23:28Z

Current memory usage with the NuGet latest release:

Current memory usage with the latest changes in the PR:

stebet · 2020-02-25T23:49:24Z

Updated (see above), now using memory buffers when serializing frames as well. Huge reductions in allocations again :)

michaelklishin · 2020-02-26T07:49:12Z

@stebet thank you. We'd appreciate some guidance on how much of #734 can be folded into this PR. If most of it (an ideal scenario IMO), we should also make sure to give @Anarh2404 full credit for their contribution :)

stebet · 2020-02-26T11:56:08Z

@stebet thank you. We'd appreciate some guidance on how much of #734 can be folded into this PR. If most of it (an ideal scenario IMO), we should also make sure to give @Anarh2404 full credit for their contribution :)

Not a lot can be merged into this PR as this PR doesn't include the pipelines improvements from #706. I mentioned to @lukebakken the other day that I'd like to split the allocation reductions off from the pipelines work to keep the PRs more sane for review and not to mix concepts too much.

The work from @Anarh2404 definietly applies in general though and it would be awesome to get in as well but I think it applies a bit more to the pipelines PR. The async improvements (ValueTask etc) have a lot of API changes that I think might be more suited for 7.0 as they are a bit bigger, but I'd love to see what from there could apply here.

stebet · 2020-02-26T12:06:13Z

The biggest part of allocations left can be removed if the following behavior change can take place:

When commands/messages are handled, make it an explicit and documented operation that if the client receiving the message wants to hang on to the bytes returned in consumed messages, they MUST create a copy of them within the scope when they are received/handled, since they'll be disposed after commands have been handled.

If we dispose them, we can fetch them from array pools like the (de)serialized versions and then pretty much make sending and receiving messages from the server totally utilizing recycled buffers making the allocations pretty predictable and constant.

This would make the Command.Body property a ReadOnlyMemory<byte>, which will be disposed once the command has been handled. As this might break clients if they depend on hanging on to the bytes, which is an edge case I think, as I'd guess most clients are deserializing them into some other objects as they process them.

What do you say @lukebakken, @michaelklishin and @bording? Is that an acceptable API change to make for 6.0?

To give you an idea what it means, this is what it looks like with the same program as above (sending and receiving 50k messages, 512 byte body size):

Even if I bump the body size up to 16384 bytes, this is what it looks like:

So pretty much constant memory usage, regardless of body size :)

lukebakken · 2020-02-26T15:39:04Z

@stebet that seems reasonable to me, but I'm not an end-user of this library.

@acogoluegnes what is the lifetime of the message data bytes in the Java client? Do users have a well-defined scope in which to either copy or use the data for de-serialization?

Anarh2404 · 2020-02-27T06:11:54Z

@michaelklishin part of the #734 doesn't make sense without #706. But I think some other changes can be ported to the current master without breaking changes. I can try to do it on the weekend.

michaelklishin · 2020-02-27T11:19:09Z

A minimum possible lifetime is the lifetime of the delivery handler. Beyond that, the client does not really control it. This would be more or less the same if/when we change the API to use ByteBuffers

stebet · 2020-02-27T14:19:13Z

A minimum possible lifetime is the lifetime of the delivery handler. Beyond that, the client does not really control it. This would be more or less the same if/when we change the API to use ByteBuffers

Yeah, that's what I'm thinking. If we dispose the bytes when all handlers are finished running we can get rid of these allocations as well as we can recycle those buffers. As long as it's well documented as a possible change in behavior, I don't see why not. Do you want me to add those changes to this PR? @michaelklishin @lukebakken @acogoluegnes

lukebakken · 2020-02-27T15:16:51Z

Do you want me to add those changes to this PR?

@stebet yes, please add them in a single commit in the unlikely event we have to remove it. Thanks!!

stebet · 2020-02-27T16:03:23Z

@lukebakken @michaelklishin PR updated. Pretty close to final version. A lot of unused internal stuff deleted that related to NetworkBinaryReader/Writer (after merge with the Public API reduction PR). Also updated the screenshot at the top with the latest memory profile :)

lukebakken

I made some cosmetic changes on my first pass. There's a lot going on here, so I'll re-review on Monday. Thanks for a lot of hard work @stebet

stebet · 2020-02-29T00:59:56Z

Don't hesitate to ask if something isn't clear or if you want me to comment on some changes for explanations :)

stebet · 2020-03-04T09:21:03Z

Haven't added this yet but here's a glimpse of what using async directly (instead of work pools) achieves as well with close to no functional changes except making most of the Handle* methods return Task instead of void, and turning HandleFrame and HandleCommand into async methods as well.

Here's my fork PR: stebet#2

This also improves throughput considerably on my test environment at least (both benchmark app and tests run faster).

…nd removing unused internal classes and tests. Adding NetworkByteOrder tests.

…tQueue + SemaphoreSlim to get rid of a blocking wait for better concurrency.

Removing use og SunchronizedList and replacing with just two longs to keep track of outstanding confirms.

lukebakken · 2020-03-04T22:07:45Z

As always very impressive work @stebet - the RabbitMQ team appreciates it a lot!

stebet · 2020-03-04T22:14:25Z

My pleasure :)

hez2010 · 2020-04-08T09:49:54Z

Just a suggestion:
Use ReadOnlySpan<T> for exposed parameters in message handler APIs instead of ReadOnlyMemory<T> so that user cannot save it for later. ReadOnlySpan<T> provides almost same interface with ReadOnlyMemory<T>. This can avoid issues with buffer's life cycle because users must copy data from Span if they want to use those data later. Update: My mistake. ReadOnlySpan<T> cannot use within an async method.

stebet changed the title ~~WIP: Reducing allocations when (de(serializing frames/commands.~~ WIP: Reducing allocations when (de)serializing frames/commands. Feb 24, 2020

stebet requested review from lukebakken and michaelklishin and removed request for lukebakken February 24, 2020 14:20

lukebakken added this to the 6.0.0 milestone Feb 24, 2020

lukebakken self-assigned this Feb 24, 2020

Anarh2404 added a commit to Anarh2404/rabbitmq-dotnet-client that referenced this pull request Feb 24, 2020

add rabbitmq#732

2690724

Anarh2404 mentioned this pull request Feb 25, 2020

Pipeline improvements #734

Closed

11 tasks

stebet changed the title ~~WIP: Reducing allocations when (de)serializing frames/commands.~~ Reducing allocations when (de)serializing frames/commands. Feb 28, 2020

lukebakken reviewed Feb 28, 2020

View reviewed changes

stebet mentioned this pull request Mar 2, 2020

#706 Pipelines implementation and allocation improvements (master rebase) #707

Closed

lukebakken force-pushed the serializationAllocReductions branch from c8cc691 to 5f33780 Compare March 4, 2020 15:13

Stefán J. Sigurðarson and others added 4 commits March 4, 2020 12:13

Reducing memory usage when deserializing frames into commands.

025b7e4

Cleaning up CommandAssembler.

4069854

Using recycled memory buffers when serializing frames.

549d99f

Removing unused code.

e9390a0

stebet and others added 9 commits March 4, 2020 12:13

Getting rid of an unnecessary NetworkBinaryWriter in the socket handler.

2a49674

Code cleanups.

f33dd6c

Recycling command buffers to reduce allocations when receiving messages.

8bf9308

Removing unused variables and making readonly where possible.

9e82efc

Removing last usages of NetworkBinaryReader and NetworkBinaryWriter a…

3a2d8fc

…nd removing unused internal classes and tests. Adding NetworkByteOrder tests.

Use ReadOnlyMemory where possible.

8778edf

Removing the use of a BlockingCollection and replacing with Concurren…

3847a76

…tQueue + SemaphoreSlim to get rid of a blocking wait for better concurrency.

Increasing buffer size for both send/receive for better throughput.

ff627be

Removing use og SunchronizedList and replacing with just two longs to keep track of outstanding confirms.

Cosmetic fixes, remove commented out code

b1c16b4

lukebakken force-pushed the serializationAllocReductions branch from 5f33780 to b1c16b4 Compare March 4, 2020 20:13

lukebakken approved these changes Mar 4, 2020

View reviewed changes

lukebakken merged commit 384b1c3 into rabbitmq:master Mar 4, 2020

michaelklishin mentioned this pull request Mar 9, 2020

Allocation improvements #753

Merged

11 tasks

lukebakken mentioned this pull request Mar 12, 2020

Back-port #753 to 5.x #754

Closed

Anarh2404 mentioned this pull request Apr 3, 2020

Delivery handlers can be invoked with deallocated or modified memory contents #802

Closed

stebet deleted the serializationAllocReductions branch April 8, 2020 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing allocations when (de)serializing frames/commands. #732

Reducing allocations when (de)serializing frames/commands. #732

stebet commented Feb 24, 2020

stebet commented Feb 24, 2020 •

edited

Loading

stebet commented Feb 25, 2020

michaelklishin commented Feb 26, 2020

stebet commented Feb 26, 2020

stebet commented Feb 26, 2020 •

edited

Loading

lukebakken commented Feb 26, 2020

Anarh2404 commented Feb 27, 2020

michaelklishin commented Feb 27, 2020

stebet commented Feb 27, 2020

lukebakken commented Feb 27, 2020

stebet commented Feb 27, 2020 •

edited

Loading

lukebakken left a comment

stebet commented Feb 29, 2020

stebet commented Mar 4, 2020 •

edited

Loading

lukebakken commented Mar 4, 2020

stebet commented Mar 4, 2020

hez2010 commented Apr 8, 2020 •

edited

Loading

Reducing allocations when (de)serializing frames/commands. #732

Reducing allocations when (de)serializing frames/commands. #732

Conversation

stebet commented Feb 24, 2020

Proposed Changes

Types of Changes

Checklist

Further Comments

stebet commented Feb 24, 2020 • edited Loading

stebet commented Feb 25, 2020

michaelklishin commented Feb 26, 2020

stebet commented Feb 26, 2020

stebet commented Feb 26, 2020 • edited Loading

lukebakken commented Feb 26, 2020

Anarh2404 commented Feb 27, 2020

michaelklishin commented Feb 27, 2020

stebet commented Feb 27, 2020

lukebakken commented Feb 27, 2020

stebet commented Feb 27, 2020 • edited Loading

lukebakken left a comment

Choose a reason for hiding this comment

stebet commented Feb 29, 2020

stebet commented Mar 4, 2020 • edited Loading

lukebakken commented Mar 4, 2020

stebet commented Mar 4, 2020

hez2010 commented Apr 8, 2020 • edited Loading

stebet commented Feb 24, 2020 •

edited

Loading

stebet commented Feb 26, 2020 •

edited

Loading

stebet commented Feb 27, 2020 •

edited

Loading

stebet commented Mar 4, 2020 •

edited

Loading

hez2010 commented Apr 8, 2020 •

edited

Loading