-
Notifications
You must be signed in to change notification settings - Fork 595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducing allocations when (de)serializing frames/commands. #732
Reducing allocations when (de)serializing frames/commands. #732
Conversation
Updated (see above), now using memory buffers when serializing frames as well. Huge reductions in allocations again :) |
@stebet thank you. We'd appreciate some guidance on how much of #734 can be folded into this PR. If most of it (an ideal scenario IMO), we should also make sure to give @Anarh2404 full credit for their contribution :) |
Not a lot can be merged into this PR as this PR doesn't include the pipelines improvements from #706. I mentioned to @lukebakken the other day that I'd like to split the allocation reductions off from the pipelines work to keep the PRs more sane for review and not to mix concepts too much. The work from @Anarh2404 definietly applies in general though and it would be awesome to get in as well but I think it applies a bit more to the pipelines PR. The async improvements (ValueTask etc) have a lot of API changes that I think might be more suited for 7.0 as they are a bit bigger, but I'd love to see what from there could apply here. |
The biggest part of allocations left can be removed if the following behavior change can take place: When commands/messages are handled, make it an explicit and documented operation that if the client receiving the message wants to hang on to the bytes returned in consumed messages, they MUST create a copy of them within the scope when they are received/handled, since they'll be disposed after commands have been handled. If we dispose them, we can fetch them from array pools like the (de)serialized versions and then pretty much make sending and receiving messages from the server totally utilizing recycled buffers making the allocations pretty predictable and constant. This would make the Command.Body property a What do you say @lukebakken, @michaelklishin and @bording? Is that an acceptable API change to make for 6.0? To give you an idea what it means, this is what it looks like with the same program as above (sending and receiving 50k messages, 512 byte body size): Even if I bump the body size up to 16384 bytes, this is what it looks like: So pretty much constant memory usage, regardless of body size :) |
@stebet that seems reasonable to me, but I'm not an end-user of this library. @acogoluegnes what is the lifetime of the message data bytes in the Java client? Do users have a well-defined scope in which to either copy or use the data for de-serialization? |
@michaelklishin part of the #734 doesn't make sense without #706. But I think some other changes can be ported to the current master without breaking changes. I can try to do it on the weekend. |
A minimum possible lifetime is the lifetime of the delivery handler. Beyond that, the client does not really control it. This would be more or less the same if/when we change the API to use |
Yeah, that's what I'm thinking. If we dispose the bytes when all handlers are finished running we can get rid of these allocations as well as we can recycle those buffers. As long as it's well documented as a possible change in behavior, I don't see why not. Do you want me to add those changes to this PR? @michaelklishin @lukebakken @acogoluegnes |
@stebet yes, please add them in a single commit in the unlikely event we have to remove it. Thanks!! |
@lukebakken @michaelklishin PR updated. Pretty close to final version. A lot of unused internal stuff deleted that related to NetworkBinaryReader/Writer (after merge with the Public API reduction PR). Also updated the screenshot at the top with the latest memory profile :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some cosmetic changes on my first pass. There's a lot going on here, so I'll re-review on Monday. Thanks for a lot of hard work @stebet
Don't hesitate to ask if something isn't clear or if you want me to comment on some changes for explanations :) |
Haven't added this yet but here's a glimpse of what using async directly (instead of work pools) achieves as well with close to no functional changes except making most of the Handle* methods return Here's my fork PR: stebet#2 This also improves throughput considerably on my test environment at least (both benchmark app and tests run faster). |
c8cc691
to
5f33780
Compare
…nd removing unused internal classes and tests. Adding NetworkByteOrder tests.
…tQueue + SemaphoreSlim to get rid of a blocking wait for better concurrency.
Removing use og SunchronizedList and replacing with just two longs to keep track of outstanding confirms.
5f33780
to
b1c16b4
Compare
As always very impressive work @stebet - the RabbitMQ team appreciates it a lot! |
My pleasure :) |
|
Proposed Changes
This PR starts the work needed to reduce allocations when (de)serializing commands and frames that eventually get sent or received over the wire.
The only way to properly do this is to use Memory and slicing to read/write objects, but this brings some questions along with it since it does impact the Public API.
Currently a lot of allocations happen when a lot of MemoryStream objects are being temporarily created. If we want to get rid of them we need to make some changes, so I propose the following:
Frame
andMethodBase
at least. This should reduce the exposed public API quite a lot. Relates to Reduce public API as much as possible #714MemoryStream
andNetworkBinary(Reader/Writer)
and work directly withMemory<byte>
instances when reading/writing types deriving fromMethodBase
.Types of Changes
Checklist
CONTRIBUTING.md
documentFurther Comments
Tagging @bording for input.