-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too many full GCs when receive big responses #13236
Comments
You're right, this is happening because for each message sent, we serialize in a byte[] and then let the buffer to be garbage collected (for large objects, they are always taken from the large object generation which requires FullGC to garbage collect). Same thing happens for deserialization. Do you have any specific suggestions how the change should be made? |
I first thought that it would work if I replace As to the native layer, I think it is OK if we pass a larger array than required in the native layer, and ArrayPool will return a larger array than we specify. As to marshallers, I thought of the below code, but existing deserializers will slow down. So I couldn't have a good idea. public class RentedBuffer : IDisposable
{
public ArraySegment<byte> Buffer { get; }
public void Dispose();
}
public class Marshaller<T>
{
Func<T, byte[]> bytesSerializer;
Func<byte[], T> bytesDeserializer;
Func<T, RentedBuffer> segmentSerializer;
Func<ArraySegment<byte>, T> segmentDeserializer;
public Marshaller(Func<T, byte[]> serializer, Func<byte[], T> deserializer)
{
this.bytesSerializer = GrpcPreconditions.CheckNotNull(serializer, nameof(serializer));
this.bytesDeserializer = GrpcPreconditions.CheckNotNull(deserializer, nameof(deserializer));
}
public Marshaller(Func<T, RentedBuffer> serializer, Func<ArraySegment<byte>, T> deserializer)
{
this.segmentSerializer = GrpcPreconditions.CheckNotNull(serializer, nameof(serializer));
this.segmentDeserializer = GrpcPreconditions.CheckNotNull(deserializer, nameof(deserializer));
}
public Func<T, byte[]> Serializer
{
get
{
if (this.bytesSerializer == null)
{
this.bytesSerializer = x =>
{
using (var serializationResult = this.segmentSerializer(x))
{
var seg = serializationResult.Buffer;
// slow, but not used in Grpc.Core
var bs = new byte[seg.Count];
Buffer.BlockCopy(seg.Array, seg.Offset, bs, 0, seg.Count);
return seg;
}
};
}
return this.bytesSerializer;
}
}
public Func<byte[], T> Deserializer
{
get
{
if (this.bytesDeserializer == null)
{
this.bytesDeserializer = bs => this.segmentDeserializer(new ArraySegment<byte>(bs));
}
return this.bytesDeserializer;
}
}
public Func<T, RentedBuffer> SegmentSerializer
{
get
{
if (this.segmentSerializer == null)
{
this.segmentSerializer = x => new RentedBuffer(this.bytesSerializer(x));
}
return this.segmentSerializer;
}
}
public Func<ArraySegment<byte>, T> segmentDeserializer
{
get
{
if (this.segmentDeserializer == null)
{
this.segmentDeserializer = seg =>
{
if (seg.Count == seg.Array.Length)
{
return this.bytesDeserializer(seg.Array);
}
// slow
var bs = new byte[seg.Count];
Buffer.BlockCopy(seg.Array, seg.Offset, bs, 0, seg.Count);
return this.bytesDeserializer(bs);
}
}
return this.segmentDeserializer;
}
}
} |
Just to think of, I am using this baby when processing realtime audio, where had the same problem (not with gRPC): https://github.com/Microsoft/Microsoft.IO.RecyclableMemoryStream. It seems very efficient in avoiding the large GCs. But it has a stream interface, not an array. |
This was partially addressed by merging #18865. Grpc now allows using a deserializer that allows reading the received payload as ReadOnlySequence - which involves no copying.
|
@jtattermusch, .NET Core has added a lot of classes to reuse memory allocations, In addition to the [de]serialization you mention, Protobuffers all operate with the System.String on the surface API level (the proto's |
This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 180 days. It will be closed automatically if no further update occurs in 1 day. Thank you for your contributions! |
Optimization work on this topic continues to be relevant. |
Once this protobuf PR is available, we will be able deserialize without incurring the GC for large objects protocolbuffers/protobuf#5888 |
This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 30 days. It will be closed automatically if no further update occurs in 7 day. Thank you for your contributions! |
protocolbuffers/protobuf#7351 has been merged. Once that feature gets release (and we add bindings for the new de-serialization API on the gRPC side), this issue can be marked as fixed. |
@jtattermusch has the new serialization API been exposed in gRPC C# yet? I am trying to figure out how to use |
It's been available for more than a year now, see the PR for details: #23485 (it also demonstrates the PayloadAsReadOnlySequence usage). C# doesn't support the |
@jtattermusch thank you! This work has significantly reduced our GC time! There is still one area causing large GC though, which is the fact that we can't pass rented buffers from an |
Related issues / PRs are here: protocolbuffers/protobuf#8238, protocolbuffers/protobuf#7828, protocolbuffers/protobuf#7645 Right now there's no plan to enable deserializing messages using pre-allocated buffers, since there is currently no API that would allow one to control the lifetime of rented memory (C# protobuf messages/fields have no "Dispose" or "Return" method). As it's been pointed out previously, implementing this properly would be a major effort that would to be carefully designed (and we currently don't have the cycles for it). And we don't want a half-baked solution. I'm going to close this issue as there's currently nothing to do here. Some general advice for GC improvements:
|
I found that a lot of full GCs are run when the client receives big responses (about 100KB).
Here is an example code: https://gist.github.com/azyobuzin/7fee0a968a8780fc56b7e8a829c60d9b#file-grpcmemorytest-cs
In the example, the client will receive 100KB responses every 0.1 seconds.
I ran the code, profiling with CLR Profiler. Then I found that full(gen 2) GCs were run about every 3 seconds, and a lot of allocation happened in
BatchContextSafeHandle.GetReceivedMessage
.The result of profiling: https://gist.github.com/azyobuzin/7fee0a968a8780fc56b7e8a829c60d9b#file-grpcmemorytest-log
I think the situation where full GCs are run many times is not good. I would suggest it use an object pool, for example
System.Buffers.ArrayPool
.What version of gRPC and what language are you using?
gRPC C# 1.7.1 from NuGet
What operating system (Linux, Windows, …) and version?
Windows (.NET Framework 4.7)
The text was updated successfully, but these errors were encountered: