Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why should we use arena? #4327

Closed
little-bird-in-china opened this issue Feb 22, 2018 · 12 comments
Closed

Why should we use arena? #4327

little-bird-in-china opened this issue Feb 22, 2018 · 12 comments

Comments

@little-bird-in-china
Copy link

As far as I know, Arena is not a memory pool which can reuse allocated memory by maintain a freelist, it just cache more and more memory when create message with arena; Isn't google tcmalloc a better and straighter way to improve overall performance? I just want to take the advantage of network transmission with protobuf.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Feb 22, 2018

Right now using arena with opensource protobuf doesn't gain you much, but inside Google, we have seen massive improvement by adopting arena. I think protobuf arena has two advantages that tcmalloc can't offer:

  1. The ability to deallocate a entire proto message tree in big chunks. With arena, it's possible to allocate everything in a proto message tree in one or several bulk chunks of memory. When you are done with the message, you just need to deallocate these few large chunks. Without arena, deleting a proto message tree will result in numerous small delete calls for every single small object a proto may hold. Basically we have the ability to skip all the desctructor calls with arena which we can't do otherwise.
  2. Better locality. With protobuf arena, objects belong to the same proto message tree are put in adjacent memory whereas tcmalloc doesn't know whether an object is part of a proto message tree and is likely interleave protos with non-protos.

I think the most benefit we saw is from (1). This unfortunately isn't the case with opensource protobuf because all string fields are not allocated in the arena. Internally we have a hack to allocate something that looks like a string in the arena and cast it to a string with accessed, but that isn't portable. We also don't have ctype=STRING_PIECE support in opensource which can help with the issue. I know there are some users using arena with their own patch to implement ctype=STRING_PIECE. I don't think arena will can be widely used until we address the string issue.

@little-bird-in-china
Copy link
Author

little-bird-in-china commented Feb 23, 2018

@xfxyjwf Thanks for you quickly reply, I still have two more question in my scene, a server holds ten thousands of tcp connection keeping alive with heartbeat package:

  1. Should I set a threshold for each connection to control the overall memory usage? when the threshold reached, free all messages by reset the corresponding arena? or should I use arena in different way?
  2. Since I only need to hold a few messages for each thread in memory, can I detach some out-of-date messages and reuse the memory they hold? so I needn't to request memory again from os.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Feb 23, 2018

The common patterns:

  1. With arena: one arena for one message. Something like:
{
  proto2::Arena arena;
  unique_ptr<Foo> foo(Arena::CreateMessage<Foo>(&arena));
  foo->ParseFromString(data);
  ... use foo ...
  // arena is destructed
}
  1. Without arena: reuse proto messages with a free-list.
Foo* foo = free_list_->Pop();
foo->ParseFromString(data);
... use foo ...
free_list_->Push(foo);

(1) works well if the message structure is complex. You can also fine-control the memory allocation using ArenaOptions. For example, you can provide an initial block so if the message fits into this block no memory allocation/deallocation will happen. However, as I mentioned, string fields won't be allocated on arena so it doesn't help if you have lots of string fields.

(2) is the most common pattern used before we have arena support. That's probably still true today. Protobuf objects have the property that proto.Clear() doesn't deallocate any memory but instead caches them for reuse. So if you reuse the same proto object, memory allocation will be kept minimum. Compared to arena, proto.Clear() still has a cost because it needs to traverse the entire message tree structure, but it's much better then deleting the proto object and therefore is used very widely. This is likely the best pattern for your use case as well. You can either use a global free list or per-thread free list. In its simplest form you can just reuse one single proto object again and again. There is one catch: because proto.Clear() doesn't deallocate memory, the memory usage of the reused proto will keep increasing. The reused proto basically allocates enough memory to accommodate every message parsed into it. For example, if one message uses repeated field "a" and another message uses repeated field "b", the reused proto will keep both. The more complex your message structure is, the faster the memory usage increases. For this reason the free-list implementation usually delete an object after a certain number of uses and newly allocated object will start to accumulate memory afresh.

@little-bird-in-china
Copy link
Author

I think i got it.

@ryanolson
Copy link

ryanolson commented Aug 12, 2018

@xfxyjwf

You mention strings not working great in arenas, but what about bytes. Bytes are pseudo strings, but since they don’t need to marshaled into some object, my assumption would be that arenas would be excellent for receiving bytes.

Especially if you wanted to receive these bytes directly into some special block of pinned memory, eg. cudaMallocHost memory using ArenaOptions.

Do arenas make sense for FlatBuffers? It seems like this might be the mechanism to do zero copy directly in and out of the memory blocks you reserve for messages.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Aug 12, 2018

@ryanolson In protobuf C++ API, string fields and bytes fields are both stored as std::string so the same issue applies: neither of them will be stored efficiently in protobuf arena. That can be solved by open-sourcing the zero copy support (see #1896), which includes StringPiece (basically std::string_view) support and that will allow a string or bytes field to alias memory in the arena directly.

@0x007004
Copy link

@xfxyjwf hi , I have an problem about arena .
now protobuf-3.6.1 has support create string in arena , so about this advice "I don't think arena will can be widely used until we address the string issue"
now Can I use this version to improve performance.

sorry , my english is bad . thank you .
Looking forward to your reply.

@acozzette
Copy link
Member

@ly82882592 No, we still do not yet have a solution for this unfortunately. We will probably need to introduce a string ctype based on std::string_view to be able to store string data directly on the arena.

@0x007004
Copy link

@acozzette Oh , thank you

@liuzhijiang
Copy link
Contributor

Is arena-allocated strings class going to be included in official protobuf releases any time soon ?

@aagor
Copy link

aagor commented Nov 22, 2024

@acozzette As protobuf supports features.(pb.cpp).string_type = VIEW now, are there any plans to support allocating string contents on the arena (and not only std::string on arena, contents on heap)?

With this, using protobuf without dynamic memory allocation should be possible.

@acozzette
Copy link
Member

Yes, we do plan to have VIEW-type strings support arena allocation. We don't have a specific timeline for it, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants