-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An effective way to compress Large Object Heap #4076
Comments
cc @Maoni0 |
To clairfy, by 'modify RAM page table', your idea here is that if the CLR could determine a page was 'free' in LOH, we could modify the PTE to point to a new physical page? |
Thanks for your interest in the GC, @ygc369. The feature you are talking about is called VA remapping or swapping - it's something that needs to be implemented in the VMM. We talked to the OS guys about it a few years ago. We don't have it yet. |
@Maoni0 This is a great feature, not only GC, many other operations which need copy large amount of memory can also benefit from it. I think you should talk to the OS guys about it again.It will obviously improve the performance of many programs without modifying the source code. |
I like the idea very much. Yes, LOH is likely to have fragments, and memory for large objects should be allocated directly, rather than have a managed heap for it. It is much better to do the address mapping in OS level than copy and move. |
👍 |
Also I would suggest filing feature requests with the Linux and *BSD kernel teams. I believe (but could be wrong) that the last attempt to get this feature into Linux was tainted by association with the Azul Zing JVM, which – being proprietary and patent-encumbered – was looked down upon by the Linux team. The feature appeared to be only useful for a single, proprietary piece of software. The FOSS *nix kernel developers might be much more interested if they saw that a free software VM would actually use address remapping. |
@drbo |
Is this feature very hard to realize? |
@ygc369 I don't know – I was referring to the now-defunct Managed Runtime Initiative, which (I believe) only ever managed to submit a patch to Linux. Azul has since resorted to shipping a custom proprietary kernel module. |
Virtual Machine softwares have already used this feature, I don't think it is too hard to realize. |
Nobody is interested in this? |
I don't want to comment directly on the merits of pursuing this proposal, but I will mention that on a 64 bit machine, fragmentation is not as problematic as you might assume. The GC does not touch freed memory and since every object on the large object heap is > 85K (thus 42 pages), most of the pages simply drop out of the working set and don't 'hurt' real memory consumption (only address space consumption, which is significantly cheaper). I don't want to make the statement that fragmentation of the large object heap does not matter, but my observation above suggests we need some data that suggests that large object heap fragmentation is a problem in interesting scenarios. |
@vancem |
Not compacting LOH does not mean we do not collect garbage on LOH. We just don't compact (unless you specifically tell us to). We can collect LOH as often as we need if we think it's productive. As I mentioned above, we already talked to the OS group a few years ago about doing this and the OS has yet to implement the VA remapping feature. I will talk to them again but feel free to bring this up with the Windows group and other OS groups as @drbo mentioned. |
@Maoni0 |
I don't want to leave a comment on "How it would be better to compress LOH", but i want to ask if its even neccessary? It is a lot of work for the GC and for this reason it will be done not as much at it should be done to work in a efficient way for the memory management. What comes to my mind is a type of The reason for the fragmentation is the search for a free gap, and that a large object must have a size less (or equals) then this gap. so over time big gaps will be filled by smaller objects and a lot of small gaps are remaining. it would be much better if LOH uses cause of LOH is used only on large memory allocations the performance impact and the fragmentation from not consuming a entire 4K block would be much less then the actual fragmentation. the additional memory consumption py the 8Byte pointer on a 4K page would be ~8.4MB on a 4GB memory allocation. reading the memory could be a tiny bit slower, but this way the GC doesn't needs to reorganize the LOH any more. if this is not enought it could maybe be optimized by using the first byte of any page to determinate if the page should be read entirely and continued to read the next page (like on actual LOH) or if the last 8Bytes are used as reference to the next Page. this would improve performance and memory usage for a usecase with a lot of free memory or only a few LOH objects |
@FlorianRainer Wouldn't that break |
@svick thats true, for unsafe code and if you are working with pointers this will not work. |
@FlorianRainer |
@Maoni0
|
@Maoni0
|
AWE has existed since Server 2003. it's not new. the APIs are quite awkward for this purpose and likely not fast enough (what I talked about with the OS folks was much more targeted at the GC usage); feel free to experiment with them. Linux has mremap which seems much more suitable for the usage. I haven't gotten around to experimenting with it yet. |
I recently saw a CppCon talk where a clever technique for compacting heaps was presented. The key idea was to map the same physical page at multiple virtual locations. As long as objects on those virtual locations do not overlap physically, zero-copy compaction can be performed without relocating objects. It's not a full compaction. Rather, it uses suitable opportunities for this technique. Apparently, the authors found this to be a valuable optimization overall. This technique is different from simply releasing free space "holes" by decommitting the pages. I'm posting this here for consideration of the GC team. |
yeah, saw this when Emery's paper was published. I have considered it for GC usage. |
Would Regions help with this? |
its 2022,with more 3 years,thats will be 10 years,.net team will be do something for .net's ZGC? |
I also want to ask this question, will Regions help with this? @Maoni0 |
@Maoni0 |
I think I've found an effective way to compress LOH:
If CLR always alloc every large object at the beginning of a RAM page(usually 4KB per page),then the large object heap(LOH) can be compressed without much cost: CLR can compress LOH by modifying RAM page table and TLB instead of copying data. If so, small fragmentation maybe still exist (less then a memory page size per fragment), but there would be no large fragmentation, and compressing would be very fast because of no copying. To do this, OS support may be needed, fortunately Windows OS and Visual Studio are both Microsoft's softwares, so Microsoft can implement this at least on Windows.
The text was updated successfully, but these errors were encountered: