An effective way to compress Large Object Heap #4076

ygc369 · 2015-03-25T02:04:39Z

I think I've found an effective way to compress LOH:
If CLR always alloc every large object at the beginning of a RAM page(usually 4KB per page)，then the large object heap(LOH) can be compressed without much cost: CLR can compress LOH by modifying RAM page table and TLB instead of copying data. If so, small fragmentation maybe still exist (less then a memory page size per fragment), but there would be no large fragmentation, and compressing would be very fast because of no copying. To do this, OS support may be needed, fortunately Windows OS and Visual Studio are both Microsoft's softwares, so Microsoft can implement this at least on Windows.

jkotas · 2015-03-25T02:11:50Z

cc @Maoni0

kangaroo · 2015-03-25T04:52:07Z

To clairfy, by 'modify RAM page table', your idea here is that if the CLR could determine a page was 'free' in LOH, we could modify the PTE to point to a new physical page?

Maoni0 · 2015-03-25T05:32:38Z

Thanks for your interest in the GC, @ygc369. The feature you are talking about is called VA remapping or swapping - it's something that needs to be implemented in the VMM. We talked to the OS guys about it a few years ago. We don't have it yet.

ygc369 · 2015-03-25T06:01:32Z

@Maoni0 This is a great feature, not only GC, many other operations which need copy large amount of memory can also benefit from it. I think you should talk to the OS guys about it again.It will obviously improve the performance of many programs without modifying the source code.

zgxnet · 2015-03-25T12:41:52Z

I like the idea very much. Yes, LOH is likely to have fragments, and memory for large objects should be allocated directly, rather than have a managed heap for it. It is much better to do the address mapping in OS level than copy and move.

omariom · 2015-04-01T22:04:29Z

👍

DemiMarie · 2015-10-15T23:35:03Z

Also I would suggest filing feature requests with the Linux and *BSD kernel teams.

I believe (but could be wrong) that the last attempt to get this feature into Linux was tainted by association with the Azul Zing JVM, which – being proprietary and patent-encumbered – was looked down upon by the Linux team. The feature appeared to be only useful for a single, proprietary piece of software.

The FOSS *nix kernel developers might be much more interested if they saw that a free software VM would actually use address remapping.

ygc369 · 2015-10-16T01:59:00Z

@drbo
Does Windows OS have this feature already?

ygc369 · 2015-10-19T06:44:04Z

Is this feature very hard to realize?

DemiMarie · 2015-12-04T03:18:20Z

@ygc369 I don't know – I was referring to the now-defunct Managed Runtime Initiative, which (I believe) only ever managed to submit a patch to Linux. Azul has since resorted to shipping a custom proprietary kernel module.

ygc369 · 2016-04-12T05:42:50Z

Virtual Machine softwares have already used this feature, I don't think it is too hard to realize.

ygc369 · 2016-06-20T06:10:27Z

Nobody is interested in this?

vancem · 2016-06-20T18:24:03Z

I don't want to comment directly on the merits of pursuing this proposal, but I will mention that on a 64 bit machine, fragmentation is not as problematic as you might assume. The GC does not touch freed memory and since every object on the large object heap is > 85K (thus 42 pages), most of the pages simply drop out of the working set and don't 'hurt' real memory consumption (only address space consumption, which is significantly cheaper).

I don't want to make the statement that fragmentation of the large object heap does not matter, but my observation above suggests we need some data that suggests that large object heap fragmentation is a problem in interesting scenarios.

ygc369 · 2016-06-21T01:22:29Z

@vancem
This proposal is not only for fragmentation problem, it can also collect garbage large objects earlier.
The cost of compressing LOH with traditional way is too much, so we can't do it frequently. But if we can compress LOH with the way I mentioned, the cost would be less. Thus we can collect garbage large objects in time.

Maoni0 · 2016-06-21T06:26:38Z

Not compacting LOH does not mean we do not collect garbage on LOH. We just don't compact (unless you specifically tell us to). We can collect LOH as often as we need if we think it's productive.

As I mentioned above, we already talked to the OS group a few years ago about doing this and the OS has yet to implement the VA remapping feature. I will talk to them again but feel free to bring this up with the Windows group and other OS groups as @drbo mentioned.

ygc369 · 2016-06-21T07:16:48Z

@Maoni0
Thank you for commenting on this.
It seems that GC on LOH only happens during GC of Generation 2, so it may not happen often.
Even though the garbage objects on LOH are Generation 0, they would not be collected until next GC of Generation 2.

FlorianRainer · 2017-06-15T16:44:28Z

I don't want to leave a comment on "How it would be better to compress LOH", but i want to ask if its even neccessary?

It is a lot of work for the GC and for this reason it will be done not as much at it should be done to work in a efficient way for the memory management.

What comes to my mind is a type of self defragmenting LOH.
I was reading this very good article on LOH Fragmenting https://www.codeproject.com/Articles/1191534/To-Heap-or-not-to-Heap-That-s-the-Large-Object-Que and if the problem is still the same, i have a different idea.

The reason for the fragmentation is the search for a free gap, and that a large object must have a size less (or equals) then this gap. so over time big gaps will be filled by smaller objects and a lot of small gaps are remaining.

it would be much better if LOH uses Pages with a constant Size as well, for example like uppon a 4KB PageSize.
I'm not talking about the RAM Page, it can match RAM Page size but its not required.
At the end of each page the last 64bit (for x64) will store a pointer to the next used page.
this way each gap contains always same sized free blocks, and its not any more required to allocate large memory in one block.

cause of LOH is used only on large memory allocations the performance impact and the fragmentation from not consuming a entire 4K block would be much less then the actual fragmentation.

the additional memory consumption py the 8Byte pointer on a 4K page would be ~8.4MB on a 4GB memory allocation.

reading the memory could be a tiny bit slower, but this way the GC doesn't needs to reorganize the LOH any more.

if this is not enought it could maybe be optimized by using the first byte of any page to determinate if the page should be read entirely and continued to read the next page (like on actual LOH) or if the last 8Bytes are used as reference to the next Page.

this would improve performance and memory usage for a usecase with a lot of free memory or only a few LOH objects

svick · 2017-06-15T16:49:20Z

@FlorianRainer Wouldn't that break unsafe code? For example, I can allocate 1 GB byte[] and then use fixed to access the whole array directly using pointers. I don't think that would work with the approach you proposed.

FlorianRainer · 2017-06-15T17:00:16Z

@svick thats true, for unsafe code and if you are working with pointers this will not work.
But maybe the idea to build some type of self defragmention (or not even fragmenting) LOH, instead of reorganizing it by the GC could be usefull? even if my approach is not so usefull.

ygc369 · 2017-06-17T04:03:44Z

@FlorianRainer
@svick
My idea could work with unsafe code, and I think that compressing LOH in my way would not have more cost than compressing SOH, if OS and CPU could support it (GC thread can modify Page Table).
Even for safe code, I don't think FlorianRainer's idea has more advantages than mine. Assuming 4KB per page in his idea, if I allocate 1GB memory and want to access the last byte, then the CPU has to access memory 256 times to get only one byte!

ygc369 · 2018-03-15T02:08:05Z

@Maoni0
I find that Windows OS seems to have VA remapping already.
See this: Address Windowing Extensions

AWE provides a very fast remapping capability. Remapping is done by manipulating virtual memory tables, not by moving data in physical memory.

ygc369 · 2020-03-29T02:02:09Z

@Maoni0
Is there any progress about this feature?
If OS has not supported VA remapping yet, I wonder why and how virtual machine softwares have it.
Besides, I think windows OS has supported it, look this:

AWE provides a very fast remapping capability. Remapping is done by manipulating virtual memory tables, not by moving data in physical memory.

from Address Windowing Extensions

Maoni0 · 2020-03-29T03:38:11Z

AWE has existed since Server 2003. it's not new. the APIs are quite awkward for this purpose and likely not fast enough (what I talked about with the OS folks was much more targeted at the GC usage); feel free to experiment with them.

Linux has mremap which seems much more suitable for the usage. I haven't gotten around to experimenting with it yet.

GSPP · 2020-06-17T17:52:08Z

I recently saw a CppCon talk where a clever technique for compacting heaps was presented. The key idea was to map the same physical page at multiple virtual locations. As long as objects on those virtual locations do not overlap physically, zero-copy compaction can be performed without relocating objects. It's not a full compaction. Rather, it uses suitable opportunities for this technique. Apparently, the authors found this to be a valuable optimization overall.

This technique is different from simply releasing free space "holes" by decommitting the pages.

I'm posting this here for consideration of the GC team.

https://www.youtube.com/watch?v=XRAP3lBivYM

Maoni0 · 2020-06-17T19:42:25Z

yeah, saw this when Emery's paper was published. I have considered it for GC usage.

teo-tsirpanis · 2022-02-08T16:22:09Z

Would Regions help with this?

sgf · 2022-07-23T16:04:07Z

its 2022,with more 3 years,thats will be 10 years,.net team will be do something for .net's ZGC?

ygc369 · 2022-07-25T13:30:24Z

Would Regions help with this?

I also want to ask this question, will Regions help with this? @Maoni0

ygc369 · 2024-05-29T11:51:45Z

@Maoni0
Any progress about this topic?

msftgits transferred this issue from dotnet/coreclr Jan 30, 2020

msftgits added this to the Future milestone Jan 30, 2020

ygc369 mentioned this issue Mar 23, 2020

Will there be any “BIG” changes to be added/introduced to GC? #33916

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

An effective way to compress Large Object Heap #4076

An effective way to compress Large Object Heap #4076

ygc369 commented Mar 25, 2015

jkotas commented Mar 25, 2015

kangaroo commented Mar 25, 2015

Maoni0 commented Mar 25, 2015

ygc369 commented Mar 25, 2015

zgxnet commented Mar 25, 2015

omariom commented Apr 1, 2015

DemiMarie commented Oct 15, 2015

ygc369 commented Oct 16, 2015

ygc369 commented Oct 19, 2015

DemiMarie commented Dec 4, 2015

ygc369 commented Apr 12, 2016

ygc369 commented Jun 20, 2016

vancem commented Jun 20, 2016

ygc369 commented Jun 21, 2016

Maoni0 commented Jun 21, 2016

ygc369 commented Jun 21, 2016

FlorianRainer commented Jun 15, 2017 •

edited

Loading

svick commented Jun 15, 2017

FlorianRainer commented Jun 15, 2017 •

edited

Loading

ygc369 commented Jun 17, 2017

ygc369 commented Mar 15, 2018 •

edited

Loading

ygc369 commented Mar 29, 2020 •

edited

Loading

Maoni0 commented Mar 29, 2020

GSPP commented Jun 17, 2020 •

edited

Loading

Maoni0 commented Jun 17, 2020

teo-tsirpanis commented Feb 8, 2022

sgf commented Jul 23, 2022

ygc369 commented Jul 25, 2022

ygc369 commented May 29, 2024

An effective way to compress Large Object Heap #4076

An effective way to compress Large Object Heap #4076

Comments

ygc369 commented Mar 25, 2015

jkotas commented Mar 25, 2015

kangaroo commented Mar 25, 2015

Maoni0 commented Mar 25, 2015

ygc369 commented Mar 25, 2015

zgxnet commented Mar 25, 2015

omariom commented Apr 1, 2015

DemiMarie commented Oct 15, 2015

ygc369 commented Oct 16, 2015

ygc369 commented Oct 19, 2015

DemiMarie commented Dec 4, 2015

ygc369 commented Apr 12, 2016

ygc369 commented Jun 20, 2016

vancem commented Jun 20, 2016

ygc369 commented Jun 21, 2016

Maoni0 commented Jun 21, 2016

ygc369 commented Jun 21, 2016

FlorianRainer commented Jun 15, 2017 • edited Loading

svick commented Jun 15, 2017

FlorianRainer commented Jun 15, 2017 • edited Loading

ygc369 commented Jun 17, 2017

ygc369 commented Mar 15, 2018 • edited Loading

ygc369 commented Mar 29, 2020 • edited Loading

Maoni0 commented Mar 29, 2020

GSPP commented Jun 17, 2020 • edited Loading

Maoni0 commented Jun 17, 2020

teo-tsirpanis commented Feb 8, 2022

sgf commented Jul 23, 2022

ygc369 commented Jul 25, 2022

ygc369 commented May 29, 2024

FlorianRainer commented Jun 15, 2017 •

edited

Loading

FlorianRainer commented Jun 15, 2017 •

edited

Loading

ygc369 commented Mar 15, 2018 •

edited

Loading

ygc369 commented Mar 29, 2020 •

edited

Loading

GSPP commented Jun 17, 2020 •

edited

Loading