-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Heap objects with custom allocator and explicit delete #5633
Comments
Destructible Types? dotnet/roslyn#161 |
That proposal seems to be safe, automatic resource management. My proposal is unsafe and manual all the way. This is about giving maximum control. Thanks for pointing out the "near duplicate", though. It is useful to contrast the two. @benaadams @stephentoub |
In any case this will be a very minor change, maybe even no change. The GC can already differ between its own objects and objects it does not own - it is specified in ECMA-335 that this must be allowed: class Program
{
int _value;
static unsafe void Main(string[] args)
{
IntPtr mem = Marshal.AllocHGlobal(4);
Method(ref *(int*)mem);
Program p = new Program();
Method(ref p._value);
}
static void Method(ref int value)
{
value = 25;
}
} Here the GC has to update the managed pointer passed to |
@JanielS I did not even know that You are right. Here, |
@GSPP I don't know whether it is supposed to work on the C# side of things, but in the CLI it definitely is. ECMA-335 states:
I have personally used this feature in C++/CLI for clean wrapper code that can work with both unmanaged and managed memory (since a |
Alright. Does this not mean that we can immediately write this allocator system on the current CLR using a tiny C++/CLI library or using This crashes with internal corruption errors, though:
I tried type-punning the code of some library as a The |
Unmanaged/managed pointers and object references are not the same thing. It's true that managed pointers can point to unmanaged memory but that doesn't imply that object reference too can do that. I suspect that it is more or less technically possible but I don't think there's anything that allows this in the current ECMA spec.
I don't think that this should be a property of the type. Not only that this prevents allocating existing reference types outside of the GC heap but it's quite useless because you can mostly do this today with value types and unsafe code. |
@mikedn I'm aware of that. What I'm saying is that the ECMA-335 states that unmanaged pointers can be converted to managed pointers. For this to be supported the CLR has to be able to answer the question I quoted from the feature request - whether the GC owns an object at the specified address. @GSPP Maybe so. I know you can reinterpret objects with a structure with Explicit layout, however that doesn't quite allow you to examine the object representation. It could probably be done with some |
As a POC, this seems to work on desktop CLR: internal class ArenaAllocator : IDisposable
{
private readonly IntPtr _mem;
private IntPtr _cur;
public ArenaAllocator()
{
_mem = Marshal.AllocHGlobal(0x100000);
_cur = _mem;
}
public unsafe T Allocate<T>() where T : class
{
*(IntPtr*)_cur = typeof(T).TypeHandle.Value;
IntPtr ptr = _cur;
TypedReference reference = default(TypedReference);
((IntPtr*)&reference)[0] = (IntPtr)(&ptr);
((IntPtr*)&reference)[1] = typeof(T).TypeHandle.Value;
return __refvalue(reference, T);
}
public void Dispose()
{
Marshal.FreeHGlobal(_mem);
}
} It's missing getting the size of T (not sure how -- probably through the type handle somehow), and sync block indices are not handled at all (I think these are negative offsets). On CoreCLR I don't think EDIT: And of course it's missing constructor invocation too, and does not handle special classes ( |
I agree with that now. @mikedn @JanielS That is a really nasty hack :) My next idea for a hack would have been to use I feel we should not derail this ticket further with meaningless chatter. I'm looking forward to the team responding. I also encourage anyone to post comments for why this would help their code and to +1 the opening post. Anyone doing games might be interested. The Stack Exchange folks posted about unsafe code tricks they did to make the tag engine perform acceptably. Would this help you, @mgravell? Or was it @mattwarren? Sorry for summoning everyone. |
You could construct an object perfectly but you can't call new with it...I am not aware if there's a way to tell new to goto your own allocator. Regardless this still doesn't integrate. If you assign this to an object field, obj.x = something_I_constructed_that_looks_like_a_managed_object, GC will attempt to trace through it and it will fail. Unless this is passed as a special type that tells GC to ignore its references. But that again doesn't make it seamless. I am thinking about isolated heaps (that allow GCs on them individually instead of per process) though. I will post something hopefully soon. |
Yes, it has to be able to answer that and it does that. But managed pointers are quite restricted, they can live only on the stack. That makes them rather uncommon and so are any potential perf issues associated with answering the question.
It also has a good chance of corrupting memory or crashing as soon as you try to store a reference into such an object.
The issue is derailed from the beginning like all other similar issues because it fails to take into account various technical realities, existing possibilities and use cases. |
I've heard things like this before on many projects. It amounts to "we can't do it because we don't do it now" which is self-limiting. Never let historical decisions dictate future possibilities. |
Neah, this only has to do with people getting overly enthusiastic and claiming that a solution for a problem exists when even the problem is not understood, much less the solution. |
I'll definitely admit it wasn't well tested. I didn't do much more than a few allocations and GCs. And I definitely won't argue with @Maoni0 whether it will work or not. 😄 |
Just to respond to an explicit mention:
Not really. In general when I have data with this problem, I have lots But I share and echo the sentiment that the problem needs to be fully Marc
|
@mgravell you could still block-allocate those objects. Downside is they now waste 16 bytes on the object header each. Upside is you have normal managed references. No need to pass indexes and arrays around, or pointers. To clarify: This would not involve the GC at all. You also could have managed objects living in memory shared with the GPU. I think we could make those objects copyable though memcpy if we disable any managed function based on the object header. That would be locking and the identity hash code I think. Those operations would throw. Also, this would obviate the need to call " |
It seems like you may be able to achieve this if/when the work being done in the Snowflake project arrives in CoreCLR, see Project Snowflake: Non-blocking safe manual memory management in .NET The code sample below is from the paper, if shows the usages of T Find(Predicate<T> match)
{
using (Shield<T[]> s_items = _items.Defend())
{
for (int i = 0; i < _size; i++)
{
if (match(s_items.Value[i]))
return s_items.Value[i];
}
}
return default(T);
} |
@Maoni0 can you explain "fail"? I see the code @jakobbotsch working fine. |
@Maoni0 Custom objects could have a bit set in the object header marking them as such to the GC. That would be a cheap way to activate custom GC behavior on a per-instance (nor per-type) basis. Would that work? |
@GSPP so there are several pieces of meta-data the GC keeps about objects. If these aren't backed by actual allocations things can go wrong
Now, placing a bit in the header requires the operations to know where the header is. This is not always the case for pointers from the stack into the heap. In particular, the write barrier does not know the header of the object it is updating, so cannot check this bit. That means you are likely to get random segfaults when you try to write to non-existent card table. The other data structures can also get touched based on the address, and may or may not exist for the address range you have allocated. If you didn't want the card table to cover the range you are managing it would be much simpler. |
@mjp41 as long as the object is allocated outside of the GC ranges, shouldn't it "just" work? The GC code has to check if it is within range or not no? Also what happens if you would do this for an object with no fields like strings could that work? |
There are of the order of 30 places that simply follow managed pointers by
Some bits do check if it is in the range of the heap, but not all of them. Many bits assume it will be able to find the GC heap/segment. If you restrict to types that do not contain GC references (blittable types), then you would not need to deal with the card table, so it would be changing just these traversal pieces. Our prototype put everything in separate address spaces for manually managed and GC managed, which lead to a quick cheap check. Looking in the header on every step of tracing could be expensive, and affect performance of code not using this feature. |
Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process. This process is part of our issue cleanup automation. |
I'll go ahead and keep this issue alive. This seems to be something that people are interested in. |
Ms, What are you waiting for to start on this issue or even the outcome of Snowflake project |
.NET only supports automatic lifetime for managed objects. The GC cleans up. This is fantastic for productivity. Sometimes, developers need tight control over latency, though. The GC can interfere with that goal.
This has been discussed at length in many place. I believe the team is aware of this issue. Although great strides have been made improving the GC this is still an important concern. It is not clear that the GC can ever fully resolve this.
As a workaround we can place data in manually allocated memory and use pointers to access that data. But that data can never be a managed object. I cannot pass that data to other non-aware code. If I want to allocate an unmanaged buffer I cannot pass that buffer as a
byte[]
to other code. This is terrible for composability.Please implement unsafe managed objects with user controlled lifetime. Like this:
I can ask the runtime to create and destroy objects on a custom allocator that I provide. An allocator is just a custom class:
Using this API developers can manage memory without involving the GC. They can devise their own lifetime schemes.
Benefits:
DeleteObject
. The allocator can destroy all objects in constant time (arena allocation).The usual perils of unsafe memory management apply:
This scheme lends itself to arena allocation. A game engine can allocate all per-frame objects in an arena and constant-time delete all of them at frame end. A REST service can arena allocate all data per-request. An XML parser can allocate all temporary buffers (temp strings, etc.) in a per-parse arena.
This proposal achieves very nice integration of unsafe memory management into an otherwise managed application. The idea is that most code is safe and managed but there are performance-critical islands of unmanaged memory that interoperate nicely.
The only CLR change required would be to teach the GC to ignore such custom objects. This could be done through a bit in the object header or based on type. I have left it open whether classes need to be declared as custom-allocated or whether any class can be allocated unsafely.
The text was updated successfully, but these errors were encountered: