Object: Let debug lock handle callee destruction within call chain gracefully #96856

RandomShaper · 2024-09-11T12:55:07Z

AThousandShips · 2024-09-11T13:01:49Z

How does this relate to:

Player crashes randomly in Godot 4.2 on macOS #86788

And this fix:

[GDScript] Fix lifetime of call base object #86969

RandomShaper · 2024-09-11T13:07:20Z

How does this relate to:

Player crashes randomly in Godot 4.2 on macOS #86788

And this fix:

[GDScript] Fix lifetime of call base object #86969

Wow, I had totally forgotten I was involved there.

Now I understand the issue better, I can say that GDScript already ensures the lifetime of a RefCounted base object, because it stores it on the stack, so it will survive at least up to the end of the call. Therefore, I'd say there's no need for the extra temporaries added by the compiler in #86969.

RandomShaper · 2024-09-11T13:26:19Z

Of course, using Variant has the side effect of deferring the destruction of the object until after the call, so I've modified the approach in a way that the refcount is not artificially incremented, while still having safety if the object is released prior to returning.

lawnjelly · 2024-09-11T14:01:45Z

This looks pretty good and identical to my approach in the issue (great minds think alike and all that 😁 ), so am almost there to approve.

The only thing I wasn't sure about is how often this gets called in practice with every call to an object, and how expensive the ObjectDB lookup is (hashtable?) and whether we should be looking to avoid it, even in debug. i.e. does it appear on a profile. I suspect not, but would be nice to check.

An alternative would be to optionally store e.g. a pointer to a bool on each Object, which can inform the calling code when an object has been destroyed. This is O(1) but has the disadvantage of storing an extra pointer on each object.

If the Object::call() isn't used that often it isn't worth micro-optimizing though.

UPDATE:
Actually nearly every alternative I can think up either has potential bugs (due to e.g. the chain of calls containing more than one reference to the object), or may not be O(1) themselves, so maybe we should just go for this and work further if it introduces any performance issues.

AThousandShips

Makes sense and looks good to me

AThousandShips · 2024-09-11T14:57:42Z

Would be good to add some tests if we can reliably reproduce these issues

…acefully Co-authored-by: lawnjelly <[email protected]>

RandomShaper · 2024-09-12T07:09:45Z

Regarding performance, I wouldn't worry much since ObjectDB::get_instance() is very fast, and O(1), as it's based on generational ids. On the other hand, I think there's an opportunity to revamp this debug lock feature and maybe make it work only if the debugger is attached (like certain Variant object validity checks do) and/or make a very different implementation. After all, the only goal of the debug lock is prevent free() from being called if the object is at that point involved in a call chain.

I've force-pushed with an added likely(), and also (due to the striking coincidence) including @lawnjelly as a co-author (that's the least a gentleman can do 😀).

I'm adding some tests and will push again.

lawnjelly · 2024-09-12T07:28:39Z

I've force-pushed with an added likely(), and also (due to the striking coincidence) including @lawnjelly as a co-author (that's the least a gentleman can do 😀).

Ah yes I'll do that on mine too. 👍

RandomShaper · 2024-09-12T08:54:53Z

Pushed some tests, as "black-boxey" as possible. There are other cases of OBJ_DEBUG_LOCK used in object.cpp (emit_signalp for the emitting object and set_script()), but in those it wouldn't be safe that the object destroyed itself.

lawnjelly · 2024-09-12T10:04:55Z

I don't know the first thing about tests, but given the MRP only crashes with asan and debug build, does the test fail with the old code (with the normal tests build)?

RandomShaper · 2024-09-12T10:38:04Z

I don't know the first thing about tests, but given the MRP only crashes with asan and debug build, does the test fail with the old code (with the normal tests build)?

It will only fail with asan enabled, but luckily CI runs jobs that compile with sanitizers and runs tests on such builds.

RandomShaper · 2024-09-12T10:59:46Z

I've been able to come up with a way of letting a crash happen despite asan not enabled. Not as guaranteed, but makes the test crash with the prior implementation of _ObjectDebugLock.

tests/core/object/test_object.h

RandomShaper · 2024-09-13T09:10:41Z

Wow, getting the non-asan/tsan version of the crash induction to work on every platform has proven challenging.

Also, I've added tests for GDScript as its script instance and VM implementation have to guarantee the behavior. Those will only fail with asan/tsan, though.

I'm fairly happy with this now.

core/object/object.cpp

akien-mga · 2024-09-16T11:45:14Z

Thanks!

akien-mga · 2024-09-16T15:58:22Z

Cherry-picked for 4.3.1.

RandomShaper added bug topic:core cherrypick:4.3 Considered for cherry-picking into a future 4.3.x release labels Sep 11, 2024

RandomShaper added this to the 4.4 milestone Sep 11, 2024

RandomShaper requested a review from a team as a code owner September 11, 2024 12:55

RandomShaper force-pushed the selfdestruct_correctness branch from b0ec68a to 05e3c71 Compare September 11, 2024 12:58

akien-mga added cherrypick:3.x Considered for cherry-picking into a future 3.x release cherrypick:3.6 Considered for cherry-picking into a future 3.6.x release labels Sep 11, 2024

RandomShaper force-pushed the selfdestruct_correctness branch from 05e3c71 to b70bbba Compare September 11, 2024 13:29

lawnjelly mentioned this pull request Sep 11, 2024

[3.x] Object::call() prevent debug lock accessing dangling pointer #96862

Merged

akien-mga removed cherrypick:3.x Considered for cherry-picking into a future 3.x release cherrypick:3.6 Considered for cherry-picking into a future 3.6.x release labels Sep 11, 2024

lawnjelly approved these changes Sep 11, 2024

View reviewed changes

AThousandShips approved these changes Sep 11, 2024

View reviewed changes

Object: Let debug lock handle callee destruction within call chain gr…

10e2318

…acefully Co-authored-by: lawnjelly <[email protected]>

RandomShaper force-pushed the selfdestruct_correctness branch from b70bbba to 10e2318 Compare September 12, 2024 06:51

RandomShaper requested a review from a team as a code owner September 12, 2024 08:52

RandomShaper force-pushed the selfdestruct_correctness branch from 2736124 to a9ff30d Compare September 12, 2024 10:58

RandomShaper force-pushed the selfdestruct_correctness branch from a9ff30d to 271729d Compare September 12, 2024 11:01

RandomShaper commented Sep 12, 2024

View reviewed changes

tests/core/object/test_object.h Outdated Show resolved Hide resolved

RandomShaper force-pushed the selfdestruct_correctness branch 6 times, most recently from 3218a35 to e4e0087 Compare September 13, 2024 08:19

RandomShaper requested a review from a team as a code owner September 13, 2024 08:19

akien-mga reviewed Sep 14, 2024

View reviewed changes

core/object/object.cpp Outdated Show resolved Hide resolved

Object: Add tests about the safety of tail destruction

bb77520

RandomShaper force-pushed the selfdestruct_correctness branch from e4e0087 to bb77520 Compare September 16, 2024 07:58

akien-mga merged commit f7daa0f into godotengine:master Sep 16, 2024
20 checks passed

This was referenced Sep 16, 2024

Player crashes randomly in Godot 4.2 on macOS #86788

Closed

[GDScript] Fix lifetime of call base object #86969

Closed

akien-mga removed the cherrypick:4.3 Considered for cherry-picking into a future 4.3.x release label Sep 16, 2024

AThousandShips mentioned this pull request Sep 27, 2024

Potential use after free on GDScript callable methods; methods do not capture self object when used as callables. #97523

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Object: Let debug lock handle callee destruction within call chain gracefully #96856

Object: Let debug lock handle callee destruction within call chain gracefully #96856

RandomShaper commented Sep 11, 2024

AThousandShips commented Sep 11, 2024

RandomShaper commented Sep 11, 2024

RandomShaper commented Sep 11, 2024

lawnjelly commented Sep 11, 2024 •

edited

Loading

AThousandShips left a comment

AThousandShips commented Sep 11, 2024

RandomShaper commented Sep 12, 2024

lawnjelly commented Sep 12, 2024

RandomShaper commented Sep 12, 2024

lawnjelly commented Sep 12, 2024

RandomShaper commented Sep 12, 2024

RandomShaper commented Sep 12, 2024

RandomShaper commented Sep 13, 2024

akien-mga commented Sep 16, 2024

akien-mga commented Sep 16, 2024

Object: Let debug lock handle callee destruction within call chain gracefully #96856

Object: Let debug lock handle callee destruction within call chain gracefully #96856

Conversation

RandomShaper commented Sep 11, 2024

AThousandShips commented Sep 11, 2024

RandomShaper commented Sep 11, 2024

RandomShaper commented Sep 11, 2024

lawnjelly commented Sep 11, 2024 • edited Loading

AThousandShips left a comment

Choose a reason for hiding this comment

AThousandShips commented Sep 11, 2024

RandomShaper commented Sep 12, 2024

lawnjelly commented Sep 12, 2024

RandomShaper commented Sep 12, 2024

lawnjelly commented Sep 12, 2024

RandomShaper commented Sep 12, 2024

RandomShaper commented Sep 12, 2024

RandomShaper commented Sep 13, 2024

akien-mga commented Sep 16, 2024

akien-mga commented Sep 16, 2024

lawnjelly commented Sep 11, 2024 •

edited

Loading