Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When removing a TileMap node (or its parent), remove_child method sometimes causes a crash. #22589

Open
koodikulma-fi opened this issue Oct 1, 2018 · 21 comments

Comments

@koodikulma-fi
Copy link

koodikulma-fi commented Oct 1, 2018

Edit:
This comment is outdated / misleading. The core issue is TileMaps and physics processing - see below.

Issue:
I have a situation where I need to switch a node between being "active" and "inactive" - so that when inactive (or disabled) its script and processes etc. stop running. However, using remove_child() will (almost always) cause the whole game/engine to crash (at least on Win10) - tried using call_deferred, set_process_internal(false) and waiting some idle_frames too. Using queue_free(), there will not be a crash but the node cannot be added back to the tree (because it's not in memory anymore).

Workaround hacks:

  1. My workaround now is to hold the original node in memory, and when I want to insert it, I duplicate the node and insert that duplicate - upon "disabling" the node, I call queue_free() on the duplicate. As is clear, this is not optimal in any sense (especially if a very complex node with a lot of kids), and it also brings other problems (eg. with AnimationPlayers inside the duplicated node - at least autoplay won't trigger, didn't check what the problem exactly is yet).
  2. I know one other kind of workaround is to save the original node as a PackedScene and then just load and instance it again, but it is not the same as remove_child + add_child (which holds all the node's properties in memory). More importantly, this method also adds other complications - such as instead of saving some scene (eg. some place in a world) should make a "export mini scenes" functionality (eg. a dock tool), and handle all its special cases (tried it initially), as well as keeping book of all the miniscenes after renaming the main scene etc.

Suggestion:
I think either of the below two suggestions would fix this "problem":

  1. A queue_remove_child() or similar method - would function like queue_free() but only removes from the tree, not from memory.
  2. A disable_node() method, which would stop the scripts and processes etc. from running - or at least so, that remove_child could be called without crash after this. ... Or maybe a call_queued() method..? (I know "disabling node" has been discussed - but didn't found a solution in the discussions that would work in my case.)
@koodikulma-fi koodikulma-fi changed the title Feature request: queue_remove_child method (or alternatively a disable_node method) Feature request: queue_remove_child method (because remove_child can cause a crash) Oct 1, 2018
@KellyThomas
Copy link
Contributor

KellyThomas commented Oct 1, 2018

Have you tried removing it on SceneTree's idle_frame signal. That should be issued between _process() cycles.
http://docs.godotengine.org/en/3.0/classes/class_scenetree.html#signals

@koodikulma-fi
Copy link
Author

Thanks for the suggestion. If you mean doing yield(get_tree(), "idle_frame"), then yes. Tried it along with using call_deferred. If I remember correctly the crashes did not happen that often, but they still happened.

@ghost
Copy link

ghost commented Oct 1, 2018

Do you have an example project of the crash? Removing children shouldn't cause a crash in most cases unless other systems require it as a dependency.

@KellyThomas
Copy link
Contributor

Are there any messages when it crashes?

@koodikulma-fi
Copy link
Author

koodikulma-fi commented Oct 1, 2018

I'm so sorry, I have to rebuild that situation to do further checks. Will be back later with more info. Thanks!
(I did the system about two weeks ago, but now was getting annoyed by the further problems (like the AnimationPlayer thing), so came back to it now.) ... was using Godot 3.1 alpha.

@koodikulma-fi
Copy link
Author

koodikulma-fi commented Oct 1, 2018

Okay. I don't have a project yet to share (have to separate the case from the other systems). But here is some error reports (quite new to Godot, please bear that in mind) :

  • Output tab gives me: drivers/windows/stream_peer_tcp_winsock.cpp:203 - Server disconnected! when the crash happens - ie. after calling remove_child.

  • Then Debugger's Errors tab gives twice: Condition ' !area_in && !E ' is true. With further info:

C Error: Condition ' !area_in && !E ' is true.
C Source: scene/2d/area_2d.cpp:262
C Function: _area_inout

... So possibly some Area2D node has problems when inside a child to be removed..? The child is a relatively complex scene (~ a small place in a 2D world).

Edit: If I remove all the nested nodes individually (recursively), I do not get the Debugger's part of the error - only the "Server disconnected!" error and the crash.

@koodikulma-fi
Copy link
Author

koodikulma-fi commented Oct 1, 2018

And rechecked that "idle_frame" thing. So, was like I said: the crashes happen less frequently when using idle_frame signal, but they still happen (about 50% of the time). Without idle_frame happens every time.

... Just thought that maybe this is related to Area2D still having an area entered inside.. ? Because an Area2D area_entered signal is the trigger for the whole process, and when the switching happens the character is still inside this area (due to game logic). ... Will try some checks related to this now.
Edit 2: No it is not related to this. But it is related to physics, potentially using TileMaps (or maybe just any collider). Will continue exploration.

Edit: Sorry about to close/open thing and unfinished message. The browser was being funky.

@ghost
Copy link

ghost commented Oct 1, 2018

Sounds like the game is looking for a collision check with a Node that isn't there.

@koodikulma-fi
Copy link
Author

Okay, thanks. Any ideas what to do about it? At this point, I'm not sure whether this is a bug or my misuse of the engine (tho queue_free works without any problems). Trying to rebuild a simplified scene, but so far no errors.

@koodikulma-fi
Copy link
Author

Did some further tests:

  • The issue happens only with TileMaps. So when removing a node that is a TileMap or has a TileMap nested within. It does not happen with, say, a StaticBody2D.
  • Also, it does not happen if I put collision_layer and collision_mask to 0 and then wait an idle_frame. (This is perfectly fine for my case - just have to keep track of what the layers were originally.)

However, in conclusion, I think this is a bug, and it's related to TileMap class and remove_child() method. The bug is that when removing a TileMap (or its parent) from the tree, the internal physics are still being processed (although not always), which causes a crash. To stop processing must manually nullify collision layers (see above) and then wait until the next frame. Changed the title to reflect this, and will try to make a simplified scene to reproduce the problem later. Thanks all for help!

@koodikulma-fi koodikulma-fi changed the title Feature request: queue_remove_child method (because remove_child can cause a crash) When removing a TileMap node (or its parent), remove_child method sometimes causes a crash. Oct 1, 2018
@Piet-G
Copy link
Contributor

Piet-G commented Oct 2, 2018

Could this be an issue with the physicsserver not updating emmediately when the node is removed from the tree?

For example the Physicsserver doesn't update emmediately when setting the position of a node #20107

@koodikulma-fi
Copy link
Author

koodikulma-fi commented Oct 2, 2018

Thanks for the comment. I initially tried waiting several idle_frames (and tried a timer too), because thought it would be something like this, but made no difference. (One idle_frame makes a difference tho, see above.) Edit Sorry, misread your answer. You probably didnt' mean this. ... As far as I know, It could be that the physics server is being lazy - however, I have no idea what to do about it.

More info:
While modifying the system yesterday, found out that after I put the collision masks to 0, if I then do not put them back to something else (when readding the nodes), there will be another crash when removing another tilemap whose collision layer was at 0... In other words, perhaps it has something to do with removing tilemaps that have overlapping colliders (on same collision layers). .. I'l try to do further tests later.

Also found out that the second error (that Area2D thing) is perhaps not related. After switching from queue_free to remove_child approach, I now get that error all the time (no crash). Found this q&a saying it's probably a harmless error.
<--- Also found this closed issue about removing overlapping Area2D's. ... So now thinking, maybe the issue is similar to that.

@hasahmed
Copy link
Contributor

I have the same issue. It happens when a a kinematicbody2d with a collision area is overlapping another area and then I try to remove it from there scene with get_tree().current_scene.remove_child(kinematicbody2d_node). It doesn't crash my game, just prints an error. My current hack-around is to move the object inside a _physics_process so that it isn't overlapping the area and only then remove it from the scene. Its quite a hack and is quite annoying to deal with.

@KoBeWi
Copy link
Member

KoBeWi commented Jul 29, 2020

Can anyone still reproduce this bug in Godot 3.2.3 rc2 or any later release?

If yes, please ensure that an up-to-date Minimal Reproduction Project (MRP) is included in this report (a MRP is a zipped Godot project with the minimal elements necessary to reliably trigger the bug). You can upload ZIP files in an issue comment with a drag and drop.

@KoBeWi
Copy link
Member

KoBeWi commented Sep 28, 2020

Closing due to lack of update. Please comment if you can still reproduce this issue in the latest stable version of Godot.

@KoBeWi KoBeWi closed this as completed Sep 28, 2020
@EricWRogers
Copy link

I am running into this problems on version Godot v3.2.3. My use case is very similar my player is a kinematicbody2d and on the body_entered of my area2d pickup I am trying to remove_child which causes a crash.

@KoBeWi
Copy link
Member

KoBeWi commented Dec 15, 2020

@EricWRogers Can you provide a minimal reproduction project for the issue?

@theatischbein
Copy link

theatischbein commented Jun 2, 2021

I can reprocude the error.

$> /opt/godot/godot --version
3.2.3.stable.official

$> lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 21.04
Release:	21.04
Codename:	hirsute

$> uname -a
Linux padlizsan 5.11.0-17-generic #18-Ubuntu SMP Thu May 6 20:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Also did it successfully in a minimal reproduction project:
bug_remove_child_updated2.zip

Edit: Updated Project with more debug prints.

The Project consists of:

  • a navigation node
    • navigation mesh (baked)
    • mesh instance (ground)
      • static body
        • collision shape
  • Inherent Node 1 from Node
    • MeshInstance
      • Area
        • CollisionShape
  • Inherent Node 2 from Node [Duplicate of Inherent Node 1]
    • MeshInstance
      • Area
        • CollisionShape
  • Camera
  • Player
    • Kinematic Body
      • MeshInstance
      • CollisionShape

I'm unsure here if the navigation node is needed and was just rebuilding it from my project where I encounter this error.

The player will move_and_slides along the x-axis and will collide with 2 inherent Node. First one will float above the ground, while the second overlaps the ground.
The nodes' area event bodyEntered will emits a signal on _on_Area_Body_entered to the player, which will remove the child from the root scene with get_node("/root/Main").remove_child(body).

The player passes the first one and the Object disappears. When hitting the last one the application crashes. Even former debug prints will be obmitted.

bug_remove_child

Please tell me, if and how I can provide more detailed logs, if needed :)

@theatischbein
Copy link

theatischbein commented Jun 2, 2021

I guess it is caued by wrong usage due to area entered/exited events and using remove_child, but I think the application shouldn't crash.

In the debugger I got this error, which lead me to #21459 and https://godotengine.org/qa/1754/how-to-change-the-parent-of-a-node-from-gdscript suggestion to use call_deferred in this situation.

E 0:00:01.951   _clear_monitoring: This function can't be used during the in/out signal.
  <C++ Error>   Condition "locked" is true.
  <C++ Source>  scene/3d/area.cpp:221 @ _clear_monitoring()
  <Stack Trace> Player.gd:11 @ _on_Node_entered()
                Node.gd:15 @ _on_Area_body_entered()

Do I understand it correctly that it crash because my second Node is still waiting for the ground to exit again and locked ?

Can't we always call remove_child internal deferred by adding here

godot/scene/main/node.cpp

Lines 1286 to 1294 in 4e52b84

void Node::remove_child(Node *p_child) {
ERR_FAIL_NULL(p_child);
ERR_FAIL_COND_MSG(data.blocked > 0, "Parent node is busy setting up children, remove_node() failed. Consider using call_deferred(\"remove_child\", child) instead.");
int child_count = data.children.size();
Node **children = data.children.ptrw();
int idx = -1;
something like:

~/godot-src$ git diff
diff --git a/scene/main/node.cpp b/scene/main/node.cpp
index dc7057f339..0ad89d07da 100644
--- a/scene/main/node.cpp
+++ b/scene/main/node.cpp
@@ -1285,6 +1285,10 @@ void Node::_propagate_validate_owner() {
 }
 
 void Node::remove_child(Node *p_child) {
+       call_deferred("_remove_child", p_child);
+}
+
+void Node::_remove_child(Node *p_child) {
        ERR_FAIL_NULL(p_child);
        ERR_FAIL_COND_MSG(data.blocked > 0, "Parent node is busy setting up children, remove_node() failed. Consider using call_deferred(\"remove_child\", child) instead.");
 
diff --git a/scene/main/node.h b/scene/main/node.h
index 6ca2317d9e..2b067cecf1 100644
--- a/scene/main/node.h
+++ b/scene/main/node.h
@@ -178,6 +178,8 @@ private:
        TypedArray<Node> _get_children() const;
        Array _get_groups() const;
 
+       void _remove_child(Node *p_child);
+
        Variant _rpc_bind(const Variant **p_args, int p_argcount, Callable::CallError &r_error);
        Variant _rpc_unreliable_bind(const Variant **p_args, int p_argcount, Callable::CallError &r_error);
        Variant _rpc_id_bind(const Variant **p_args, int p_argcount, Callable::CallError &r_error);

Edit: Unfortunately it crashes the editor. Any idea if the concept is reasonable ?

@theatischbein
Copy link

theatischbein commented Jun 3, 2021

Bug seems to be closed in version 4e52b84

Compiled it with my system described #22589 (comment) and updated the project's script to newer version:
bug_remove_child_latest.zip

If anyone can confirm this, I guess the bug and be closed again ?

@sszigeti
Copy link

sszigeti commented Dec 4, 2021

I've just seen this very bug with v3.4.stable.official [206ba70]

E 0:00:24.569   remove_child: Parent node is busy setting up children, remove_node() failed. Consider using call_deferred("remove_child", child) instead.
  <C++ Error>   Condition "data.blocked > 0" is true.
  <C++ Source>  scene/main/node.cpp:1183 @ remove_child()

Interesting right after that I had its opposite too:

E 0:00:24.588   add_child: Parent node is busy setting up children, add_node() failed. Consider using call_deferred("add_child", child) instead.
  <C++ Error>   Condition "data.blocked > 0" is true.
  <C++ Source>  scene/main/node.cpp:1135 @ add_child()

In my case it has nothing to do with collisions or physics. I'm working on game that has a generic purpose dialog scene, using Control-derived nodes only. When the errors happened, the dialog content was updated, which first clears any old content (which can be a list of custom scenes, or just a single RichTextLabel), then adds the new content (again, this can be a list of custom scenes or a RichTextLabel).
This has happened when I was clicking a button very quickly in a "shop" selling items. But I don't think I should be that fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants