Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instantiate big Tilemap on thread dont work. #87402

Closed
grusad opened this issue Jan 20, 2024 · 23 comments · Fixed by #87478
Closed

Instantiate big Tilemap on thread dont work. #87402

grusad opened this issue Jan 20, 2024 · 23 comments · Fixed by #87478

Comments

@grusad
Copy link

grusad commented Jan 20, 2024

Tested versions

Godot 4.1 and 4.2

System information

Macbook Pro M2

Issue description

When instantiate a big Tilemap on a separate thread and try to return it to the main thread, issues occur. I've encounter various bugs while trying to find a working solution. It works fine if i call Thread::wait_to_finish() on the main thread. But passing the built scene with call_deferred, problems occur.

On the attached project, the game freezes on 4.2. Ive also encountered crashes (mostly on 4.1) as shownhere:

handle_crash: Program crashed with signal 11
Engine version: Godot Engine v4.2.1.stable.official (b09f793f564a6c95dc76acc654b390e68441bd01)
Dumping the backtrace. Please include this when reporting the bug to the project developer.
[1] 1   libsystem_platform.dylib            0x00000001834f2a84 _sigtramp + 56
[2] SpriteFrames::get_frame_duration(StringName const&, int) const
[3] NavigationObstacle2D::get_vertices() const
[4] NavigationObstacle2D::get_vertices() const
[5] NavigationObstacle2D::get_vertices() const
[6] VariantInitializer<Vector<Color>>::init(Variant*)
[7] Object* ClassDB::creator<ENetConnection>()
[8] Object::get_instance_id() const
[9] JSON::get_data() const
[10] VariantInitializer<Vector<Color>>::init(Variant*)
[11] Object* ClassDB::creator<ENetConnection>()
[12] Object::get_instance_id() const
[13] JSON::get_data() const
[14] VariantInitializer<Vector<Color>>::init(Variant*)
[15] Object* ClassDB::creator<ENetConnection>()
[16] MultiplayerAPI::is_server()
[17] AnimationPlayerEditor::unpin()
[18] CallableCustomExtension::default_compare_less(CallableCustom const*, CallableCustom const*)
[19] SceneTree::get_root() const
[20] Node::get_index(bool) const
[21] Node::get_index(bool) const
[22] RendererCompositorRD::_create_current()
[23] void rx::mtl::GetMatrixUniformMetal<unsigned int>(unsigned int, unsigned int*, unsigned int const*, bool)
[24] RendererCompositorRD::_create_current()
[25] 25  dyld                                0x000000018316bf28 start + 2236

What is going on, is this a bug? Or how do i solve it. One must be able to instantiate a bigger scene on a separate thread.

EDIT:

On the current attached minimal project i get the following error multiple times in the log-file:

USER ERROR: Condition "p_elem->_root != this" is true.
   at: remove (./core/templates/self_list.h:80)

It does however work with smaller tilemaps

Steps to reproduce

Create a big TileMap and try to loading it and instantiate it using a Thread.

Minimal reproduction project (MRP)

playground.zip

@grusad
Copy link
Author

grusad commented Jan 21, 2024

Ok, I can now confirm this error is generated based on Tilemap size. I tried out different sizes of the map to see when and where the error message was generated (USER ERROR: Condition "p_elem->_root != this" is true. at: remove (./core/templates/self_list.h:80)).

It is now clear that if I instantiate a Tilemap with the size of 750x500 tiles (roughly), the error is shown and everything breaks.

Lower than that, and the instantiation works fine on a separate Thread.

@bs-mwoerner
Copy link
Contributor

That's a bit weird. I do get spammed with remove: Condition "p_elem->_root != this" is true. <C++ Source> ./core/templates/self_list.h:80 @ remove() upon calling instantiate() on the tilemap scene (~1.3 million cells) but only if I set a breakpoint there and use "Step Over". I don't get an error when I run the project normally or use "Continue" from the breakpoint. Changing basically anything in the function (such as adding a pass after that line) or setting the breakpoint on the line after instantiate() makes the problem go away, so I wonder if that's actually an issue with the debugger.

As for your immediate problems, @grusad:

  • _dont_work_0() doesn't work because the thread upon completion calls a function thread_complete that doesn't exist. Change this to context.call_deferred("thread_complete_1", instance).
  • _dont_work_1() doesn't work because the thread upon completion calls the function thread_complete_2, which requires a parameter. Change this to context.call_deferred("thread_complete_2", instance).

@grusad
Copy link
Author

grusad commented Jan 22, 2024

Silly me, attached wrong version of the project. On the attached one I had done som random tests. Still, the problem that occurs for you is present on my end as well. Im not sure what you mean by setting a breakpoint. Godots dont break on breakpoints inside thread, or am i missing something?

As I mentioned before, the error is thrown only on big tilemaps. Smaller ones works fine. So I'm a bit skeptical it has anything to do with the debugger.

@bs-mwoerner
Copy link
Contributor

Yes, reducing the tilemap size also prevents the error. However, other than this peculiarity when stepping through the code, I can't reproduce any crashes, freezes or other problems on 4.2.1 on Windows.

To me, that stack trace reads like a fever dream. Maybe there's a possibility of memory corruption, at least on Mac, for huge TileMaps? Hard to say without a reliable reproduction.

@AThousandShips
Copy link
Member

This is due to the same issue as:

Looking into a different improvement on this side as well

@grusad
Copy link
Author

grusad commented Jan 23, 2024

To me, that stack trace reads like a fever dream. Maybe there's a possibility of memory corruption, at least on Mac, for huge TileMaps? Hard to say without a reliable reproduction.

I did try it on Linux mint as well, same result.

@grusad
Copy link
Author

grusad commented Jan 23, 2024

This is due to the same issue as:

Looking into a different improvement on this side as well

Interesting. But one could assume it's not unreasonable to load a bigger TileMap on a separate thread, correct?

@AThousandShips
Copy link
Member

It's indeed not unreasonable, it's due to a bug, which I'm currently working on fixing, if you can please try testing:

@grusad
Copy link
Author

grusad commented Jan 25, 2024

It's indeed not unreasonable, it's due to a bug, which I'm currently working on fixing, if you can please try testing:

I was not able to run the build from the artifacts on my M2 for whatever reason, i will try on my linux machine later today.

@alberto-mco
Copy link

I get the same error when i try to load tiles in a separated thread. In my case, i'm making an open world game (procedural map generation), and i have chunks to load the map while player is moving, and sometimes, i have the same error. The tilemap isn't inside the SceneTree, i just have a thread that do the next:

  • duplicate(0) the current tilemap
  • edit the copy adding or removing tiles
  • insert in the scene tree with call_deferred
  • Remove the last one with queue_free.

@grusad
Copy link
Author

grusad commented Feb 5, 2024

I was not able to test the version on linux, sorry for late response. What's the status on this one?

@AThousandShips
Copy link
Member

AThousandShips commented Feb 5, 2024

If you're able please try compiling it on your own, I don't know why it wouldn't work maybe some configuration error

The PR is still in progress, needs some further testing but it does solve the base problem, so would be nice if there can get more testing on it to confirm it

@AThousandShips
Copy link
Member

AThousandShips commented Mar 5, 2024

Unsure if this is a bug as such, instantiating the same resource in two different threads might not be safe to begin with, I'm not sure

Does this happen if you instantiate it once on a thread? Or if you load it only on one thread?

Loading on multiple threads is not safe:

Still, this is only really useful if you have one thread loading data. Attempting to load or create scene chunks from multiple threads may work, but you risk resources (which are only loaded once in Godot) tweaked by the multiple threads, resulting in unexpected behaviors or crashes.

The limitation here is as described above, one must be careful not to load the same resource from multiple threads at once

CC @RandomShaper @reduz

@AThousandShips AThousandShips removed this from the 4.3 milestone Mar 5, 2024
@AThousandShips AThousandShips reopened this Mar 5, 2024
@AThousandShips
Copy link
Member

So I tested your MRP and it is deeply broken and the code doesn't work, I fixed it and once I did there were no crashes, there were some errors relating to instantiating the script though, but this is likely the same thing of thread safety that has been well documented

So no crash, and as long as you load it once on a thread and instantiate it once on a thread, which is the suggested behavior, it seems to work

So I will close this again, if there are problems still occuring when using these types safely please open a new issue for those problems, will a working MRP attatched

@AThousandShips AThousandShips added this to the 4.3 milestone Mar 5, 2024
@alberto-mco
Copy link

I had the same issue with Godot 4.2 (or previous versions), but now i tested my game with Godot v4.3 dev 4, and i can't reproduce the issue. It works perfectly. In my case, the problem is solved with Godot 4.3 dev 4. Nice!

@AThousandShips
Copy link
Member

Then the general threading issue is confirmed solved, thank you for verifying

@grusad
Copy link
Author

grusad commented Mar 12, 2024

playground2.zip

Before closing this issue completely, the attached minimal project do NOT work on my machine. Tested in both 4.2 and 4.3 dev 4 build with no success.

My computer is an Macbook pro M2. Maybe the problem is computer related?

@alberto-mco
Copy link

playground2.zip

Before closing this issue completely, the attached minimal project do NOT work on my machine. Tested in both 4.2 and 4.3 dev 4 build with no success.

My computer is an Macbook pro M2. Maybe the problem is computer related?

I don't know if this can help you, but i tried on Windows 11 with RTX4070 and the output is:

Godot Engine v4.3.dev4.official.df78c0636 - https://godotengine.org
OpenGL API 3.3.0 NVIDIA 551.76 - Compatibility - Using Device: NVIDIA - NVIDIA GeForce RTX 4070 Laptop GPU

<PackedScene#-9223372011319851734>
Node2D:<Node2D#26122126561>
this is not printed
--- Debugging process stopped ---

I don't see any crash, and i tried at least 10 times.

@grusad
Copy link
Author

grusad commented Mar 12, 2024

That helps quite a lot. I think the issue is narrowed down quite a bit. It's not the first time I face issues with my Apple silicon chip.

@grusad
Copy link
Author

grusad commented Mar 12, 2024

Worked on my M1, i'm loss for words..

@akien-mga
Copy link
Member

@grusad Are you able to compile Godot from the master branch on the M2 with scons dev_build=yes (to have debug symbols), and see what stacktrace you get if/when it crashes?

@grusad
Copy link
Author

grusad commented Mar 14, 2024

I made it work on my M2 as well. I found that Godot had generated 250GB of files somewhere (probobly left the crashed program over night or something). I clean up my computer, and after that the branch 4.3 works like a charm and i can finally instantiate nodes on a thread!

@AThousandShips
Copy link
Member

Great to hear it working! And if any new issues arise don't hesitate to open a new issue and tag me, I don't expect the change to cause any issues but having people use it will help identify any issues that might still exist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants