-
-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What optimizations are used in this module? #150
Comments
The implementation in this module is more or less fancy mesh batching, with some extra effort to still cull faces of non-cube shapes. There is no greedy meshing, it really is close to Minecraft in that sense. Fast to generate and simple code, working for any kind of shape (but obviously working best with cubes).
That plays a big role I think. Godot's culling system gets very slow with a lot of chunks. 16x16x16 is a sweet spot between culling performance and edition speed. 32x32x32 is probably a bit faster if you don't mind slower chunk updates. There is still room for better culling but Godot doesn't have an API to implement such techniques without modifying VisualServer.
This module offers two alternatives for collision with blocky terrain:
There are several simple things this module does to optimize memory. Most of them can be done in GDScript too.
C++ makes things really fast otherwise. I can line up the data efficiently for the CPU to crunch more quickly than GDScript, with less overhead and fewer allocations. Besides, I use 2 threads at once to produce visual meshes, and a third thread to generate voxel data. The rest (mesh instances and colliders) being on the main thread due to Godot not supporting these things being done in a thread. |
Thank you for such a detailed answer!! I already saw that bug report about Bullet collision mesh being slow to generate. But after it's generated, is it slower to check collision against, compared to calculating collisions manually in GDScript instead of using Bullet? I understood that using VisualServer directly allows us to add a mesh to the scene without creating a Node (MeshInstance). That will save some memory for a Node object, but does having less Nodes also increase performance? Is this the only difference (other than batching) between using VisualServer and "doing it the normal way"? And finally, "fancy mesh batching". I know this is the most significant factor for a huge FPS increase, but I could not find any information at all about how it works with Godot. Is there anywhere I could read about it, or could you explain how did you do it? I do not see any way to "group" meshes by material to indicate they should be batched. I would assume they are drawn in the same order they were created, but there is no way to control this order on chunk rebuild. I don't suppose all of them are automatically split into surfaces and then all those surfaces are grouped by material?.. Are they even drawn in one call, or simply drawn without changing the state? What "fancy" things did you have to do other than simply create meshes via VisualServer instead of MeshInstances? |
That has nothing to do with GDScript. Once a collider is generated, its efficiency only depends on the physics engine (assuming the geometry is the same).
Less objects to manage leads to more performance because they dont need to be created, enter tree, leave tree or be freed, but it mostly reduces memory usage.
I don't know about Godot specifically, but I've been using that technique several times for a while in various engines. It's notably described here: https://0fps.net/2012/01/14/an-analysis-of-minecraft-like-engines/
Godot doesn't do batching on its own. The fact I combine meshes into one or more surfaces into an
Order doesn't matter, I don't know why you'd need this.
If a chunk has 2 materials (i.e opaque and transparent), then it will have two surfaces. That happens usually when voxels composing it have geometry including such materials (opaque voxels go into first vertex array, transparent voxels go into the second). |
I mean, is Bullet collision detection slow compared to a custom GDScript implementation of voxel-based collision? (Something simple like "push the player out if there is a voxel here".) About batching, I thought you were further optimizing it in some way, because you're up to ten times faster than Voxelverse, which is supposed to do the same thing. But upon closer examination, seems like that's not what it's doing, so I'll examine it more closely. Thank you again! |
|
What about occlusion culling, I don't think it's implemented, right? I imagine in a voxel game like Minecraft/Minetest it would sometimes significantly boost FPS depending on the player's location. Just wandering around the world with a lot mountains or exploring caves would result in most of the surrounding world not needing to be rendered at all (especially with high render distances). Although probably at the cost of slightly reduced FPS if you're seeing most of the world anyways and of course higher memory usage due to all the overhead. If it isn't implemented right now, I'm sure you have an important reason why that is. This might have been already mentioned in other issues, but I couldn't find anything. |
There is no occlusion culling. It's a relatively recent feature in Godot, and custom solutions such as BFS cave culling are a bit annoying to do because Godot doesn't expose a way to register custom culling systems (down to having to set mesh instances visible/not visible, which is more costly). See also #19 |
Yes, I think it was added with Godot 4.0 AFAIK. I noticed that the Maybe it's even possible to generate an occluder for each voxel depending on its visual mesh and transparency and then combine these array occluders to a single occluder instance similar to how it is with the physics engine collision (?). Like I said, easier said than done and I have no clue how exactly meshing works in this module, but that'd be my naive approach and I don't want to open up an entirely new discussion on that. |
That's a very bad idea considering the sheer amount of geometry this represents. Godot uses a CPU occlusion culling technique, so it will perform badly if given complex shapes.
Maybe. The |
That's a good point, I totally forgot that it needs baking. What a shame. |
Are there any plans to implement an LOD system like in VoxelLodTerrain for blocky meshers? Chunked LODs for blocky voxels would be another great optimizations especially at the larger view distances at which that would make sense. |
I'd like to have that at some point, I have done some prototypes, but for now there is nothing ready yet. It's significantly more complicated to setup in a way that looks good, compared to smooth LOD |
I am trying to figure out why is this module so much faster than a "normal" GDScript implementation that does the same thing with SurtaceTool or ArrayMesh. For example, this:
https://github.com/orosmatthew/Voxelverse
(Here I'm talking about the "blocky" terrain only.)
I understand that GDScript is slow, so the code itself will run faster - including terrain generation. But once everything is generated, there is supposedly the same thing! And yet terrain generated with this module is running at least twice as fast as Voxelverse, even at a higher draw distance!
Apparently, there is a number of optimizations; now I wonder what exactly, and whether those are possible to pull off in GDScript. (I can read GDScript, but I'm having trouble finding my way around C++ sources here.)
Both implementations draw only visible faces and do not draw faces in between blocks, naturally.
If I'm reading correctly, Voxelverse creates one mesh per 8x8x8 chunk, with only one surface for the whole mesh. It does not seem to use any "greedy meshing", so it's two polygons per each block face. Could it be that the gigantic performance boost is only due to "greedy meshing"? I find that hard to believe, especially seeing that the terrain generated in the demo for this module is all mountains - there are barely any flat faces more than one block in size.
According to godotengine/godot#7844, "mesh batching" is also used - which is the only other thing I can think of. But is it not the same as merging faces into a single surface per chunk? Maybe this module merges all of them into one mesh? But then does it rebuild the whole gigantic mesh in a split second every time a block is added or removed? Or maybe the chunk size is much larger, so there are fewer meshes?
Or maybe it's due to collision? As far as I can see, Voxelverse generates a collision mesh that is the same as the rendered mesh. And I remember that collisions are not very fast, so there is probably room for optimization. But I don't see the demo doing anything special in terms of a rewritten collision system, so it's the "normal" Godot collision as well, isn't it?
By the way, it also uses significantly less memory. What, does it use some kind of magic mesh that is not a normal Godot mesh, and therefore impossible to create in "normal" Godot?.. Oh, maybe it doesn't add each chunk as a MeshInstance? But you can't add a separate mesh for each chunk without either having one MeshInstance for each chunk, or rebuilding the whole mesh on every block modification...
The text was updated successfully, but these errors were encountered: