Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FPS drops considerably if there are a large number of Skeleton3Ds #99194

Open
Miurg opened this issue Nov 13, 2024 · 15 comments
Open

FPS drops considerably if there are a large number of Skeleton3Ds #99194

Miurg opened this issue Nov 13, 2024 · 15 comments

Comments

@Miurg
Copy link

Miurg commented Nov 13, 2024

Tested versions

Reproducible in: v4.3.stable.official [77dcf97]

System information

Godot v4.3.stable - Windows 10.0.19045 - Vulkan (Forward+) - dedicated AMD Radeon RX 7700 XT (Advanced Micro Devices, Inc.; 32.0.12019.1028) - AMD Ryzen 7 7700 8-Core Processor (16 Threads)

Issue description

I wanted to create an RTS game similar to BFME, but I ran into the problem that the engine is terribly optimized when using many units with a skeleton and animation based on it. I created the simplest model with the simplest animation and only with them I already get 100 fps with 1k units and below 10 fps with 10k units. In this case, the GPU and CPU are practically not involved. I can't use RenderingServer because I don't see that skeletal animations are implemented there in any way, the documentation seems completely unfinished at this point.

Steps to reproduce

  1. Run scene.

Minimal reproduction project (MRP)

reproduction.zip

@Zireael07
Copy link
Contributor

I do not think ANY game engine out there can handle 10k animated skeletons. Godot has had a big improvement when it comes to skeletons and performance.

Use LODs. Switch off animations on mobs outside the view. Switch off animations on mobs outside a certain range (too far away for you to see even if they're technically in view).

@mrjustaguy
Copy link
Contributor

mrjustaguy commented Nov 13, 2024

Actually, Only Rendered skeletons absolutely tank performance, and is directly affected by the number of Mesh surfaces that have a skeleton, and further to that how many times they're rendered (so up to 5 times with a 4 split directional shadow and view).

This isn't the first time this has popped up, and it actually does seem to be CPU bottlenecked last time I've looked into this, as a single thread just gets hammered like crazy while the other's are asleep.

Switching off animations will not resolve this, you'd have to jank out the skeleton from mobs that are being rendered by shadows and or the camera for performance to improve, and this is not good.

See #93568

@fire
Copy link
Member

fire commented Nov 14, 2024

I might look at this. As far as I know, Unreal Engine struggles in the 100-500 "skeletons" range, and we're trying to go 100x that.

I have not played with their ECS system in a long time, so things are probably completely different now.

@Capewearer
Copy link

I do not think ANY game engine out there can handle 10k animated skeletons. Godot has had a big improvement when it comes to skeletons and performance.

Use LODs. Switch off animations on mobs outside the view. Switch off animations on mobs outside a certain range (too far away for you to see even if they're technically in view).

Take a look for Serious Sam series, Total War and musou games. You'll be surprised.

@Zireael07
Copy link
Contributor

Those are bespoke engines, not general game engines I was referring to.

@Capewearer
Copy link

Anyway, techniques they are using are fully implementable in Godot. E.g. modern Total War uses GPU skinning. Serious Sam 4 uses impostor rendering for large crowds. Last one is definitely appliable to Godot.

@fire
Copy link
Member

fire commented Nov 14, 2024

We discussed this in animation meetings, but reviewing existing game technology reports like https://www.remedygames.com/article/how-northlight-makes-alan-wake-2-shine, we found that it would require a GPU-driven animation system on top of our existing GPU mesh skinning system.

As far as I know, we have yet to develop technologies for GPU-driven animations.

Imposters are approved for implementation in godot engine; feel free to submit work. https://github.com/zhangjt93/godot-imposter

VAT3 is also possible: https://github.com/G4ND44/Godot_VAT3

@Zireael07
Copy link
Contributor

Serious Sam 4 uses impostor rendering for large crowds

If you have impostors, they're no longer skeletons afaik so this doesn't fit your claim that said engine can render 10k animated SKELETONS

@Capewearer
Copy link

Capewearer commented Nov 14, 2024

Serious Sam 4 uses impostor rendering for large crowds

If you have impostors, they're no longer skeletons afaik so this doesn't fit your claim that said engine can render 10k animated SKELETONS

Because it handles not 10k, but 100k entities, of course it's all smokes and mirrors, but anyway, the order of magnitude is higher, than 10k. That's why animated imposters are used for further enemies. Anyway, the amount of 10k skeletons is still usable in such engine. Otherwise official 10x enemy multipliers would crash game in most intensive fights.

@smix8
Copy link
Contributor

smix8 commented Nov 14, 2024

Take a look for Serious Sam series, Total War and musou games. You'll be surprised.

The only thing I would be surprised here is if people would really think that any of those games use 100+ fully skinned skeleton characters that are updated every frame.

It is very obvious for even untrained eyes in all those games that they use a mix of LOD for both skeleton and animation and skinned mesh as well of swapping everything out for animated sprites and vertex animation at a distance. They are updating everything not inside the camera focus or at a distance at a very low fps and not every frame.

Especially in the Total War series they are not even trying to hide the LOD switch or the low animation fps, it happens in plain sight when you move the camera over the focus bubble / distance threshold back and fourth. Set your hardware setting to low quality and you will see the very aggressive LOD and sprite switches at a close up and how all those things animate like a 5 fps flipbook animation.

Undeniably there are things that can be optimized in the Godot skinned meshes animation system but especially for an RTS you also need to bring a lot of your own accumulated knowledge to the table on how to optimize for these kind of games if you want thousands of actors active.

@mrjustaguy
Copy link
Contributor

I believe There is actually an example of a game that handles 100+ fully skinned skeleton characters that are updated every frame just fine. ARK Survival Evolved, an open world dino survival game, and I've seen some really dense dino farms there easily packing a hundred raptors alone.

Do note however the game is infamous for poor graphics optimization, it's remaster, ARK Survival Ascended that moved to UE5 especially so, yet if you dial the graphics settings down they'll run fine from a CPU perspective even in such animation intense scenarios.

There are even options in the original to simplify distant animations and use lower quality animations, but they basically don't change things much.

Multithreading animations in Godot would massively help, and in one of the previous threads about Godot's Poor Skeleton Performance there were comparison to UE and Flax, and UE version used suffered as much as Godot, but Flax didn't with it being attributed to Multithreading support of the Flax engine, though Godot fared by far the worst. see #90943

@Zireael07
Copy link
Contributor

I believe There is actually an example of a game that handles 100+

You're off by an order of magnitude or two. OP wants 10k or 100k, not a hundred.

@mrjustaguy
Copy link
Contributor

mrjustaguy commented Nov 14, 2024

I could see 100k running with animation being multithreaded and only rendering for the main camera, and having shadows just be blob shadows (as that'd easily add another 1-4x render runs) and with every skeleton having only one surface.

In my MRP I've got like 2k skeletons running at 30 fps on an i3 10105f, and it's using just one thread to do that, and IIRC there are multiple surfaces per skeleton in the MRP and the sun shadow running them all an added 4 times

Edit:
Retested my old MRP on 4.4 dev 4, and it's 2160 Skeletons, each with 6 surfaces, about 50 bones each, with Directional Shadows enabled, with each surface casting a shadow... running at 30 fps on a single thread of an i3 10105f, with the other threads practically idling.

2160x6 is 12 960 mesh surfaces, times 5 is up to 64 800 skeletal mesh surfaces rendered each frame, which is right in the middle of OP range.

do note also however there is no other logic going on on the main thread that's absolutely murdering performance further.

@TokageItLab
Copy link
Member

TokageItLab commented Nov 15, 2024

I remember the biggest bottleneck in Godot3 that existed in Skeleton in the past was that every global pose change in Godot3 would make a skin change request to the render server and the thread would lock up until the render server had finished its calculations. See also #42681.

This was at the point of several dozen bones, actually several hundred requests since bone deformations were recursive at the time, but it was causing a significant FPS drop at that point.

The rendering server accessing bottleneck "in one Skeleton" was solved with #51368, the prevention of unnecessary skin deformation was solved with #87888, and the recursion problem was solved with #97538.

So now the skin deformation request is done only once per skeleton. However, when the number of Skeletons exceeds 100, this rendering server accessing bottleneck will inevitably come up again. In other words, the Rendering Server access bottleneck has been solved per Skeleton, but not per Scene.

As I discussed with reduz yesterday, we think that stacking these skin change requests so that the render server can handle them all at once would be a quite improvement. However in anyway, we need to profile the case with and without skins first.

@TokageItLab TokageItLab changed the title Absolutely terrible performance when using skeletons and skeletal animation FPS drops considerably if there are a large number of Skeleton3Ds Nov 15, 2024
@Nazarwadim
Copy link
Contributor

For those who want to have thousands of simple animated skeletons on scene, you can bake the animation. After that, turn on skeleton animation for near objects, and use baked LOD animation in low FPS for distant ones.

output.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants