Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.x] Rendering thread synchronization in Particles (GLES3) #60625

Open
m4nu3lf opened this issue Apr 29, 2022 · 5 comments
Open

[3.x] Rendering thread synchronization in Particles (GLES3) #60625

m4nu3lf opened this issue Apr 29, 2022 · 5 comments

Comments

@m4nu3lf
Copy link
Contributor

m4nu3lf commented Apr 29, 2022

Godot version

d62166f

System information

OS: Arch Linux, CPU: Ryzen 5, GPU: RX 6600

Issue description

Reading the "is_emitting" property in Particles nodes causes synchronisation of the main thread with the rendering thread.
The property is read internally at each frame for one-shot particles, always causing synchronisation.
In projects where both threads are under heavy load (like in mine), this will cause significant performance degradation.

The solution should be to cache the "is_emitting" property in the Particles object.

Steps to reproduce

  • Create a Particle node in a scene with emitting = false.
  • Set it to "one-shot".
  • Add enough load for both the rendering and the main thread.
  • Add a script to activate the particle system after a delay.
  • Run the scene and see the performance tank when the particle system is emitting.

Minimal reproduction project

particles_performace_test.zip

@TheDuriel
Copy link
Contributor

TheDuriel commented Apr 29, 2022

Can not confirm on Godot 3.4 release, Windows 10 latest, RTX2070S, I9-9900KF. Godot either completely locks up due to how high I need to set the loop value (gdscript incapable of using up hardware resources before the engine dies), or dies from running out of memory because of the absurd particle count I have to set to notice any kind of framerate dip. I can also observe no difference with Vsync disabled. With both versions maintaining identical frame timings at the same settings.

@Calinou Calinou added this to the 3.x milestone Apr 29, 2022
@Calinou Calinou changed the title [3.x] Rendering thread synchronization in Particles [3.x] Rendering thread synchronization in Particles (GLES3) Apr 29, 2022
@clayjohn
Copy link
Member

clayjohn commented May 1, 2022

Running the MRP with 3.5 beta 4 and I can't reproduce. I ran both the GPUParticles scene and the CPUParticles scene and had identical performance between both.

Pop OS 22.04, Intel i7-1165G7, Mesa Intel Xe Graphics

That being said, reading emitting does make a call into the rendering backend to check whether the particle is finished emitting or not. This will cause a synchronization between CPU threads (if the check is called from a separate thread).

The difficulty with caching the state of emitting is that the processing of particles is handled in the renderer itself. In order to obtain the accurate state, the Particles node would have to do poll the renderer every frame to check it on your behalf. Resulting in the exact situation that you have artificially created here.

I think a more sustainable solution is to avoid a situation where you are polling the rendering thread every single frame. Another solution is to have a signal that emits once the particles are finished (this is described in godotengine/godot-proposals#649)

@m4nu3lf
Copy link
Contributor Author

m4nu3lf commented May 1, 2022

Running the MRP with 3.5 beta 4 and I can't reproduce. I ran both the GPUParticles scene and the CPUParticles scene and had identical performance between both.

Pop OS 22.04, Intel i7-1165G7, Mesa Intel Xe Graphics

That being said, reading emitting does make a call into the rendering backend to check whether the particle is finished emitting or not. This will cause a synchronization between CPU threads (if the check is called from a separate thread).

The difficulty with caching the state of emitting is that the processing of particles is handled in the renderer itself. In order to obtain the accurate state, the Particles node would have to do poll the renderer every frame to check it on your behalf. Resulting in the exact situation that you have artificially created here.

I think a more sustainable solution is to avoid a situation where you are polling the rendering thread every single frame. Another solution is to have a signal that emits once the particles are finished (this is described in godotengine/godot-proposals#649)

The performance of the MRP is very machine-dependent. I can definitely see the effect of reading emitting on my machine in both the MRP and the game I'm working on. Avoiding reading emitting every frame will mitigate the issue. I think the signal solution is the best one, though.

@lawnjelly
Copy link
Member

I guess the reason why this has historically been difficult is probably because of the separation between the client side scene tree code and the VisualServer, and the synchronization, because the relationship is usually one way from client -> VisualServer, as reading back can potentially cause stalls.

Although it seems a bit strange, a timer is probably the most simple way of doing this and keeping it all scene tree side, if a formula like that suggested here works:
godotengine/godot-proposals#649 (comment)

There is also a callback mechanism which I put in for portals which could be re-used for sending a signal from the VisualServer if anyone attempts to implement this (and such a proposal was approved). servers/visual_server_callbacks.h contains the relevant stuff, you would just add an extra callback type, and pass the ObjectID so it could be relayed to the relevant scene tree object just after the next frame idle. This avoids the stalling problem in exchange for a potential delay before the signal is received, and has locking for getting the callback queue back to the client side.

This would work for CPUParticles at least, I'm not super familiar with the GPUParticles. One snag may be that the particle systems may not currently hold an ObjectID, so it might need e.g. an extra function to set this on particle systems.

But overall I would recommend a simple client side timer if at all possible, as it will be far less complex and have fewer failure points.

@Calinou
Copy link
Member

Calinou commented May 3, 2022

I wonder if this issue can potentially affect the implementation of #54716.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants