Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RetroArch shaders slower with vc4 than with bcrm #97

Open
vanfanel opened this issue Mar 23, 2018 · 3 comments
Open

RetroArch shaders slower with vc4 than with bcrm #97

vanfanel opened this issue Mar 23, 2018 · 3 comments

Comments

@vanfanel
Copy link

vanfanel commented Mar 23, 2018

Hi, Eric

I am using VC4 as my videodriver for the Raspberry Pi 3, but I have noticed that the same shader is WAY slower on the VC4 driver than it is on the bcrm vendor driver (which I don't use anymore).
For example, the dedicated Raspberry Pi CTR shader here:
https://github.com/RetroPie/common-shaders/blob/vc4-fixes/shaders/crt-pi.glsl
This shader was fullspeed on any core on the bcrm driver, but it's SLOW on VC4.

Even this "simple" scanline shader
https://github.com/libretro/glsl-shaders/blob/master/misc/scanline.glsl
...isn't always fullspeed anymore. It's an 1-pass shader only.

Maybe there's some operation that's possible to optimize in order to get back the lost performance?
I am using recent upstream MESA, I believe all your VC4-related optimizations are there, aren't they?

@anholt
Copy link
Owner

anholt commented Mar 26, 2018

I don't see anything in those shaders that should be unable to sample faster than refresh rate fullscreen. Without specific instructions on how to reproduce the problem, I (or anyone else helping) probably won't be able to help. For debugging on your own, I'd recommend taking a look at the docs at https://github.com/anholt/mesa/wiki/VC4-Performance-Debugging

@vanfanel
Copy link
Author

I didn't explain the issue properly. The shaders themselves are fullspeed: but paired with CPU-demanding emulators (~70-80% CPU usage in one core, since emulation is not multithreaded), the Pi can't keep up with and fullspeed isn't possible. They ARE fullspeed with emulators or native games that don't use so much CPU time. The problem is that they are fullspeed always with the BCRM vendor driver, while they are not with your fantastic VC4 driver, which is the future Pi driver.

So, are shaders using CPU time, or maybe it could be due to memory bandwidth limitations?

@anholt
Copy link
Owner

anholt commented Mar 27, 2018

That link is precisely the thing for figuring out what's going on with performance, instead of speculating about what might be going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants