-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache framebuffer copies (for self-texturing) until the next TexFlush GPU instruction #17032
Conversation
… instruction. Fixes #17030 , or at least improves on it - for optimal performance that big framebuffer used for bloom should be split like in Killzone, but it's not trivial. The regression in 1.14 is fixed with this, at least. I tried it with a few other games with no issues - it seems games are using TexFlush when needed. But let's see if it really is safe to rely on that... There might also be other places we should call DiscardFramebufferCopy in.
Hm, this seems slightly dangerous but maybe okay. I did have reason to believe people's assertions about texsync or texflush were a bit off, but I think it was texsync. Let me check in a couple games, though. Especially concerned about Motorstorm for example. -[Unknown] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to discard in a few more situations, let me see real quick...
-[Unknown]
// If max is not > min, we probably could not detect it. Skip. | ||
// See the vertex decoder, where this is updated. | ||
// TODO: We're currently not hitting this path in Dante. See #17032 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably not using vertex coords that get checked, so maxU probably equals minU. We don't currently set the range for all vertex types, just 16-bit through iirc. This has been true forever, initially to avoid any perf impact to vertex decode for the few games this was initially helping.
-[Unknown]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Possibly it would always be worth it to find the min/max in through mode, although then this change would need more sophistication. So, maybe one day :)
This avoids retaining the framebuffer copy any longer than the current framebuffer target.
GPU: Discard framebuffer copy when clearing
I'm gonna go for it, easy to revert if it causes problems, but it doesn't really seem to. As mentioned, future work will include tracking the affected areas more accurately. |
Fixes #17030 , or at least improves on it - for optimal performance that big framebuffer used for bloom should be split like in Killzone, but it's not trivial. So there's still a bunch of copies happening, just less of them.
The performance regression in Dante's Inferno in 1.14 is fixed with this, at least.
I tried it with a few other games with no issues - it seems games are using TexFlush when needed. But let's see if it really is safe to rely on that...
There might also be other places we should call DiscardFramebufferCopy in.
For these copies, ideally we should look at texture coordinates to figure out the regions to copy, and if the previously copied region was enough. Actually we do that already, in some cases - and I just realized that this isn't necessarily valid in that case. So need a couple more checks. EDIT: Added - works because that path isn't currently active in Dante for some reason.