-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shrink our main uniform buffer by 32 bytes #16103
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
unknownbrackets
approved these changes
Sep 25, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things we could do:
- texEnvColor is fairly uncommon, so could maybe be uint.
- viewPos is only needed for fog.
- We could actually combine proj/proj_through/view into a single matrix (viewproj) + a single vec4 to calculate fog from worldpos.
- This would also get rid of fogCoef.
- We could still cache the matrices separately and only multiply if dirty before flush, so I don't think this would need to be that expensive.
- The fog vec4 would be cheap since it'd just be parts of the view matrix, though we'd scale by fogCoef.
- Even keeping proj_through, this would get rid of 40 bytes (remove 12 for view, remove 2 fog fogCoef, add 4 fog fogFromWorld.)
- I agree about blendFix, especially blendFixB should be uncommon.
-[Unknown]
…d to 128 bits. Allows us to save 16 bytes from the main uniform buffer, since there's free 32-bit spaces here and there to use.
hrydgard
force-pushed
the
optimize-shader-constants
branch
from
September 26, 2022 11:10
39e9313
to
d9f74d2
Compare
This is simpler and allows us to unify paths better.
GPU: Apply color test mask as a uint
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
That's half a 4x4 matrix, we're down to 480 bytes now.
Ideally I'd like to be able to squeeze two VR eye matrices in here without exceeding 512 bytes... but starting to look impossible. Though if we'd merge the two proj matrices, which should be doable, we'd get closer.
There are a bunch of float4 colors that could be easily squeezed into 32 bits each (fog color etc) but not sure how those will affect performance on old hardware. I guess the rarer ones like blendFixA/B would be fine.. There's value too in keeping uniforms as similar to GL as possible.
There's a reason we want to stay below 512 bytes, because the next step up is 768 on a lot of hardware as they can only align uniform buffers on 256-byte boundaries. Whether it actually makes much of a performance difference in practice, probably not hugely...
Another way to go would be to dynamically generate uniform buffers with just the constants that each pipeline needs, but the complexity would be huge. Very likely not worth it.