Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrink our main uniform buffer by 32 bytes #16103

Merged
merged 7 commits into from
Sep 26, 2022
Merged

Conversation

hrydgard
Copy link
Owner

@hrydgard hrydgard commented Sep 25, 2022

That's half a 4x4 matrix, we're down to 480 bytes now.

Ideally I'd like to be able to squeeze two VR eye matrices in here without exceeding 512 bytes... but starting to look impossible. Though if we'd merge the two proj matrices, which should be doable, we'd get closer.

There are a bunch of float4 colors that could be easily squeezed into 32 bits each (fog color etc) but not sure how those will affect performance on old hardware. I guess the rarer ones like blendFixA/B would be fine.. There's value too in keeping uniforms as similar to GL as possible.

There's a reason we want to stay below 512 bytes, because the next step up is 768 on a lot of hardware as they can only align uniform buffers on 256-byte boundaries. Whether it actually makes much of a performance difference in practice, probably not hugely...

Another way to go would be to dynamically generate uniform buffers with just the constants that each pipeline needs, but the complexity would be huge. Very likely not worth it.

@hrydgard hrydgard added this to the v1.14.0 milestone Sep 25, 2022
Copy link
Collaborator

@unknownbrackets unknownbrackets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things we could do:

  • texEnvColor is fairly uncommon, so could maybe be uint.
  • viewPos is only needed for fog.
    • We could actually combine proj/proj_through/view into a single matrix (viewproj) + a single vec4 to calculate fog from worldpos.
    • This would also get rid of fogCoef.
    • We could still cache the matrices separately and only multiply if dirty before flush, so I don't think this would need to be that expensive.
    • The fog vec4 would be cheap since it'd just be parts of the view matrix, though we'd scale by fogCoef.
    • Even keeping proj_through, this would get rid of 40 bytes (remove 12 for view, remove 2 fog fogCoef, add 4 fog fogFromWorld.)
  • I agree about blendFix, especially blendFixB should be uncommon.

-[Unknown]

GPU/Vulkan/DrawEngineVulkan.cpp Outdated Show resolved Hide resolved
GPU/Common/FragmentShaderGenerator.cpp Outdated Show resolved Hide resolved
GPU/Common/FragmentShaderGenerator.cpp Outdated Show resolved Hide resolved
GPU/Common/ShaderUniforms.h Outdated Show resolved Hide resolved
@hrydgard hrydgard force-pushed the optimize-shader-constants branch from 39e9313 to d9f74d2 Compare September 26, 2022 11:10
@hrydgard hrydgard merged commit 89e6b10 into master Sep 26, 2022
@hrydgard hrydgard deleted the optimize-shader-constants branch September 26, 2022 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants