-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Pack/Unpack for HLSL #2353
Conversation
The HLSL output for what I've implemented is now valid (I converted a wgsl file containing all the functions to hlsl, then copied that into https://shader-playground.timjones.io/ to make sure that it'd compile with DXC and FXC), but I'm not sure how to test that the output is correct. |
Please add HLSL here Line 448 in 907b7c7
so that we have and check the HLSL snapshots. CI will only make sure the output is valid HLSL but not run the file.
We aren't currently testing behavior in naga, what I'd do is create a compute shader to test the behavior and open a PR on wgpu (that temporarily links to your rev of naga). |
I hacked up a compute shader to test with (though I'm currently running it as an example and not a test, see the I've run into an issue where my packing spec: https://github.com/gpuweb/cts/blob/main/src/unittests/conversion.spec.ts |
The issue is that the WGSL spec defines the behavior of
We need to floor instead of round which is the behavior of SPV and what the PR is currently implementing. Having to polyfill this for SPV as well is unfortunate but I guess necessary since the rounding behavior might not always be the same across implementations of SPV.
|
I wonder how MSL behaves since they don't specify this in their spec. |
I filed gpuweb/gpuweb#4159 to hopefully iron out the behavior. Maybe we can wait a bit and see how to proceed. |
I could've saved so much time had I known about gpuweb/gpuweb#1189 (comment)... |
I feel you 😅 - the spec repo usually contains a lot of info in PRs/issues since the decisions are taken based on what's in the APIs/shading languages it targets. I find myself using |
readable version of what I'm doing:
|
Could you do |
Fails the edit: Added a ternary to guard against count == 0, now it works. |
The spec also calls for us to error on insertbits if we try and insert too much https://www.w3.org/TR/WGSL/#insertBits-builtin If
We don't currently do that for HLSL or Vulkan, and I don't know how to implement that. |
We can take care of that validation later once we got const-expressions fully working. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not yet looked at the behavior but still left some comments.
Also, it seems the resolution of gpuweb/gpuweb#4159 is that the rounding mode should have been unspecified so, we can use round
(same as here gpuweb/gpuweb#1189 (comment)) instead of floor(0.5 + x)
for the packing functions.
![image](https://github.com/bevyengine/bevy/assets/47158642/dbb62645-f639-4f2b-b84b-26fd915c186d) # Objective - Add Screen space ambient occlusion (SSAO). SSAO approximates small-scale, local occlusion of _indirect_ diffuse light between objects. SSAO does not apply to direct lighting, such as point or directional lights. - This darkens creases, e.g. on staircases, and gives nice contact shadows where objects meet, giving entities a more "grounded" feel. - Closes #3632. ## Solution - Implement the GTAO algorithm. - https://www.activision.com/cdn/research/Practical_Real_Time_Strategies_for_Accurate_Indirect_Occlusion_NEW%20VERSION_COLOR.pdf - https://blog.selfshadow.com/publications/s2016-shading-course/activision/s2016_pbs_activision_occlusion.pdf - Source code heavily based on [Intel's XeGTAO](https://github.com/GameTechDev/XeGTAO/blob/0d177ce06bfa642f64d8af4de1197ad1bcb862d4/Source/Rendering/Shaders/XeGTAO.hlsli). - Add an SSAO bevy example. ## Algorithm Overview * Run a depth and normal prepass * Create downscaled mips of the depth texture (preprocess_depths pass) * GTAO pass - for each pixel, take several random samples from the depth+normal buffers, reconstruct world position, raytrace in screen space to estimate occlusion. Rather then doing completely random samples on a hemisphere, you choose random _slices_ of the hemisphere, and then can analytically compute the full occlusion of that slice. Also compute edges based on depth differences here. * Spatial denoise pass - bilateral blur, using edge detection to not blur over edges. This is the final SSAO result. * Main pass - if SSAO exists, sample the SSAO texture, and set occlusion to be the minimum of ssao/material occlusion. This then feeds into the rest of the PBR shader as normal. --- ## Future Improvements - Maybe remove the low quality preset for now (too noisy) - WebGPU fallback (see below) - Faster depth->world position (see reverted code) - Bent normals - Try interleaved gradient noise or spatiotemporal blue noise - Replace the spatial denoiser with a combined spatial+temporal denoiser - Render at half resolution and use a bilateral upsample - Better multibounce approximation (https://drive.google.com/file/d/1SyagcEVplIm2KkRD3WQYSO9O0Iyi1hfy/view) ## Far-Future Performance Improvements - F16 math (missing naga-wgsl support https://github.com/gfx-rs/naga/issues/1884) - Faster coordinate space conversion for normals - Faster depth mipchain creation (https://github.com/GPUOpen-Effects/FidelityFX-SPD) (wgpu/naga does not currently support subgroup ops) - Deinterleaved SSAO for better cache efficiency (https://developer.nvidia.com/sites/default/files/akamai/gameworks/samples/DeinterleavedTexturing.pdf) ## Other Interesting Papers - Visibility bitmask (https://link.springer.com/article/10.1007/s00371-022-02703-y, https://cdrinmatane.github.io/posts/cgspotlight-slides/) - Screen space diffuse lighting (https://github.com/Patapom/GodComplex/blob/master/Tests/TestHBIL/2018%20Mayaux%20-%20Horizon-Based%20Indirect%20Lighting%20(HBIL).pdf) ## Platform Support * SSAO currently does not work on DirectX12 due to issues with wgpu and naga: * gfx-rs/wgpu#3798 * gfx-rs/naga#2353 * SSAO currently does not work on WebGPU because r16float is not a valid storage texture format https://gpuweb.github.io/gpuweb/wgsl/#storage-texel-formats. We can fix this with a fallback to r32float. --- ## Changelog - Added ScreenSpaceAmbientOcclusionSettings, ScreenSpaceAmbientOcclusionQualityLevel, and ScreenSpaceAmbientOcclusionBundle --------- Co-authored-by: IceSentry <[email protected]> Co-authored-by: IceSentry <[email protected]> Co-authored-by: Daniel Chia <[email protected]> Co-authored-by: Elabajaba <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: robtfm <[email protected]> Co-authored-by: Brandon Dyer <[email protected]> Co-authored-by: Edgar Geier <[email protected]> Co-authored-by: Nicola Papale <[email protected]> Co-authored-by: Carter Anderson <[email protected]>
![image](https://github.com/bevyengine/bevy/assets/47158642/dbb62645-f639-4f2b-b84b-26fd915c186d) # Objective - Add Screen space ambient occlusion (SSAO). SSAO approximates small-scale, local occlusion of _indirect_ diffuse light between objects. SSAO does not apply to direct lighting, such as point or directional lights. - This darkens creases, e.g. on staircases, and gives nice contact shadows where objects meet, giving entities a more "grounded" feel. - Closes bevyengine#3632. ## Solution - Implement the GTAO algorithm. - https://www.activision.com/cdn/research/Practical_Real_Time_Strategies_for_Accurate_Indirect_Occlusion_NEW%20VERSION_COLOR.pdf - https://blog.selfshadow.com/publications/s2016-shading-course/activision/s2016_pbs_activision_occlusion.pdf - Source code heavily based on [Intel's XeGTAO](https://github.com/GameTechDev/XeGTAO/blob/0d177ce06bfa642f64d8af4de1197ad1bcb862d4/Source/Rendering/Shaders/XeGTAO.hlsli). - Add an SSAO bevy example. ## Algorithm Overview * Run a depth and normal prepass * Create downscaled mips of the depth texture (preprocess_depths pass) * GTAO pass - for each pixel, take several random samples from the depth+normal buffers, reconstruct world position, raytrace in screen space to estimate occlusion. Rather then doing completely random samples on a hemisphere, you choose random _slices_ of the hemisphere, and then can analytically compute the full occlusion of that slice. Also compute edges based on depth differences here. * Spatial denoise pass - bilateral blur, using edge detection to not blur over edges. This is the final SSAO result. * Main pass - if SSAO exists, sample the SSAO texture, and set occlusion to be the minimum of ssao/material occlusion. This then feeds into the rest of the PBR shader as normal. --- ## Future Improvements - Maybe remove the low quality preset for now (too noisy) - WebGPU fallback (see below) - Faster depth->world position (see reverted code) - Bent normals - Try interleaved gradient noise or spatiotemporal blue noise - Replace the spatial denoiser with a combined spatial+temporal denoiser - Render at half resolution and use a bilateral upsample - Better multibounce approximation (https://drive.google.com/file/d/1SyagcEVplIm2KkRD3WQYSO9O0Iyi1hfy/view) ## Far-Future Performance Improvements - F16 math (missing naga-wgsl support https://github.com/gfx-rs/naga/issues/1884) - Faster coordinate space conversion for normals - Faster depth mipchain creation (https://github.com/GPUOpen-Effects/FidelityFX-SPD) (wgpu/naga does not currently support subgroup ops) - Deinterleaved SSAO for better cache efficiency (https://developer.nvidia.com/sites/default/files/akamai/gameworks/samples/DeinterleavedTexturing.pdf) ## Other Interesting Papers - Visibility bitmask (https://link.springer.com/article/10.1007/s00371-022-02703-y, https://cdrinmatane.github.io/posts/cgspotlight-slides/) - Screen space diffuse lighting (https://github.com/Patapom/GodComplex/blob/master/Tests/TestHBIL/2018%20Mayaux%20-%20Horizon-Based%20Indirect%20Lighting%20(HBIL).pdf) ## Platform Support * SSAO currently does not work on DirectX12 due to issues with wgpu and naga: * gfx-rs/wgpu#3798 * gfx-rs/naga#2353 * SSAO currently does not work on WebGPU because r16float is not a valid storage texture format https://gpuweb.github.io/gpuweb/wgsl/#storage-texel-formats. We can fix this with a fallback to r32float. --- ## Changelog - Added ScreenSpaceAmbientOcclusionSettings, ScreenSpaceAmbientOcclusionQualityLevel, and ScreenSpaceAmbientOcclusionBundle --------- Co-authored-by: IceSentry <[email protected]> Co-authored-by: IceSentry <[email protected]> Co-authored-by: Daniel Chia <[email protected]> Co-authored-by: Elabajaba <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: robtfm <[email protected]> Co-authored-by: Brandon Dyer <[email protected]> Co-authored-by: Edgar Geier <[email protected]> Co-authored-by: Nicola Papale <[email protected]> Co-authored-by: Carter Anderson <[email protected]>
![image](https://github.com/bevyengine/bevy/assets/47158642/dbb62645-f639-4f2b-b84b-26fd915c186d) # Objective - Add Screen space ambient occlusion (SSAO). SSAO approximates small-scale, local occlusion of _indirect_ diffuse light between objects. SSAO does not apply to direct lighting, such as point or directional lights. - This darkens creases, e.g. on staircases, and gives nice contact shadows where objects meet, giving entities a more "grounded" feel. - Closes bevyengine#3632. ## Solution - Implement the GTAO algorithm. - https://www.activision.com/cdn/research/Practical_Real_Time_Strategies_for_Accurate_Indirect_Occlusion_NEW%20VERSION_COLOR.pdf - https://blog.selfshadow.com/publications/s2016-shading-course/activision/s2016_pbs_activision_occlusion.pdf - Source code heavily based on [Intel's XeGTAO](https://github.com/GameTechDev/XeGTAO/blob/0d177ce06bfa642f64d8af4de1197ad1bcb862d4/Source/Rendering/Shaders/XeGTAO.hlsli). - Add an SSAO bevy example. ## Algorithm Overview * Run a depth and normal prepass * Create downscaled mips of the depth texture (preprocess_depths pass) * GTAO pass - for each pixel, take several random samples from the depth+normal buffers, reconstruct world position, raytrace in screen space to estimate occlusion. Rather then doing completely random samples on a hemisphere, you choose random _slices_ of the hemisphere, and then can analytically compute the full occlusion of that slice. Also compute edges based on depth differences here. * Spatial denoise pass - bilateral blur, using edge detection to not blur over edges. This is the final SSAO result. * Main pass - if SSAO exists, sample the SSAO texture, and set occlusion to be the minimum of ssao/material occlusion. This then feeds into the rest of the PBR shader as normal. --- ## Future Improvements - Maybe remove the low quality preset for now (too noisy) - WebGPU fallback (see below) - Faster depth->world position (see reverted code) - Bent normals - Try interleaved gradient noise or spatiotemporal blue noise - Replace the spatial denoiser with a combined spatial+temporal denoiser - Render at half resolution and use a bilateral upsample - Better multibounce approximation (https://drive.google.com/file/d/1SyagcEVplIm2KkRD3WQYSO9O0Iyi1hfy/view) ## Far-Future Performance Improvements - F16 math (missing naga-wgsl support https://github.com/gfx-rs/naga/issues/1884) - Faster coordinate space conversion for normals - Faster depth mipchain creation (https://github.com/GPUOpen-Effects/FidelityFX-SPD) (wgpu/naga does not currently support subgroup ops) - Deinterleaved SSAO for better cache efficiency (https://developer.nvidia.com/sites/default/files/akamai/gameworks/samples/DeinterleavedTexturing.pdf) ## Other Interesting Papers - Visibility bitmask (https://link.springer.com/article/10.1007/s00371-022-02703-y, https://cdrinmatane.github.io/posts/cgspotlight-slides/) - Screen space diffuse lighting (https://github.com/Patapom/GodComplex/blob/master/Tests/TestHBIL/2018%20Mayaux%20-%20Horizon-Based%20Indirect%20Lighting%20(HBIL).pdf) ## Platform Support * SSAO currently does not work on DirectX12 due to issues with wgpu and naga: * gfx-rs/wgpu#3798 * gfx-rs/naga#2353 * SSAO currently does not work on WebGPU because r16float is not a valid storage texture format https://gpuweb.github.io/gpuweb/wgsl/#storage-texel-formats. We can fix this with a fallback to r32float. --- ## Changelog - Added ScreenSpaceAmbientOcclusionSettings, ScreenSpaceAmbientOcclusionQualityLevel, and ScreenSpaceAmbientOcclusionBundle --------- Co-authored-by: IceSentry <[email protected]> Co-authored-by: IceSentry <[email protected]> Co-authored-by: Daniel Chia <[email protected]> Co-authored-by: Elabajaba <[email protected]> Co-authored-by: Robert Swain <[email protected]> Co-authored-by: robtfm <[email protected]> Co-authored-by: Brandon Dyer <[email protected]> Co-authored-by: Edgar Geier <[email protected]> Co-authored-by: Nicola Papale <[email protected]> Co-authored-by: Carter Anderson <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left a few minor comments.
I wanted to point out that for We'll also end up with a lot of duplication if we want to do it in-line as opposed to calling a function we injected previously (this is how we deal with constructors currently). |
Note for a future reader (including myself): HLSL seems to use arithmetic right shifts for signed integers even if not specified in their docs - found this out by looking at the output of FXC and DXC. |
Co-authored-by: Teodor Tanasoaia <[email protected]>
Co-authored-by: Teodor Tanasoaia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for implementing this, it looks great!✨
These are currently untested, and I'm not sure how to do the signed versions.edit: These have now all been tested against a re-implementation of the CTS tests here (set
WGPU_BACKEND
env variable to dx12 and run thenaga-valid
example, it'll print out any failed tests).Status:
Related: #1449
Closes #1929