-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webgpu: fix uniform block incorrectly #5487
base: master
Are you sure you want to change the base?
Conversation
uniform block size will always be evenly divisible by sizeof(vec4), so we must pad the ending of uniform block buffer if necessary.
@qjia7 @haoyunfeix @axinging @gyagp, please take a look, thank you. |
Some D3D cbuffer examples is https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl-packing-rules, which is similar as uniform block. |
padding = Math.ceil(currentOffset / 4) * 4 - currentOffset; | ||
for (let p = 0; p < padding; ++p) { | ||
dimUniformsData.push({type: 'int32', data: [0]}); | ||
dataViewIndex++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need dataViewIndex++;?
@@ -653,6 +653,14 @@ export class WebGPUBackend extends KernelBackend { | |||
currentOffset += d.data.length + padding; | |||
}); | |||
|
|||
// Force the resulting size of uniform block to be evenly divisible | |||
// by sizeof(four-component vector). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add more comments here, so that we can understand why we need those lines, including the related platform and test case. And We'd better provide a test case.
@@ -653,6 +653,14 @@ export class WebGPUBackend extends KernelBackend { | |||
currentOffset += d.data.length + padding; | |||
}); | |||
|
|||
// Force the resulting size of uniform block to be evenly divisible | |||
// by sizeof(four-component vector). | |||
padding = Math.ceil(currentOffset / 4) * 4 - currentOffset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can 4 be replaced with baseAlignment?
Refer OpenGL ES 3.0 spec page 61, the std140 uniform block layout guarantees specific packing behavior and does not require the application to query the offsets and strides. But the minimum required size may still be queried from |
Apologies for the delay, I haven't kept up with my email/reviews. I am not sure why that minimum size varies between GL implementations. In WebGPU we definitely intend to guarantee that the required size is consistent, so in this case either 40 on all systems or 48 on all systems. This seems related to gpuweb/gpuweb#1558. I'll try to verify with our shader team. |
In WGSL Note that while layout rules are identical between uniform and storage buffers, uniform buffers have tighter restrictions on the permitted layouts. WGSL requires structures to be aligned to 16 bytes for uniform buffer usage. In terms of shader visibility, this only really affects the byte offset of a field that follows a field of a structure type. Tint reports to Dawn the uniform buffer size rounded up to 16 bytes. I don't think this is currently covered by the spec, but I'm also not sure if this is actually observable to the client. Dawn folks are probably better to know the answer to that. |
The rounded up size that Tint returns to Dawn is observable through 1) the use of |
In that case shouldn't it be impossible to use a buffer of size 40 with a |
@Kangz @kainino0x At present, does tfjs could get a valid |
You can't reflect on a pipeline or shader module to determine the minimum buffer binding size for a binding. However, setting minBindingSize is optional - it defaults to 0, meaning there's no minimum. If it is 0, there may be additional draw-time or shader checks for accessing the binding (I don't remember the details gpuweb/gpuweb#678, I also have no idea whether Dawn implements any optimization). You can compute the actual minimum size: https://gpuweb.github.io/gpuweb/wgsl/#memory-layouts ("The size of a structure ...") |
uniform block size will always be evenly divisible by sizeof(vec4),
so we must pad the ending of uniform block buffer if necessary.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)