-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add minimumBufferSize to BGL entry #678
Conversation
Would we validate this against the shader at pipeline creation? Or would it just be a "hint" to allow elision of a runtime check? If the former, we should mention that in the definition for the new field. (can spec it later when we have spec for create*Pipeline) |
Yes, I described this here:
I agree more details should be provided in the documentation. Will be happy to follow up after the initial discussion, if the group sees the value in pursuing this direction. |
Thanks for putting this up! This is an interesting proposal and would avoid the need to do runtime checks for this. My biggest concern is that this reduces flexibility and requires another piece of data that applications were not used to providing. It's unclear how much harder it makes WebGPU to use (from not at all, to OMG I CAN'T PORT MY CONTENT). I think Jasper commented about this in the Matrix chatroom, and hacks that D3D developers were using to do unsized arrays in shaders. Another small thing to consider is maybe asking for the size of an extra array element to be in the |
IIRC, we agreed that this wasn't applicable to WGSL. If there is a concrete statement there, let's have it in this issue (or any other).
We clearly require more data already, and we clearly require the existing content to go through some transformation (together with another round of testing) before it can ship. Providing an extra value here shouldn't be a problem unless there is a really tricky use-case where the same bindings was used with 2 different shader structures. I haven't seen this in practice, would be open to hear more from ISVs if that's something they intend to do.
It's a valid implementation behavior for this PR, but I don't think it needs to be specified or requires any changes to the PR, does it? The semantics of the "minimumBufferSize" is, well, that it's a lower bound. So the implementation can extract as much benefit as it can from knowing that ahead of time. |
Not in the PR yet, but wouldn't it change the validation for pipeline creation (when validating the minimumBufferSize against the shader code)? |
I think it would still be exactly as described:
In other words, the validation only cares about the structured part being covered. If we want to wiggle it, we also can: e.g. by removing all the validation here. If the user provided the minimum size - use it, otherwise - do something more complex. I do however think that just eliminating the complexity of access-guarding the structured fields as a class is most appealing, and it would require the validation as described. |
I asked why people care about clamping in #546 (comment) and got no responses, so I guess I'm asking about the same here. Why do we need to support clamping? Does it have any advantages over zeroing? |
Got it, and I'm not sure. We'll have to discuss it. |
@Kangz said:
There might be a middle ground here, where applications can provide a minimum buffer size, but there are no requirements placed on what that size is. Applications which don’t know any minimum buffer sizes ahead of time can supply a value of “0” which would effectively mean “no minimum.” When an implementation compiles a shader, these minimum buffer sizes would be provided as an additional input, and any accesses which lie outside of these minimum sizes would be guarded. That way, an application which knows its minimum size can supply it to gain additional speed, but an application which doesn’t know it would omit it, and all accesses would be guarded, reducing speed. One tenant of this idea is that these minimum buffer size values would be supplied in This means that if an application wants to use the same shader module in two different contexts, one where it knows the minimum buffer sizes and one where it doesn’t, the application would have to compile their shader module multiple times. The duplicate compilation makes sense, because the compilation process is the one which produces the OOB guards. The multiple compilations would happen explicitly at the author’s request, not invisibly under-the-hood. Or, if they don’t want to compile multiple times, they could just re-use the shader that includes all the guards, if they are willing to sacrifice shader runtime performance to improve shader compile-time performance. |
Here's a quote from #33 that's an old issue talking about the same problem. It shows how to do robust access in SPIR-V assuming the following:
That is, the CPU-side validation ensures that the buffer is big enough, either through draw/dispatch-time validation or with a mechanism like How about having |
I'd prefer to start by requiring Given that we can't guard accesses to structured members, I think that a relaxed I'm still not sure whether providing space for one element of unsized arrays is needed or not. |
What validation will we do to ensure that If we're taking untrusted values from the user and using it to skip out-of-bounds checks, that seems like a bad thing. |
|
Well-formed content won't bind too-small buffers though, will it? What if we failed operations if the buffers are smaller than the size of part (1)? Is there a use case for minimumBufferSize smaller or larger than size-of-part-1 that I'm not seeing? |
This change lets us do that validation at bind group creation time, which means at draw time we only have to validate the layout of the pipeline against the layouts of the bound bind groups (which we have to do anyway) |
I'm in the middle of catching up on this. First, I strongly agree that the buffer size should at least cover the fixed portion of the arg-buffer. The (1) part. I can't think of a good use case. It would also add complexity to the index-clamping transforms, I think.
When clamping, the pattern is to still do the access after computing the clamped address: Before:
After:
As @litherum noted in #546 (comment):
If there is no backing store for the 0 element of the buffer, then it's an out of bounds access. I think it's a poor tradeoff to have to inject extra code to deliberately skip the access:
When you say "zeroing" do you mean having the implementation automatically expand the underlying storage, I think to include at least one element of the unsized array (2), and filling it with zero? From the perspective of injection of index-clamping shader transform, that's fine. But does it break a contract with the programmer? When they specified a buffer that implies a zero-sized runtime-array, do they mind that the actual length reported in the shader is more than zero? (GLSL .length() method, or SPIR-V OpArrayLength instruction) |
I'm pretty sure "zeroing" meant [EDIT: |
That introduces control flow at every memory access. It seems like a lot for good code to pay for the bad cases. |
I thought we could mask out the OOB access based on the index, but now I realize that accessing itself is UB in Metal (for example), so branching is necessary, and I agree with @dneto0 's concern about it being heavy... |
For the |
Discussed on the June 8th 2020 call |
Resolved: add minimumBufferSize but make it optional. If specified it must be >= the size required by the shader code. TBD whether unspecified defaults to Default GPUPipelineLayout is generated with BGLs with minimumBufferSize. |
We discussed this in the editors meeting but had no strong feelings. I have a slight preference for Argument in favor of Argument against Argument against @litherum, thoughts? EDIT: with #851 (comment) I might prefer a sentinel value instead, since we'd use one across all three of these dictionary members |
This is now rebased and updated, also linking to #851 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. some grammar nits.
Co-authored-by: Justin Fan <[email protected]>
Co-authored-by: Justin Fan <[email protected]>
726: Basic support for minBufferBindingSize r=cwfitzgerald a=kvark **Connections** Has basic (partial) implementation of gpuweb/gpuweb#678 wgpu-rs update - gfx-rs/wgpu-rs#377 **Description** This change allows users to optionally specify the expected minimum binding size for buffers. We are then validating this against both the pipelines and bind groups. If it's not provided, we'll need to validate at draw time - this PR doesn't do this (focus on API changes first). It also moves out the `read_spirv`, since wgpu-types wasn't the right home for it ever. **Testing** Tested on wgpu-rs examples Co-authored-by: Dzmitry Malyshau <[email protected]>
This is a follow-up to #546 (comment) that is meant to start the discussion on how this issue can be addressed, and proposes a strawman solution :)
It's often the case where the buffers visible to a shader are structured, which means the shader sees it as:
This structure can be seen as 2 parts:
This stawman proposal is to require the minimum size of (1) portion to be specified in BGL entry. Any buffer binding created for it has to specify the size that is ">=" that minimum size. When a pipeline is created, we can inspect the structured parts of the buffer bindings of the shader, and validate that the minimum buffer size covers the (1) portion.
What this allows us to do is basically avoiding the access address checks for named non-array fields of buffer bindings (as well as the fields of structures in a arrays). This means more efficient shader code being generated and ran on the GPU. Since uniform buffers are used a lot, having efficient loads from them is essential for performance (although, we don't have concrete benchmarks showing how much it would cost to do a check for every field access, yet).
What this limits the user to do is having a buffer that only partially covers a structured buffer binding. My feeling is that this is a fine compromise, and the users can always work around this by creating a slightly bigger buffer.
Some more details on how the buffer access rules would work in the shader:
Preview | Diff