Add minimumBufferSize to BGL entry #678

kvark · 2020-04-01T16:06:31Z

This is a follow-up to #546 (comment) that is meant to start the discussion on how this issue can be addressed, and proposes a strawman solution :)

It's often the case where the buffers visible to a shader are structured, which means the shader sees it as:

type Particles = struct {
  [[offset 0]] particles : array<float, 10>;
  [[offset 40]] other_field: i32;
  [[offset 44]] unbound_array: array<i32, 0>;
};

This structure can be seen as 2 parts:

the bounded portion: it has named fields, some could be fixed-size arrays.
the unbound array of something, indexed dynamically.

This stawman proposal is to require the minimum size of (1) portion to be specified in BGL entry. Any buffer binding created for it has to specify the size that is ">=" that minimum size. When a pipeline is created, we can inspect the structured parts of the buffer bindings of the shader, and validate that the minimum buffer size covers the (1) portion.

What this allows us to do is basically avoiding the access address checks for named non-array fields of buffer bindings (as well as the fields of structures in a arrays). This means more efficient shader code being generated and ran on the GPU. Since uniform buffers are used a lot, having efficient loads from them is essential for performance (although, we don't have concrete benchmarks showing how much it would cost to do a check for every field access, yet).

What this limits the user to do is having a buffer that only partially covers a structured buffer binding. My feeling is that this is a fine compromise, and the users can always work around this by creating a slightly bigger buffer.

Some more details on how the buffer access rules would work in the shader:

when accessing arrays within a buffer, we check for index and consider the access to be OOB if the structure by that index doesn't fit into the binding (i.e. no partial structure access).
when accessing other fields (either top-level, or within an array by index that is in bounds), there need to be no checks done.

Preview | Diff

kainino0x · 2020-04-02T17:56:06Z

Would we validate this against the shader at pipeline creation? Or would it just be a "hint" to allow elision of a runtime check?

If the former, we should mention that in the definition for the new field. (can spec it later when we have spec for create*Pipeline)

kvark · 2020-04-02T17:58:21Z

Yes, I described this here:

When a pipeline is created, we can inspect the structured parts of the buffer bindings of the shader, and validate that the minimum buffer size covers the (1) portion.

I agree more details should be provided in the documentation. Will be happy to follow up after the initial discussion, if the group sees the value in pursuing this direction.

Kangz · 2020-04-03T15:45:44Z

Thanks for putting this up! This is an interesting proposal and would avoid the need to do runtime checks for this.

My biggest concern is that this reduces flexibility and requires another piece of data that applications were not used to providing. It's unclear how much harder it makes WebGPU to use (from not at all, to OMG I CAN'T PORT MY CONTENT). I think Jasper commented about this in the Matrix chatroom, and hacks that D3D developers were using to do unsized arrays in shaders.

Another small thing to consider is maybe asking for the size of an extra array element to be in the minimumBufferSize so that just clamping the array index is sufficient, since there's always valid data in the first element of the array.

kvark · 2020-04-03T21:37:17Z

@Kangz

I think Jasper commented about this in the Matrix chatroom, and hacks that D3D developers were using to do unsized arrays in shaders

IIRC, we agreed that this wasn't applicable to WGSL. If there is a concrete statement there, let's have it in this issue (or any other).

My biggest concern is that this reduces flexibility and requires another piece of data that applications were not used to providing.

We clearly require more data already, and we clearly require the existing content to go through some transformation (together with another round of testing) before it can ship. Providing an extra value here shouldn't be a problem unless there is a really tricky use-case where the same bindings was used with 2 different shader structures. I haven't seen this in practice, would be open to hear more from ISVs if that's something they intend to do.

Another small thing to consider is maybe asking for the size of an extra array element to be in the minimumBufferSize so that just clamping the array index is sufficient, since there's always valid data in the first element of the array.

It's a valid implementation behavior for this PR, but I don't think it needs to be specified or requires any changes to the PR, does it? The semantics of the "minimumBufferSize" is, well, that it's a lower bound. So the implementation can extract as much benefit as it can from knowing that ahead of time.

kainino0x · 2020-04-03T22:44:40Z

Not in the PR yet, but wouldn't it change the validation for pipeline creation (when validating the minimumBufferSize against the shader code)?

kvark · 2020-04-03T23:45:44Z

I think it would still be exactly as described:

When a pipeline is created, we can inspect the structured parts of the buffer bindings of the shader, and validate that the minimum buffer size covers the (1) portion.

In other words, the validation only cares about the structured part being covered.

If we want to wiggle it, we also can: e.g. by removing all the validation here. If the user provided the minimum size - use it, otherwise - do something more complex. I do however think that just eliminating the complexity of access-guarding the structured fields as a class is most appealing, and it would require the validation as described.

kainino0x · 2020-04-04T00:21:43Z

If I understand that and what @Kangz said correctly, this would validate the minimumBufferSize is at least 44, while @Kangz's option would require it to be 48: 44 + one unbounded array element (i32).

kvark · 2020-04-04T00:33:34Z

I asked why people care about clamping in #546 (comment) and got no responses, so I guess I'm asking about the same here. Why do we need to support clamping? Does it have any advantages over zeroing?

kainino0x · 2020-04-04T00:40:12Z

Got it, and I'm not sure. We'll have to discuss it.

litherum · 2020-04-06T20:15:44Z

@Kangz said:

requires another piece of data that applications were not used to providing.

There might be a middle ground here, where applications can provide a minimum buffer size, but there are no requirements placed on what that size is. Applications which don’t know any minimum buffer sizes ahead of time can supply a value of “0” which would effectively mean “no minimum.”

When an implementation compiles a shader, these minimum buffer sizes would be provided as an additional input, and any accesses which lie outside of these minimum sizes would be guarded. That way, an application which knows its minimum size can supply it to gain additional speed, but an application which doesn’t know it would omit it, and all accesses would be guarded, reducing speed.

One tenant of this idea is that these minimum buffer size values would be supplied in createShaderModule() instead of createPipeline().

This means that if an application wants to use the same shader module in two different contexts, one where it knows the minimum buffer sizes and one where it doesn’t, the application would have to compile their shader module multiple times. The duplicate compilation makes sense, because the compilation process is the one which produces the OOB guards. The multiple compilations would happen explicitly at the author’s request, not invisibly under-the-hood. Or, if they don’t want to compile multiple times, they could just re-use the shader that includes all the guards, if they are willing to sacrifice shader runtime performance to improve shader compile-time performance.

Kangz · 2020-04-07T17:14:03Z

Here's a quote from #33 that's an old issue talking about the same problem. It shows how to do robust access in SPIR-V assuming the following:

The top level of image views isn't empty.

Buffer views contain enough space for the "sized" part of the structures, and at least one element of the unsized part if present.

That is, the CPU-side validation ensures that the buffer is big enough, either through draw/dispatch-time validation or with a mechanism like minimumBufferSize. We discussed this briefly with @dneto0 and it doesn't seem very tractable to guard all memory accesses in case the CPU-side can't guarantee a minimum size. (plus I think that GPUs often prefetch uniform buffers?)

How about having minimumBufferSize but if it's not present we could do draw-time validation?

kainino0x · 2020-04-07T17:46:55Z

I'd prefer to start by requiring minimumBufferSize, and consider relaxing it later (making it optional) later, as a nonbreaking change, if early users find that it's necessary.

Given that we can't guard accesses to structured members, I think that a relaxed minimumBufferSize should be either present and large enough for the struct (validated at pipeline creation time), or not present, as I don't think anything in between is useful.

I'm still not sure whether providing space for one element of unsized arrays is needed or not.

RafaelCintron · 2020-04-07T17:57:34Z

What validation will we do to ensure that minimumBufferSize is a value for which we can actually skip out-of-bounds checks? If we can validate it, why can't we determine the value ourselves?

If we're taking untrusted values from the user and using it to skip out-of-bounds checks, that seems like a bad thing.

kainino0x · 2020-04-07T19:13:55Z

minimumBufferSize is added to the BindGroupLayout, but it can't be computed until Pipeline (or maybe ShaderModule) creation. We would validate the the two are compatible as part of the Pipeline-PipelineLayout matching rules.

kdashg · 2020-04-11T00:03:57Z

Well-formed content won't bind too-small buffers though, will it?

What if we failed operations if the buffers are smaller than the size of part (1)?
This would treat all content as if minimumBufferSize were always size-of-part-1.

Is there a use case for minimumBufferSize smaller or larger than size-of-part-1 that I'm not seeing?

kainino0x · 2020-04-11T00:26:15Z

This change lets us do that validation at bind group creation time, which means at draw time we only have to validate the layout of the pipeline against the layouts of the bound bind groups (which we have to do anyway)

dneto0 · 2020-04-23T21:33:58Z

I'm in the middle of catching up on this.

First, I strongly agree that the buffer size should at least cover the fixed portion of the arg-buffer. The (1) part. I can't think of a good use case. It would also add complexity to the index-clamping transforms, I think.

I asked why people care about clamping in #546 (comment) and got no responses, so I guess I'm asking about the same here. Why do we need to support clamping? Does it have any advantages over zeroing?

When clamping, the pattern is to still do the access after computing the clamped address:

Before:

    buf[i]

After:

    buf[clamp(0,i,buf.length())]

As @litherum noted in #546 (comment):

Clamping for OOB accesses doesn't work for zero-sized buffers.

If there is no backing store for the 0 element of the buffer, then it's an out of bounds access. I think it's a poor tradeoff to have to inject extra code to deliberately skip the access:

    if (buf.length() > 0) {   
          buf[clamp(0,i,buf.length())];
    }

When you say "zeroing" do you mean having the implementation automatically expand the underlying storage, I think to include at least one element of the unsized array (2), and filling it with zero?

From the perspective of injection of index-clamping shader transform, that's fine. But does it break a contract with the programmer? When they specified a buffer that implies a zero-sized runtime-array, do they mind that the actual length reported in the shader is more than zero? (GLSL .length() method, or SPIR-V OpArrayLength instruction)

kainino0x · 2020-04-23T21:51:45Z

When you say "zeroing" do you mean having the implementation automatically expand the underlying storage, I think to include at least one element of the unsized array (2), and filling it with zero?

I'm pretty sure "zeroing" meant (i + sizeof(buf[i]) <= buf.length()) ? buf[i] : 0.

[EDIT: ~~fixed~~ -- ~~wait still not fixed~~ -- ok now fixed]

dneto0 · 2020-06-01T13:44:38Z

I'm pretty sure "zeroing" meant (i + sizeof(buf[i]) <= buf.length()) ? buf[i] : 0.

That introduces control flow at every memory access. It seems like a lot for good code to pay for the bad cases.

kvark · 2020-06-01T16:22:53Z

I thought we could mask out the OOB access based on the index, but now I realize that accessing itself is UB in Metal (for example), so branching is necessary, and I agree with @dneto0 's concern about it being heavy...

kvark · 2020-06-01T16:24:29Z

For the OpArrayLength, we need to talk about how it's implemented on MSL and HLSL/DXIL (and the respected backends). If it needs to be emulated anyway, we might as well emulate it everywhere to have full control over what length the shader sees.

litherum · 2020-06-08T19:18:30Z

Discussed on the June 8th 2020 call

kainino0x · 2020-06-08T19:32:31Z

Resolved: add minimumBufferSize but make it optional. If specified it must be >= the size required by the shader code. TBD whether unspecified defaults to 0 or undefined.

Default GPUPipelineLayout is generated with BGLs with minimumBufferSize.

kainino0x · 2020-06-08T22:39:03Z

We discussed this in the editors meeting but had no strong feelings. I have a slight preference for 0 based on the following:

Argument in favor of 0: Semantically, it really does mean "the minimum buffer size, for a buffer in a bind group with this bind group layout, is 0." So in that sense 0 is not being used as a sentinel value but as a real one, in the interface between GPUBindGroupLayout and GPUBindGroup.

Argument against 0 and in favor of undefined: This is a special case because it implies the validation occurs at draw time (since it can't happen at createBindGroup time).

Argument against undefined: That's not exactly a special case involving GPUBindGroupLayout, but instead a special case of interaction between the bound bind groups and the bound pipeline (if the bound buffers aren't large enough for the pipeline - which, as it so happens, can only occur if minimumBufferSize was zero).

@litherum, thoughts?

EDIT: with #851 (comment) I might prefer a sentinel value instead, since we'd use one across all three of these dictionary members

kvark · 2020-06-08T22:39:39Z

This is now rebased and updated, also linking to #851

spec/index.bs

JusSn

Looks good. some grammar nits.

Co-authored-by: Justin Fan <[email protected]>

726: Basic support for minBufferBindingSize r=cwfitzgerald a=kvark **Connections** Has basic (partial) implementation of gpuweb/gpuweb#678 wgpu-rs update - gfx-rs/wgpu-rs#377 **Description** This change allows users to optionally specify the expected minimum binding size for buffers. We are then validating this against both the pipelines and bind groups. If it's not provided, we'll need to validate at draw time - this PR doesn't do this (focus on API changes first). It also moves out the `read_spirv`, since wgpu-types wasn't the right home for it ever. **Testing** Tested on wgpu-rs examples Co-authored-by: Dzmitry Malyshau <[email protected]>

)

kvark added the for meeting label Apr 2, 2020

kainino0x removed the for wgsl meeting label Apr 14, 2020

kvark mentioned this pull request Apr 20, 2020

Binding to zero-sized range of a buffer #686

Closed

JusSn closed this Jun 8, 2020

JusSn reopened this Jun 8, 2020

kvark mentioned this pull request Jun 8, 2020

Case for optional GPUBindGroupLayoutEntry entries #851

Open

kvark force-pushed the min-buf-size branch from 8cb71b2 to cac0aac Compare June 8, 2020 22:38

kvark requested review from kainino0x and JusSn June 8, 2020 22:39

kainino0x approved these changes Jun 8, 2020

View reviewed changes

spec/index.bs Outdated Show resolved Hide resolved

spec/index.bs Show resolved Hide resolved

spec/index.bs Outdated Show resolved Hide resolved

Kangz reviewed Jun 9, 2020

View reviewed changes

spec/index.bs Outdated Show resolved Hide resolved

kvark added 3 commits June 10, 2020 12:17

Add minimumBufferSize to BGL entry

e80fd50

Add pipeline validation text

75082d2

Rename to minBufferBindingSize, address review nots

83111fc

kvark force-pushed the min-buf-size branch from cac0aac to 83111fc Compare June 10, 2020 16:17

iraiter2 mentioned this pull request Jun 12, 2020

Add minimum buffer size for dawn reflection google/shaderc#1084

Merged

kvark mentioned this pull request Jun 12, 2020

metal uniform buffer size issue gfx-rs/wgpu#185

Closed

JusSn reviewed Jun 15, 2020

View reviewed changes

spec/index.bs Outdated Show resolved Hide resolved

JusSn reviewed Jun 15, 2020

View reviewed changes

spec/index.bs Outdated Show resolved Hide resolved

JusSn approved these changes Jun 15, 2020

View reviewed changes

kvark and others added 2 commits June 15, 2020 17:15

Update spec/index.bs

15d8a8d

Co-authored-by: Justin Fan <[email protected]>

Update spec/index.bs

23c2851

Co-authored-by: Justin Fan <[email protected]>

kvark merged commit 0e437f2 into gpuweb:master Jun 15, 2020

kvark deleted the min-buf-size branch June 15, 2020 21:27

kvark mentioned this pull request Jun 16, 2020

Basic support for minBufferBindingSize gfx-rs/wgpu#726

Merged

This was referenced Sep 24, 2021

webgpu: fix uniform block incorrectly tensorflow/tfjs#5487

Open

Remove no-op minBindingSize text #2141

Merged

ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022

Fix validation errors in operation,vertex_state,index_format (gpuweb#678

191b407

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add minimumBufferSize to BGL entry #678

Add minimumBufferSize to BGL entry #678

kvark commented Apr 1, 2020 •

edited by pr-preview bot

Loading

kainino0x commented Apr 2, 2020

kvark commented Apr 2, 2020

Kangz commented Apr 3, 2020

kvark commented Apr 3, 2020

kainino0x commented Apr 3, 2020

kvark commented Apr 3, 2020

kainino0x commented Apr 4, 2020

kvark commented Apr 4, 2020

kainino0x commented Apr 4, 2020

litherum commented Apr 6, 2020 •

edited

Loading

Kangz commented Apr 7, 2020

kainino0x commented Apr 7, 2020

RafaelCintron commented Apr 7, 2020

kainino0x commented Apr 7, 2020

kdashg commented Apr 11, 2020

kainino0x commented Apr 11, 2020

dneto0 commented Apr 23, 2020

kainino0x commented Apr 23, 2020 •

edited

Loading

dneto0 commented Jun 1, 2020

kvark commented Jun 1, 2020

kvark commented Jun 1, 2020

litherum commented Jun 8, 2020 •

edited

Loading

kainino0x commented Jun 8, 2020 •

edited

Loading

kainino0x commented Jun 8, 2020 •

edited

Loading

kvark commented Jun 8, 2020

JusSn left a comment

Add minimumBufferSize to BGL entry #678

Add minimumBufferSize to BGL entry #678

Conversation

kvark commented Apr 1, 2020 • edited by pr-preview bot Loading

kainino0x commented Apr 2, 2020

kvark commented Apr 2, 2020

Kangz commented Apr 3, 2020

kvark commented Apr 3, 2020

kainino0x commented Apr 3, 2020

kvark commented Apr 3, 2020

kainino0x commented Apr 4, 2020

kvark commented Apr 4, 2020

kainino0x commented Apr 4, 2020

litherum commented Apr 6, 2020 • edited Loading

Kangz commented Apr 7, 2020

kainino0x commented Apr 7, 2020

RafaelCintron commented Apr 7, 2020

kainino0x commented Apr 7, 2020

kdashg commented Apr 11, 2020

kainino0x commented Apr 11, 2020

dneto0 commented Apr 23, 2020

kainino0x commented Apr 23, 2020 • edited Loading

dneto0 commented Jun 1, 2020

kvark commented Jun 1, 2020

kvark commented Jun 1, 2020

litherum commented Jun 8, 2020 • edited Loading

kainino0x commented Jun 8, 2020 • edited Loading

kainino0x commented Jun 8, 2020 • edited Loading

kvark commented Jun 8, 2020

JusSn left a comment

Choose a reason for hiding this comment

kvark commented Apr 1, 2020 •

edited by pr-preview bot

Loading

litherum commented Apr 6, 2020 •

edited

Loading

kainino0x commented Apr 23, 2020 •

edited

Loading

litherum commented Jun 8, 2020 •

edited

Loading

kainino0x commented Jun 8, 2020 •

edited

Loading

kainino0x commented Jun 8, 2020 •

edited

Loading