Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling invalid/not-enabled chained structs on extensible Out-structs #115

Closed
litherum opened this issue Nov 16, 2021 · 19 comments · Fixed by #377
Closed

Handling invalid/not-enabled chained structs on extensible Out-structs #115

litherum opened this issue Nov 16, 2021 · 19 comments · Fixed by #377
Labels
errors Error reporting extensibility Adding features without breaking API changes has resolution Issue is resolved, just needs to be done

Comments

@litherum
Copy link

WebGPU's limits don't seem to be optional, so it doesn't make much sense why wgpuAdapterGetLimits() returns a bool.

@austinEng
Copy link
Collaborator

IIRC, this is for extensibility.

In the JS API, if/when WebGPU's limits gain additional members, they can be detected in the JS API by checking for the existence of properties in GPUSupportedLimits or perhaps by first querying if the limit's respective feature is available in GPUSupportedFeatures.

The C API can't just add members to the struct without breaking ABI compatibility, so additions would be expressed by doing something like:

WGPULimits limits;
WGPULimitsForNewFeature newFeatureLimits;
limits.nextInChain = &newFeatureLimits;

wgpuAdapterGetLimits(&limits);

and it will populate both limits and newFeatureLimits;

wgpuAdapterGetLimits would return false if the implementation cannot comprehend the struct type. Basically meaning that you're holding it wrong and the implementation couldn't fulfill your request.
You should first check with wgpuAdapterHasFeature if the implementation knows how to understand the struct WGPULimitsForNewFeature.

@litherum
Copy link
Author

"you're holding it wrong" 😐

@kainino0x
Copy link
Collaborator

If "you're holding it wrong" can wgpuAdapterGetLimits just crash?

@kainino0x kainino0x reopened this Nov 17, 2021
@austinEng
Copy link
Collaborator

why would crashing be a better option?

Is the implementation supposed to insert an actual crash? If so, it has to do something to detect the failure condition, so indicating to the developer that something went wrong seems better than an unrecoverable crash.
Or would it be a no-op that skips unknown structs in release builds and some assert in debug builds?

@kainino0x
Copy link
Collaborator

What is the application supposed to do about it though? If it's purely a programming error then the developer (e.g. browser developer) should deal with it, not the application (e.g. the browser code).

Analogously, we might reasonably crash if you pass in a null pointer where there shouldn't be one, or similar.

I was thinking an assert, debug-only would be reasonable.

@Kangz
Copy link
Collaborator

Kangz commented Nov 22, 2021

In general the API doesn't crash if you pass an incorrect sType: it produces a validation error. This is a bit similar where we don't crash but gracefully return that it is an error. Agreed that it's not that much better than than crashing in terms of what the code can do, but at least it can provide a correct error message to the user "whoops I did something bad" instead of just crashing the application/GPU process.

@kainino0x
Copy link
Collaborator

A web application wouldn't be able to trigger this though. E.g. it would be Chrome's responsibility to make sure it doesn't pass stuff to Dawn that Dawn doesn't understand (which should be easy).

TBH, thinking about it now, I think of the extension structs as being purely a part of the C bindings and think it would be fine if all functions crashed if given extension chains they don't understand. Calling this "validation" is a bit weird because I think it's blurring the line between bindings and behavior.

@Kangz
Copy link
Collaborator

Kangz commented Nov 23, 2021

I guess we never spelled out the non-crashability guarantees of this header. In my mind if you respected the following, then an application would be guaranteed to not crash:

  • Non-nullable pointer to types contain the correct type (with the size you promised), and if the type is an object, the object is not freed yet.
  • You don't index out of the range returned by Get[Const]MappedRange.
  • Extension structures are tagged with the correct sType.

So here the question is whether we add a new constraint:

  • Extension structs must only be used if the respective extension is enabled.

I have a slight preference to make non-crashability easier for applications, especially since it happens "for free" in (our) implement, but either way would be fine. Whatever we decide we should start describing it in docs.

@austinEng
Copy link
Collaborator

We could also say that unknown structs are completely skipped - so no crash, and no bool. One thing is that It could be slightly surprising if you pass a chained out-struct and only some of the structs are actually populated and the others are left uninitialized. Leaving those structs uninitialized is already the case when wgpuAdapterGetLimits returns false.

@kainino0x
Copy link
Collaborator

  • Extension structs must only be used if the respective extension is enabled.

Ah I was thinking about this wrong, I was thinking about an implementation that doesn't understand the sType at all, not just a device that doesn't have the feature enabled. In retrospect that doesn't really make sense because that would mean there was a version mismatch between header and implementation.

If the sType is something completely wrong (not recognized by the implementation, maybe because the sType is for a different extension chain) I think it makes sense to crash or debug-assert.

If the sType is something that would be valid with the correct feature enabled, but that feature isn't enabled, then leaving the struct uninitialized, or maybe zero-initializing it, would be reasonable.

@henrybetts
Copy link

Should wgpuAdapterGetProperties also return a bool, since it looks like this is also meant to be extensible?

I was also wondering about the consistency with wgpuShaderModuleGetCompilationInfo - I'm assuming that in this case, the implementation adds whatever extensions it thinks are relevant to WGPUCompilationInfo, rather than the user requesting them specifically. Could this approach also work for the limits and properties? I suppose the ownership/lifetime of the extensions makes it a bit tricky.

@kainino0x
Copy link
Collaborator

Should wgpuAdapterGetProperties also return a bool, since it looks like this is also meant to be extensible?

Yeah, I agree, looks like they should match (either both return bool or both return void).

I was also wondering about the consistency with wgpuShaderModuleGetCompilationInfo - I'm assuming that in this case, the implementation adds whatever extensions it thinks are relevant to WGPUCompilationInfo, rather than the user requesting them specifically.

Correct, and it's that way because wgpuShaderModuleGetCompilationInfo takes a callback so the library can own the return value.

Could this approach also work for the limits and properties? I suppose the ownership/lifetime of the extensions makes it a bit tricky.

We could do this for limits/properties if they also took callbacks (which could called inline rather than asynchronously), but I think most applications would want to hold onto the limits/properties data for longer than that. (They might want to do that with shader compilation info too, but it's harder because they don't know the size of the output ahead of time - we'd have to have a "get number of messages" followed by "write all the messages into this array for me", like Vulkan has in many of its APIs.)

@kryptan
Copy link

kryptan commented Feb 22, 2022

I was thinking about an implementation that doesn't understand the sType at all, not just a device that doesn't have the feature enabled. In retrospect that doesn't really make sense because that would mean there was a version mismatch between header and implementation.

Don't extension chains exist to provide ABI stability? If header should always match implementation then there is no need to have extension chains and new fields can simply be added to structs instead.

Maybe a boolean field can be added to WGPUChainedStructOut. Implementation would then set it to true/false depending on whether it initialized it or not. This way you won't need to check supported features before calling wgpuAdapterGetLimits, just add all structs that your care about to the chain and then check which ones were initialized.

@kainino0x
Copy link
Collaborator

Don't extension chains exist to provide ABI stability? If header should always match implementation then there is no need to have extension chains and new fields can simply be added to structs instead.

I think that depends on whether we want unidirectional compatibility (older header can be used with newer binary) or bidirectional compatibility (any header can be used with any binary) in our ABI.

@kainino0x kainino0x added the errors Error reporting label Aug 3, 2023
@kainino0x kainino0x added the extensibility Adding features without breaking API changes label Aug 19, 2023
@kainino0x
Copy link
Collaborator

kainino0x commented Aug 31, 2023

  • Error reporting vs undefined behavior
    • in WGPUFuture
    • for instance creation
      • wgpu doesn’t report errors on instance creation, but does on surface creation
      • dawn’s logging callback is on the device. AE: we want this to know which webpage’s devtools to report back to
      • add a logging callback to the instance descriptor? no logging callbacks yet
      • CF: no easy way in wgpu to route errors to a logging callback. Maybe just don’t standardize this yet?
    • for extension structs
      • CF: when does this come up?
      • AE: extra limits added by a feature, extra adapter properties
        • Why does wgpuAdapterGetLimits() return a bool? #115
      • KN: had proposed just crashing when we see a struct we don’t understand
      • AE: another case, export a platform-specific handle from a WGPUFuture
      • LK: overwrite the sType with Invalid?
      • CF: people won’t know to check it.
      • RM: webgpu.h doesn’t currently have a way to report any errors above the device (instance, surface)
      • LK: out-structs could have a valid-or-not-valid in the base struct of the chain.
      • AE: Similar to what we have with wgpuAdapterGetLimits returning a bool
      • CF: Every time we have a crash it’s a headache, because it’s on a user’s machine and it’s a high-priority bug to not do that anymore.
      • Can just return a enum probably.
        • KN: Don’t have an exhaustive list of places we’d need to do this, but works for GetLimits and WaitAny which both already basically do this.
      • CF: What about in-structs?
      • AE: Generate an error to the error scopes
      • CF: Emscripten will have to do the error emulation stuff to inject errors.
        • KN: Sigh but probably should just do it.
    • Synchronous validation in getMappedRange #64
      • — not discussed —

@kainino0x
Copy link
Collaborator

meeting:

  • What should this enum look like? A single one for all entrypoints? Just two variants, Success/Failure?
  • Yes, single two-variant enum (to be bikeshedded)

(See also the somewhat related #225.)

@kainino0x kainino0x changed the title Why does wgpuAdapterGetLimits() return a bool? Handling invalid or not-enabled chained structs on extensible *out*-structs Sep 7, 2023
@kainino0x kainino0x changed the title Handling invalid or not-enabled chained structs on extensible *out*-structs Handling invalid/not-enabled chained structs on extensible Out-structs Sep 7, 2023
@kainino0x
Copy link
Collaborator

kainino0x commented Sep 7, 2023

This should be a complete list of functions that take out-structs:

  • No return value (or currently returns bool)
    • wgpuAdapterGetLimits (WGPUSupportedLimits)
    • wgpuAdapterGetProperties (WGPUAdapterProperties)
    • wgpuDeviceGetLimits (WGPUSupportedLimits)
    • wgpuGetInstanceFeatures (WGPUInstanceFeatures, WGPUFuture #199)
    • wgpuSurfaceGetCapabilities (WGPUSurfaceCapabilities)
  • Non-extensible out-structs:
  • Note there are a few currently-Dawn-specific extensions too:
    • wgpuSharedTextureMemoryEndAccess (WGPUSharedTextureMemoryEndAccessState), currently returns bool but I don't remember why
    • wgpuSharedFenceExportInfo (WGPUSharedFenceExportInfo), returns void

Having verified that none of the standardized ones have return values already, we can apply the proposal above to all of them. Marking "has resolution".

@austinEng
Copy link
Collaborator

We never bikeshed the name of the status enum. Here's a suggestion.

typedef enum WGPUStatus {
    WGPUStatus_Success = 0x00000000,
    WGPUStatus_Error = 0x00000001,
    WGPUStatus_Force32 = 0x7FFFFFFF
} WGPUStatus WGPU_ENUM_ATTRIBUTE;

@austinEng austinEng added the !discuss Needs discussion (at meeting or online) label Apr 18, 2024
@kainino0x
Copy link
Collaborator

April 25th meeting:

  • [Proposal above]
  • need to have other enum values?
  • KN: I don’t think so - we intentionally decided not to complicate it
  • CF: might be useful to mark it as "must use"
  • KN: C23 has it
  • KN: Maybe could do it per-entrypoint
  • KN: probably fine to ignore it most of the time. if your application works, it works.
  • KN: How do you suppress it?
  • cast to void
  • KN: could do it for everything that returns a status / object. I’ll file another issue to track the work to add nodiscard to a bunch of stuff
  • Ok to proposal with simple Success/Error status enum
  • Ok to add [[nodiscard]]

@kainino0x kainino0x removed the !discuss Needs discussion (at meeting or online) label Jun 6, 2024
kainino0x added a commit to kainino0x/webgpu-headers that referenced this issue Sep 21, 2024
Not used yet.
The values are wrong (we intend to shift them to 1,2) but I'll fix that
along with the other enums because it may be a generator-wide change.

Issue: webgpu-native#115
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
errors Error reporting extensibility Adding features without breaking API changes has resolution Issue is resolved, just needs to be done
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants