Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document effects of vkFreeDescriptorSets and vkResetDescriptorPool failure, if any. #2317

Open
adrianlopezroche opened this issue Mar 1, 2024 · 17 comments

Comments

@adrianlopezroche
Copy link

adrianlopezroche commented Mar 1, 2024

The documentation for vkFreeDescriptorSets and vkResetDescriptorPool indicates return values of VK_SUCCESS on success, but it makes no mention of other possible return values and, more importantly, makes no mention of the state of the pool and descriptor sets after failure. Do all descriptor sets remain valid after failure? Might some be freed and others not? Could the pool itself end up in an inconsistent state after a failed reset, with e.g. some descriptor sets becoming invalid but not being returned to the pool? As it stands the documentation provides no guidance on what programs should expect after these calls have failed.

A related issue, #1070, mentions other return values which have now been removed. In the responses to that issue one contributor stated that including those other values "was probably an oversight, and if we can establish that all implementations only return successfully, we'll probably add a restriction to the allowed return value and a Note explaining the discrepancy." The return values have been removed without noting any restriction on the return value (beyond, perhaps, an implicit one by omission).

The standard needs to address this.

@krOoze
Copy link
Contributor

krOoze commented Mar 1, 2024

A return type though cannot reasonably be removed from function signature without breaking compatibility.

Meanwhile the spec is not exactly incorrect. It would return VK_SUCCESS as the specification says. While it has the return type it could perhaps additionally be able to return VK_ERROR_UNKNOWN or VK_ERROR_VALIDATION_FAILED_EXT.

Note the language already does not acknowlege any possibility of failure:

After calling vkFreeDescriptorSets, all descriptor sets in pDescriptorSets are invalid.

@adrianlopezroche
Copy link
Author

I'm not suggesting the return value be removed, but I think it's important to know what the state of the pool and descriptor sets may be when a value other than VK_SUCCESS is returned. Are descriptor sets still valid after a failed call? Are some of them valid and some of them not? Is the descriptor pool itself still in a well-defined state, and what is that state? Can you recover from such a failure or do you possibly have to destroy and recreate the pool? Those are the questions this raises for me.

@krOoze
Copy link
Contributor

krOoze commented Mar 3, 2024

What do you mean by failed call? How would the call fail, if only permitted return value is VK_SUCCESS?

@adrianlopezroche
Copy link
Author

What do you mean by failed call? How would the call fail, if only permitted return value is VK_SUCCESS?

The documentation for these calls specifically states that "on success, this command returns VK_SUCCESS." Such wording suggests to me that failure is a possibility given there's no rule to the effect that these calls must succeed. I suppose this wording can be excused by the fact that functions with a VkResult return type may return VK_ERROR_VALIDATION_FAILED_EXT or VK_ERROR_UNKNOWN without these codes being listed, but then I'd still worry about the state of the vkPool and vkDescriptorSet objects if the calls at issue here should return such an error.

With this in mind, I offer some followup questions:

  • Is it safe to ignore the return value of vkFreeDescriptorSets and vkResetDescriptorPool?
  • Should a call returning VK_ERROR_UNKNOWN be understood as resulting in undefined behavior?
  • Should a call returning VK_ERROR_VALIDATION_FAILED_EXT be understood as resulting in undefined behavior?
  • If the answer to either of these last two questions is no, then what is the expected state of the system? All descriptor sets freed even though vkFreeDescriptorSets has indicated failure? No descriptor sets freed? The pool being reset even though vkResetDescriptorPool has indicated failure? The pool not reset?

@polarina
Copy link
Contributor

polarina commented Mar 3, 2024

No function is ever allowed to return VK_ERROR_VALIDATION_FAILED_EXT.

What you observe is the result of undefined behaviour, in which case anything may happen and the validation layers just so happen to exploit that liberty. You cannot assume anything about the state of the descriptor pool, let alone the remainder of your program.

@adrianlopezroche
Copy link
Author

No function is ever allowed to return VK_ERROR_VALIDATION_FAILED_EXT.

From the VkResult manual page: "Any command returning a VkResult may return VK_ERROR_VALIDATION_FAILED_EXT if a violation of valid usage is detected, even though commands do not explicitly list this as a possible return code."

The same page states that VK_ERROR_UNKNOWN "will be returned by an implementation when an unexpected error occurs that cannot be attributed to valid behavior of the application and implementation. Under these conditions, it may be returned from any command returning a VkResult."

If either VK_ERROR_UNKNOWN or VK_ERROR_VALIDATION_FAILED_EXT can result in undefined behavior I believe it should be pointed out explicitly in the spec (though it's entirely possible that it does and I've missed it). If vkFreeDescriptorSets and vkResetDescriptorPool are only ever supposed to return VK_SUCCESS then I believe the standard should state so explicitly, perhaps to the effect of "this command must return VK_SUCCESS except in case of validation errors."

@adrianlopezroche
Copy link
Author

OK, so I'm going over the spec again and I see that section 3.7 (Valid Usage) clearly states: "The core layer assumes applications are using the API correctly. Except as documented elsewhere in the Specification, the behavior of the core layer to an application using the API incorrectly is undefined, and may include program termination."

Given such wording, the inevitable conclusion is that VK_ERROR_UNKNOWN and VK_ERROR_VALIDATION_FAILED_EXT are both indicative of possible undefined behavior. So the documentation is reasonably clear on this, though it wouldn't hurt to mention undefined behavior again in the documentation for VkResult.

With this in mind I propose the following changes, specifically:

  • In the documentation for vkFreeDescriptorSets and vkResetDescriptorPool, add language to the effect that the return value exists for historical reasons only and that it's safe to assume the functions always return VK_SUCCESS.
  • In the documentation for VkResult, add the following (shown here in bold) to the existing text: "VK_ERROR_UNKNOWN will be returned by an implementation when an unexpected error occurs that cannot be attributed to valid behavior of the application and implementation. Under these conditions, it may be returned from any command returning a VkResult. If VK_ERROR_UNKNOWN is returned the behavior of the core layer is undefined."
  • Likewise, "Any command returning a VkResult may return VK_ERROR_VALIDATION_FAILED_EXT if a violation of valid usage is detected, even though commands do not explicitly list this as a possible return code. If VK_ERROR_VALIDATION_FAILED_EXT is returned the behavior of the core layer is undefined."

@oddhack
Copy link
Contributor

oddhack commented Mar 6, 2024

The documentation for these calls specifically states that "on success, this command returns VK_SUCCESS."

That's just an artifact of the script which autogenerates this language.

@adrianlopezroche
Copy link
Author

The documentation for these calls specifically states that "on success, this command returns VK_SUCCESS."

That's just an artifact of the script which autogenerates this language.

That's true, but a search against the manual pages for Vulkan 1.2 reveals the only calls returning something on success and nothing on failure are vkFreeDescriptorSets and vkResetDescriptorPool. All other calls either have a void return type or return some sort of VK_ERROR_* code on failure. I haven't run a search on the Vulkan 1.3 manual pages, but I expect the results would be the same.

Since these are the only two functions in Vulkan that do this, I believe it would be appropriate to note explicitly that the return value exists for historical reasons only and that it's safe to assume the functions always return VK_SUCCESS.

@oddhack
Copy link
Contributor

oddhack commented Mar 8, 2024

Since these are the only two functions in Vulkan that do this, I believe it would be appropriate to note explicitly that the return value exists for historical reasons only and that it's safe to assume the functions always return VK_SUCCESS.

I think there are a handful of vendor extensions with similar lack-of-error behavior, but agreed, not many. One possibility is to recognize when a command has no error codes in the XML and generate some boilerplate about that situation, if that would help.

@krOoze
Copy link
Contributor

krOoze commented Mar 9, 2024

@oddhack Already ahead in that direction: #2325

@adrianlopezroche
Copy link
Author

@oddhack Already ahead in that direction: #2325

I believe this would make things clear enough.

@TomOlson
Copy link

#2325 is a good fix for this case. Dropping in a note to the WG to also consider vkResetCommandPool, which is listed as returning either VK_SUCCESS or OODM. Did we intend this, or should it also be logically void, i.e. not allowed to fail?

@krOoze
Copy link
Contributor

krOoze commented Mar 12, 2024

@TomOlson Please also consider vkResetCommandBuffer, vkResetFences, and vkResetEvent.

@oddhack
Copy link
Contributor

oddhack commented Mar 13, 2024

@TomOlson as we discussed just now, the couple of commands that explicitly have VK_ERROR_UNKNOWN errorcodes in the XML should probably also be considered in this context. I suspect it's just an oversight, these are from fairly obscure vendor extensions.

@oddhack oddhack closed this as completed Mar 13, 2024
@oddhack oddhack reopened this Mar 13, 2024
@TomOlson
Copy link

Tracking internally as vulkan#3824: vkResetCommandPool, vkResetCommandBuffer, vkResetFences, and vkResetEvent, plus any documented uses of VK_ERROR_UNKNOWN. Informally, it sounds like we do have some implementations that detect and report memory heap exhaustion when resetting command pools, but we'll confirm. We agree that developers should not have to waste time worrying about error returns that will never happen, so we'll fix any cases we find.

@oddhack
Copy link
Contributor

oddhack commented Mar 15, 2024

@TomOlson in addition to those commands, vkAcquireDrmDisplayEXT and vkGetDrmDisplayEXT contain spec language defining conditions under which VK_ERROR_UNKNOWN must be returned (though not XML errorcodes, oddly). This is a different form of the problem but probably should be addressed as well, ideally by having them return a meaningful error code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants