-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vulkan loader/layer interface makes it difficult to query what extensions an ICD supports #866
Comments
I'm going to enumerate what I perceive to be the listed issues.
I don't have any concrete solutions for those problems. Much of the issue is that the design did indeed try to keep layers from having to know/deal with the multiple drivers on the system. One idea might be to allow querying the instance extensions active on a physical device. Downside is this has to occur after an instance is created. But the alternative is to make layers aware of the Drivers on the system, something which isn't possible now. There is no
I'm not sure as to what complexity you are referring to here. Layers should handle unknown function pointers uniformly, in that they simply query the next layer down (save for |
I am referring to the case where a layer intercepts Similarly, for extensions that alter the behaviour of some Vulkan entrypoints, the layer may have to intercept these entrypoints and undo the alterations introduced by the extensions that the layer enabled, but the application is not expecting to be enabled. In an ideal world, the layer would instead tell to the Vulkan Loader which extensions it needs without having to hack the |
I'm not sure its specified what happens when a layer enables an extension. It would stand to reason that a layer should be able to enable an extension, after all, layers can make addition Vulkan API calls while intercepting a Vulkan call. If a layer wishes to use an extension that the application didn't specify, that should be allowed as it follows that a layer which needs to call an extension function must enable that extension first. What isn't clear is all the behavior around that. I see the logic of a layer returning NULL for entrypoints that the application doesn't expect to be valid, as while the layer itself may load those entrypoints, the application didn't enable the extension. However, as far as the driver is concerned, the extension was enabled. There is no issue for the application to call those entrypoints. This distinction leads me to think that layers shouldn't be responsible for returning NULL here, the extension is being enabled regardless. I don't see anywhere in the specification which describes the behavior. The Vulkan-Loader will emit a debug message for any extension not supported by the ICD, but the validation of the extensions happens before calling down the chain. As for a specific solution to layers adding extensions, we could add a new struct to the pNext chain that holds the necessary info in it. I think its not impossible to add a private loader-layer function which lets layers ask which physical devices support instance extensions. |
Implicit layers that intend to implement a Vulkan extension in a way that is transparent to the application would, however, end up introducing side effects that are "noticeable" to applications. This is why an implicit layer may want to intercept Returning to
I think this would be a good improvement that would benefit developers of Vulkan layers. In particular, it would make it easier to write layers that work properly a multiple-ICD setups. |
A possible declaration for the function could be: typedef bool (VKAPI_PTR *PFN_SupportInstanceExtension)(VkPhysicalDevice physicalDevice, const char* extensionName); With Alternatively, if the vkGetPhysicalDeviceProcAddr function was added to the Vulkan API proper, then a private implementation wouldn't be necessary, as well as allow applications to query if a driver actually supports instance level commands rather than return NULL. This would close the information gap, in that apps can call any physical device level function (from instance extensions) but it wont necessarily do something useful. |
The prototype you propose seems good to me. It would definitely help the use case explained in the description of this issue: decide whether a layer can safely call down a Vulkan command that the layer itself implements. How is this function meant to operate, though? Are you thinking that the function could bypass all the layers and directly query the ICD? Consider the case where there are two layers both implementing MyNewCommand -> Layer A -> Layer B -> ICD Layer A can always call down, while Layer B can call down only if the ICD supports |
Those are very valid concerns. FYI the manifest containing a list of supported device extensions is something I would like to remove, or at least deprecate. It doesn't delineate what 'support' means, and many debugging layers 'intercept' all sorts of device extensions but don't declare it in that manifest. Additionally, support could be 'intercept', it could be 'implement', or it could be both implementing the behavior and calling down the chain. |
In your hypothetical, for Instance & physical device functions, layers always call the loader's terminator for the function, not the driver directly. So Layer B would always have a valid function to call, the loader may just return immediately without calling into the ICD(s). Yes, there is a very large lack of visibility to other layers who does & doesn't intercept a function. I could imagine a solution to this that mirrors the VK_EXT_tooling_info where each layer adds its own information to it, but I find that to be very annoying to write for a layer developer, as it requires everyone to get it right for the output to be useful. |
Looking at typedef bool (VKAPI_PTR *PFN_vk_layerEnumerateICDSupportedInstanceExtensionProperties)(VkPhysicalDevice physicalDevice,
uint32_t* pPropertyCount,
VkExtensionProperties* pProperties); Something I realized while typing these replies out is whether or not a layer should care about individual ICD's supporting instance extensions. The way the loader is written, if 1 ICD doesn't support an instance extension that another does, the loader must handle this situation, and make sure that any calls which hit the terminator wont cause a problem. From my vantage point, it seems that layers trying to cater to a specific ICD are going about things backwards, as layers aren't supposed to care about individual ICD's. Could you explain the rationale for needing to know which ICD supports which instance extension? I should add that with Vulkan 1.1/ |
This problem emerged while working on the layer at https://gitlab.freedesktop.org/mesa/vulkan-wsi-layer In this layer we are implementing surface extensions like VK_KHR_wayland_surface and VK_EXT_headless_surface. One problem we have is to decide whether we should handle entrypoints ourselves or fall back to the ICD, in case they implement these extensions themselves. See (note) below. For example, our layer implements VK_EXT_headless_surface, but also some ICDs might. We thought we'd handle these situations by preferring to use the ICD's implementations, when available. The problem is that from the layer's position in the stack it is not possible to know whether calling down to the ICD would work. Instance extensions returned by the Vulkan API include those advertised by the layer. So VK_EXT_headless_surface is always included. Also, calling I guess one solution could be to change the terminators to assume the layer may themselves implement the entrypoint. I suspect this may be problematic as some entrypoints are expected to do something when they are exposed via (note) VK_*_surface are somewhat odd extensions. They are partly implemented by the loader, but have consequences on the ICDs, as ultimately the physical devices advertise support for these for their queues. They also "leak" into other (device) extensions, such as VK_KHR_swapchain, which are necessarily tied to the windowing systems.
What we found in practice is that the loader terminator function complains because the ICDs do not implement the entrypoint. Consider the example the case I discussed before:
The prototype you propose looks interesting. It'd be nice if this was an ordinary Vulkan entrypoint (or would be treated as such), so that layers can intercept it.This would allow a layer to know what instance extensions each ICD, or layers below, support. Another solution I thought about was to change the Vulkan loader implementation of if (NULL == icd_term->dispatch.GetPhysicalDeviceSurfaceSupportKHR) {
// set pSupported to false as this driver doesn't support WSI functionality
*pSupported = false;
loader_log(loader_inst, VULKAN_LOADER_ERROR_BIT, 0,
"ICD for selected physical device does not export vkGetPhysicalDeviceSurfaceSupportKHR!\n");
return VK_SUCCESS;
} However, this is how the loader implements if (!strcmp("vkGetPhysicalDeviceSurfaceSupportKHR", name)) {
*addr = loader_inst->wsi_surface_enabled ? (void *)vkGetPhysicalDeviceSurfaceSupportKHR : NULL;
return true;
} It could be argued the loader should not return a function pointer unless it is guaranteed to work. Therefore the condition One option would be to also check that |
The Vulkan loader architecture and implementation shield Vulkan applications as well as the Vulkan ICDs from the complexity of having multiple drivers accessible via a single Vulkan API.
Unfortunately, Vulkan layers can find themselves handling some of this complexity themselves. In particular, a Vulkan layer wanting to implement a Vulkan extension has to intercept a number of instance and device commands. In some circumstances, it is not straightforward for the layer to decide whether it should handle the call itself or it should rather hand it over to the loader and the ICDs.
For example, if the layer implements an instance extension, then it cannot easily find out whether the same instance extension is supported by the ICDs in the system. In particular,
vkGetInstanceProcAddr
is callable. The Vulkan loader may enable instance trampolines for extensions that the layer implements, even when there are no ICDs that implement that extension. In that case, by callingvkGetInstanceProcAddr
the layer obtains pointers to the loader trampolines. When called, these trampolines abort as they are unable to find any ICD implementation for the corresponding Vulkan commands.Another problem encountered by layers is enabling additional extensions that they need in order to operate. See for example #51. I understand, the conclusion is that enabling device extensions may be fine, but enabling instance extensions may be problematic. The boundary of what a layer is or is not able to do is not very well defined / documented. Also - I suspect - layers enabling additional extensions behind the application's back (e.g. by intercepting
vkCreateDevice
and adding more extensions toppEnabledExtensionNames
) should take care to interceptvkGet*ProcAddr
and return NULL for commands that the applications is not expecting to be enabled. It'd help to have some of this complexity, that all layers have to deal with, moved to the loader.See also: https://gitlab.freedesktop.org/mesa/vulkan-wsi-layer/-/issues/18 where this discussion started
The text was updated successfully, but these errors were encountered: