You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both Vulkan and OpenGL support up to 65536 bytes of groupshared memory. And Vulkan even more commonly supports up to 49152 bytes of groupshared memory (even when taking into acount integrated graphics cards in the stats).
I personally would be happy with a new 49152 byte limit, but ideally it should be the maximum the hardware supports. For example, NVIDIA GPUs support up to 164KB of groupshared memory (albeit with stealing a certain portion of the unified data cache).
I know that DirectX very much wants to only make changes when all hardware supports a feature (in most cases), but I want to argue the following: there is inherent value in letting developers make vendor specific shader permutations which negates most of the compatiblity factor (they're dealing with all that themselves). Developers should be able to make specific microarchitecture/vendor-specific optimisations regardless. Bear in mind DirectX did for example support tensor cores and RT cores back meanwhile; only NVIDIA had this in hardware.
The text was updated successfully, but these errors were encountered:
Both Vulkan and OpenGL support up to 65536 bytes of groupshared memory. And Vulkan even more commonly supports up to 49152 bytes of groupshared memory (even when taking into acount integrated graphics cards in the stats).
I personally would be happy with a new 49152 byte limit, but ideally it should be the maximum the hardware supports. For example, NVIDIA GPUs support up to 164KB of groupshared memory (albeit with stealing a certain portion of the unified data cache).
I know that DirectX very much wants to only make changes when all hardware supports a feature (in most cases), but I want to argue the following: there is inherent value in letting developers make vendor specific shader permutations which negates most of the compatiblity factor (they're dealing with all that themselves). Developers should be able to make specific microarchitecture/vendor-specific optimisations regardless. Bear in mind DirectX did for example support tensor cores and RT cores back meanwhile; only NVIDIA had this in hardware.
The text was updated successfully, but these errors were encountered: