-
-
Notifications
You must be signed in to change notification settings - Fork 21.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add optional driver workaround to RenderingDevice for Adreno 6XX. #91514
Conversation
d603c06
to
a830e16
Compare
Clay has added the device detection and confirmed this to be working. |
I just tested and can confirm this PR fixes it! Great work everyone! Shoutout to @clayjohn @DarioSamo and @TokageItLab !! :) |
The device detection code has the same problem as checking for "Windows 9". It will falsely flag Adreno 6000 if it ever comes out. The proper way to check is via vendor id + device id + name string. Like this: if( physical_device_properties.vendorID == 0x5143 && // Qualcomm
physical_device_properties.deviceID >= 0x6000000 && // Adreno 6xx
physical_device_properties.deviceID < 0x7000000 &&
strstr( physical_device_properties.deviceName, "Turnip" ) == nullptr
)
{
// Enable workaround.
} Of course there's still always the chance Qualcomm one day starts using unused deviceID entries in the range We still use string evaluation to rule out Turnip (i.e. Mesa's FOSS driver). |
@darksylinc I agree with your comment, but I'll leave @clayjohn to make the decision on that as he coded the current detection. I have no problems with upgrading it to do that. |
That sounds great to me! I just copied what we do for OpenGL which comes with a TODO: godot/drivers/gles3/storage/config.cpp Lines 170 to 173 in d8aa2c6
Edit: On second thought, this will not be trivial. It seems to rely on Vulkan-specific properties that we don't expose through the RD right now. So implementing @darksylinc's solution will require exposing a lot more information than we currently have access to. |
Vendor ID is already present ( |
This is a vulkan-specific workaround though. Shouldn't the flag be set at the Vk implementation level? |
It's a bit of a mixed bag as the workaround must be implemented at a level above the driver due to how the ARG works, so detecting the workaround could be one of two possibilities:
|
I think these workarounda should be centralized in a global section so they can be documented and analyzed if they're still needed. e.g. struct Workarounds
{
// Explanation about the workaround.
bool adreno_compute_clip = false;
};
Workarounds workarounds; Thinking about proper design is pointless IMHO because by its very nature workarounds can be anything, anywhere and cross multiple isolation boundaries. Thus a global bool is the best choice, all in one place. |
a830e16
to
94aef4b
Compare
I've pushed a new version of the workaround detection based on driver version instead. For this I moved the logic specifically to be a part of the Vulkan driver. The workarounds structure is now part of RenderingContextDriver's Device structure, which makes it pretty straightforward to only enable it where it's relevant and with as much information as possible. The only additional question here is if we want to specifiy a driver version range or just enable this on all devices previous to this version. Right now, this PR is implementing the latter. Tagging @clayjohn and @TCROC to test again if possible to confirm the detection works. |
I tested and it worked! :) |
Co-authored-by: Clay John <[email protected]>
bc72c0c
to
d5789e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Let's go ahead and merge.
For maintainers, this shouldn't be cherrypicked
Thanks! |
p_device_properties.driverVersion < VK_MAKE_VERSION(512, 503, 0) Somewhere I read that the On my Adreno 6xx (Android 13):
I also tried your //#ifdef ANDROID_ENABLED
print_line("======== Workarounds ========");
print_line("avoid_compute_after_draw: ", r_device.workarounds.avoid_compute_after_draw);
print_line("avoid_render_graph_reorder: ", r_device.workarounds.avoid_render_graph_reorder);
print_line("-----------------------------");
print_line("name: ", r_device.name);
print_line("vendor: ", r_device.vendor);
print_line("deviceID: ", p_device_properties.deviceID);
uint32_t variant = VK_API_VERSION_VARIANT(p_device_properties.driverVersion);
uint32_t major = VK_API_VERSION_MAJOR(p_device_properties.driverVersion);
uint32_t minor = VK_API_VERSION_MINOR(p_device_properties.driverVersion);
uint32_t patch = VK_API_VERSION_PATCH(p_device_properties.driverVersion);
print_line("driverVersion: ", vformat("%d.%d.%d.%d", variant, major, minor, patch));
uint32_t version = p_device_properties.driverVersion;
print_line("clayjohn driverVersion: ", vformat("%s.%s.%s.%s",
itos(VK_API_VERSION_VARIANT(version)),
itos(VK_API_VERSION_MAJOR(version)),
itos(VK_API_VERSION_MINOR(version)),
itos(VK_API_VERSION_PATCH(version))));
//#endif // ANDROID_ENABLED
If it's wrong, maybe the solution is somewhere there: https://github.com/SaschaWillems/VulkanCapsViewer |
@Alex2782 Here is the code that GPUinfo uses https://github.com/SaschaWillems/vulkan.gpuinfo.org/blob/1e6ca6e3c0763daabd6a101b860ab4354a07f5d3/functions.php#L294 It looks like NVidia and one other vendor differ from the standard Vulkan mapping in a known way. He just falls back to the Vulkan mapping for everyone else. In this case, it looks like it might be using the old vulkan mapping (which excludes the variant version) I ran a quick test with the old versions vs. the new versions
This prints:
In hindsight, we are using the old |
That is correct, However Qualcomm follows the same convention of 10 bits for major (>> 22), 10 for minor (>> 12), and 12 for patch (& 0xfff). Later Vulkan added that's where the confusion is going on:
Takeaways:
|
Fixes #90459. This was a perceived regression by the end user as #84976 was the one that introduced the crash. However, the reality turned out to be the graph exposed a driver bug we were successfully dodging in Mobile. This very well could explain why Forward+ was prone to crashing on this family of devices.
Quoting from the PR itself for the in-depth explanation.
TODO: