-
-
Notifications
You must be signed in to change notification settings - Fork 21.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Metal for macOS x86_64. #96158
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be at least not set as default driver for Intel macs, since most of them do not have Metal 3 capabilities.
if (supportsFamilyMetal3) { | ||
limits.maxPerStageSamplerCount = 1024; | ||
} else { | ||
limits.maxPerStageSamplerCount = 16; | ||
} | ||
if (supportsFamilyMetal3) { | ||
limits.maxPerStageTextureCount = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? It seems to be 16
and 128
for as well.
https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I read the wrong numbers.
int version = (int)props.features.highestFamily; | ||
device.name = vformat("%s (Apple%d)", metal_device.name.UTF8String, version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int version = (int)props.features.highestFamily; | |
device.name = vformat("%s (Apple%d)", metal_device.name.UTF8String, version); | |
if (props.features.highestFamily == MTLGPUFamilyMetal3) { | |
device.name = vformat("%s (Metal3)", metal_device.name.UTF8String); | |
} else { | |
device.name = vformat("%s (Apple%d)", metal_device.name.UTF8String, (int)props.features.highestFamily - (int)MTLGPUFamilyApple1 + 1); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the enthusiasm for wanting to add x86 support!
From the outset, I approached the development of the Metal driver for Apple TBDR GPUs, with a unified memory architecture. This covers all M-series laptops, iPhones, iPads and AppleTVs. This code makes a number of assumptions that are incorrect for non-Apple GPUs, which could lead to memory corruption, or just not work on non-Apple GPUs.
Per this comment, my assumption is that MoltenVK, and their vast experience and presumable hardware access, would support non-Apple silicon (x86 laptops).
for (MTLGPUFamily family = MTLGPUFamilyMetal3; family >= MTLGPUFamilyApple1; --family) { | ||
if ([p_device supportsFamily:family]) { | ||
features.highestFamily = family; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour of this loop is undefined, or at least incorrect. The previous version looped from MTLGPUFamilyApple9
(ordinal value 1009) to MTLGPUFamilyApple1
(ordinal value 1001), and made a maximum of 9 calls to [p_device supportsFamily:family]
with valid values of the MTLGPUFamily
enumeration.
MTLGPUFamilyMetal3
has an ordinal value is 5001
. This version could make 1000's of additional calls, and will likely produce undefined behaviour depending on the behaviour of [p_device supportsFamily:family]
with undefined values.
For reference, this is the definition of MTLGPUFamily
enumeration:
typedef NS_ENUM(NSInteger, MTLGPUFamily)
{
MTLGPUFamilyApple1 = 1001,
MTLGPUFamilyApple2 = 1002,
MTLGPUFamilyApple3 = 1003,
MTLGPUFamilyApple4 = 1004,
MTLGPUFamilyApple5 = 1005,
MTLGPUFamilyApple6 = 1006,
MTLGPUFamilyApple7 = 1007,
MTLGPUFamilyApple8 = 1008,
MTLGPUFamilyApple9 = 1009,
MTLGPUFamilyMac1 API_DEPRECATED_WITH_REPLACEMENT("MTLGPUFamilyMac2", macos(10.15, 13.0), ios(13.0, 16.0)) = 2001,
MTLGPUFamilyMac2 = 2002,
MTLGPUFamilyCommon1 = 3001,
MTLGPUFamilyCommon2 = 3002,
MTLGPUFamilyCommon3 = 3003,
MTLGPUFamilyMacCatalyst1 API_DEPRECATED_WITH_REPLACEMENT("MTLGPUFamilyMac2", macos(10.15, 13.0), ios(13.0, 16.0)) = 4001,
MTLGPUFamilyMacCatalyst2 API_DEPRECATED_WITH_REPLACEMENT("MTLGPUFamilyMac2", macos(10.15, 13.0), ios(13.0, 16.0)) = 4002,
MTLGPUFamilyMetal3 API_AVAILABLE(macos(13.0), ios(16.0)) = 5001,
} API_AVAILABLE(macos(10.15), ios(13.0));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, not good.
As the current implementation potentially lacks memory safety/coherency/consistency on the x86 based platforms, lets bin this PR; my bad for assuming things were ok after a little playing around with no obvious issues.
Further to your comment, the MoltenVK workarounds are probably not worth the trouble of implementing and supporting, given that Apple may deprecate x86 hardware next year.
This PR enables Metal for macOS x86_64.
Additionally,
rendering_context_driver_metal.mm
now logs the actual Metal GPU family, as the previously computed value was incorrect.