Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Metal for macOS x86_64. #96158

Closed
wants to merge 1 commit into from
Closed

Enable Metal for macOS x86_64. #96158

wants to merge 1 commit into from

Conversation

Ozomahtli
Copy link

This PR enables Metal for macOS x86_64.

Additionally, rendering_context_driver_metal.mm now logs the actual Metal GPU family, as the previously computed value was incorrect.

Copy link
Member

@bruvzg bruvzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be at least not set as default driver for Intel macs, since most of them do not have Metal 3 capabilities.

Comment on lines 280 to 286
if (supportsFamilyMetal3) {
limits.maxPerStageSamplerCount = 1024;
} else {
limits.maxPerStageSamplerCount = 16;
}
if (supportsFamilyMetal3) {
limits.maxPerStageTextureCount = 1024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? It seems to be 16 and 128 for as well.

https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I read the wrong numbers.

Comment on lines 57 to 58
int version = (int)props.features.highestFamily;
device.name = vformat("%s (Apple%d)", metal_device.name.UTF8String, version);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int version = (int)props.features.highestFamily;
device.name = vformat("%s (Apple%d)", metal_device.name.UTF8String, version);
if (props.features.highestFamily == MTLGPUFamilyMetal3) {
device.name = vformat("%s (Metal3)", metal_device.name.UTF8String);
} else {
device.name = vformat("%s (Apple%d)", metal_device.name.UTF8String, (int)props.features.highestFamily - (int)MTLGPUFamilyApple1 + 1);
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@akien-mga
Copy link
Member

CC @stuartcarnie

Copy link
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the enthusiasm for wanting to add x86 support!

From the outset, I approached the development of the Metal driver for Apple TBDR GPUs, with a unified memory architecture. This covers all M-series laptops, iPhones, iPads and AppleTVs. This code makes a number of assumptions that are incorrect for non-Apple GPUs, which could lead to memory corruption, or just not work on non-Apple GPUs.

Per this comment, my assumption is that MoltenVK, and their vast experience and presumable hardware access, would support non-Apple silicon (x86 laptops).

Comment on lines +83 to 86
for (MTLGPUFamily family = MTLGPUFamilyMetal3; family >= MTLGPUFamilyApple1; --family) {
if ([p_device supportsFamily:family]) {
features.highestFamily = family;
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour of this loop is undefined, or at least incorrect. The previous version looped from MTLGPUFamilyApple9 (ordinal value 1009) to MTLGPUFamilyApple1 (ordinal value 1001), and made a maximum of 9 calls to [p_device supportsFamily:family] with valid values of the MTLGPUFamily enumeration.

MTLGPUFamilyMetal3 has an ordinal value is 5001. This version could make 1000's of additional calls, and will likely produce undefined behaviour depending on the behaviour of [p_device supportsFamily:family] with undefined values.

For reference, this is the definition of MTLGPUFamily enumeration:

typedef NS_ENUM(NSInteger, MTLGPUFamily)
{
    MTLGPUFamilyApple1  = 1001,
    MTLGPUFamilyApple2  = 1002,
    MTLGPUFamilyApple3  = 1003,
    MTLGPUFamilyApple4  = 1004,
    MTLGPUFamilyApple5  = 1005,
    MTLGPUFamilyApple6  = 1006,
    MTLGPUFamilyApple7  = 1007,
    MTLGPUFamilyApple8  = 1008,
    MTLGPUFamilyApple9  = 1009,
    
    MTLGPUFamilyMac1 API_DEPRECATED_WITH_REPLACEMENT("MTLGPUFamilyMac2", macos(10.15, 13.0), ios(13.0, 16.0)) = 2001,
    MTLGPUFamilyMac2 = 2002,
    
    MTLGPUFamilyCommon1 = 3001,
    MTLGPUFamilyCommon2 = 3002,
    MTLGPUFamilyCommon3 = 3003,
    
    MTLGPUFamilyMacCatalyst1 API_DEPRECATED_WITH_REPLACEMENT("MTLGPUFamilyMac2", macos(10.15, 13.0), ios(13.0, 16.0)) = 4001,
    MTLGPUFamilyMacCatalyst2 API_DEPRECATED_WITH_REPLACEMENT("MTLGPUFamilyMac2", macos(10.15, 13.0), ios(13.0, 16.0)) = 4002,
    
    MTLGPUFamilyMetal3 API_AVAILABLE(macos(13.0), ios(16.0)) = 5001,
} API_AVAILABLE(macos(10.15), ios(13.0));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, not good.

As the current implementation potentially lacks memory safety/coherency/consistency on the x86 based platforms, lets bin this PR; my bad for assuming things were ok after a little playing around with no obvious issues.

Further to your comment, the MoltenVK workarounds are probably not worth the trouble of implementing and supporting, given that Apple may deprecate x86 hardware next year.

@Ozomahtli Ozomahtli closed this Aug 29, 2024
@akien-mga akien-mga removed this from the 4.x milestone Aug 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants