Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCC #80

Merged
merged 43 commits into from Aug 28, 2022
Merged

GCC #80

merged 43 commits into from Aug 28, 2022

Conversation

ghost
Copy link

@ghost ghost commented Aug 26, 2022

Now compiles on GCC and Clang.

@ArtemisX64
Copy link
Contributor

Can confirm using gcc also fixes the final linking error

@ArtemisX64
Copy link
Contributor

ArtemisX64 commented Aug 27, 2022

Weird but it no longer compiles on clang(clang 12)
Oops as expected. Won't compile on Windows too. (Checked in actions on my repo)

@ArtemisX64
Copy link
Contributor

It will work fine in clang14(libstdc++12)

Copy link
Contributor

@Tachi107 Tachi107 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I've left a few comments below.

src/Cafe/Account/Account.cpp Show resolved Hide resolved
src/Cafe/CMakeLists.txt Outdated Show resolved Hide resolved
src/Cafe/HW/Espresso/PPCTimer.cpp Outdated Show resolved Hide resolved
src/Common/precompiled.h Outdated Show resolved Hide resolved
src/Common/precompiled.h Outdated Show resolved Hide resolved
src/Common/precompiled.h Outdated Show resolved Hide resolved
src/Common/precompiled.h Outdated Show resolved Hide resolved
src/Common/precompiled.h Outdated Show resolved Hide resolved
src/config/XMLConfig.h Show resolved Hide resolved
@Tachi107
Copy link
Contributor

Tachi107 commented Aug 27, 2022 via email

@ArtemisX64
Copy link
Contributor

What kind of UB did you encounter?

-- OpenPGP key: 66DE F152 8299 0C21 99EF A801 A8A1 28A8 AB1C EE49

Can't recollect. But I think it will be fine. (As long as it receives uint64_t as a param)

@Tachi107
Copy link
Contributor

Tachi107 commented Aug 27, 2022

What kind of UB did you encounter?

Can't recollect. But I think it will be fine. (As long as it receives uint64_t as a param)

Maybe the issue was that the parameter type was uint64 instead of uint64_t, and numeric_limits<uint64>​::​digits doesn't work on some platforms?

@ghost ghost marked this pull request as ready for review August 27, 2022 18:46
@ghost ghost requested a review from Crementif August 28, 2022 12:29
Copy link
Member

@Crementif Crementif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Crementif Crementif merged commit 454b587 into cemu-project:main Aug 28, 2022
@ghost ghost deleted the gcc branch August 28, 2022 13:33
@ghost ghost mentioned this pull request Aug 28, 2022
@bitscher
Copy link
Contributor

bitscher commented Sep 2, 2022

The addition of the -mavx2 flag is causing issues on pre-Haswell CPUs #144

@ghost
Copy link
Author

ghost commented Sep 2, 2022

-mavx2 is required for all the _mm256_... functions.
-mssse3 is for _mm_shuffle_epi8

Other instructions used in Cemu:

  • sse _mm_prefetch, _mm_pause, _mm_setcsr
  • sse2 _mm_mfence, _mm_set_epi8, _mm_set_epi16, _mm_loadu_si128, _mm_store_si128
  • sse4.1 _mm_min_epu16, _mm_max_epu16

So it may actually be a good idea to add additional flags.

@Tachi107
Copy link
Contributor

Tachi107 commented Sep 2, 2022

What about adding -march=x86-64-v2 and trying to stick to the x86_64-v2 psABI?

Edit: nevermind, AVX2 is part of v3

@Exzap
Copy link
Member

Exzap commented Sep 2, 2022

Both clang and gcc have a way to selectively enable CPU extensions for individual functions. Here is an example. It's better to do it this way than to set -mavx2 or similar globally as Cemu doesn't have a hard requirement on any of the optional CPU extensions.

@ghost
Copy link
Author

ghost commented Sep 2, 2022

Both clang and gcc have a way to selectively enable CPU extensions for individual functions. Here is an example. It's better to do it this way than to set -mavx2 or similar globally

I agree, I can make a pr changing this if you wish but I don't think it would help insofar as compiling Cemu for pre-haswell CPUs on GCC.
From https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

For instance, on an x86, you could declare one function with the target("sse4.1,arch=core2") attribute and another with target("sse4a,arch=amdfam10"). This is equivalent to compiling the first function with -msse4.1 and -march=core2 options, and the second function with -msse4a and -march=amdfam10 options. It is up to you to make sure that a function is only invoked on a machine that supports the particular ISA it is compiled for

@Exzap
Copy link
Member

Exzap commented Sep 2, 2022

I agree, I can make a pr changing this if you wish

It would be appreciated.

I don't think it would help insofar as compiling Cemu for pre-haswell CPUs on GCC.

Why not? As it says in your quote from the spec, you have to manually make sure the CPU extensions are supported at runtime before running the function with the intrinsics. This is something we already do. Optionally running the hand-optimized functions if the CPU extensions are supported and otherwise falling back to a generic implementation that can run on any CPU. The code for handling GPU drawcall indices is a good example. On Windows this is all pretty well tested and people run Cemu with the most ancient of CPUs.

@ghost
Copy link
Author

ghost commented Sep 2, 2022

Alright, I didn't realise you already checked this, thanks.

@ghost
Copy link
Author

ghost commented Sep 2, 2022

Here you go #152

@ghost
Copy link
Author

ghost commented Sep 2, 2022

I must have missed these clang specific pragmas when I made #80. Sorry about that.

#pragma clang attribute push (__attribute__((target("avx2"))), apply_to=function)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants