Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow building on Windows using Clang without MinGW #77233

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SlugFiller
Copy link
Contributor

@SlugFiller SlugFiller commented May 19, 2023

Changes to the SCons script so that building with use_llvm=yes and use_mingw=no is possible on Windows. Additionally, both use_llvm and use_mingw can now be specified in custom.py, so it is no longer necessary to specify them in the command line on every build. Also, LLVM builds on Windows now produce a PDB by default. The option use_dwarf is added for creating DWARF debug symbols with LLVM instead.

LLVM is easier to install and update than MSVC and MinGW-LLVM. It produces faster executables than MinGW. And allows generating debug symbols in PDB format, which can be used by any standard debugging and profiling tools on Windows.

However, LLVM lacks compatibility certain assembly and intrinsics that are vendor-specific to either MinGW or MSVC. As a result, architecture-specific acceleration in libtheora in R128 is disabled.

Testers needed

This PR was tested on a system where MSVC is installed. LLVM automatically detects the presence of MSVC libraries and linker, and makes use of them. However, this should also work on a "clean" system with nothing but LLVM and Windows SDK installed. This requires testing, however, since lld-link is not guaranteed to be full compatible with the MSVC linker.

Bugsquad edit (keywords for easier searching): clang-cl

@SlugFiller SlugFiller requested review from a team as code owners May 19, 2023 10:30
@SlugFiller SlugFiller force-pushed the scons-windows-llvm-no-mingw branch 2 times, most recently from ba40552 to 2132302 Compare May 19, 2023 10:36
@SlugFiller SlugFiller force-pushed the scons-windows-llvm-no-mingw branch from 2132302 to 73705f0 Compare May 19, 2023 11:06
@Calinou
Copy link
Member

Calinou commented May 19, 2023

LLVM is easier to install and update than MSVC and MinGW-LLVM. It produces faster executables than MinGW.

Note that this assertion may not hold true with GDScript, which has been optimized with GCC/LLVM compilers in mind rather than MSVC. Likewise, MSVC's LTO equivalent (LTCG) is known to cause issues with Godot, whereas LTO works just fine.

@SlugFiller
Copy link
Contributor Author

GDScript is optimized? My performance assertion is based on benchmarking my Bently-Ottmann implementation, which was 60 FPS (MSVC, LLVM) vs 40 FPS (MinGW) on equal load. Is there any GDScript benchmark which shows a similar difference between MinGW and MSVC/LLVM?

Similarly, is there an MVP showing issues with LTO on MSVC? It would require testing on LLVM set-up as well. I did notice that enabling LTO as a compile flag instead of link flag causes linking to fail (saying the object file is corrupt). Is this what you mean?

@Faless
Copy link
Collaborator

Faless commented May 21, 2023

GDScript is optimized?

GDScript VM uses computed gotos (i.e. the labels as values extension) in places of switches when compiling with GCC or LLVM (MSVC does not support it).

Is there any GDScript benchmark which shows a similar difference between MinGW and MSVC/LLVM?

See #11518 for an example which you can use to compare the two implementations.

@SlugFiller
Copy link
Contributor Author

@Faless Okay, so I used the provided benchmark. I modified it a bit to be an EditorScript, and also made some changes because it didn't work in Godot 4.

@tool
extends EditorScript

func _run() -> void:
	var time_before = Time.get_ticks_msec()

	var r
	for i in range(10000000):
		var next
		var first = 1
		var second = 2
		for c in range(40):
			next = first + second
			first = second
			second = next
		r = next
	print(r)

	var total_time = Time.get_ticks_msec() - time_before
	print("Time taken: " + str(total_time))

I tested it with my LLVM build, vs a build I made with the latest MinGW from Chocolatey (no LLVM):
MinGW results:

MinGW
433494437
Time taken: 32258
433494437
Time taken: 31589
433494437
Time taken: 32091

LLVM results:

433494437
Time taken: 38229
433494437
Time taken: 40386
433494437
Time taken: 38032

So, clearly, this optimization is big. But there's a catch: This optimization is limited to GNU, but the feature that it uses exists in CLang since 5.0. Making this small change in modules/gdscript/gdscript_vm.cpp:

#if defined(__GNUC__) || defined(__clang__)

Gives the following results on LLVM build:

433494437
Time taken: 32916
433494437
Time taken: 32765
433494437
Time taken: 33212

Should I add this change to this PR, or make a separate PR for it?

P.S. Adding type hints (: int) to the benchmark, gives the following result:

433494437
Time taken: 20510

Although I don't think that takes away from the improvement of using calculated goto.

P.P.S. For some reason my MinGW build is giving an error spam about connecting signals. I haven't done anything special with it, other than release with debug symbols, and lto=full. The issue isn't present in the LLVM build, with the same settings (Although LLVM debug symbols are PDB, and MinGW are DWARF).

@akien-mga
Copy link
Member

akien-mga commented May 22, 2023

But there's a catch: This optimization is limited to GNU, but the feature that it uses exists in CLang since 5.0. Making this small change in modules/gdscript/gdscript_vm.cpp:

That's weird, AFAIK Clang does define __GNUC__ too. Or does it depend on whether it's Clang + GNU libstdc++ or Clang + LLVM libc++?

If this is a difference specific to LLVM-on-Windows as opposed to LLVM on Linux or Apple LLVM (which AFAIK both pose as __GNUC__), then there might be a bunch of places where we made the assumption that __GNUC__ covers Clang which might need changes... Or we could define __GNUC__ manually for LLVM-on-Windows if it doesn't make things explode (I assume they had a reason not to define it for that platform).

@SlugFiller
Copy link
Contributor Author

It might be dependent. It's also important to remember that not everything that works in GNU works in CLang. For instance, I had to disable every feature that uses GNU-style inline assembly, because CLang was failing to process it. So force-defining __GNUC__ might indeed make things explode. They share features but are not 100% compatible.

@SlugFiller SlugFiller force-pushed the scons-windows-llvm-no-mingw branch from 73705f0 to a9f77ec Compare May 23, 2023 09:19
@SlugFiller SlugFiller requested a review from a team as a code owner May 23, 2023 09:19
@SlugFiller SlugFiller force-pushed the scons-windows-llvm-no-mingw branch from a9f77ec to 8ae859f Compare July 3, 2023 19:21
@SlugFiller
Copy link
Contributor Author

Rebased, and removed Mbed-TLS patch notes since it was merged upstream (master and LTS)

@akien-mga akien-mga changed the title Allow building on Windows using CLang without MinGW Allow building on Windows using Clang without MinGW Jun 20, 2024
@kisg
Copy link
Contributor

kisg commented Oct 2, 2024

I just learned of this.

I think that at least the Clang GDScript optimization should be prioritized, because both Apple and Android platforms build using Clang, and thus on these platforms Godot could receive a ~10+% GDScript performance improvement with minimal work.

@akien-mga Would you be willing to accept just the Clang GDScript optimization in a separate PR?

@SlugFiller Would you have time to create a separate PR for just this? (I saw that you already offered previously.)

@akien-mga
Copy link
Member

akien-mga commented Oct 2, 2024

Yeah it makes sense to me to split anything that's not related to making Windows Clang builds specifically to its own PR(s).

An optimization for GDScript on all Clang platforms should definitely be standalone as it will impact more than just Windows.

Edit: As discussed in #97727 the optimization was actually already active for most platforms where we use Clang for official builds (macOS, iOS, Android, Web). Those define __GNUC__ for compatibility with GCC.
But this will benefit our Windows arm64 builds which use clang-cl which may not pose as __GNUC__ - IIUC, I didn't check all compilers to confirm.


As for the Windows Clang build, I admit I'm a bit confused by the multiple options in that space:

CC @bruvzg @Repiteo

@SlugFiller
Copy link
Contributor Author

This PR has indeed become a bit of a chimera. It achieves a few things.

For said main purpose, I'm not sure this gives any advantage over #92316. During the link process, differences between clang and clang-cl are largely erased. Except perhaps any C++ ABI used in GDExtensions. But that would be a bug in GDExtensions, since C++ does not have standard ABI, hence any plugin interface should use C ABI only. It might have effect for any LibGodot usage, but at that point, it might be better to just use MinGW-LLVM. I'm fairly sure raw LLVM ABI is compatible with GCC (Did not test, though).

As an added bonus, I tested both #92316 and a cleaned up version of this PR, and both use MSVC as a standard library, triggering the ThorVG issue if MSVC2017 is installed. I still need to test if there's a way to use it directly with Windows SDK. But, if there is, it likely applies to both PRs as well.

@bruvzg
Copy link
Member

bruvzg commented Oct 2, 2024

C++ ABI used in GDExtensions

Extension API is pure C. C++ is used on both sides but never interact directly, so it should not be an issue.

@SlugFiller SlugFiller force-pushed the scons-windows-llvm-no-mingw branch from 8ae859f to a5b1284 Compare October 2, 2024 15:13
@SlugFiller SlugFiller force-pushed the scons-windows-llvm-no-mingw branch from a5b1284 to 5075d6b Compare October 2, 2024 15:21
@SlugFiller
Copy link
Contributor Author

Rebased, and cleaned up the extra fluff. However, I'm considering moving this to draft, since it has a couple of issues:

  • It conflicts with Add support for compiling with VS clang-cl toolset #92316. For the two to coexists, there needs to be some way to specify if clang or clang-cl is desired.
  • Features that weren't building were simply disabled. This is far from a clean solution, and it's unclear if there's a way to force them to work.

And, as stated above, the advantage of using this over clang-cl is not clear. But I'm leaving this up if anyone wants to test.

Extension API is pure C

I recall discussion about pointers to C++ objects being passed for use as non-opaque objects. It was mentioned in relation to integration of another language. While not as volatile as name-mangling, differences in virtual function table alignment or calling convention could still cause incompatibilities. However, it was a long time ago, and is probably no longer relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants