Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[android] Update to NDK 23 #39045

Merged
merged 1 commit into from
Oct 16, 2021
Merged

Conversation

finagolfin
Copy link
Member

@finagolfin finagolfin commented Aug 25, 2021

The latest Long Term Support NDK finally removed binutils, including the bfd/gold linkers and libgcc. This simplifies our Android support, including making lld the default linker for Android. Disable three reflection tests that now fail, likely related to issues with swift-reflection-dump and switching to lld.

Also, add the libatomic dependency for Android armv7, just as on linux.

I have updated my daily Android CI to use the new NDK 23, finagolfin/swift-android-sdk#14, and will be releasing Swift 5.4 Android SDKs built against the new NDK soon. Using a version of this pull and a couple other patches, like libdispatch and #36917, I was able to cross-compile the trunk stdlib, corelibs, and SPM for Android with the latest official trunk snapshot build for linux.

@drodriguez, this may break the trunk Android CI, which you've said uses the old Android NDK 19c. If it does, we should update that CI to NDK 23. We shouldI have also removed the now unused --android-ndk-gcc-version build flag, but I left that for now.

module malloc {
header "${GLIBC_INCLUDE_PATH}/malloc.h"
export *
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed because Foundation won't build without it, as I noted with NDK 22 earlier this year. Something about the NDK update gets the Swift compiler to not find that declaration from that header anymore, not sure what.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

malloc.h is deprecated and should not be included. It is supposed to be included from stdlib.h. This is a Linux-ism. That direct include from CoreFoundation can probably be replaced for stdlib.h and the problem will go away.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will try it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried the following patch, makes no difference:

diff --git a/CoreFoundation/Base.subproj/CoreFoundation_Prefix.h b/CoreFoundation/Base.subproj/CoreFoundation_Prefix.h
index 35e08036..4c06088e 100644
--- a/CoreFoundation/Base.subproj/CoreFoundation_Prefix.h
+++ b/CoreFoundation/Base.subproj/CoreFoundation_Prefix.h
@@ -284,7 +284,7 @@ typedef unsigned long fd_mask;
 #undef interface
 #endif
 
-#include <malloc.h>
+//#include <malloc.h>
 CF_INLINE size_t malloc_size(void *memblock) {
     return malloc_usable_size(memblock);
 }
diff --git a/CoreFoundation/Base.subproj/ForFoundationOnly.h b/CoreFoundation/Base.subproj/ForFoundationOnly.h
index 99407b78..fb469e55 100644
--- a/CoreFoundation/Base.subproj/ForFoundationOnly.h
+++ b/CoreFoundation/Base.subproj/ForFoundationOnly.h
@@ -51,7 +51,7 @@ CF_IMPLICIT_BRIDGING_DISABLED
 #pragma mark - CFRuntime
 
 #if TARGET_OS_LINUX
-#include <malloc.h>
+#include <stdlib.h>
 #elif TARGET_OS_BSD
 #include <stdlib.h> // malloc()
 #elif TARGET_OS_MAC

The error seems to indicate that it doesn't matter where the declaration comes from in the CoreFoundation headers, it cannot use that declaration in an @inlinable Swift function because CoreFoundation is implementation-only:

swift-corelibs-foundation/Sources/Foundation/Data.swift:88:20: error: global function 'malloc' cannot be used in an '@inlinable' function because 'CoreFoundation' was imported implementation-only
return malloc(size)

Rather, I think it was getting malloc() from the Glibc module that referenced prior NDKs, but something changed in the NDK sysroot from 22 onwards so Glibc couldn't find that C declaration anymore. Back then, I tried building with the exact same source and Swift toolchain, but simply passing in the newer NDK and its clang to build-script and I'd get this error, implying an NDK sysroot change was interacting badly with our Swift import configuration somehow.

I found this week that this header addition to the Bionic modulemap works around that but I don't know what causes this problem, unlike the libdispatch issue we discussed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this header from this pull, as we only need it to compile the corelibs, so I'll add it to #38441 instead.

Copy link
Contributor

@drodriguez drodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will see if I can install new NDK in the CI machines. I will see if I can use this changes as is. As far as I understand 19 was already in the new layout.

One thing that doesn't seem to be stressed enough in the changes is that LLD becomes the default and the only option, right? It kinda becomes a requisite?

Edit: A change of NDK is not automatic, we need help from @shahmishal to do the variable change in CI.

lib/Driver/UnixToolChains.cpp Outdated Show resolved Hide resolved
@finagolfin
Copy link
Member Author

As far as I understand 19 was already in the new layout.

Yes, layout won't be an issue, but the lld used back then was much older.

LLD becomes the default and the only option, right? It kinda becomes a requisite?

Yep, the only linker left in the NDK now is lld.

@drodriguez
Copy link
Contributor

The NDK 23 has been installed in the CI machines (in ~/android-ndk-r23/). I have not tested it yet. What is the order that we should do the changes? Should we change to NDK 23 without this changes first, or will these changes work with r19?

About the LLD: you should add that requirement in the documentation, because as far as I can see, LLD was not required before, and the NDK LLD is not being used (the system LLD is used, specially after removing the -B${tools_path} from a couple of places in this commit).

@finagolfin
Copy link
Member Author

What is the order that we should do the changes? Should we change to NDK 23 without this changes first, or will these changes work with r19?

After testing to make sure it works, I think you'll want to do both simultaneously: switch to NDK 23 and merge this pull, as I don't think any of the other change combos will work.

About the LLD: you should add that requirement in the documentation

Is it a requirement if it's the only option in the NDK now? Didn't know if that's worth mentioning, can add it if you want.

LLD was not required before

It was actually the default linker used by this Swift compiler/stdlib build on non-Apple platforms until last May, when Saleem changed that Unix default to gold while setting it to lld for Windows, c6cc769.

the NDK LLD is not being used (the system LLD is used, specially after removing the -B${tools_path} from a couple of places in this commit).

Maybe the system lld is used to link the compiler itself, but I think the NDK lld will be used to link the Android stdlib, which is what we care about.

If you look, the two -B${tools_path} locations were always disabled for lld and always worked, so even if we are using the system lld again, apparently that always worked until it was changed last May. I can now add that line back to make sure the NDK is used for lld for the first time, if you think that's needed.

@finagolfin
Copy link
Member Author

I think you'll want to do both simultaneously

Oh, if you just meant which to do slightly before, probably this pull first, as otherwise it will definitely break because there's no gold linker in the new NDK. Whereas this pull might still work with NDK 19, though the libgcc changes will probably break with that old NDK 19c.

swift_android_prebuilt_host_name(prebuilt_build)
set(${sysroot_var_name} "${SWIFT_ANDROID_NDK_PATH}/toolchains/llvm/prebuilt/${prebuilt_build}/sysroot" PARENT_SCOPE)
string(TOLOWER ${CMAKE_HOST_SYSTEM_NAME} platform)
set(${sysroot_var_name} "${SWIFT_ANDROID_NDK_PATH}/toolchains/llvm/prebuilt/${platform}-x86_64/sysroot" PARENT_SCOPE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be change shortly when Google releases native NDK for Darwin-arm64. I think the previous function was better in case further changes were needed. The Windows/windows change is fine, but it was probably was working because Windows filesystem is normally case insensitive.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always change this later if we ever support that mooted arm64 NDK.

lib/Driver/UnixToolChains.cpp Outdated Show resolved Hide resolved
test/lit.cfg Show resolved Hide resolved
module malloc {
header "${GLIBC_INCLUDE_PATH}/malloc.h"
export *
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

malloc.h is deprecated and should not be included. It is supposed to be included from stdlib.h. This is a Linux-ism. That direct include from CoreFoundation can probably be replaced for stdlib.h and the problem will go away.

Copy link
Contributor

@drodriguez drodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLD from the NDK (or much of the NDK toolchain, for that matter) doesn't seem to be used with this changes (except for the tests), but I might be wrong.

I remember having to install LLD in the CI machines because some change in the defaults. At some point the system LLD was being used, and I think that's what the removed commend from AddSwift.cmake refers about: the system LLD can handle Android just fine. In case other linker was the default, one need to provide the path to the NDK tools. With the changes, I think everyone needs to have LLD installed in their systems.

I will try to test these changes + r19 to see if we can go that way. "Simultaneously" can be quite hard, because changing the CI doesn't depend on me.

@finagolfin
Copy link
Member Author

LLD from the NDK (or much of the NDK toolchain, for that matter) doesn't seem to be used with this changes

It only uses the NDK for me, as I do a standalone stdlib build with an official prebuilt Swift snapshot toolchain for linux with --native-clang-tools-path passing in the NDK clang, then cross-compile the corelibs and SPM with the same setup. You're probably right that this pull doesn't do the same with a fresh build of the Swift toolchain though.

With the changes, I think everyone needs to have LLD installed in their systems.

Yep, but as you noted, that was the default for Android till last May. I'll bring back the tools path and apply it to lld for the first time, as I agree that we should standardize on the NDK lld now.

"Simultaneously" can be quite hard, because changing the CI doesn't depend on me.

It sounds like getting the CI changed is more difficult than getting this pull merged, so let's make sure we have this pull working as we want, then get him to update the CI, and then we can finally merge this pull.

endif()
# Make sure the Android NDK lld is used.
swift_android_tools_path(${CFLAGS_ARCH} tools_path)
list(APPEND result "-B" "${tools_path}")
Copy link
Member Author

@finagolfin finagolfin Aug 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brought the NDK tools path back here and tested it by passing the prebuilt clang that comes with the Swift trunk snapshot toolchain, as that's the clang a fresh build would use too, to --native-clang-tools-path when cross-compiling the standalone stdlib, only to hit another problem: it's trying to link against libgcc. I dug into this difference and the NDK clang hardcodes using compiler-rt instead, along with other changes related to libunwind.

We may be better off changing the clang used to link for Android with a freshly-built Swift toolchain to the NDK clang too, let me know what you think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drodriguez, just waiting to hear from you about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the radio silence. It has been a busy couple of weeks lately.

What I have done:

  • Upgraded the CI machines to Ubuntu 18.04. This was required with the recent drop of support for 16.04, which this machines were using.
  • Checked the rebranch that was supposed to happen soon.
  • NDK r23 was installed on the machines. Not being used, but ready to be used.
  • All the Termux libraries that you required as dependencies are installed in two directories, ready to be used. They are all installed as if they were in Termux, which makes using them a question of adding an include and a library path, which should simplify things (I hope).

What I have not done:

  • Checked the rebranch recently.
  • Checked that this changes work on CI.
  • Checked if Termux ICU can replace the aging ICU libraries that currently CI uses.

I will probably try to find out some time for the rebranch check (because the clock is ticking on that one). I will try to get to the others as soon as I can.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been getting the stdlib built for Android with the Swift-forked clang and I just noticed that clang 13 switches Android to building with compiler-rt and libunwind, llvm/llvm-project@91f8aacc0, which was upstreamed from the NDK clang. Since the upcoming Swift rebranch has that patch too, I think it will break the community Android CI, unless we switch it to Android NDK 23.

We should probably get this NDK 23 pull in before the rebranch: the good news is that I have the Swift-forked clang building the stdlib for Android now, with the patches in my first link. I've just kicked off a build to make sure the validation suite passes for me locally on linux, just as the Android CI runs with --host-test --skip-test-linux, and will update this pull with those changes once that passes. You can push the work with Termux packages till we get the corelibs built with #38441, as those Android library packages are only really needed for that pull, not this one.

@finagolfin
Copy link
Member Author

Alright, I've updated this pull after testing on linux with this command, very similar to the build preset that's run on the Android CI, that I ran separately for both armv7 and aarch64:

./swift/utils/build-script --release --assertions --no-swift-stdlib-assertions --android
--android-ndk /home/butta/android-ndk-r23/ --android-arch armv7 --android-api-level 24
--android-icu-uc /home/butta/swift-5.5-android-armv7-24-sdk/usr/lib/libicuuc.so
--android-icu-uc-include /home/butta/swift-5.5-android-armv7-24-sdk/usr/include/
--android-icu-i18n /home/butta/swift-5.5-android-armv7-24-sdk/usr/lib/libicui18n.so
--android-icu-i18n-include /home/butta/swift-5.5-android-armv7-24-sdk/usr/include/
--android-icu-data /home/butta/swift-5.5-android-armv7-24-sdk/usr/lib/libicudata.so
--test --validation-test --long-test --stress-test --test-optimized --lit-args="-v --time-tests"
--build-swift-static-stdlib --build-swift-static-sdk-overlay --build-swift-stdlib-unittest-extra
--host-test --skip-test-linux -j9 --skip-build-compiler-rt

I also had to apply #30170, which I recently rebased, only because I don't build libicu from source like the preset and CI do, and that pull allows to me link the linux and Android stdlib against the Fedora system libicu package and the Termux Android packages from my Android SDKs instead. I think swiftlang/swift-corelibs-libdispatch#568 is also needed, so I applied that patch too.

Only three tests fail for both arches, all related to reflection, after I split the one linker test apart for lld in this pull: Reflection/capture_descriptors.sil, Reflection/typeref_decoding_imported.swift, and AutoDiff/validation-test/reflection.swift. It appears to be related to switching the linker to lld but I don't know why: maybe you know, @drodriguez?

I also went ahead and got rid of the NDK GCC version variables, since they are unused after NDK 23 ditched binutils.

set(SWIFT_ANDROID_NDK_GCC_VERSION "" CACHE STRING
"The GCC version to use when building for Android. Currently only 4.9 is supported.")
set(SWIFT_ANDROID_NDK_CLANG_VERSION "12.0.5" CACHE STRING
"The Clang version to use when building for Android.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to update this every year when a new LTS NDK comes out.

# We need to add the math library, which is linked implicitly by libc++
list(APPEND result "-lm")
list(APPEND result "-rtlib=compiler-rt")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this line after the rebranch, as this is the default for Android with clang 13, but the next line is the path in the NDK for libunwind.a, so we'll need to pass that since we're linking with the Swift-forked clang.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it since the rebranch already adds it.

@finagolfin
Copy link
Member Author

I looked into the three failing reflection tests and while it's most likely related to this pull switching to lld 12.0.5 from the NDK, these same tests passed for me with lld 12.0.1 a month ago, when run natively on my Android device.

Maybe there's some difference in those lld versions: 42a3da2 from last year says swift-reflection-dump is sensitive to differences in the ELF image. What do you think, @3405691582?

@3405691582
Copy link
Contributor

@buttaface I'm still seeing segfaults in swift-reflection-dump (only in the Reflection/typeref_decoding_concurrency.swift test now though). I would guess the particular problem in SR-12893 still persists though I haven't walked through a recent coredump to try and confirm this.

@finagolfin
Copy link
Member Author

In my case, the dump produces empty output for those three tests, with one exception, the Autodiff test gets swift-reflection-dump to crash, but only for Android armv7.

I just tried switching to linking with the lld 10 that is freshly built with the Swift toolchain and that fails the same way.

@3405691582
Copy link
Contributor

FWIW, the good news is that if this is indeed the same bug, I don't think think the impact is terribly high because the code only resides inside swift-reflection-dump and that AIUI is only really useful for the tests. I haven't looked too much further into it unfortunately, and last I recall, fixing this properly isn't trivial.

@finagolfin finagolfin force-pushed the android-ndk-23 branch 2 times, most recently from 478647c to 7e915c2 Compare October 8, 2021 08:42
@finagolfin
Copy link
Member Author

Rebased, disabled those three now-failing reflection tests, and added some tweaks from building with this patch natively on my Android phone with clang 13, such as not adding the NDK path to libunwind when building in Termux.

@drodriguez, this is ready to go. Along with swiftlang/swift-corelibs-libdispatch#568, it should get the community Android CI working with NDK 23 and avoid any breakage from the coming rebranch to clang 13 in #39473.

@finagolfin
Copy link
Member Author

finagolfin commented Oct 10, 2021

Btw, unrelated to this pull but I'm seeing about 30 Foundation tests related to time zones failing when built and run natively on Android 11 AArch64 from the Oct. 5 source snapshot. Guessing it's related to the Monterey merge last month, as everything was fine when I built the Sep. 2 source snapshot before that got in. I don't know if you're still running the Foundation tests on Android, but if you are, would be good if you can confirm.

Update: found and submitted a fix in swiftlang/swift-corelibs-foundation#3088

@finagolfin
Copy link
Member Author

As I predicted, the rebranch broke the community Android CI because of the switch to linking against compiler-rt and libunwind with clang 13:

/home/ubuntu/android-ndk-r19c/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/aarch64-linux-android/bin/ld.gold: error: cannot open /home/ubuntu/jenkins/workspace/oss-swift-RA-linux-ubuntu-16.04-android-arm64/buildbot_linux/llvm-linux-x86_64/lib/clang/10.0.0/lib/linux/libclang_rt.builtins-aarch64-android.a: No such file or directory
/home/ubuntu/android-ndk-r19c/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/aarch64-linux-android/bin/ld.gold: error: cannot find libunwind.a

This pull and the libdispatch pull should get it running again, provided NDK 23 is installed on the CI.

The latest Long Term Support NDK finally removed binutils, including the bfd/gold
linkers and libgcc. This simplifies our Android support, including making lld the
default linker for Android. Disable three reflection tests that now fail, likely
related to issues with swift-reflection-dump and switching to lld.

Also, add the libatomic dependency for Android armv7, just as on linux.
@drodriguez
Copy link
Contributor

@swift-ci please test and merge

@drodriguez
Copy link
Contributor

@buttaface: seems to work on a quick test I did, using this on top of main and with NDK r23. Hopefully it will also fix the CI machines. Thanks!

@swift-ci swift-ci merged commit 46f3cc9 into swiftlang:main Oct 16, 2021
@finagolfin
Copy link
Member Author

Still failing because the libdispatch pull has not been merged, I mentioned above that that would be needed too.

@drodriguez, why don't we just get that in to fix the CI, swiftlang/swift-corelibs-libdispatch#568, and if you don't like that solution, we can always change it later?

@drodriguez
Copy link
Contributor

It didn't fail for me in my testing. Was it failing on CI with some problem from libdispatch?

I see this have been reverted for other reasons and there's discussion there about what to do with libdispatch. We can always merge this first (when it is correct and doesn't break some other platform), and merge the libdispatch changes after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants