Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libc++ is linked improperly when using cmake or standalone toolchains #379

Closed
marijnvdwerf opened this issue May 2, 2017 · 25 comments
Closed

Comments

@marijnvdwerf
Copy link

Is there a roadmap on providing stable exception support on ARM-devices when compiled using c++_shared?

@janisozaur
Copy link

Or a list of known cases that generate invalid results?

@DanAlbert
Copy link
Member

Do you have a bug report? We have no open bugs about exception handling in libc++ that I see.

@marijnvdwerf
Copy link
Author

marijnvdwerf commented May 2, 2017

It is said so on https://developer.android.com/ndk/guides/cpp-support.html#ic (scroll up a bit)

Compatibility

The NDK's libc++ is not stable. Not all the tests pass, and the test suite is not comprehensive. > Some known issues are:

  • Using c++_shared on ARM can crash when an exception is thrown.

On a project we noticed strange crashes after exceptions. I assumed this to be the cause. I think I could make a small proof of concept for it.

@DanAlbert
Copy link
Member

Oh, that bullet point is way out of date. That was a KI in r11 or something.

libc++ is in much better shape than it was back then, but it's still not quite to the point that I'm comfortable recommending it. For a timeline on that, see our roadmap.

On a project we noticed strange crashes after exceptions. I assumed this to be the cause. I think I could make a small proof of concept for it.

Please do. I can't fix bugs that don't get reported.

@DanAlbert
Copy link
Member

(doc is fixed now, btw, sorry for the confusion)

@marijnvdwerf
Copy link
Author

I went a bit deeper into this, now I know that libc++ isn't the source of the issue. Got a case here: https://gist.github.com/marijnvdwerf/5ae2d0405dc4ce766d7ab3d08522a308.

When jansson is loaded, it crashes. When second-lib is loaded, it doesn't. For some reason, the externalproject_add library gets linked with a lot of gnu functions, while the other doesn't.

nm libjansson.so | grep 'gnu'
00008038 T __gnu_Unwind_Backtrace
00000000 w __gnu_Unwind_Find_exidx
00007f36 T __gnu_Unwind_ForcedUnwind
00007ed4 T __gnu_Unwind_RaiseException
00008570 T __gnu_Unwind_Restore_VFP
00008580 T __gnu_Unwind_Restore_VFP_D
00008590 T __gnu_Unwind_Restore_VFP_D_16_to_31
00008628 T __gnu_Unwind_Restore_WMMXC
000085a0 T __gnu_Unwind_Restore_WMMXD
00007f4a T __gnu_Unwind_Resume
00007f8c T __gnu_Unwind_Resume_or_Rethrow
00008578 T __gnu_Unwind_Save_VFP
00008588 T __gnu_Unwind_Save_VFP_D
00008598 T __gnu_Unwind_Save_VFP_D_16_to_31
0000863c T __gnu_Unwind_Save_WMMXC
000085e4 T __gnu_Unwind_Save_WMMXD
00007dd2 t __gnu_unwind_24bit.isra.1
0000874a T __gnu_unwind_execute
000089c0 T __gnu_unwind_frame
00007ca6 t __gnu_unwind_get_pr_addr
000080b2 t __gnu_unwind_pr_common

I'm assuming I'm doing something wrong, causing this. But I don't know what.

@DanAlbert
Copy link
Member

For some reason, the externalproject_add library gets linked with a lot of gnu functions, while the other doesn't.

This is probably the problem. Could you share how second-lib is built?

@marijnvdwerf
Copy link
Author

marijnvdwerf commented May 3, 2017 via email

@DanAlbert
Copy link
Member

Hmm. We need to rethink how we link STLs in cmake, and also in standalone toolchains.

Looking at the -v output for the link command for native-lib:

... CMakeFiles/native-lib.dir/native-lib.cpp.o /usr/local/google/home/danalbert/src/android-ndk-r14b/platforms/android-16/arch-arm/usr/lib/liblog.so lib/libjansson.so -lm /usr/local/google/home/danalbert/src/android-ndk-r14b/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libc++.so -lstdc++ -lm -lgcc -ldl -lc -lgcc -ldl /usr/local/google/home/danalbert/src/android-ndk-r14b/platforms/android-16/arch-arm/usr/lib/../lib/crtend_so.o

In order:

  1. Object files
  2. Some shared libs (including libjansson, which exports the libgcc unwind symbols)
  3. libc++ (in reality a linker script that includes some static libraries, including libunwind.a)
  4. More shared libs
  5. And the usual prolog: -lgcc -ldl -lc -lgcc -ldl

If we look at the output for roughly the same project in ndk-build:

/work/src/armeabi-exceptions/obj/local/armeabi-v7a/objs/native-lib/native-lib.o /work/src/android-ndk-r14b/sources/android/support/../../cxx-stl/llvm-libc++/libs/armeabi-v7a/libandroid_support.a /work/src/android-ndk-r14b/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libunwind.a -lgcc /work/src/armeabi-exceptions/obj/local/armeabi-v7a/libjansson.so /work/src/armeabi-exceptions/obj/local/armeabi-v7a/libc++_shared.so --fix-cortex-a8 --exclude-libs libunwind.a --build-id --no-undefined -z noexecstack -z relro -z now --warn-shared-textrel --fatal-warnings -llog -lc -lm -lstdc++ -lm -lgcc -ldl -lc -lgcc -ldl

  1. Object files
  2. All static libraries
  3. libgcc
  4. All shared libraries
  5. And the usual prolog

ndk-build gets this right. All static libraries must precede libgcc, which must preceded all shared libraries. I think I can at least fix cmake by just not using the linker script, but I'm not sure what to do about standalone toolchains...

@DanAlbert DanAlbert changed the title [question] libc++ exception support libc++ is linked improperly when using cmake or standalone toolchains May 3, 2017
@DanAlbert DanAlbert self-assigned this May 3, 2017
@marijnvdwerf
Copy link
Author

Fyi, this is the real-world dependency-building project we (attempt to) use: https://github.com/marijnvdwerf/openrct2-dependencies-android.

@DanAlbert
Copy link
Member

I don't think this can be truly fixed without patching CMake itself: https://github.com/Kitware/CMake/blob/master/Source/cmLinkLineComputer.cxx#L182

The standard libraries are always the last libraries to be linked. Even if we were to try to add the libraries in some way other than CMAKE_CXX_STANDARD_LIBRARIES_INIT, CMake links all libraries in the order they were provided, regardless of whether they were static or shared libraries: https://github.com/Kitware/CMake/blob/master/Source/cmLinkLineComputer.cxx#L62

This works in the typical CMake build because CMake usually isn't in the business of deciding how to link your stdlib, it just leaves that to the compiler. These sorts of issues can affect libraries beyond the stdlib too, it's just much less likely (DLL hell for static libraries, essentially).

There might be some things we can do to mitigate this problem. Could you try applying this patch to your cmake toolchain file and rebuilding? I've done this locally and can confirm that the unwind symbols in libjansson do get hidden. I believe this should alleviate the problems, even if it isn't a perfect solution.

Longer term we're going to be replacing both unwinders (libgcc and libunwind) with one that we've written from scratch. Having that should also help things.

@marijnvdwerf
Copy link
Author

How can I download the patch file?

@DanAlbert
Copy link
Member

Excellent question. I'm not sure, so here: https://gist.github.com/DanAlbert/e8593db1f22e3014463da02a54471927

@enh
Copy link
Contributor

enh commented May 4, 2017

(there's the download drop-down in the top right corner that lets you download something you'll have to base64 decode or a zip file, or you can click on the individual file and there's a download icon next to where it says "Patch set 1" over the right hand pane.)

@marijnvdwerf
Copy link
Author

This does fix the example project, and the real-world subprojects we use externalproject_add for. Any idea on how to solve it with the standalone toolchain/autoconf? (PNG seems to have hidden unwind symbols, while speexdsp had visible ones)

@DanAlbert
Copy link
Member

The standalone toolchain case is a bit more haphazard, but I think we can do it by wrapping the linker in a script that would autoappend those flags (this is already how we add the -target flag to Clang).

I can go ahead and submit the CMake fix (I'll extend it to cover ndk-build too) and work on the standalone toolchain case separately.

@DanAlbert
Copy link
Member

https://android-review.googlesource.com/389852 has a fix for standalone toolchains, but it occurs to me that we might not be able to use this. iirc this will break clang -fuse-ld=blah on Windows, since you have to actually do clang -fuse-ld=gold.exe, and now all those users would need to change that to clang -fuse-ld=gold.cmd. The existence of the wrapper would need to be transparent to the user (that's the whole point of the standalone toolchain).

I can verify this later when I have a Windows machine available. If I'm right, maybe we should teach the Clang driver to do this by default for Android. @stephenhines: thoughts?

We should also get the -fuse-ld=gold.exe thing fixed at some point. I thought we had, but we had a report of this recently.

@DanAlbert
Copy link
Member

Yeah, confirmed that. Fixing standalone toolchains with the patch above will make things messy on Windows. We're not going to be able to tackle those in r15. I'll give it some more thought.

@Jaykob
Copy link

Jaykob commented Jul 14, 2017

I'm having the same issue, however the provided patch doesn't help in my case.
My setup is as follows: A fat library called scankit.so depends on many other c++ libraries (opencv, tesseract, zxing, ...) which are all built with gradle, cmake and the NDK r15, similar to the project above. I'm using the SDK-bundled cmake version 3.6.4111459. The fat lib is than packaged intro an aar and used in the app project through gradle.

Now when an exception is thrown in one of these libraries and caught in the scankit, I get the same SIGABRT as above.
I tried both c++_static and c++_shared versions, the only thing that changed was the place where it happened. With static, an exception thrown from lib a was crashing and with shared, this exception went through fine but an exception thrown in lib b now leads to a crash instead.
I tried the readelf stuff mentioned here #289 and got rid of the undefined Unwinder stuff, but that didn't fix it.
Is there anything I'm missing here or any advice what I should check to narrow it down? The linker command looks like this:

/Users/xyz/android-ndk-r15/toolchains/llvm/prebuilt/darwin-x86_64/bin/clang++  --target=armv7-none-linux-androideabi 
--gcc-toolchain=/Users/xyz/android-ndk-r15/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64 
--sysroot=/Users/xyz/android-ndk-r15/sysroot -fPIC -isystem 
/Users/xyz/android-ndk-r15/sysroot/usr/include/arm-linux-androideabi -D__ANDROID_API__=19 -g -DANDROID 
-ffunction-sections -funwind-tables -fstack-protector-strong -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp 
-mfpu=vfpv3-d16 -fno-integrated-as -mthumb -Wa,--noexecstack -Wformat -Werror=format-security -std=c++11 -frtti 
-fexceptions -fexceptions -frtti -frtti -fexceptions -std=c++11 -Os -DNDEBUG  -Wl,--exclude-libs,libgcc.a --sysroot 
/Users/xyz/android-ndk-r15/platforms/android-19/arch-arm -Wl,--build-id -Wl,--warn-shared-textrel -Wl,--fatal-warnings 
-Wl,--fix-cortex-a8 -Wl,--exclude-libs,libunwind.a 
-L/Users/xyz/android-ndk-r15/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a -Wl,--no-undefined -Wl,-z,noexecstack 
-Qunused-arguments -Wl,-z,relro -Wl,-z,now -shared -Wl,-soname,libscankit.so -o 
../../../../build/intermediates/cmake/release/obj/armeabi-v7a/libscankit.so 
CMakeFiles/scankit.dir/src/helper/helper.cpp.o CMakeFiles/scankit.dir/src/scankit/ReceiptScanner.cpp.o 
CMakeFiles/scankit.dir/src/scankit/ReceiptSearcher.cpp.o CMakeFiles/scankit.dir/src/scankit/ReceiptSearchResult.cpp.o 
CMakeFiles/scankit.dir/src/scankit/ReceiptSearchStatus.cpp.o CMakeFiles/scankit.dir/src/scankit/RelativePoint.cpp.o 
CMakeFiles/scankit.dir/src/scankit/binarize.cpp.o CMakeFiles/scankit.dir/src/scankit/InformationExtraction.cpp.o 
CMakeFiles/scankit.dir/platform-dependent/android/src/main/jni/cpp/ReceiptScanner.cpp.o 
CMakeFiles/scankit.dir/platform-dependent/android/src/main/jni/cpp/ReceiptSearchResult.cpp.o 
CMakeFiles/scankit.dir/platform-dependent/android/src/main/jni/cpp/ReceiptSearchSta
tus.cpp.o CMakeFiles/scankit.dir/platform-dependent/android/src/main/jni/cpp/ReceiptSearcher.cpp.o 
CMakeFiles/scankit.dir/platform-dependent/android/src/main/jni/cpp/RelativePoint.cpp.o 
CMakeFiles/scankit.dir/platform-dependent/android/src/support-lib/djinni_support.cpp.o  -llog 
../../../../src/main/jniLibs/armeabi-v7a/libopencv_java3.so ../../../../src/main/jniLibs/armeabi-v7a/libjpgt.so 
../../../../src/main/jniLibs/armeabi-v7a/libpngt.so ../../../../src/main/jniLibs/armeabi-v7a/liblept.so 
../../../../src/main/jniLibs/armeabi-v7a/libtess.so ../../../../src/main/jniLibs/armeabi-v7a/libzxing.so 
../../../../src/main/jniLibs/armeabi-v7a/libtre.so ../../../../src/main/excludedLibs/armeabi-v7a/libLottoKit.so -lm 
-lgcc -ldl -lc -lm "/Users/xyz/android-ndk-r15/sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libc++.so" 

@Jaykob
Copy link

Jaykob commented Jul 18, 2017

BTW, can anybody explain me what's the difference between libc++.so and libc++_shared.so in this case? Because my log states that I'm using libc++.so and the other logs show libc++_shared.so is being linked instead.

@DanAlbert
Copy link
Member

#379 (comment)

libc++ (in reality a linker script that includes some static libraries, including libunwind.a)

libc++_shared.so is the real library, libc++.so is a linker script that makes it less of a hassle to use standalone toolchains.

fornwall added a commit to termux/termux-packages that referenced this issue Jul 27, 2017
See android/ndk#379

Fixes #1163

Fixes issues with gdb segfaulting on arm on an unrecognized command.
@kaa-awem
Copy link

kaa-awem commented Sep 11, 2017

Hi Dan.

I'm using NDK r15c (15.2.4203891) and cmake 3.6.4111459 (I'm experiencing the very same results on r12b and r14b and the previous version of cmake). I've checked cmake scripts as well as build.ninja file. It seems that -Wl,--exclude-libs,libgcc.a -Wl,--exclude-libs,libunwind.a is used as expected.

However, I've got the issue similar to one described here. When I statically link to libc++, I've got the following crash:

  * frame #0: 0xf11c0e44 libc.so`tgkill + 12
    frame #1: 0xf11be5e6 libc.so`pthread_kill + 38
    frame #2: 0xf11948a8 libc.so`raise + 14
    frame #3: 0xf11903f4 libc.so`__libc_android_abort + 38
    frame #4: 0xf118e038 libc.so`abort + 8
    frame #5: libnative-lib.so`__cxxabiv1::readEncodedPointer(data=<unavailable>, encoding=<unavailable>) at cxa_personality.cpp:304
    frame #6: libnative-lib.so`__cxxabiv1::scan_eh_tab(results=<unavailable>, actions=<unavailable>, native_exception=<unavailable>, unwind_exception=<unavailable>, context=<unavailable>)::scan_results&, _Unwind_Action, bool, _Unwind_Control_Block*, _Unwind_Context*) at cxa_personality.cpp:629
    frame #7: libnative-lib.so`::__gxx_personality_v0(state=<unavailable>, unwind_exception=0xcf665a28, context=<unavailable>) at cxa_personality.cpp:1098
    frame #8: 0xf14cf024 liblog.so`__gnu_Unwind_RaiseException + 112
    frame #9: 0xf14cfb78 liblog.so`_Unwind_RaiseException + 24
    frame #10: libnative-lib.so`::__cxa_throw(thrown_object=0xcf665a80, tinfo=<unavailable>, dest=<unavailable>)(void *)) at cxa_exception.cpp:223

when I dynamically link to libc++, I've got a crash here:

  * frame #0: 0xf14cf078 liblog.so`__gnu_Unwind_Resume + 4
    frame #1: 0xf14cfb9c liblog.so`_Unwind_Resume + 24

Please, note that for some reason crash with dynamically inked libc++ happens when the exception is thrown, but call stack shows _Unwind_Resume only.

Note: it seems that Unwind* functions were linked from liblog.so instead of being statically linked from libunwind.a. Following the ndk-build's linking order we re-sorted all libraries in CMakeList.txt and manually added libc++ between static and dynamic libraries. Now we've got only __gnu_Unwind_Find_exidx being unresolved. Not quite sure what piece of code/library introduced it. However, we don't see crashes because of exceptions anymore.

@KevinWrightCatDaddy
Copy link

KevinWrightCatDaddy commented Sep 22, 2017

We ran into the same issue here. We use a custom script to call clang instead of using NDK build or Andoid Studio or anything else. Sometime between setting up our build system and the latest NDK, the NDK build added a unwind.a (I think NDK r15?) to it linking process. We added that and did some reordering of the args we pass into the linker (based on the verbose output from the NDK build args) and we were able to resolve our issue.

@DanAlbert DanAlbert added this to the r19 milestone Mar 16, 2018
miodragdinic pushed a commit to MIPS/ndk that referenced this issue Apr 17, 2018
It seems we can't guarantee the link order of our libraries (objects,
then static libraries, then shared libraries) in cmake, which
prevents us from ensuring that libunwind is used with libc++ in the
event that some other library is exporting an unwinder.

The best we can do without patching cmake (and until the new unwinder
is available) is hide libgcc in any new libraries we build. This
can't protect us from libraries built without this patch (or with
build systems we don't control), but it will at least protect us from
ourselves.

Patch this in both ndk-build and cmake. ndk-build doesn't have the
same problems that cmake does, but libraries built by ndk-build can
cause problems for cmake. Standalone toolchains are also affected by
this. I'll fix them in a follow up patch.

Test: Built the test cast attached to the bug, libgcc is hidden.
Bug: android/ndk#379
Change-Id: I3406c7e49f7194231bfe1e5f921a3e51d875dc84
miodragdinic pushed a commit to MIPS/ndk that referenced this issue Apr 17, 2018
Doing this here causes libgcc to be linked ahead of the STL, which
breaks unwinding and therefore exception handling.

Test: ./run_tests.py
Bug: android/ndk#379
Change-Id: I63cbbdc125fdc3231a21f3354c1b1cfb129ac12b
@DanAlbert
Copy link
Member

A breakdown of the current status here:

  • ndk-build is unaffected by this bug as of r15, even if the build includes broken prebuilt libraries.
  • CMake should be unaffected as of r16, assuming there are no broken prebuilts.
  • Standalone toolchains and third-party-build systems are partially fixed as of r19. See explanation below. Compatibility with broken prebuilts it up to the build system.

Since the issue with broken prebuilts can only be fixed by the build system, the NDK cannot do any more to solve that problem. ndk-build does not have this problem.

NDK r19 turns libgcc.a into a linker script to ensure that libunwind.a is linked before libgcc.a, which clears up most of the issues. This cannot make the symbols hidden, so it may not behave properly in the presence of broken dependencies, but the toolchain itself produces correct output.

The remaining work here is to lift -Wl,--exclude-libs,libgcc.a into the Clang driver for Android targets (realistically it's probably a good change to make for all targets when using a static libgcc, but idk if that will get traction). I've forked that in to #823

Anything beyond that is something that each build system will have to deal with individually.

@DanAlbert
Copy link
Member

I should also mention that the correct build process is documented here: https://android.googlesource.com/platform/ndk/+/master/docs/BuildSystemMaintainers.md#Unwinding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants