Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support-flto #310

Closed
headupinclouds opened this issue Feb 18, 2017 · 20 comments
Closed

support-flto #310

headupinclouds opened this issue Feb 18, 2017 · 20 comments
Assignees
Labels
Milestone

Comments

@headupinclouds
Copy link
Collaborator

headupinclouds commented Feb 18, 2017

UPDATE 2/21/17: clang toolchains seems to work for iOS and OS X builds. Android NDK toolchains (both clang and gcc hit various roadblocks. See discussion below

See: http://stackoverflow.com/a/25649861 : -flto -O3 may optimize further but increase size, while -flto -Os should reduce size. Also, -flto -Os (or whatever optimization setting is used) must be used consistently for compilation and linking within the project and all external (Hunter) dependencies. Thus, -flto belongs in the toolchain, and we must specify a CMAKE_BUILD_TYPE of MinSizeRel, which can be achieved for the internal project and hunter dependencies using the following polly flags respectively:

--config=MinSizeRel
--fwd HUNTER_CONFIGURATION_TYPES=MinSizeRel

Relates: https://github.com/ruslo/hunter/issues/22

Note: Polly toolchain requires HUNTER_TOOLCHAIN_UNDETECTABLE_ID flag:

# There is no macro to detect this flags on toolchain calculation so we must
# mark this toolchain explicitly.
list(APPEND HUNTER_TOOLCHAIN_UNDETECTABLE_ID "lto")

Experiment 1: -O3 -ffunction-sections -fdata-sections

$ du -sh _install/xcode-hid-sections/{bin,lib}/*

5.8M	_install/xcode-hid-sections/bin/drishti-acf
8.0M	_install/xcode-hid-sections/bin/drishti-eye
9.0M	_install/xcode-hid-sections/bin/drishti-face
4.7M	_install/xcode-hid-sections/bin/opencv_size
6.6M	_install/xcode-hid-sections/lib/libdrishti.0.8.0.dylib
4.0K	_install/xcode-hid-sections/lib/libdrishti.0.dylib
4.0K	_install/xcode-hid-sections/lib/libdrishti.dylib
4.1M	_install/xcode-hid-sections/lib/libdrishti_c.dylib

Experiment 2: -flto -O3 -ffunction-sections -fdata-sections

du -sh _install/xcode-hid-sections-lto/{bin,lib}/*

4.3M	_install/xcode-hid-sections-lto/bin/drishti-acf
5.3M	_install/xcode-hid-sections-lto/bin/drishti-eye
7.2M	_install/xcode-hid-sections-lto/bin/drishti-face
2.8M	_install/xcode-hid-sections-lto/bin/opencv_size
6.9M	_install/xcode-hid-sections-lto/lib/libdrishti.0.8.0.dylib
4.0K	_install/xcode-hid-sections-lto/lib/libdrishti.0.dylib
4.0K	_install/xcode-hid-sections-lto/lib/libdrishti.dylib
3.8M	_install/xcode-hid-sections-lto/lib/libdrishti_c.dylib

Experiment 3: -flto -Os -ffunction-sections -fdata-sections

Note: needs Eigen patch for MinSizeRel

3.7M	_install/xcode-hid-sections-lto/bin/drishti-acf
4.9M	_install/xcode-hid-sections-lto/bin/drishti-eye
6.6M	_install/xcode-hid-sections-lto/bin/drishti-face
2.5M	_install/xcode-hid-sections-lto/bin/opencv_size
5.5M	_install/xcode-hid-sections-lto/lib/libdrishti-MinSizeRel.0.8.0.dylib
4.0K	_install/xcode-hid-sections-lto/lib/libdrishti-MinSizeRel.0.dylib
4.0K	_install/xcode-hid-sections-lto/lib/libdrishti-MinSizeRel.dylib
2.9M	_install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib
 gzip _install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib
du -sh _install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib.gz 
1.1M	_install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib.gz

Experiment 4: -flto -Os -ffunction-sections -fdata-sections w/ cereal

DRISHTI_SERIALIZE_WITH_BOOST=OFF
DRISHTI_SERIALIZE_WITH_CEREAL=ON
DRISHTI_SERIALIZE_WITH_CVMATIO=OFF
2.7M	_install/xcode-hid-sections-lto/bin/drishti-acf
3.2M	_install/xcode-hid-sections-lto/bin/drishti-eye
4.1M	_install/xcode-hid-sections-lto/bin/drishti-face
2.5M	_install/xcode-hid-sections-lto/bin/opencv_size
5.1M	_install/xcode-hid-sections-lto/lib/libdrishti-MinSizeRel.0.8.0.dylib
4.0K	_install/xcode-hid-sections-lto/lib/libdrishti-MinSizeRel.0.dylib
4.0K	_install/xcode-hid-sections-lto/lib/libdrishti-MinSizeRel.dylib
2.5M	_install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib
gzip _install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib
du -sh _install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib.gz 
1004K	_install/xcode-hid-sections-lto/lib/libdrishti_c-MinSizeRel.dylib.gz
@headupinclouds
Copy link
Collaborator Author

Eigen won't support MinSizeRel:

Build/Eigen/args.cmake -DCMAKE_MINSIZEREL_POSTFIX=-MinSizeRel -DCMAKE_BUILD_TYPE=MinSizeRel -DCMAKE_CONFIGURATION_TYPES=MinSizeRel -DCMAKE_INSTALL_PREFIX=/Users/dhirvonen/.hunter/_Base/1b9e720/61a0300/3e2bb40/Build/Eigen/Install -DCMAKE_TOOLCHAIN_FILE=/Users/dhirvonen/.hunter/_Base/1b9e720/61a0300/3e2bb40/Build/Eigen/toolchain.cmake -GXcode /Users/dhirvonen/.hunter/_Base/1b9e720/61a0300/3e2bb40/Build/Eigen/Source
loading initial cache file /Users/dhirvonen/.hunter/_Base/1b9e720/61a0300/3e2bb40/cache.cmake
loading initial cache file /Users/dhirvonen/.hunter/_Base/1b9e720/61a0300/3e2bb40/Build/Eigen/args.cmake
-- [polly] Used toolchain: Xcode / clang / LLVM Standard C++ Library (libc++) / c++11 support / hidden / data-sections / function-sections / LTO
-- The C compiler identification is AppleClang 8.0.0.8000042
-- The CXX compiler identification is AppleClang 8.0.0.8000042
-- Check for working C compiler: /Applications/develop/ide/xcode/8.1/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang
-- Check for working C compiler: /Applications/develop/ide/xcode/8.1/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -- works
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Applications/develop/ide/xcode/8.1/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++
-- Check for working CXX compiler: /Applications/develop/ide/xcode/8.1/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ -- works
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at CMakeLists.txt:26 (message):
  Unknown build type "MinSizeRel".  Allowed values are Debug, Release,
  RelWithDebInfo (case-insensitive).

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Feb 20, 2017

All NDK toolchains shown below report "plugin needed to handle lto object". Sorting through various posts to understand this.

TOOLCHAINS=
(
    libcxx-hid-sections${POLLY_LTO_EXT}
    ios-10-1-dep-8-0-libcxx-hid-sections${POLLY_LTO_EXT}
    android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections${POLLY_LTO_EXT}  # plugin needed to handle lto object
    android-ndk-r10e-api-21-arm64-v8a-gcc-49-hid-sections${POLLY_LTO_EXT} #  plugin needed to handle lto object
    android-ndk-r10e-api-16-x86-hid-sections${POLLY_LTO_EXT} #  plugin needed to handle lto object
    android-ndk-r10e-api-21-x86-64-hid-sections${POLLY_LTO_EXT} #  plugin needed to handle lto object
)

See: android/ndk#108

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Feb 21, 2017

Reading android/ndk#108, and android/ndk#137

As a starting point, it would be useful to create any working build of drishti w/ LTO using a polly android toolchain. Ideally, this would be for all toolchains in bin/toolchains.sh.

Both gcc and clang seem to turn up a few issues, which are described in the GitHub issue linked above.


android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto

Create new toolchain from *-hid-sections.cmake:

cp ${POLLY_ROOT}/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections.cmake ${POLLY_ROOT}/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto.cmake`

Update include guard, polly_init, add lto.cmake flag, etc:

Add ANDROID toolchain:

include("${CMAKE_CURRENT_LIST_DIR}/utilities/polly_common.cmake")
include("${CMAKE_CURRENT_LIST_DIR}/flags/cxx11.cmake") # before toolchain!
include("${CMAKE_CURRENT_LIST_DIR}/flags/function-sections.cmake")
include("${CMAKE_CURRENT_LIST_DIR}/flags/data-sections.cmake")
include("${CMAKE_CURRENT_LIST_DIR}/flags/hidden.cmake")
include("${CMAKE_CURRENT_LIST_DIR}/flags/lto.cmake") #### ADD ME
include("${CMAKE_CURRENT_LIST_DIR}/os/android.cmake")

${POLLY_ROOT}/flags/lto.cmake

# Copyright (c) 2014-2017, Ruslan Baratov
# Copyright (c) 2017, David Hirvonen
# All rights reserved.

if(DEFINED POLLY_FLAGS_LTO_CMAKE_)
  return()
else()
  set(POLLY_FLAGS_LTO_CMAKE_ 1)
endif()

include(polly_add_cache_flag)

string(COMPARE EQUAL "${ANDROID_NDK_VERSION}" "" _not_android)

# TODO: test other platfroms, CMAKE_CXX_FLAGS_INIT should work for all
if(_not_android)
  polly_add_cache_flag(CMAKE_CXX_FLAGS "-flto")
  polly_add_cache_flag(CMAKE_C_FLAGS "-flto")
else()
  polly_add_cache_flag(CMAKE_CXX_FLAGS_INIT "-flto")
  polly_add_cache_flag(CMAKE_C_FLAGS_INIT "-flto")

  # SECTION A
  #polly_add_cache_flag(CMAKE_EXE_LINKER_FLAGS_INIT "-fuse-ld=gold")
  #polly_add_cache_flag(CMAKE_SHARED_LINKER_FLAGS_INIT "-fuse-ld=gold")
  #polly_add_cache_flag(CMAKE_EXE_LINKER_FLAGS_INIT "-flto")
  #polly_add_cache_flag(CMAKE_SHARED_LINKER_FLAGS_INIT "-flto")

  # SECTION B
  #polly_add_cache_flag(CMAKE_EXE_LINKER_FLAGS_INIT "-Wl,-flto")
  #polly_add_cache_flag(CMAKE_SHARED_LINKER_FLAGS_INIT "-Wl,-flto")
  #polly_add_cache_flag(CMAKE_EXE_LINKER_FLAGS_INIT "-Wl,-fuse-ld=gold")
  #polly_add_cache_flag(CMAKE_SHARED_LINKER_FLAGS_INIT "-Wl,-fuse-ld=gold")
  
endif()

# There is no macro to detect this flags on toolchain calculation so we must
# mark this toolchain explicitly.
list(APPEND HUNTER_TOOLCHAIN_UNDETECTABLE_ID "lto")

When we run this toolchain as shown above (SECTIONA and SECTIONB commented out), but with -flto passed to CMAKE_{C,CXX}_FLAGS, we end up with a build error (plugin needed to handle lto objec) that first shows up here:

/usr/local/Cellar/cmake/3.7.2/bin/cmake -E cmake_link_script CMakeFiles/zlib.dir/link.txt --verbose=1
/Users/dhirvonen/pkg/android/android-ndk-r10e/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-ar qc libz.a  CMakeFiles/zlib.dir/adler32.c.o CMakeFiles/zlib.dir/compress.c.o CMakeFiles/zlib.dir/crc32.c.o CMakeFiles/zlib.dir/deflate.c.o CMakeFiles/zlib.dir/gzclose.c.o CMakeFiles/zlib.dir/gzlib.c.o CMakeFiles/zlib.dir/gzread.c.o CMakeFiles/zlib.dir/gzwrite.c.o CMakeFiles/zlib.dir/inflate.c.o CMakeFiles/zlib.dir/infback.c.o CMakeFiles/zlib.dir/inftrees.c.o CMakeFiles/zlib.dir/inffast.c.o CMakeFiles/zlib.dir/trees.c.o CMakeFiles/zlib.dir/uncompr.c.o CMakeFiles/zlib.dir/zutil.c.o
BFD: CMakeFiles/zlib.dir/adler32.c.o: plugin needed to handle lto object

This seems to be due to a mismatched ar. We can tell CMake to use *gcc-ar by adding this to the toolchain file at the bottom:

set(CMAKE_AR "${ANDROID_NDK}/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-gcc-ar")

But we still end up with

/Users/dhirvonen/pkg/android/android-ndk-r10e/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-gcc-ar qc libz.a  CMakeFiles/zlib.dir/adler32.c.o CMakeFiles/zlib.dir/compress.c.o CMakeFiles/zlib.dir/crc32.c.o CMakeFiles/zlib.dir/deflate.c.o CMakeFiles/zlib.dir/gzclose.c.o CMakeFiles/zlib.dir/gzlib.c.o CMakeFiles/zlib.dir/gzread.c.o CMakeFiles/zlib.dir/gzwrite.c.o CMakeFiles/zlib.dir/inflate.c.o CMakeFiles/zlib.dir/infback.c.o CMakeFiles/zlib.dir/inftrees.c.o CMakeFiles/zlib.dir/inffast.c.o CMakeFiles/zlib.dir/trees.c.o CMakeFiles/zlib.dir/uncompr.c.o CMakeFiles/zlib.dir/zutil.c.o
/Users/dhirvonen/pkg/android/android-ndk-r10e/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-ranlib libz.a
BFD: adler32.c.o: plugin needed to handle lto object

We can enable SECTIONB in the lto.cmake flag shown above and build again... but this causes a cmake build check to fail:

  /Users/dhirvonen/pkg/android/android-ndk-r10e/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-gcc
  --sysroot=/Users/dhirvonen/pkg/android/android-ndk-r10e/platforms/android-19/arch-arm
  -flto -fvisibility=hidden -fdata-sections -ffunction-sections
-- Configuring incomplete, errors occurred!
  -march=armv7-a -marm -mfpu=neon -mfloat-abi=softfp -funwind-tables
  -no-canonical-prefixes -fexceptions -g -Wl,-fuse-ld=gold -Wl,-flto
  -Wl,--fix-cortex-a8 -fPIE -pie -Wl,--gc-sections -Wl,-z,nocopyreloc
See also "/Users/dhirvonen/devel/elucideye/drishti/_builds/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto-Release/CMakeFiles/CMakeOutput.log".
  CMakeFiles/cmTC_f83e0.dir/testCCompiler.c.o -o cmTC_f83e0


See also "/Users/dhirvonen/devel/elucideye/drishti/_builds/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto-Release/CMakeFiles/CMakeError.log".
  /Users/dhirvonen/pkg/android/android-ndk-r10e/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../../arm-linux-androideabi/bin/ld:
  fatal error: -f/--auxiliary may not be used without -shared

So it seems we are getting the correct linker, but hitting and error where gold is confused by the option that is used to call itself:-f/--auxiliary may not be used without -shared.

Note: With clang, according to this comment android/ndk#137 (comment), we pass -flto directly to the linker flags (not -Wl,-flto. Passing this directly to gcc` seems to have no effect.

@headupinclouds headupinclouds added this to the 0.4 milestone Feb 21, 2017
@headupinclouds headupinclouds changed the title review-flto support-flto Feb 21, 2017
@headupinclouds
Copy link
Collaborator Author

Note: Errors now from ranlib step. May be useful: http://stackoverflow.com/a/39256013

@headupinclouds
Copy link
Collaborator Author

Possible solution here: https://bugs.archlinux.org/task/43367

The "fix" is to use 'gcc-ar' in place of 'ar' (which invokes 'ar' with --plugin=/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/liblto_plugin.so or similar).

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Feb 21, 2017

Testing this now. Build is about 1/2 way through. Seems to resolve some of the basic setup issues for gcc + ndk. Could probably be done cleaner.

include("${CMAKE_CURRENT_LIST_DIR}/os/android.cmake")

set(CMAKE_AR "${ANDROID_NDK}/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin/arm-linux-androideabi-gcc-ar")
set(CMAKE_RANLIB "/usr/bin/true") # noop for gcc

message("CMAKE_AR: ${CMAKE_AR}")
  
# set(CMAKE_C_ARCHIVE_CREATE "<CMAKE_AR> qcs <TARGET> <LINK_FLAGS> <OBJECTS>")
# set(CMAKE_C_ARCHIVE_FINISH "true") # Or any other no-op command

# set(CMAKE_CXX_ARCHIVE_CREATE "<CMAKE_AR> qcs <TARGET> <LINK_FLAGS> <OBJECTS>")
# set(CMAKE_CXX_ARCHIVE_FINISH "true") # Or any other no-op command

# Need to use ${CMAKE_AR} here else opencv build breaks
set(CMAKE_C_ARCHIVE_CREATE "${CMAKE_AR} qcs <TARGET> <LINK_FLAGS> <OBJECTS>")
set(CMAKE_C_ARCHIVE_FINISH "true") # Or any other no-op command

set(CMAKE_CXX_ARCHIVE_CREATE "${CMAKE_AR} qcs <TARGET> <LINK_FLAGS> <OBJECTS>")
set(CMAKE_CXX_ARCHIVE_FINISH "true") # Or any other no-op command

The build is much closer but does hit a few errors.

@ruslo ruslo self-assigned this Feb 22, 2017
@ruslo
Copy link
Collaborator

ruslo commented Feb 22, 2017

Eigen won't support MinSizeRel

#314

@ruslo
Copy link
Collaborator

ruslo commented Feb 22, 2017

@ruslo
Copy link
Collaborator

ruslo commented Feb 23, 2017

From GCC documentation:

To create static libraries suitable for LTO, use gcc-ar and gcc-ranlib instead of ar and ranlib

So instead of '/usr/bin/true' it should be arm-linux-androideabi-gcc-ranlib:

@ruslo
Copy link
Collaborator

ruslo commented Feb 24, 2017

android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections${POLLY_LTO_EXT}
android-ndk-r10e-api-21-arm64-v8a-gcc-49-hid-sections${POLLY_LTO_EXT}
android-ndk-r10e-api-16-x86-hid-sections${POLLY_LTO_EXT}
android-ndk-r10e-api-21-x86-64-hid-sections${POLLY_LTO_EXT}

Non of those is Android + Clang. What toolchain have you tried?

@headupinclouds
Copy link
Collaborator Author

I tried a few versions. These could be a starting point:

android-ndk-r10e-api-16-armeabi-v7a-neon-clang-35-hid-sections-lto.cmake
android-ndk-r11c-api-21-arm64-v8a-clang-hid-sections-lto.cmake

@headupinclouds
Copy link
Collaborator Author

In some cases, I believe toolchains that maintain a specific clang version (i.e., "clang3.5") have been updated to newer NDKs where that clang version is not available.

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Feb 28, 2017

Note: Updated experiment after cleaning the hunter cache w/ shared OpenCV + lto

Experiment 8: TOOLCHAIN=android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto (i.e., gcc) w/ MinSizerel and opencv CMAKE_ARGS BUILD_SHARED_LIBS=ON + OpenCV LTO and -flto -Os -ffunction-sections -fdata-sections

du -sh _install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/*
932K	_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/libdrishti.so
836K	_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/libdrishti_c.so
gzip -9 *.so
du -sh *
396K	libdrishti.so.gz
352K	libdrishti_c.so.gz

🎆 👏 👍

DRISHTI_SERIALIZE_WITH_BOOST=OFF
DRISHTI_SERIALIZE_WITH_CEREAL=ON
DRISHTI_SERIALIZE_WITH_CVMATIO=OFF
"cmake" "-H." "-B/Users/dhirvonen/devel/elucideye/drishti/_builds/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto-MinSizeRel" "-DCMAKE_BUILD_TYPE=MinSizeRel" "-GUnix Makefiles" "-DCMAKE_TOOLCHAIN_FILE=/Users/dhirvonen/devel/elucideye/drishti/src/3rdparty/polly/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto.cmake" "-DCMAKE_VERBOSE_MAKEFILE=ON" "-DPOLLY_STATUS_DEBUG=ON" "-DHUNTER_STATUS_DEBUG=ON" "-DCMAKE_INSTALL_PREFIX=/Users/dhirvonen/devel/elucideye/drishti/_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto" "-DANDROID=TRUE" "-DHUNTER_CONFIGURATION_TYPES=Release" "-DDRISHTI_BUILD_EXAMPLES=ON" "-DDRISHTI_BUILD_TESTS=OFF" "-DDRISHTI_BUILD_FACE=OFF" "-DDRISHTI_BUILD_ACF=OFF" "-DDRISHTI_BUILD_HCI=OFF" "-DDRISHTI_BUILD_OGLES_GPGPU=OFF" "-DDRISHTI_BUILD_REGRESSION_FIXED_POINT=ON" "-DDRISHTI_BUILD_REGRESSION_SIMD=ON" "-DDRISHTI_SERIALIZE_WITH_BOOST=OFF" "-DDRISHTI_SERIALIZE_WITH_CEREAL=ON" "-DDRISHTI_SERIALIZE_WITH_CVMATIO=OFF" "-DDRISHTI_DISABLE_DSYM=OFF" "-DDRISHTI_BUILD_C_INTERFACE=ON" "-DCMAKE_VISIBILITY_INLINES_HIDDEN=ON" "-DCMAKE_CXX_VISIBILITY_PRESET=hidden" "-DCMAKE_XCODE_ATTRIBUTE_GCC_INLINES_ARE_PRIVATE_EXTERN=YES" "-DCMAKE_XCODE_ATTRIBUTE_GCC_SYMBOLS_PRIVATE_EXTERN=YES" "-DDRISHTI_BUILD_MIN_SIZE=ON"

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Feb 28, 2017

Experiment 9: TOOLCHAIN=android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto (i.e., gcc) w/ MinSizerel and opencv CMAKE_ARGS BUILD_SHARED_LIBS=OFF + OpenCV LTO and -flto -Os -ffunction-sections -fdata-sections

Same as above but with static opencv:

2.4M	_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/libdrishti.so
1.4M	_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/libdrishti_c.so
$ gzip -9 *.so
1.1M	libdrishti.so.gz
624K	libdrishti_c.so.gz

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Feb 28, 2017

For NDK lto builds I notice these warnings. I believe they are innocuous, but worth a review:

/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/iostream:62:0: warning: type of 'cerr' does not match original declaration
   extern ostream cerr;  /// Linked to standard error (unbuffered)
 ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/iostream:62:18: warning: type of 'cerr' does not match original declaration
   extern ostream cerr;  /// Linked to standard error (unbuffered)
                  ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/iostream:62:18: note: previously declared here
   extern ostream cerr;  /// Linked to standard error (unbuffered)
                  ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/istream:58:11: warning: type 'struct basic_istream' violates one definition rule
     class basic_istream : virtual public basic_ios<_CharT, _Traits>
           ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/istream:58:11: note: a type with the same name but different layout is defined in another translation unit
     class basic_istream : virtual public basic_ios<_CharT, _Traits>
           ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/basic_ios.h:66:11: warning: type 'struct basic_ios' violates one definition rule
     class basic_ios : public ios_base
           ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/basic_ios.h:66:11: note: a type with the same name but different layout is defined in another translation unit
     class basic_ios : public ios_base
           ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/locale_facets.h:674:11: warning: type 'struct ctype' violates one definition rule
     class ctype<char> : public locale::facet, public ctype_base
           ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/locale_facets.h:674:11: note: a type with the same name but different layout is defined in another translation unit
     class ctype<char> : public locale::facet, public ctype_base
           ^
In member function '_ZNKSt5ctypeIcE5widenEc.part.18.constprop':
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/locale_facets.h:674:11: warning: type 'struct ctype' violates one definition rule
     class ctype<char> : public locale::facet, public ctype_base
           ^
/Users/dhirvonen/pkg/ndk/android-ndk-r10e/sources/cxx-stl/gnu-libstdc++/4.9/include/bits/locale_facets.h:674:11: note: a type with the same name but different layout is defined in another translation unit
     class ctype<char> : public locale::facet, public ctype_base

@headupinclouds
Copy link
Collaborator Author

headupinclouds commented Mar 5, 2017

Experiment 10: TOOLCHAIN=android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto (i.e., gcc) w/ MinSizerel and opencv CMAKE_ARGS BUILD_SHARED_LIBS=ON + OpenCV LTO and -flto -Os -ffunction-sections -fdata-sections

Note: Full build w/ real-time + face api:

HUNTER_CONFIGURATION_TYPES=MinSizeRel
DRISHTI_BUILD_CONFIG=MinSizeRel

DRISHTI_SERIALIZE_WITH_BOOST=OFF
DRISHTI_SERIALIZE_WITH_CEREAL=ON
DRISHTI_SERIALIZE_WITH_CVMATIO=OFF
DRISHTI_BUILD_C_INTERFACE=ON
DRISHTI_BUILD_ACF=ON
DRISHTI_BUILD_FACE=ON
DRISHTI_BUILD_HCI=ON
DRISHTI_BUILD_OGLES_GPGPU=ON
 du -sh _install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/*
1.4M	_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/libdrishti-MinSizeRel.so
1.3M	_install/android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto/lib/libdrishti_c-MinSizeRel.so
$ gzip -9 *
$ du -sh *
612K	libdrishti-MinSizeRel.so.gz
568K	libdrishti_c-MinSizeRel.so.gz

@headupinclouds
Copy link
Collaborator Author

Experiment 11: TOOLCHAIN=android-ndk-r10e-api-19-armeabi-v7a-neon-hid-sections-lto (i.e., gcc) w/ MinSizerel and opencv CMAKE_ARGS BUILD_SHARED_LIBS=OFF + OpenCV LTO and -flto -Os -ffunction-sections -fdata-sections

Note: Full build w/ real-time + face api (same as 10 above but w/ static OpenCV):

HUNTER_CONFIGURATION_TYPES=MinSizeRel
DRISHTI_BUILD_CONFIG=MinSizeRel

DRISHTI_SERIALIZE_WITH_BOOST=OFF
DRISHTI_SERIALIZE_WITH_CEREAL=ON
DRISHTI_SERIALIZE_WITH_CVMATIO=OFF
DRISHTI_BUILD_C_INTERFACE=ON
DRISHTI_BUILD_ACF=ON
DRISHTI_BUILD_FACE=ON
DRISHTI_BUILD_HCI=ON
DRISHTI_BUILD_OGLES_GPGPU=ON
$ du -sh *
3.6M	libdrishti-MinSizeRel.so
2.1M	libdrishti_c-MinSizeRel.so
$ gzip -9 *.so
$ du -sh *.gz
1.6M	libdrishti-MinSizeRel.so.gz
944K	libdrishti_c-MinSizeRel.so.gz

@headupinclouds
Copy link
Collaborator Author

I think this main issue can be closed w/ the addition of the polly lto flag and toolchains. This should be sufficient for current needs, and remaining issues seem to be mostly upstream. (CMake related issues will remain open.)

Caveats:

  • old ndk clang stuff doesn't seem to work with -lto funcionality
  • new ndk clang works for -lto w/ -O[123] but -Os functionality is waiting on an LLVM fix + merge to NDK
  • old ndk gcc toolchains (deprecated) seem to work for -Os and -O123
  • Apple Clang -lto seems to work fine
  • MSVC (not tested, and not required)... still curious

@ruslo
Copy link
Collaborator

ruslo commented Apr 12, 2017

MSVC (not tested, and not required)... still curious

They have it as an optimization choice as far as I remember, like no-optimization, agressive-optimization, size-optimization, lto-optimization.

@headupinclouds
Copy link
Collaborator Author

Closing this main issue since the core feature is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants