Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Crash if the connection is no longer available once an offline downloaded has started #6210

Closed
zugaldia opened this issue Aug 30, 2016 · 24 comments
Assignees
Labels
Android Mapbox Maps SDK for Android offline

Comments

@zugaldia
Copy link
Member

Platform: Android 6.0.1 (including Wear)
Mapbox SDK version: 4.2.0-beta.2 and master

Steps to trigger behavior

  1. Create an activity with the simple offline map.
  2. Launch the app and start the download.
  3. Kill the Internet connection of the device mid-download.

Expected behavior

Download pauses, the map remains responsive.

Actual behavior

The app crashes with the following stacktrace:

08-30 11:15:05.629 22606-22684/com.mapbox.testwear W/com.mapbox.mapboxsdk.http.HTTPRequest: [HTTP] Request could not be executed: No Internet connection available.
08-30 11:15:05.636 22606-22606/com.mapbox.testwear E/MapboxOfflineActivity: onError reason: REASON_CONNECTION
08-30 11:15:05.636 22606-22606/com.mapbox.testwear E/MapboxOfflineActivity: onError message: No Internet connection available.
08-30 11:15:05.636 22606-22684/com.mapbox.testwear W/art: Throwing OutOfMemoryError "PushLocalFrame"

                                                           --------- beginning of crash
08-30 11:15:05.637 22606-22684/com.mapbox.testwear A/libc: /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74: void abort_message(const char *, ...): assertion "terminating with uncaught exception of type jni::PendingJavaException" failed
08-30 11:15:05.638 22606-22684/com.mapbox.testwear A/libc: Fatal signal 6 (SIGABRT), code -6 in tid 22684 (DefaultFileSour)
08-30 11:15:09.337 22606-22606/com.mapbox.testwear D/mbgl: {apbox.testwear}[JNI]: nativeSetGestureInProgress
08-30 11:15:09.338 22606-22606/com.mapbox.testwear D/mbgl: {apbox.testwear}[Android]: NativeMapView::invalidate()

@ivovandongen I'd appreciate if you could dig up the actual point where the crash is produced like you did in #6123 (comment), the generic terminating with uncaught exception of type jni::PendingJavaException doesn't say much.

/cc: @cammace

@zugaldia zugaldia added Android Mapbox Maps SDK for Android offline labels Aug 30, 2016
@ivovandongen
Copy link
Contributor

@zugaldia The error originates in platform/default/online_file_source.cpp:99 (master). This delegates to the android implementation of HTTPFileSource::request() and tries to create a new HttpRequest. In the constructor of HttpRequest, there is a request for more local jni references that fails with an OOE here:

08-30 11:15:05.636 22606-22684/com.mapbox.testwear W/art: Throwing OutOfMemoryError "PushLocalFrame"

The underlying issue seems to be that too many local references are being held (there is a hard limit. 16 min as defined by the jni standard, 512 on my nexus 5x). Maybe the objects and thus the local references are not being cleared properly since the changes to detect off-line situations. I see the following in the log that worries me:

08-31 12:19:01.422 22974-22974/com.mapbox.mapboxsdk.testapp E/OfflineActivity: onError reason: REASON_CONNECTION
08-31 12:19:01.422 22974-22974/com.mapbox.mapboxsdk.testapp E/OfflineActivity: onError message: Unable to resolve host "api.mapbox.com": No address associated with hostname
08-31 12:19:01.423 3609-4857/? E/Netd: netlink response contains error (No such file or directory)

... a couple

08-31 12:19:01.424 22974-22974/com.mapbox.mapboxsdk.testapp E/OfflineActivity: onError reason: REASON_CONNECTION
08-31 12:19:01.424 22974-22974/com.mapbox.mapboxsdk.testapp E/OfflineActivity: onError message: No Internet connection available.

... dozens more

The attempt to download is never canceled, so the requests keep coming. So there seem to be two problems:

  • The requests/local references are not cleared properly in the offline situation
  • The download request is never cancelled/paused when the phone goes offline.

Let me know if I can do more.

@ivovandongen
Copy link
Contributor

@zugaldia I was reading through some related code and it seems that there is also a more pro-active way to set the connectivity state. On ios I see the following

- (void)reachabilityChanged:(NSNotification *)notification
{
    MGLReachability *reachability = [notification object];
    if ( ! _isWaitingForRedundantReachableNotification && [reachability isReachable])
    {
        mbgl::NetworkStatus::Reachable();
    }
    _isWaitingForRedundantReachableNotification = NO;
}

This seems to re-set connectivty through mbgl::NetworkStatus. If that's the general mechanism to set connectivity state, than offline downloads should also adhere to this right?

cc @1ec5

@1ec5
Copy link
Contributor

1ec5 commented Aug 31, 2016

Yes, it is a general connectivity state API. See also #4234. It sounds like we should move this logic to MGLOfflineStorage so that it runs even if no map view has been initialized by the time the user comes online or starts an offline download. I don't know offhand whether OfflineRegion respects NetworkStatus, but I would expect it to.

@ivovandongen
Copy link
Contributor

I added a connectivity listener to set the connectivity state on core. Had a slight overflow issue which made the requests continuously retry instead of waiting for a connection. However, requests are nicely parked now until connectivity is restored.

Offline still crashes though, with what still seems like memory issues:

Stack frame #07 pc 003a230f  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine abort_message at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74
Stack frame #08 pc 003a23f7  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine default_terminate_handler() at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_default_handlers.cpp:68
Stack frame #09 pc 003a09e1  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine std::__terminate(void (*)()) at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_handlers.cpp:68
Stack frame #10 pc 003a02fb  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so (__cxa_throw+122): Routine __cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_exception.cpp:149
Stack frame #11 pc 00144c20  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine jni::CheckJavaException(_JNIEnv&) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../mason_packages/headers/jni.hpp/2.0.0/include/jni/errors.hpp:69 (discriminator 2)
Stack frame #12 pc 00146114  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine std::__ndk1::__unique_if<mbgl::HTTPRequest>::__unique_single std::__ndk1::make_unique<mbgl::HTTPRequest, _JNIEnv&, mbgl::Resource const&, std::__ndk1::function<void (mbgl::Response)>&>(_JNIEnv&, mbgl::Resource const&, std::__ndk1::function<void (mbgl::Response)>&) at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/memory:3047 (discriminator 34)
Stack frame #13 pc 0015241c  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::activateRequest(mbgl::OnlineFileRequest*) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:99
Stack frame #14 pc 001520c8  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::activatePendingRequest() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:117
Stack frame #15 pc 001529ac  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:101
Stack frame #16 pc 00152854  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine _ZNSt6__ndk18__invokeIRZN4mbgl16OnlineFileSource4Impl15activateRequestEPNS1_17OnlineFileRequestEEUlNS1_8ResponseEE_JS6_EEEDTclclsr3std6__ndk1E7forwardIT_Efp_Espclsr3std6__ndk1E7forwardIT0_Efp0_EEEOS9_DpOSA_ at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/__functional_base:413
Stack frame #17 pc 0014631c  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::HTTPRequest::async::{lambda()#1}::operator()() const at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/http_file_source.cpp:46 (discriminator 1)
Stack frame #18 pc 00142254  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::Impl::processRunnables() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:180
Stack frame #19 pc 001427e0  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::run() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:229 (discriminator 2)
Stack frame #20 pc 0014b880  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine void mbgl::util::Thread<mbgl::DefaultFileSource::Impl>::run<std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>, 0u, 1u>(std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>&&, std::__ndk1::integer_sequence<unsigned int, 0u, 1u>) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:111
Stack frame #21 pc 0014b7f0  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:95 (discriminator 1)

@ivovandongen
Copy link
Contributor

"Solved" the memory issues by making the local references in HttpRequest and associated failure handlers short-lived. Still need a cleaner solution for local object references that work with both the low and high level jni types though: local_object.hpp

A couple of previously hidden issues now pop up as well. Going offline works, coming back online crashes on one of the following:

libc    : Fatal signal 11 (SIGSEGV), code 1, fault addr 0x18 in tid 22583 (DefaultFileSour)

********** Crash dump: **********
Build fingerprint: 'google/bullhead/bullhead:7.0/NRD90M/3085278:user/release-keys'
pid: 22331, tid: 22583, name: DefaultFileSour  >>> com.mapbox.mapboxsdk.testapp <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x18
Stack frame #00 pc 00143f64  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine _ZNSt6__ndk14swapIPPNS_10shared_ptrIN4mbgl8WorkTaskEEEEENS_9enable_ifIXaasr21is_move_constructibleIT_EE5valuesr18is_move_assignableIS8_EE5valueEvE4typeERS8_SB_ at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/type_traits:3307
Stack frame #01 pc 00151400  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >::__zero() at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/string:1813
Stack frame #02 pc 00151acc  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::remove(mbgl::OnlineFileRequest*) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:78
Stack frame #03 pc 00152b0c  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::unique_ptr<std::__ndk1::__hash_node<std::__ndk1::__hash_value_type<mbgl::OnlineFileRequest*, std::__ndk1::__list_iterator<mbgl::OnlineFileRequest*, void*> >, void*>, std::__ndk1::__hash_node_destructor<std::__ndk1::allocator<std::__ndk1::__hash_node<std::__ndk1::__hash_value_type<mbgl::OnlineFileRequest*, std::__ndk1::__list_iterator<mbgl::OnlineFileRequest*, void*> >, void*> > > >::reset(std::__ndk1::__hash_node<std::__ndk1::__hash_value_type<mbgl::OnlineFileRequest*, std::__ndk1::__list_iterator<mbgl::OnlineFileRequest*, void*> >, void*>*) at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/memory:2629
Stack frame #04 pc 00152970  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::activateRequest(mbgl::OnlineFileRequest*) at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/functional:1708
Stack frame #05 pc 00146370  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine ~basic_string at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/string:2324
Stack frame #06 pc 00141fdc  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine ~Impl at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:125
Stack frame #07 pc 00142568  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::Get() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:199
Stack frame #08 pc 0014b99c  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >::__zero() at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/string:1813
Stack frame #09 pc 0014b90c  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine __optional_storage at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/experimental/optional:212
Stack frame #10 pc 00047003  /system/lib/libc.so (_ZL15__pthread_startPv+22)
Stack frame #11 pc 00019e1d  /system/lib/libc.so (__start_thread+6)

or

signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr

********** Crash dump: **********
Build fingerprint: 'google/bullhead/bullhead:7.0/NRD90M/3085278:user/release-keys'
pid: 3141, tid: 4293, name: DefaultFileSour  >>> com.mapbox.mapboxsdk.testapp <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
Stack frame #00 pc 00049d94  /system/lib/libc.so (tgkill+12)
Stack frame #01 pc 00047533  /system/lib/libc.so (pthread_kill+34)
Stack frame #02 pc 0001d885  /system/lib/libc.so (raise+10)
Stack frame #03 pc 000193d1  /system/lib/libc.so (__libc_android_abort+34)
Stack frame #04 pc 00017014  /system/lib/libc.so (abort+4)
Stack frame #05 pc 0001b87f  /system/lib/libc.so (__libc_fatal+22)
Stack frame #06 pc 000195cb  /system/lib/libc.so (__assert2+18)
Stack frame #07 pc 001534ac  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::activateOrQueueRequest(mbgl::OnlineFileRequest*) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:82 (discriminator 2)
Stack frame #08 pc 00142384  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::Impl::processRunnables() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:180
Stack frame #09 pc 00142910  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::run() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:229 (discriminator 2)
Stack frame #10 pc 0014bd44  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine void mbgl::util::Thread<mbgl::DefaultFileSource::Impl>::run<std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>, 0u, 1u>(std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>&&, std::__ndk1::integer_sequence<unsigned int, 0u, 1u>) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:111
Stack frame #11 pc 0014bcb4  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:95 (discriminator 1)
Stack frame #12 pc 00047003  /system/lib/libc.so (_ZL15__pthread_startPv+22)
Stack frame #13 pc 00019e1d  /system/lib/libc.so (__start_thread+6)

or

A/libc: /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74: void abort_message(const char *, ...): assertion "terminating with uncaught exception of type std::__ndk1::bad_function_call: std::exception" failed

********** Crash dump: **********
Build fingerprint: 'google/bullhead/bullhead:7.0/NRD90M/3085278:user/release-keys'
pid: 9949, tid: 10219, name: DefaultFileSour  >>> com.mapbox.mapboxsdk.testapp <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
Stack frame #00 pc 00049d94  /system/lib/libc.so (tgkill+12)
Stack frame #01 pc 00047533  /system/lib/libc.so (pthread_kill+34)
Stack frame #02 pc 0001d885  /system/lib/libc.so (raise+10)
Stack frame #03 pc 000193d1  /system/lib/libc.so (__libc_android_abort+34)
Stack frame #04 pc 00017014  /system/lib/libc.so (abort+4)
Stack frame #05 pc 0001b87f  /system/lib/libc.so (__libc_fatal+22)
Stack frame #06 pc 000195cb  /system/lib/libc.so (__assert2+18)
Stack frame #07 pc 003a27df  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine abort_message at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74
Stack frame #08 pc 003a28a7  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine default_terminate_handler() at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_default_handlers.cpp:63
Stack frame #09 pc 003a0eb1  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__terminate(void (*)()) at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_handlers.cpp:68
Stack frame #10 pc 003a07cb  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so (__cxa_throw+122): Routine __cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_exception.cpp:149
Stack frame #11 pc 00146974  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::function<void (mbgl::Response)>::operator()(mbgl::Response) const at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/functional:1754 (discriminator 3)
Stack frame #12 pc 00151ea0  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileRequest::completed(mbgl::Response) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:335 (discriminator 1)
Stack frame #13 pc 00152eb4  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:103
Stack frame #14 pc 00152d18  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine _ZNSt6__ndk18__invokeIRZN4mbgl16OnlineFileSource4Impl15activateRequestEPNS1_17OnlineFileRequestEEUlNS1_8ResponseEE_JS6_EEEDTclclsr3std6__ndk1E7forwardIT_Efp_Espclsr3std6__ndk1E7forwardIT0_Efp0_EEEOS9_DpOSA_ at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/__functional_base:413
Stack frame #15 pc 00146718  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::HTTPRequest::async::{lambda()#1}::operator()() const at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/http_file_source.cpp:53 (discriminator 1)
Stack frame #16 pc 00142384  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::Impl::processRunnables() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:180
Stack frame #17 pc 00142910  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::run() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:229 (discriminator 2)
Stack frame #18 pc 0014bd44  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine void mbgl::util::Thread<mbgl::DefaultFileSource::Impl>::run<std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>, 0u, 1u>(std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>&&, std::__ndk1::integer_sequence<unsigned int, 0u, 1u>) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:111
Stack frame #19 pc 0014bcb4  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:95 (discriminator 1)
Stack frame #20 pc 00047003  /system/lib/libc.so (_ZL15__pthread_startPv+22)
Stack frame #21 pc 00019e1d  /system/lib/libc.so (__start_thread+6)

@edgarmacas
Copy link

I have the same problem, what would be the most feasible solution, thanks

A/libc: /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74: void abort_message(const char *, ...)

@jfirebaugh
Copy link
Contributor

jfirebaugh commented Sep 12, 2016

@ivovandongen So why would a DeleteLocalRef-based solution work, but not PushLocalFrame/PopLocalFrame?

[oops, mean to post to #6293]

@ivovandongen
Copy link
Contributor

@jfirebaugh I was hoping you could tell me :)

It seems that PopLocalFrame doesn't clear references immediately, but I can't say when exactly. DeleteLocalRef does. In the case of (failing) HttpRequests it seems that no references are cleared until at least the native constructor has returned, but maybe even after that. The failure creates even more references (going through the java constructor -> onFailure -> etc) and then it crashes.

I tried the same approach on conversion for the queryRenderedFeature calls. And push/pop local frame did not work as advertised. Sadly.

I verified the creation of http requests and there are never more than the set limit (20).

@ivovandongen
Copy link
Contributor

ATM there are two remaining issues when switching back and forth between online/offline.

URL/ETag gets corrupted

Somehow the url/etag passed to the constructor of HTTPRequest get corrupted:

09-14 12:23:08.038 5269-5269/? A/DEBUG: Abort message: '/Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74: void abort_message(const char *, ...): assertion "terminating with uncaught exception of type std::range_error: wstring_convert: from_bytes error" failed'

The url/etag look something like this at that time (one/both): �}^�, `���

Full dump:

********** Crash dump: **********
Build fingerprint: 'google/bullhead/bullhead:7.0/NRD90M/3085278:user/release-keys'
pid: 23687, tid: 25170, name: DefaultFileSour  >>> com.mapbox.mapboxsdk.testapp <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
Stack frame #00 pc 00049d94  /system/lib/libc.so (tgkill+12)
Stack frame #01 pc 00047533  /system/lib/libc.so (pthread_kill+34)
Stack frame #02 pc 0001d885  /system/lib/libc.so (raise+10)
Stack frame #03 pc 000193d1  /system/lib/libc.so (__libc_android_abort+34)
Stack frame #04 pc 00017014  /system/lib/libc.so (abort+4)
Stack frame #05 pc 0001b87f  /system/lib/libc.so (__libc_fatal+22)
Stack frame #06 pc 000195cb  /system/lib/libc.so (__assert2+18)
Stack frame #07 pc 003a6b2f  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine abort_message at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/abort_message.cpp:74
Stack frame #08 pc 003a6bf7  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine default_terminate_handler() at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_default_handlers.cpp:63
Stack frame #09 pc 003a5201  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__terminate(void (*)()) at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_handlers.cpp:68
Stack frame #10 pc 003a4b1b  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so (__cxa_throw+122): Routine __cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) at /Volumes/Android/buildbot/src/android/ndk-r12-release/ndk/sources/cxx-stl/llvm-libc++abi/libcxxabi/src/cxa_exception.cpp:149
Stack frame #11 pc 0004b48c  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::wstring_convert<std::__ndk1::codecvt_utf8_utf16<char16_t, 1114111ul, (std::__ndk1::codecvt_mode)0>, char16_t, std::__ndk1::allocator<char16_t>, std::__ndk1::allocator<char> >::from_bytes(char const*, char const*) at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/locale:3909 (discriminator 1)
Stack frame #12 pc 0004b0c4  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::wstring_convert<std::__ndk1::codecvt_utf8_utf16<char16_t, 1114111ul, (std::__ndk1::codecvt_mode)0>, char16_t, std::__ndk1::allocator<char16_t>, std::__ndk1::allocator<char> >::from_bytes(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&) at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/locale:3786 (discriminator 3)
Stack frame #13 pc 00146ec8  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine jni::Object<jni::StringTag> jni::Make<jni::Object<jni::StringTag>, _JNIEnv&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >&>(_JNIEnv&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >&) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../mason_packages/headers/jni.hpp/2.0.0/include/jni/make.hpp:10 (discriminator 2)
Stack frame #14 pc 00148838  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine std::__ndk1::__unique_if<mbgl::HTTPRequest>::__unique_single std::__ndk1::make_unique<mbgl::HTTPRequest, _JNIEnv&, mbgl::Resource const&, std::__ndk1::function<void (mbgl::Response)>&>(_JNIEnv&, mbgl::Resource const&, std::__ndk1::function<void (mbgl::Response)>&) at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/memory:3047 (discriminator 46)
Stack frame #15 pc 00156010  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::activateRequest(mbgl::OnlineFileRequest*) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:101
Stack frame #16 pc 00155cbc  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileSource::Impl::activatePendingRequest() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:119
Stack frame #17 pc 001565a0  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:103
Stack frame #18 pc 00156448  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine _ZNSt6__ndk18__invokeIRZN4mbgl16OnlineFileSource4Impl15activateRequestEPNS1_17OnlineFileRequestEEUlNS1_8ResponseEE_JS6_EEEDTclclsr3std6__ndk1E7forwardIT_Efp_Espclsr3std6__ndk1E7forwardIT0_Efp0_EEEOS9_DpOSA_ at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/__functional_base:413
Stack frame #19 pc 00148a40  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::HTTPRequest::async::{lambda()#1}::operator()() const at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/http_file_source.cpp:47 (discriminator 1)
Stack frame #20 pc 001444a0  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::Impl::processRunnables() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:180
Stack frame #21 pc 00144a2c  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::run() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:229 (discriminator 2)
Stack frame #22 pc 0014e7dc  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine void mbgl::util::Thread<mbgl::DefaultFileSource::Impl>::run<std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>, 0u, 1u>(std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>&&, std::__ndk1::integer_sequence<unsigned int, 0u, 1u>) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:114
Stack frame #23 pc 0014e74c  /data/app/com.mapbox.mapboxsdk.testapp-2/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:98 (discriminator 1)
Stack frame #24 pc 00047003  /system/lib/libc.so (_ZL15__pthread_startPv+22)
Stack frame #25 pc 00019e1d  /system/lib/libc.so (__start_thread+6)

Crash when stopping Timer

More difficult to reproduce, only happened twice in the last couple of days. Will probably be easier to reproduced once we solve the other issues. The crash happens in Timer#stop

********** Crash dump: **********
Build fingerprint: 'google/bullhead/bullhead:7.0/NRD90M/3085278:user/release-keys'
pid: 2988, tid: 4175, name: DefaultFileSour  >>> com.mapbox.mapboxsdk.testapp <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x18
Stack frame #00 pc 00146584  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::util::Timer::Impl::stop() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/timer.cpp:32
Stack frame #01 pc 00154f58  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileRequest::schedule(std::experimental::__library_fundamentals_v1::optional<std::__ndk1::chrono::time_point<std::__ndk1::chrono::system_clock, std::__ndk1::chrono::duration<long long, std::__ndk1::ratio<1ll, 1ll> > > >) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:265
Stack frame #02 pc 00155658  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::OnlineFileRequest::completed(mbgl::Response) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:309 (discriminator 1)
Stack frame #03 pc 00156698  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/default/online_file_source.cpp:105
Stack frame #04 pc 001564fc  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine _ZNSt6__ndk18__invokeIRZN4mbgl16OnlineFileSource4Impl15activateRequestEPNS1_17OnlineFileRequestEEUlNS1_8ResponseEE_JS6_EEEDTclclsr3std6__ndk1E7forwardIT_Efp_Espclsr3std6__ndk1E7forwardIT0_Efp0_EEEOS9_DpOSA_ at /Users/ivo/git/mapbox-gl-native/mason_packages/osx-x86_64/android-ndk/arm-9-r12b/bin/../lib/gcc/arm-linux-androideabi/4.9.x/../../../../include/c++/4.9.x/__functional_base:413
Stack frame #05 pc 00148af4  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::HTTPRequest::async::{lambda()#1}::operator()() const at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/http_file_source.cpp:47 (discriminator 1)
Stack frame #06 pc 001444a0  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::Impl::processRunnables() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:180
Stack frame #07 pc 00144a2c  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine mbgl::util::RunLoop::run() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../platform/android/src/run_loop.cpp:229 (discriminator 2)
Stack frame #08 pc 0014e890  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine void mbgl::util::Thread<mbgl::DefaultFileSource::Impl>::run<std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>, 0u, 1u>(std::__ndk1::tuple<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, unsigned long long&>&&, std::__ndk1::integer_sequence<unsigned int, 0u, 1u>) at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:114
Stack frame #09 pc 0014e800  /data/app/com.mapbox.mapboxsdk.testapp-1/lib/arm/libmapbox-gl.so: Routine operator() at /Users/ivo/git/mapbox-gl-native/build/android-arm-v7/Debug/../../../src/mbgl/util/thread.hpp:98 (discriminator 1)
Stack frame #10 pc 00047003  /system/lib/libc.so (_ZL15__pthread_startPv+22)
Stack frame #11 pc 00019e1d  /system/lib/libc.so (__start_thread+6)

@jfirebaugh
Copy link
Contributor

What thread does OkHttp run Callbacks on? It looks like the current implementation of Android HTTPRequest::onResponse/HTTPRequest::onFailure assumes that they are called on the thread which called Call::enqueue. If that isn't the case and onResponse/onFailure are called on a different thread than the OnlineFileSource, they'll be racing against deletion of the HTTPRequest due to cancellation.

I'm not quite sure how that would lead to crashes with the stack traces above, but it seems suspicious.

@ivovandongen
Copy link
Contributor

@jfirebaugh Thanks for the tips

What thread does OkHttp run Callbacks on?

HTTPRequest::HTTPRequest runs on the calling thread (DefaultFileSource in a lot of cases)
HTTPRequest::onFailure:

  • Runs on the original calling thread if the HTTPRequest ctor fails for any reason (io, no connection to begin with) here
  • Runs on a thread from the OkHttp pool for any errors after the ctor completes
    HTTPRequest::onResponse: runs on a thread from the OkHttp pool

It looks like the current implementation of Android HTTPRequest::onResponse/HTTPRequest::onFailure assumes that they are called on the thread which called Call::enqueue. If that isn't the case and onResponse/onFailure are called on a different thread than the OnlineFileSource, they'll be racing against deletion of the HTTPRequest due to cancellation.

I thought that the callback (AsyncTask) abstracted over that: defined here, called from onResponse/onFailure. It always seems to run on the original calling thread.

Would a good next step be to check the assertions like you mentioned here? #5827 (comment)

@jfirebaugh
Copy link
Contributor

I thought that the callback (AsyncTask) abstracted over that

It does, but only the part that happens after the call to async.send(). onResponse/onFailure themselves are still running on the OkHttp pool thread. When they access the member variable async, in order to call async.send(), they are depending on this still being valid.

@ivovandongen
Copy link
Contributor

@jfirebaugh Thanks, yet again, for looking into this. I've looked at the threading in onResponse/onFailure and haven't been able to provoke a situation where the this pointer becomes invalid.

Going by your comments #5827 (comment) though, I found that requests indeed do get scheduled multiple times under some conditions as the pre-conditions are not checked at the proper time.

I added runtime checks to OnlineFileSource::activateOrQueueRequest (commit) to prevent double scheduling and this resolves all issues noted above. I can check if this also solves #5827 if I can reproduce that.

@jfirebaugh
Copy link
Contributor

Nice. Did you determine what the circumstances are where activateOrQueueRequest gets called for a request that's already pending or active? I believe that means there's code elsewhere that isn't acting as expected.

@ivovandongen
Copy link
Contributor

Did you determine what the circumstances are where activateOrQueueRequest gets called for a request that's already pending or active

No, not quite. Somehow the request get scheduled twice after a online/offline switch. This happens irregardless of setting the NetworkState proactively (first commit), but was previously masked by other crashes. So either schedule() gets called twice on some error conditions or the request is not correctly moved from pending to active state. I checked all threads involved to see if concurrent access might be an issues, but all callbacks execute on a single thread (OnlineFileSource). I could try to reproduce the exact circumstances in a a unit test. Ideally I'd like to use a mock for HttpFileSource in that case to have a little more control.

Little side-question; do you think it's worth having two collections keeping track of the pending requests? It might be more efficient than a std::set, but it also adds some complexity and possible errors.

@jfirebaugh
Copy link
Contributor

@ivovandongen I want to track down the root cause here. Can you keep digging?

Little side-question; do you think it's worth having two collections keeping track of the pending requests? It might be more efficient than a std::set, but it also adds some complexity and possible errors.

Good question -- the reason to use two collection is that we need a combination of things that no one C++ collection type provides:

  • Iteration by insertion order -- pending requests need to be processed in the order received. std::map (or std::unordered_map) does not give us that, but std::list does.
  • Efficient lookup by identifier (pointer value in this case), for when a request is removed. Searching a std::list is O(n), which would make the simultaneous removal of a large number of requests O(n^2). Quadratic behavior would be unacceptable in the case of pausing a large offline download, for example.

Some languages have insertion-ordered maps as a built-in collection (e.g. Ruby, JS); it's a handy datastructure. In C++ the most straightforward way to get the same behavior is to use two coordinated collections. There are alternatives such as boost multi-index but they tend to be very heavy-weight.

@ivovandongen
Copy link
Contributor

@jfirebaugh Found the issue. Like I thought, the request may be scheduled multiple times. As it turns out, it may happen when connection is restored because schedule is called from OnlineFileRequest::networkIsReachableAgain while the request may already be scheduled (or running even if it just transitioned).

I think the fix I proposed (commit) still makes sense as at the moment of connectivity restore / schedule there might be a timer already set to call activateOrQueueRequest here in schedule.

Doing the check in activateOrQueueRequest ensures that at the moment of queuing/activation the pre-conditions are valid at that actual point. It also guards well against any future regressions.


I traced the methods with and without setting the connection state pro-actively (commit)

With pro-active connection state:

OnlineFileSource::Impl::activateOrQueueRequest
OnlineFileSource::Impl::queueRequest
OnlineFileRequest::networkIsReachableAgain
OnlineFileRequest::schedule
OnlineFileSource::Impl::activateOrQueueRequest <-- already queued
OnlineFileSource::Impl::queueRequest <!-- assert crashes things

With passive connection state:

OnlineFileSource::Impl::add
OnlineFileRequest::schedule
OnlineFileSource::Impl::activateOrQueueRequest
OnlineFileSource::Impl::queueRequest
OnlineFileSource::Impl::activatePendingRequest
OnlineFileSource::Impl::activateRequest
OnlineFileRequest::completed
OnlineFileRequest::schedule
OnlineFileRequest::networkIsReachableAgain
OnlineFileRequest::schedule
OnlineFileSource::Impl::activateOrQueueRequest <-- already queued
OnlineFileSource::Impl::queueRequest <!-- assert crashes things

@zugaldia zugaldia added this to the android-v4.2.0 milestone Sep 20, 2016
@zugaldia
Copy link
Member Author

I think the fix I proposed (commit) still makes sense as at the moment of connectivity restore / schedule there might be a timer already set to call activateOrQueueRequest here in schedule.

@jfirebaugh Does this look like 👍 to merge #6293? I'd love to have this for beta3.

@jfirebaugh
Copy link
Contributor

@ivovandongen Thanks so much for tracking it down! To summarize, a request will be improperly rescheduled in the event that networkIsReachableAgain is called while the request is in the pending state. The guard at the beginning of OnlineFileRequest::schedule is intended to prevent duplicate scheduling, but it's ineffective in this case because a pending request does not have the request member set.

I think we want to make the fix at a higher level than activateOrQueueRequest however, to avoid the following timing:

  1. Request is scheduled, transitions to pending
  2. Network becomes reachable, scheduling timer is activated
  3. Prior scheduled request transitions to active, and quickly completes
  4. Scheduling timer from step 2 completes, request is rescheduled

In this scenario, we end up with more frequent requests than desired.

The bug here is really in step 2: a pending request should not be rescheduled. Let's adjust the guard condition so that this doesn't happen.

@ivovandongen
Copy link
Contributor

@jfirebaugh Thanks again! Didn't consider that scenario yet.

I've changed the guard in schedule. Hope this is what you meant. I prefer the accessor methods (isScheduled / isActive) over accessing the collections in OnlineFileSource::Impl directly from OnlineFileRequest.

Also, I added a small check on NetworkStatus::Set so that Reachable is called only once. I noticed that on Android, the state transition to online/connected is sometimes called multiple times. Which would mean calling reschedule for all requests multiple times.

@ivovandongen ivovandongen self-assigned this Sep 20, 2016
@ivovandongen
Copy link
Contributor

ivovandongen commented Sep 20, 2016

@jfirebaugh Now that I think of it some more. There is one situation we don't cover with the guards in schedule I think:

  1. Request is scheduled, activated, hits a rate limit (or something else causing a timeout before retry)
  2. schedule is called, timer is started to call activateOrQueueRequest with a delay
  3. Connectivity is disrupted
  4. Connectivity is restored
  5. schedule is called again for the same request

Now, the guard will fail as the request is not scheduled nor active. So another timer is started, resulting in two subsequent calls to activateOrQueueRequest right?

@ivovandongen
Copy link
Contributor

@jfirebaugh To prevent both the situation you described and the one just above, I've added an additional guard in the timer completion callback.

If you could review the changes in #6293, that would be great.

@jfirebaugh
Copy link
Contributor

Two things:

  • timer.start(...) is supposed to cancel any existing timer, and as far as I know that works properly on all platforms. So calling schedule when the timer is already active should not result in multiple calls to activateOrQueueRequest.
  • Note that networkIsReachableAgain() only calls schedule if failedRequestReason is Reason::Connection. In that case, we get the desired behavior: resetting the exponential backoff for connection errors when the network is restored. Other errors, such as rate limiting, won't cause networkIsReachableAgain to reschedule.

@jfirebaugh
Copy link
Contributor

PR looks good! Thanks for getting to the bottom of this. I'm confident this change will fix #5827 as well.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Android Mapbox Maps SDK for Android offline
Projects
None yet
Development

No branches or pull requests

5 participants