Crash due to alignment exception below SharedGroup::do_open on 32-bit iOS devices #1558

bdash · 2016-03-13T21:45:54Z

With core v0.97.0, released late last week as part of Realm Cocoa v0.98.4, users are seeing crashes below SharedGroup::do_open on 32-bit iOS devices. This was reported to us in realm/realm-swift#3321.

The exception looks like so:

Exception Type:  EXC_BAD_ACCESS (SIGBUS)
Exception Subtype: EXC_ARM_DA_ALIGN at 0x0000000002e00114
Triggered by Thread:  0

The backtrace like so:

* thread #1: tid = 0x15f283, 0x340381a0 libsystem_platform.dylib`OSAtomicCompareAndSwap64Barrier + 8, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_ARM_DA_ALIGN, address=0x776c09c)
    frame #0: 0x340381a0 libsystem_platform.dylib`OSAtomicCompareAndSwap64Barrier + 8
    frame #1: 0x3403bf1e libsystem_pthread.dylib`_pthread_mutex_lock + 210
    frame #2: 0x007cb5ec Realm`realm::util::RobustMutex::is_valid() + 84
    frame #3: 0x0088144c Realm`realm::SharedGroup::do_open(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, realm::SharedGroup::DurabilityLevel, bool, char const*, bool) + 900
  * frame #4: 0x007b2834 Realm`realm::SharedGroup::open(this=0x163b3c00, repl=0x15dbe080, durability=durability_Full, encryption_key=0x00000000, allow_file_format_upgrade=true) + 284 at group_shared.hpp:810
    frame #5: 0x007b25ba Realm`realm::SharedGroup::SharedGroup(this=0x163b3c00, repl=0x15dbe080, durability=durability_Full, encryption_key=0x00000000, allow_file_format_upgrade=true) + 514 at group_shared.hpp:779
    frame #6: 0x007b23ae Realm`realm::SharedGroup::SharedGroup(this=0x163b3c00, repl=0x15dbe080, durability=durability_Full, encryption_key=0x00000000, allow_file_format_upgrade=true) + 62 at group_shared.hpp:778
    frame #7: 0x007acdee Realm`realm::Realm::open_with_config(realm::Realm::Config const&, std::__1::unique_ptr<realm::Replication, std::__1::default_delete<realm::Replication> >&, std::__1::unique_ptr<realm::SharedGroup, std::__1::default_delete<realm::SharedGroup> >&, std::__1::unique_ptr<realm::Group, std::__1::default_delete<realm::Group> >&) [inlined] std::__1::__unique_if<realm::SharedGroup>::__unique_single std::__1::make_unique<realm::SharedGroup, realm::Replication&, realm::SharedGroup::DurabilityLevel&, char const*, bool>(__args=0x15dbe080, __args=0x003fc4ac, __args=0x003fc4a4, __args=0x003fc4a3) + 100 at memory:3075
    frame #8: 0x007acd8a Realm`realm::Realm::open_with_config(config=0x15dbdfd8, history=0x15dbe020, shared_group=0x15dbe024, read_only_group=0x15dbe028) + 4034 at shared_realm.cpp:86
    frame #9: 0x007abba2 Realm`realm::Realm::Realm(this=0x15dbdfd0) + 470 at shared_realm.cpp:66
    frame #10: 0x007ad424 Realm`realm::Realm::Realm(this=0x15dbdfd0, config=<unavailable>) + 16 at shared_realm.cpp:65
    frame #11: 0x006a8a66 Realm`std::__1::shared_ptr<realm::Realm> std::__1::shared_ptr<realm::Realm>::make_shared<realm::Realm::Config>(realm::Realm::Config&&) [inlined] std::__1::__libcpp_compressed_pair_imp<std::__1::allocator<realm::Realm>, realm::Realm, 1u>::__libcpp_compressed_pair_imp<std::__1::allocator<realm::Realm>&, realm::Realm::Config&&, 0ul, 0ul>(this=0x15dbdfd0, __pc=piecewise_construct_t @ 0x003fd080, __first_args=tuple<std::__1::allocator<realm::Realm> &> @ 0x003fd07c, __second_args=tuple<realm::Realm::Config &&> @ 0x003fd078, (null)=__tuple_indices<0> @ 0x003fd074, (null)=__tuple_indices<0> @ 0x003fd070) + 144 at memory:2128
    frame #12: 0x006a89d6 Realm`std::__1::shared_ptr<realm::Realm> std::__1::shared_ptr<realm::Realm>::make_shared<realm::Realm::Config>(realm::Realm::Config&&) [inlined] std::__1::__compressed_pair<std::__1::allocator<realm::Realm>, realm::Realm>::__compressed_pair<std::__1::allocator<realm::Realm>&, realm::Realm::Config&&>(this=0x15dbdfd0, __pc=piecewise_construct_t @ 0x003fd05c, __first_args=tuple<std::__1::allocator<realm::Realm> &> @ 0x003fd058, __second_args=tuple<realm::Realm::Config &&> @ 0x003fd054) + 48 at memory:2391
    frame #13: 0x006a89a6 Realm`std::__1::shared_ptr<realm::Realm> std::__1::shared_ptr<realm::Realm>::make_shared<realm::Realm::Config>(realm::Realm::Config&&) [inlined] std::__1::__compressed_pair<std::__1::allocator<realm::Realm>, realm::Realm>::__compressed_pair<std::__1::allocator<realm::Realm>&, realm::Realm::Config&&>(this=0x15dbdfd0, __pc=piecewise_construct_t @ 0x003fd038, __first_args=tuple<std::__1::allocator<realm::Realm> &> @ 0x003fd034, __second_args=tuple<realm::Realm::Config &&> @ 0x003fd030) + 24 at memory:2394
    frame #14: 0x006a898e Realm`std::__1::shared_ptr<realm::Realm> std::__1::shared_ptr<realm::Realm>::make_shared<realm::Realm::Config>(realm::Realm::Config&&) [inlined] std::__1::__shared_ptr_emplace<realm::Realm, std::__1::allocator<realm::Realm> >::__shared_ptr_emplace<realm::Realm::Config>(this=0x15dbdfc0, __a=allocator<realm::Realm> @ 0x003fd018, __args=0x003fd8d8) + 330 at memory:3732
    frame #15: 0x006a8844 Realm`std::__1::shared_ptr<realm::Realm> std::__1::shared_ptr<realm::Realm>::make_shared<realm::Realm::Config>(realm::Realm::Config&&) [inlined] std::__1::__shared_ptr_emplace<realm::Realm, std::__1::allocator<realm::Realm> >::__shared_ptr_emplace<realm::Realm::Config>(this=0x15dbdfc0, __a=allocator<realm::Realm> @ 0x003fcff8, __args=0x003fd8d8) + 16 at memory:3733
    frame #16: 0x006a8834 Realm`std::__1::shared_ptr<realm::Realm> std::__1::shared_ptr<realm::Realm>::make_shared<realm::Realm::Config>(__args=0x003fd8d8) + 348 at memory:4308
    frame #17: 0x0069ca58 Realm`realm::_impl::RealmCoordinator::get_realm(realm::Realm::Config) [inlined] std::__1::enable_if<!(is_array<realm::Realm>::value), std::__1::shared_ptr<realm::Realm> >::type std::__1::make_shared<realm::Realm, realm::Realm::Config>(__args=0x003fd8d8) + 22 at memory:4672
    frame #18: 0x0069ca42 Realm`realm::_impl::RealmCoordinator::get_realm(this=0x15e98df0, config=<unavailable>) + 2550 at realm_coordinator.cpp:116
    frame #19: 0x007aebe8 Realm`realm::Realm::get_shared_realm(realm::Realm::Config) + 172 at shared_realm.cpp:183
    frame #20: 0x0078ae9a Realm`+[RLMRealm openSharedRealm:error:](self=0x00956104, _cmd="openSharedRealm:error:", config=0x15e98a88, outError=domain: nil - code: 4186848) + 154 at RLMRealm.mm:279
    frame #21: 0x0078b7a8 Realm`+[RLMRealm realmWithConfiguration:error:](self=0x00956104, _cmd="realmWithConfiguration:error:", configuration=0x15e98a80, error=domain: nil - code: 4186848) + 2096 at RLMRealm.mm:350

I was able to reproduce by doing the following:

cd realm-core
git checkout v0.97.0
REALM_ENABLE_ENCRYPTION=yes REALM_ENABLE_ASSERTIONS=yes sh build.sh config
REALM_COCOA_PLATFORMS="iphone" sh build.sh build-cocoa
cd ../realm-cocoa
git checkout v0.98.4
curl "https://gist.githubusercontent.com/bdash/d7ee7bfb4624c57f689f/raw/6bd1be5f75cc42c52ed94fbe7a0bf0f635c822c4/armv7-only.patch" | patch -p1 (hacks the Xcode projects to build for armv7 only)
open examples/ios/objc/RealmExamples.xcworkspace
Build and run the Migration scheme on an iOS device.

A git bisect pinpointed b78b704 as the culprit for the breakage. Nothing obvious jumps out to me in that change, but I was able to verify this by switching between b78b704 and its parent.

/cc @kspangsege

The text was updated successfully, but these errors were encountered:

tgoyne · 2016-03-14T01:13:44Z

The change from placement-new on a mmaped buffer to copying a SharedInfo to the file is not valid because SharedInfo has pthread_mutex_t members. Because they're unlocked at the time it's unsurprising that it happens to work on many platforms, but POSIX doesn't require that memcpy on a process-shared mutex should work.

kspangsege · 2016-03-14T01:32:26Z

@tgoyne, I have a hard time making sense of that.

First of all, do we agree that the following scenario is supposed to work?

Create file with shared mutex the way core used to do it (by placement new on mapped file).
Reboot computer.
Map file and lock mutex.

If that works, why would the new scheme used by SharedGroup::do_open() not work? How can step 3 tell the difference between a mutex in a file prepared before a reboot, and one that was prepared by copying some bytes into a file? What is the significant difference?

kspangsege · 2016-03-14T01:44:00Z

@tgoyne, here is another scenario that is supposed to work as far as I know:

Run program to initialize mutex in file A.
Copy file A to file B.
Run another program that maps file B and locks the mutex.

Again, if this works, why not the scheme used in SharedGroup::do_open()?

kspangsege · 2016-03-14T01:59:44Z

As far as I can tell, POSIX only requires that a process shared mutex remains mapped at a fixed address from the point of view of process A while it is locked by process A.

In particular, when the mutex is not locked by anybody, its state is self contained, and can therefore be cloned and copied around as desired.

kspangsege · 2016-03-14T02:17:20Z

@tgoyne, I will be grateful if you can just hint at the kind of implementation of process shared mutexes that would invalidate the scheme currently used in SharedGroup::do_open().

bdash · 2016-03-14T02:25:57Z

This doesn't reproduce in the iOS simulator, only on an iOS device. Any iOS device capable of running iOS 9 should be able to reproduce this when following the steps I outlined above.

tgoyne · 2016-03-14T02:57:21Z

First of all, do we agree that the following scenario is supposed to work?

No, I do not see anything in POSIX 2013 that even hints at either of those being a supported scenario. pthread_mutex_t is explicitly an opaque type that can only be manipulated with the provided functions, and PTHREAD_PROCESS_SHARED merely makes it so that it can be used by any thread which has access to the mutex, even if the mutex is created in shared memory. Mmaping a copy of the file would not give you access to the same memory.

kspangsege · 2016-03-14T03:05:12Z

Mmaping a copy of the file would not give you access to the same memory.

Of course not. I only meant to imply that the copy was itself a valid mutex that can be mapped and then locked.

kspangsege · 2016-03-14T03:20:21Z

No, I do not see anything in POSIX 2013 that even hints at either of those being a supported scenario.

I agree that POSIX is not explicit about it, but from the man pages it is clear that the following scenario is valid:

Program A creates a file and initializes a process shared mutex in that file.
Program A terminates.
Time passes while nobody has the file open.
Program B opens the file, maps it, and locks the process shared mutex.

So what you are saying, is that this scenario is only valid if the system is not rebooted during step 3 (time passes).

I would find that very odd for several reasons:

The man page does not specifically prohibit the reboot scenario.
If it is not just a matter of the mutex state constituted by the bytes stored in the file, then it is not at all clear what I am allowed to do with the file during step 3 (time passes). It would seem to me that the man page would then have to be explicit about what you can and cannot do with the file. And it isn't.

kspangsege · 2016-03-14T03:22:20Z

And:

It is hard (if not impossible) for process B to tell whether the system was rebooted or not.

kspangsege · 2016-03-14T03:37:54Z

All I am saying, is that as far as I can see, one can deduce from the POSIX man pages that a process shared mutex has to have a self contained state (no critical information saved outside mutex object) while it is not locked by anybody (step 3, time passes).

It is of course a separate question whether iOS adheres to that rule or not.

kspangsege · 2016-03-14T03:43:06Z

All I am saying, is that as far as I can see, one can deduce from the POSIX man pages that a process shared mutex has to have a self contained state (no critical information saved outside mutex object) while it is not locked by anybody (step 3, time passes).

@tgoyne, please let me know if you know anything specific about the iOS implementation of process shared mutexes that makes it not adhere to this rule.

kspangsege · 2016-03-14T03:50:10Z

@tgoyne, please see "Process Shared Memory and Synchronization" in the man page of pthread_mutexattr_init. Note especially the semaphore example, and this remark:

In particular, these processes may exist beyond the lifetime of the initializing process.

kspangsege · 2016-03-14T04:43:46Z

Here is a little extra analysis from me:

One can imagine that iOS allocates part of the mutex state in a slot of memory somewhere outside the mutex object whenever a shared mutex is initialized. In that case the slot must be identified during any subsequent access to the mutex, and that would invalidate the reboot scenario.

However, since a file containing a process shared mutex can be created via one map, and then later accessed via another map at a a different address (even inside the same process, or thread), we can safely assume that the address of the mutex object is not stored (at least not for any critical purpose) while the mutex is not locked by anybody.

Further more, even if iOS behaves like this, the current implementation of SharedGroup::do_open() would still work, as the mutex object is still initialized before being used.

bdash · 2016-03-14T05:17:36Z

A little digging around in Apple's pthread implementation suggests that the alignment of info in the following block from SharedGroup::do_open is the relevant factor:

// Write an initialized SharedInfo structure to the file, but with
// init_complete = 0.
SharedInfo info(durability, history_type);
m_file.write(reinterpret_cast<char*>(&info), sizeof info); // Throws

Information about the alignment of a mutex is stored within the pthread_mutex structure during pthread_mutex_init. If the alignment of info differs from the alignment of the SharedInfo structure in the mapped region, the alignment information in the pthread_mutex structures will not match the alignment of the addresses the mutexes are mapped at. Changing the declaration of info to alignas(64) SharedInfo gives the structure the same alignment as when it's remapped, and we no longer crash.

finnschiermer · 2016-03-14T08:04:49Z

@bdash: Thx for digging into this, we'll make a fix.

kspangsege · 2016-03-14T12:58:44Z

@bdash, thanks a lot for pinpointing the problem. One thing, though, I am assuming that you meant to align on a 64 bit boundary, which means that alignas(8) will suffice. Note that alignas() takes a "number of bytes" argument.

Also thanks a lot for the precise repro instructions.

@finnschiermer will implement a fix and release another version of core today.

bugfix for issue #1558

bdash added the T-Bug-Crash label Mar 13, 2016

kspangsege self-assigned this Mar 14, 2016

bmunkholm added the S:In Progress label Mar 14, 2016

finnschiermer added a commit that referenced this issue Mar 14, 2016

bugfix for issue #1558

8fa507c

finnschiermer mentioned this issue Mar 14, 2016

bugfix for issue #1558 #1559

Merged

finnschiermer assigned finnschiermer and unassigned kspangsege Mar 14, 2016

finnschiermer closed this as completed in #1559 Mar 14, 2016

finnschiermer added a commit that referenced this issue Mar 14, 2016

Merge pull request #1559 from realm/fsa/fix_sharedinfo_alignment

03383fd

bugfix for issue #1558

finnschiermer removed the S:In Progress label Mar 14, 2016

kspangsege pushed a commit that referenced this issue Mar 14, 2016

bugfix for issue #1558

0167498

crrobinson14 mentioned this issue Oct 25, 2016

EXC_BAD_ACCESS when opening a Realm DB realm/realm-swift#4260

Closed

github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash due to alignment exception below SharedGroup::do_open on 32-bit iOS devices #1558

Crash due to alignment exception below SharedGroup::do_open on 32-bit iOS devices #1558

bdash commented Mar 13, 2016

tgoyne commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

bdash commented Mar 14, 2016

tgoyne commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

bdash commented Mar 14, 2016

finnschiermer commented Mar 14, 2016

kspangsege commented Mar 14, 2016

Crash due to alignment exception below SharedGroup::do_open on 32-bit iOS devices #1558

Crash due to alignment exception below SharedGroup::do_open on 32-bit iOS devices #1558

Comments

bdash commented Mar 13, 2016

tgoyne commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

bdash commented Mar 14, 2016

tgoyne commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

kspangsege commented Mar 14, 2016

bdash commented Mar 14, 2016

finnschiermer commented Mar 14, 2016

kspangsege commented Mar 14, 2016