Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Realm Accessed from Incorrect Thread after Realm version upgrade #6559

Closed
mahmoom opened this issue Jun 8, 2020 · 3 comments · Fixed by #6576
Closed

Intermittent Realm Accessed from Incorrect Thread after Realm version upgrade #6559

mahmoom opened this issue Jun 8, 2020 · 3 comments · Fixed by #6576

Comments

@mahmoom
Copy link

mahmoom commented Jun 8, 2020

Goals

We use a wrapper around Realm to abstract away some of the write/read logic throughout our application as a way to modularize and simplify our code. We recently upgraded realm from 3.17.3 -> 5.0.1

Expected Results

Our realm wrappers will grab an instance of realm and perform a write transaction on that instance with a Dispatch Item on an autorelease pool and throw it onto our static utility queue to complete the transaction. This worked perfectly prior to upgrading realm.

Actual Results

(print statements for getting realm and writing on realm threads to verify that they're indeed the same, crash happens in the realm.beginWrite() immediately after the latter print statement where we just verified thread safety)

Getting realm on thread <NSThread: 0x60000053f340>{number = 27, name = (null)}

# ~~~~~~~~~~~~~
# Performing action on thread <NSThread: 0x60000053f340>{number = 27, name = (null)}
2020-06-08 09:22:39.985484-0700 Dev[77025:6312770] *** Terminating app due to uncaught exception 'RLMException', reason: 'Realm accessed from incorrect thread.'


    * First throw call stack:
(
0   CoreFoundation                      0x00007fff23e3cf0e __exceptionPreprocess + 350
1   libobjc.A.dylib                     0x00007fff50ba89b2 objc_exception_throw + 48
2   Realm                               0x000000010a0b1bbd _Z18RLMSetErrorOrThrowP7NSErrorPU15__autoreleasingS0_

* 505
3   Realm                               0x000000010a0853fd _Z26RLMRealmTranslateExceptionPU15__autoreleasingP7NSError + 488
4   Realm                               0x000000010a086d54 -[RLMRealm beginWriteTransactionWithError:] + 44
5   Core                                0x0000000107cc049b $s4Core14RealmUtilitiesC5writeyAA7RequestCyytGy0B5Swift0B0VKcFZAA11CancellableVyAA8ResponseOyytGccfU_yycfU_yyXEfU_ + 667
6   Core                                0x0000000107cc073f $ss5Error_pIgzo_ytsAA_pIegrzo_TR + 15
7   Core                                0x0000000107cc23e4 $ss5Error_pIgzo_ytsAA_pIegrzo_TRTA.28 + 20
8   libswiftObjectiveC.dylib            0x000000010bad4c7e $s10ObjectiveC15autoreleasepool8invokingxxyKXE_tKlF + 46
9   Core                                0x0000000107cc01a6 $s4Core14RealmUtilitiesC5writeyAA7RequestCyytGy0B5Swift0B0VKcFZAA11CancellableVyAA8ResponseOyytGccfU_yycfU_ + 230
10  Core                                0x0000000107c69670 $sIeg_IeyB_TR + 48
11  libdispatch.dylib                   0x000000010bb534dc _dispatch_block_async_invoke2 + 83
12  libdispatch.dylib                   0x000000010bb44e8e _dispatch_client_callout + 8
13  libdispatch.dylib                   0x000000010bb477a3 _dispatch_continuation_pop + 552
14  libdispatch.dylib                   0x000000010bb46bbb _dispatch_async_redirect_invoke + 771
15  libdispatch.dylib                   0x000000010bb56399 _dispatch_root_queue_drain + 351
16  libdispatch.dylib                   0x000000010bb56ca6 _dispatch_worker_thread2 + 135
17  libsystem_pthread.dylib             0x00007fff51c089f7 _pthread_wqthread + 220
18  libsystem_pthread.dylib             0x00007fff51c07b77 start_wqthread + 15
)
libc++abi.dylib: terminating with uncaught exception of type NSException
* thread #36, queue = 'realm_write_queue', stop reason = signal SIGABRT
* frame #0: 0x00007fff51b6133a libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff51c0be60 libsystem_pthread.dylib`pthread_kill + 430
frame #2: 0x00007fff51af0b7c libsystem_c.dylib`abort + 120
frame #3: 0x00007fff4f9f7858 libc++abi.dylib`abort_message + 231
frame #4: 0x00007fff4f9e8cbf libc++abi.dylib`demangling_terminate_handler() + 262
frame #5: 0x00007fff50ba8c0b libobjc.A.dylib`_objc_terminate() + 96
frame #6: 0x00007fff4f9f6c87 libc++abi.dylib`std::__terminate(void (*)()) + 8
frame #7: 0x00007fff4f9f6c29 libc++abi.dylib`std::terminate() + 41
frame #8: 0x00007fff50ba8bab libobjc.A.dylib`objc_terminate + 9
frame #9: 0x000000010bb44ea2 libdispatch.dylib`_dispatch_client_callout + 28
frame #10: 0x000000010bb477a3 libdispatch.dylib`_dispatch_continuation_pop + 552
frame #11: 0x000000010bb46bbb libdispatch.dylib`_dispatch_async_redirect_invoke + 771
frame #12: 0x000000010bb56399 libdispatch.dylib`_dispatch_root_queue_drain + 351
frame #13: 0x000000010bb56ca6 libdispatch.dylib`_dispatch_worker_thread2 + 135
frame #14: 0x00007fff51c089f7 libsystem_pthread.dylib`_pthread_wqthread + 220
frame #15: 0x00007fff51c07b77 libsystem_pthread.dylib`start_wqthread + 15

## Steps to Reproduce
See code sample below, but this is basically an intermittent but fairly frequent crash with the architecture we're using for our realm write transactions (so you have to do a couple of write transactions and eventually it happens).

## Code Sample
//realm write function
public static func write(_ action: @escaping (_ realm: Realm) throws ~~> Void)~~ > Request<Void> {
        return Request(action: { (completion) -> Cancellable in

```
        let item = DispatchWorkItem {
            autoreleasepool {
                let realm = RealmUtilities.realm //SEE NEXT CODE SNIPPET FOR HOW THIS WORKS
                print("""
                    =============
                    Performing action on thread \(Thread.current)
                    =============
                    """)
                realm.beginWrite() //CRASH OCCURS HERE
                print("print statement immediately after realm.beingWrite()") //NEVER RUNS (if you set breakpoints for all OBJ-C and Swift Errors you will also validate that beginWrite is where we throw the realm thread exception)
                do {
                    try action(realm)
                    try realm.commitWrite()
                    completion(.success(()))
                } catch {
                    realm.cancelWrite()
                    completion(.error(error))
                }
            }
        }
        
        RealmUtilities.writeQueue.async(execute: item) //SEE BELOW CODE SNIPPET FOR HOW THIS WORKS
        return Cancellable { item.cancel() }
    })
}
```

//how we get realm instance
 public static var realm: Realm {
        do {
            print("""
            ~~~~~~~~~~~~~
            Getting realm on thread \(Thread.current)
            ~~~~~~~~~~~~~
            """)
            return try Realm(configuration: realmConfiguration)
        } catch {
            ErrorHandler.shared.handleError(
                "loading Realm",
                error: error,
                alertUser: false)
            DDLogError("Error loading Realm, attempting to recover by nuking the database.")
            nukeRealm()
            do {
                return try Realm(configuration: realmConfiguration)
            } catch {
                //error handling
        }
    }

//utility queue for all write transactions
public class RealmUtilities {
    ...
    static private let writeQueue = DispatchQueue(label: "realm_write_queue", qos: .utility, attributes: .concurrent)
...
    }

##Explanation and observations
So even though we seem to be accessing realm from the same thread for the write as the thread we get the realm instance on, we are still getting this crash with error realm was accessed from incorrect thread. Curiously, this crash goes away (or at least I haven't been able to reproduce a crash with it yet) if we change the queue from asyn to sync. Having ALL realm write transactions be synchronous is obviously not ideal, but it may give some insight into the issue.

## Version of Realm and Tooling
Realm framework version: v3.17.3 -> v5.0.1 (even tried downgrading to v4.4.0 and still had this issue, haven't yet tried versions older than that)

Realm Object Server version: N/A

Xcode version: 11.5

iOS/OSX version: 13.5.1

Dependency manager + version: Carthage v0.33.0

@OneSman7
Copy link

OneSman7 commented Jun 9, 2020

I am also seeing these crashes.
Can it be related to queue-confined realms/notifications?
I use notifications on a separate queue. In RLMCollection.mm I found method that is responsible for subscribing - RLMAddNotificationBlock
If queue is used, config from original Realm is taken (including original scheduler), passed on the queue block to create another queue-confined Realm:

RLMRealmConfiguration *config = realm.configuration;
    dispatch_async(queue, ^{
        std::lock_guard<std::mutex> lock(token->_mutex);
        if (!token->_realm) {
            return;
        }
        NSError *error;
        RLMRealm *realm = token->_realm = [RLMRealm realmWithConfiguration:config queue:queue error:&error];
        if (!realm) {
            block(nil, nil, error);
            return;
        }
        RLMCollection *collection = [realm resolveThreadSafeReference:tsr];
        token->_token = RLMGetBackingCollection(collection).add_notification_callback(CollectionCallbackWrapper{block, collection, skipFirst});
    });

In realm init I see configuration = [configuration copy];, but this only creates RLMRealmConfiguration copy. Then if queue is passed, another scheduler is created and assigned to the same Realm::Config

    Realm::Config& config = configuration.config;

    RLMRealm *realm = [[self alloc] initPrivate];
    realm->_dynamic = dynamic;

    // protects the realm cache and accessors cache
    static std::mutex& initLock = *new std::mutex();
    std::lock_guard<std::mutex> lock(initLock);

    try {
        if (queue) {
            if (queue == dispatch_get_main_queue()) {
                config.scheduler = realm::util::Scheduler::make_runloop(CFRunLoopGetMain());
            }
            else {
                config.scheduler = realm::util::Scheduler::make_dispatch((__bridge void *)queue);
            }
            if (!config.scheduler->is_on_thread()) {
                throw RLMException(@"Realm opened from incorrect dispatch queue.");
            }
        }
        realm->_realm = Realm::get_shared_realm(config);
    }
    catch (...) {
        translateSharedGroupOpenException(error);
        return nil;
    }

So it gets replaced in original cached Realm too on original thread. Then I get a crash on some other operation on that thread, because new scheduler says it is queue-confined.

@bmunkholm @jsflax what do you think? I am not a great c++ programmer, so it is just a theory

@OneSman7
Copy link

OneSman7 commented Jun 9, 2020

I rewrote our helpers to use the same queue to create Realm, fetch results and subscribe to notifications and crash is gone 🎉

@tgoyne
Copy link
Member

tgoyne commented Jun 15, 2020

[configuration copy] does a deep copy of the RLMRealmConfiguration object, so the assignments to config.scheduler below should only be modifying the newly created Realm::Config. However, I think I do see what the problem is; [RLMRealm realmWithConfiguration:realm.configuration error:nil] will incorrectly reuse the first Realm's scheduler.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants