Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Crash when using realm on background thread (v5.3.1) #6659

Closed
apptekstudios opened this issue Jul 20, 2020 · 25 comments · Fixed by #6714
Closed

Intermittent Crash when using realm on background thread (v5.3.1) #6659

apptekstudios opened this issue Jul 20, 2020 · 25 comments · Fixed by #6714

Comments

@apptekstudios
Copy link

apptekstudios commented Jul 20, 2020

Goals

Write to realm from background thread without crash

Steps for others to Reproduce

Occasionally (and seemingly randomly) realm will throw an exception for "Realm accessed from incorrect thread." during beginWriteTransaction.

We are creating a new realm instance on the current thread and it is within an autorelease pool. When this exception is occurring the doWrite closure is not reached (the exception occurs at beginWriteTransaction).

We have also tried using the new queue-confined Realm init but experienced the same crash, also randomly.

After extensive debugging it seems that something must be going wrong with the thread check, or the underlying cache is returning the wrong instance intermittently.

Any ideas on a potential solution?

Code Sample

image

Version of Realm and Tooling

Dependency manager: SPM

Realm-cocoa 5.3.1
Realm 5.3.1
Realm-core 6.0.11

Xcode version: 11.5
iOS/OSX version: 13.5

@javruben
Copy link

Hi, I'm on the same project as @apptekstudios. FWIW I reverted realm to the latest 4.4.x and there it seems to not crash.

@starchey
Copy link

We had similar issues after updating to Realm 5.3. Passing the dispatchQueue into the Realm init call, let realm = try! Realm(queue: dispatchQueue), helped resolve these crashes on our end.

@apptekstudios
Copy link
Author

We had similar issues after updating to Realm 5.3. Passing the dispatchQueue into the Realm init call, let realm = try! Realm(queue: dispatchQueue), helped resolve these crashes on our end.

This initialiser seemed to avoid crashes while on our dispatch queue. However when using it there we were then getting random crashes when reading or writing (to a new Realm) on any other thread (including main), even though queue for those was set to nil (or not set at all)

@tristangrichard
Copy link

Experiencing the same on main.async

@niralishaha25
Copy link

Facing same crashes frequently on Realm 5.3.1 but not reproducible

@ejm01
Copy link
Contributor

ejm01 commented Jul 24, 2020

Thanks for info @apptekstudios . Could you share the full log or stack trace when this happens?

For

However when using it there we were then getting random crashes when reading or writing (to a new Realm) on any other thread (including main)

Does this mean any new realm, including the default realm? Or only realms with a dispatch queue included in the config.

Also if anyone can share a sample project that would help immensely.

@duncangroenewald
Copy link

Not sure if this is related but we are testing an upgrade from 4.4.1 to 5.2.0 and experienced some weird issues after upgrading the database (macOS 10.15.6)

Here is one example:

image

This has to be changed to callback on the main thread of we get a crash - never had any issues with 4.4.1

image

Why is that - we do lots of stuff on background threads and make extensive use of OperationQueues for running reports and complex long running queries and never had any issues with 4.4.1.

And then many realm queries are returning nil. For example

let items = realm.objects(Items.self)

// this works fine and returns the correct list of items
for item in items {
print("item: (item.id), (item.name)")
let id = item.id
}

// But this fails and returns nil
let item = realm.object(ofType: Item.self, forPrimaryKey: id)
print("item: (item.id), (item.name)")

// And so does this
let item = realm.objects(Item.self).filter("id == %@", id).first

I just retested using the same database file and RealmSwift 4.4.1 and everything works perfectly fine.
I will try with the latest 5.3.x now to see if there is anything fixed but it seems something is badly broken here.

I am happy to give you access to the full application repository if necessary to help find and fix this issue - but for now it seems we better hold off any attempts to upgrade to using RealmSwift 5.x.x

Will keep testing on this side as it possible we may have done something to the data that causes queries to return nil now !?

@tgoyne
Copy link
Member

tgoyne commented Aug 1, 2020

asyncOpen() now requires a serial dispatch queue because the Realm it gives you is now confined to that queue rather than the thread it happens to be called on.

object(ofType:forPrimaryKey:) not working correctly on upgraded files has been reported (#6672) but I'm not sure we have a repro case yet.

@duncangroenewald
Copy link

duncangroenewald commented Aug 1, 2020

Hmm, the weird thing is we have been using 5.2.0 for local testing for some time using the Package Manager (no sync component) and I don't recall every seeing this failure of the queries occur before. We just set up a test instance of realm cloud and migrated a copy of the data to that instance for testing the upgrade in a cloud sync environment. The tests above were run with a backup of the newly created cloud realm using the backup file as a local realm.

The only difference I can see is that now the migration was done using RealmSwift with the sync components - previously we have been using the Package Manager version on macOS 11 which does not have the sync component because the binaries for them are not available yet.

Let me know if you need a repro case. I am going to try opening the same file with the macOS11 version to see if the same query failures occur. Then I will migrate to 5.2 using the macOS11 version and try opening the migrated file with the 5.2 version with the sync components (the binaries you (Realm) publish) to see if the problem still occurs.

Not sure if that will be helpful at all.

We only use asyncOpen() once at startup where everything is initialised in series anyway so that's probably minimal impact.

@duncangroenewald
Copy link

I just opened the 4.4.1 realm file with the macOS11 (RealmSwift 5.2.0 Package Manager version) version and the file was upgraded and realm queries seem to be fine. Not using asyncOpen() for this at all since no sync. However when closing the app I got this crash - which has been occurring infrequently in the past with this version.

image

@duncangroenewald
Copy link

So given the above issues are we able to get a binary build for Xcode 12/macOS11, etc. for version 4.4.1 since it seems to be a more stable version for production use - obviously only when macOS11 goes GA.

@duncangroenewald
Copy link

duncangroenewald commented Aug 1, 2020

Oh here is another slightly different crash when closing the app. There are a bunch of background tasks running when the app gets shut down - not sure if these crashes have anything to do with that or perhaps different macOS behaviour on macOS11 beta (yes on THAT version of hardware!).

image

@duncangroenewald
Copy link

I can confirm that if I upgrade to 5.2.0 using the macOS11 version and then make a backup of the realm file and then open that file with the macOS10.15.6 version(using Realm provided 5.2.0 binaries) the realm queries returning nil does not occur.

So it seems that the migration with the Realm provide 5.2.0 binaries is causing a problem. I have not check to see whether the crashes above are limited to the macOS11 version yet or whether they are also occurring on macOS10.15 with Realm binaries.

@leemaguire
Copy link
Contributor

@duncangroenewald does updating to 5.3.3 solve your issue?

@jpstern
Copy link

jpstern commented Aug 11, 2020

FWIW, I rolled back from 5.3.3 to 4.3.2 without any other code changes and the crashes completely disappeared

@duncangroenewald
Copy link

@duncangroenewald does updating to 5.3.3 solve your issue?

I will give it a try and get back to you.

@duncangroenewald
Copy link

It's looking promising on the test application so far using either the Package Manager version or the 5.3.3 binary with a local realm file - I did get one crash on startup with an invalid thread issue but haven't been able to reproduce that. I will try using 5.3.3 for a while on the full application with Realm Cloud sync to see how that works out.

BTW can anyone tell me whether there is any impact of using 5.3.3 with the Realm Cloud service with one client application while other client applications are still using V4.4.1. The schema is identical for all.

I am working on the assumption that the 5.3.3 client changes nothing on the Realm Cloud service that can affect v4.4.1 clients.

@duncangroenewald
Copy link

I just did a build with 5.3.3 and tested connecting to an existing 4.4.1 file and Realm Cloud - it seems that main thread queries work but all the reports are still failing with nil query results. These are running on background threads but I am not sure that is relevant - it seems more likely there is some problem with the database format upgrade process since the next test, described below, does not appear to suffer the same issue.

I did a second test with no existing client files - I deleted the entire realm-object-server directory - and forced the client to reinitialise by logging in to Realm Cloud from scratch. It now appears that the reports are working just fine - obviously need to do quite a bit more testing to confirm there are not further issues but this is big a step forward. I never tried this with 5.2.x btw.

Fortunately it's no big deal for us to manually reinitialise the client application so they rebuild the local realm file but I am sure for others that probably not the case.

@duncangroenewald
Copy link

Well I just had another exception "realm accessed from invalid thread" or something like that but haven't been able to reproduce that yet. Something strange is definitely going on if this happens one time and next time it doesn't. I'll keep trying to see if I can reliably reproduce this crash. All the above is macOS10.15/Xcode 11.6.

@duncangroenewald
Copy link

Weird - after crashing almost immediately the first two times I have been unable to get another crash for the past 5 minutes despite doing multiple tasks, all of which trigger multiple background processes that are accessing the realm database. Is it possible the realm verify thread calls are getting something wrong under particular circumstances ?

@duncangroenewald
Copy link

OK - finally managed to trigger another crash doing nothing different than previously:

2020-08-13 13:20:43.308789+1000 MakeSpace[9224:684438] *** Terminating app due to uncaught exception 'RLMException', reason: 'Realm accessed from incorrect thread.'
*** First throw call stack:
(
0 CoreFoundation 0x00007fff35e78b57 exceptionPreprocess + 250
1 libobjc.A.dylib 0x00007fff6ecbf5bf objc_exception_throw + 48
2 Realm 0x0000000102391be5 -[RLMRealm verifyThread] + 70
3 Realm 0x000000010232ed1b RLMGetObjects + 80
4 RealmSwift 0x0000000102bee120 $s10RealmSwift0A0V7objectsyAA7ResultsVyxGxmAA6ObjectCRbzlF + 112
5 MakeSpace 0x000000010071e887 $s9MakeSpace10AssortmentC13assortmentFor2id5realmACSgSS_10RealmSwift0H0VtFZAGyXEfU
+ 87
6 MakeSpace 0x000000010071c753 $s9MakeSpace10AssortmentCSgs5Error_pIgozo_ADsAE_pIegrzo_TR + 19
7 MakeSpace 0x000000010071ea04 $s9MakeSpace10AssortmentCSgs5Error_pIgozo_ADsAE_pIegrzo_TRTA.31 + 20
8 libswiftObjectiveC.dylib 0x00007fff6fb1bd8e $s10ObjectiveC15autoreleasepool8invokingxxyKXE_tKlF + 46
9 MakeSpace 0x000000010071e7d4 $s9MakeSpace10AssortmentC13assortmentFor2id5realmACSgSS_10RealmSwift0H0VtFZ + 244
10 MakeSpace 0x0000000100460628 $s9MakeSpace14AssortmentNodeC11updateAsync5realmy10RealmSwift0H0V_tF + 440
11 MakeSpace 0x00000001000f96d6 $s9MakeSpace19NodeUpdateOperationC14performUpdates33_B96001338E3AF9F5624F06ED287A4FA8LL3for10completionys10ArraySliceVyAA04BaseC0CG_yyctFyycfU
+ 806
12 MakeSpace 0x00000001000fa4af $s9MakeSpace19NodeUpdateOperationC14performUpdates33_B96001338E3AF9F5624F06ED287A4FA8LL3for10completionys10ArraySliceVyAA04BaseC0CG_yyctFyycfU_TA + 47
13 MakeSpace 0x0000000100049e10 $sIeg_IeyB_TR + 48
14 libdispatch.dylib 0x0000000102d23844 _dispatch_call_block_and_release + 12
15 libdispatch.dylib 0x0000000102d24826 _dispatch_client_callout + 8
16 libdispatch.dylib 0x0000000102d26e8d _dispatch_queue_override_invoke + 1038
17 libdispatch.dylib 0x0000000102d38391 _dispatch_root_queue_drain + 334
18 libdispatch.dylib 0x0000000102d38e03 _dispatch_worker_thread2 + 127
19 libsystem_pthread.dylib 0x0000000102db231b _pthread_wqthread + 220
20 libsystem_pthread.dylib 0x0000000102db149b start_wqthread + 15
)
libc++abi.dylib: terminating with uncaught exception of type NSException
(lldb)

@duncangroenewald
Copy link

That crash is inside this function

DispatchQueue.global().async {
            
            guard let realm = GlobalVars.shared.realm else {
                completion()
                return
            }
            
            for node in segment {
                
                node.updateAsync(realm: realm)
                
                self.processedCount += 1
                if self.isCancelled { break }
            }
            
            completion()
            return
        }

@duncangroenewald
Copy link

Let me know if you want me to run some longer term testing - it is easy to setup some repeating tests that will run multiple concurrent background tasks and leave it running to see if it will eventually crash.

Currently manually switching to different parts of the app and triggering tasks seems to result in very infrequent crashes so its going to be difficult to reliably reproduce like this it seems. We must have upward of 12 concurrent background tasks when doing this and so far only three crashes. I could be mistaken but it seems it might only be crashing when a new realm is opened and accessed for the first time. Some of the reports take 20 minutes or more to run and run 8 concurrent threads and so far I have not encountered one crashing after it has started running.

@duncangroenewald
Copy link

Since my last post we have been running a series of test cases and using the application in every way imaginable and have not experienced another crash yet so this could be a hard one to reliably reproduce !

@tgoyne
Copy link
Member

tgoyne commented Aug 20, 2020

I've successfully reproduced a case which would cause spurious incorrect thread exceptions and am working on a fix.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants