Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realm notification listener crash #4050

Closed
Viktorianec opened this issue Nov 9, 2020 · 26 comments
Closed

Realm notification listener crash #4050

Viktorianec opened this issue Nov 9, 2020 · 26 comments

Comments

@Viktorianec
Copy link

Realm version: 10.1.1
and some previous (it was reproduce on 10.0.0 too).

It happens suddenly on 20-50 users in a day (we have around 1k users in a day). I could provide only part of stack trace from firebase:

Crashed: Realm notification listener
0  Realm                          0x101671e04 long long realm::Array::get<64ul>(unsigned long) const + 4
1  Realm                          0x1013922fc realm::ArrayKeyBase<0>::get(unsigned long) const + 36
2  Realm                          0x101392324 realm::util::FunctionRef<void (realm::BPlusTreeNode*, unsigned long)>::FunctionRef<realm::BPlusTree<realm::ObjKey>::get(unsigned long) const::'lambda'(realm::BPlusTreeNode*, unsigned long)&>(realm::ObjKey&&)::'lambda'(void*, realm::BPlusTreeNode*, unsigned long)::__invoke(void*, realm::BPlusTreeNode*, unsigned long) + 28
3  Realm                          0x10173c638 realm::ConstLstIf<realm::ObjKey>::get(unsigned long) const + 264
4  Realm                          0x1013a3b84 realm::_impl::ListNotifier::run() + 244
5  Realm                          0x1013b9f8c realm::_impl::RealmCoordinator::run_async_notifiers() + 1788
6  Realm                          0x1013b9834 realm::_impl::RealmCoordinator::on_change() + 24
7  Realm                          0x10139345c realm::_impl::ExternalCommitHelper::listen() + 204
8  Realm                          0x1013938b4 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, realm::_impl::ExternalCommitHelper::ExternalCommitHelper(realm::_impl::RealmCoordinator&)::$_0> >(void*) + 52
9  libsystem_pthread.dylib        0x1df23dca8 _pthread_start + 320
10 libsystem_pthread.dylib        0x1df246788 thread_start + 8

Steps to reproduce:
unknown, sorry

@ironage
Copy link
Contributor

ironage commented Dec 7, 2020

There's a good chance this is the same issue as described in #4175
The reason being the combination of the notifier being run and ConstLstIf<ObjKey>::get() being triggered. The know (and since fixed) issue here is that the iterator at the ConstLstIf level provides access to invalidated links which are not actually stored there, so it could be that the Array::get() asserts if this link was out of bounds at the storage level.

@fealebenpae
Copy link
Member

@Viktorianec can you tell if your app has been running in the background at the time the crashes happened? A possible cause might be trying to access the realm file after the device is locked and iOS revokes the access to it. Because realm files are memory-mapped we unfortunately do not get useful errors from the operating system but instead hard crashes like that.

@r-rebacz
Copy link

r-rebacz commented Aug 18, 2021

@fealebenpae I'm able to notice same kind of crash with almost identical stack trace. Is there anything that could be done to avoid them? Most of them happen when app is in background (based on instabugs), so most likely your assumption about iOS file rights is correct.

@jedelbo
Copy link
Contributor

jedelbo commented Sep 7, 2021

@r-rebacz Could you please add the actual stack trace you are seeing and also inform about the version of Realm you are using?

@r-rebacz
Copy link

Thank you @jedelbo for your interest in the topic. We're using v10.7.4.

Crashed: Realm notification listener
SIGSEGV 0x0000000116089ed0
----
Crashed: Realm notification listener
0  Realm                          0x103209124 long long realm::Array::get<64ul>(unsigned long) const + 4
1  Realm                          0x1030b2d64 realm::ArrayKeyBase<0>::get(unsigned long) const + 36
2  Realm                          0x1030b481c realm::util::FunctionRef<void (realm::BPlusTreeNode*, unsigned long)>::FunctionRef<realm::BPlusTree<realm::ObjKey>::get_uncached(unsigned long) const::'lambda'(realm::BPlusTreeNode*, unsigned long)&>(realm::ObjKey&&)::'lambda'(void*, realm::BPlusTreeNode*, unsigned long)::__invoke(void*, realm::BPlusTreeNode*, unsigned long) + 28
3  Realm                          0x1030b47f0 realm::BPlusTree<realm::ObjKey>::get_uncached(unsigned long) const + 64
4  Realm                          0x1030b5798 realm::LnkLst::get_any(unsigned long) const + 80
5  Realm                          0x10345ff30 realm::_impl::ListNotifier::run() + 260
6  Realm                          0x103468bdc realm::_impl::RealmCoordinator::run_async_notifiers() + 3124
7  Realm                          0x103467f2c realm::_impl::RealmCoordinator::on_change() + 24
8  Realm                          0x10344ec4c realm::_impl::ExternalCommitHelper::listen() + 204
9  Realm                          0x10344ede4 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, realm::_impl::ExternalCommitHelper::ExternalCommitHelper(realm::_impl::RealmCoordinator&)::$_0> >(void*) + 52
10 libsystem_pthread.dylib        0x1dc620bfc _pthread_start + 320
11 libsystem_pthread.dylib        0x1dc629758 thread_start + 8

@jedelbo
Copy link
Contributor

jedelbo commented Oct 26, 2021

@r-rebacz I am sorry for the long delay in responding to this. I am a bit confused about the version of Realm you are using, perhaps because I am not sure if this is happening on Android or on iOS. Anyway - we have made some fixes in this area that - however - is not yet released. I hope they will also fix the issues you experience.

@r-rebacz
Copy link

r-rebacz commented Nov 4, 2021

@jedelbo thank you for feedback. It's iOS app. I mentioned about "iOS file rights" in one of previous comment but I should probably be more clear :) Could you please point out pull requests with fixes, so I can track when they'll get released? Thank you in advance.

@jedelbo
Copy link
Contributor

jedelbo commented Nov 4, 2021

@r-rebacz The confusion comes from the fact that the stack trace does not match v10.7.4. It seems to be a newer version. You can follow realm/realm-swift#7488.

@sync-by-unito
Copy link

sync-by-unito bot commented Nov 15, 2021

➤ Jørgen Edelbo commented:

We are waiting to see if the new release improves the situation

@bodnar-dan
Copy link

bodnar-dan commented Jan 7, 2022

We updated to 10.20.0 but we are still able to see the crash. It's an iOS app, still using the objc version of Realm.

In addition to what was already said above. This crash happens for only one RLMObject (from a total of 88 Realm objects that we have), and only when it is added to Realm (not updated/deleted). The object is owner of 3 other objects (on which 2 are RLMArrays) each of them with its own children (but the entire structure is not complicated and the size of the arrays is max 15). This particular object doesn't own any RLMEmbeddedObjects, neither its children. Bottom line is that we have other objects with a structure much more complex than this one.

The creation rate is also low, is usually 1 per user session. I don't see any issue with the object or its types and we're not doing anything fancy with it, but it's intriguing that it happens to only this object.

It happens randomly, we weren't able to catch the crash with the debugger, only from what we see in crashlytics.

This is the stack from the thread that is crashing.

0    Realm                                    0x102da1778     long long realm::Array::get<64ul>(unsigned long) const + 4
1    Realm                                    0x102c4e5c4     realm::ArrayKeyBase<0>::get(unsigned long) const (array_key.hpp:90)
2    Realm                                    0x102c507bc     realm::util::FunctionRef<void (realm::BPlusTreeNode*, unsigned long)>::FunctionRef<realm::BPlusTree<realm::ObjKey>::get_uncached(unsigned long) const::'lambda'(realm::BPlusTreeNode*, unsigned long)&>(realm::ObjKey&&)::'lambda'(void*, realm::BPlusTreeNode*, unsigned long)::__invoke(void*, realm::BPlusTreeNode*, unsigned long) (function_ref.hpp:103)
3    Realm                                    0x102c50790     realm::BPlusTree<realm::ObjKey>::get_uncached(unsigned long) const (bplustree.hpp:379)
4    Realm                                    0x102c9d7c8     realm::LnkLst::get_any(unsigned long) const (list.hpp:904)
5    Realm                                    0x102fec4fc     realm::_impl::ListNotifier::run() + 259
6    Realm                                    0x102ff5228     realm::_impl::RealmCoordinator::run_async_notifiers() + 3207
7    Realm                                    0x102ff4524     realm::_impl::RealmCoordinator::on_change() + 23
8    Realm                                    0x102fd83c4     realm::_impl::ExternalCommitHelper::listen() + 203
9    Realm                                    0x102fd84e4     void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, realm::_impl::ExternalCommitHelper::ExternalCommitHelper(realm::_impl::RealmCoordinator&)::$_0> >(void*) + 51
10   libsystem_pthread.dylib                  0x1f20bb9a4     _pthread_start + 147
11   libsystem_pthread.dylib                  0x1f20baea0     thread_start + 7

Or similar ones

0    Realm                                    0x102361700     long long realm::Array::get<16ul>(unsigned long) const + 4
1    Realm                                    0x10220e5c4     realm::ArrayKeyBase<0>::get(unsigned long) const (array_key.hpp:90)
...

And this is what's always happening on the thread that is triggering the save (not sure if this is relevant in any way).

0    libsystem_kernel.dylib                   0x1b8945f90     _psynch_cvwait + 8
1    libc++.1.dylib                           0x199ce6ddc     $std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) + 27
2    Realm                                    0x1025a5420     realm::_impl::NotifierPackage::package_and_wait(realm::util::Optional<unsigned long long>) + 235
3    Realm                                    0x1025c1094     realm::_impl::transaction::begin(std::__1::shared_ptr<realm::Transaction> const&, realm::BindingContext*, realm::_impl::NotifierPackage&) + 1363
4    Realm                                    0x1025b6cac     realm::_impl::RealmCoordinator::promote_to_write(realm::Realm&) + 291
5    Realm                                    0x102644dd0     realm::Realm::begin_transaction() + 147
6    Realm                                    0x10232ef04     -[RLMRealm beginWriteTransactionWithError:] (RLMRealm.mm:644)
7    Realm                                    0x10232f204     -[RLMRealm transactionWithoutNotifying:block:error:] (RLMRealm.mm:692)
8    Realm                                    0x10232f194     -[RLMRealm transactionWithBlock:error:] (RLMRealm.mm:684)
...

It's a bit frustrating as it happens to quite some users per day and it completely brakes their app experience.

@jedelbo
Copy link
Contributor

jedelbo commented Jan 10, 2022

@tgoyne Based on your experience with notifiers, does this ring a bell? This seems to be happening when an object containing a list is created. Could it be that the list object is not properly transferred to the notification transaction?

@tgoyne
Copy link
Member

tgoyne commented Jan 10, 2022

The crashing line here is https://github.com/realm/realm-core/blob/master/src/realm/object-store/impl/list_notifier.cpp#L100

We check if the List is valid at the start of that function (!m_list || !m_list->is_attached()), and any sort of bug in the handover process should result in it just taking the list-was-deleted code path. There's also a call to size() before this which had to have return a non-zero value to hit this location.

@ironage
Copy link
Contributor

ironage commented Jan 10, 2022

Are the apps that are crashing using Realm sync?
I'm just realizing that the indices reported to the replication for notifications are the full set including unresolved links, and this section of code is using LnkLst which would then translate an index incorrectly to something out of bounds.

@tgoyne
Copy link
Member

tgoyne commented Jan 10, 2022

Oh, notification bugs related to unresolved links would be pretty unsurprising, and also would explain why it's only happening on one object (if that object is just the only one with a list with an unresolved link). This hopefully just requires fixing the index in Replication::list_set() etc. then? I assume we need to pass the raw index to sync replication so we can't adjust it earlier.

@jedelbo
Copy link
Contributor

jedelbo commented Jan 12, 2022

It should be noted that the client cannot insert unresolved links,so this is probably not the problem here. But I agree that there are problems in handling replication of unresolved links. Created #5164.

@bodnar-dan
Copy link

Are the apps that are crashing using Realm sync?

No. We're using Realm just for persisting data locally.

@bodnar-dan
Copy link

I see the issue #5164 was merged, but didn't made it to 11.9.0.
Is there a workaround for this or double check that we can do? It's getting critical for us, as we are facing repetitive crashes for some users. Internally, we were still unable to reproduce the crash, with or without the debugger attached.

We have simplified as much as we could the object structure, we removed properties that were not important. It has now only one RLMArray property (with a max size of 15 RLMObjects) and we have an NSData property (which is fairly small, around 5k bytes). The crash still happens.

@jedelbo
Copy link
Contributor

jedelbo commented Feb 22, 2022

If you are not using sync, #5164 is not relevant for you. We will try to find out what we can do to find the root cause of this problem.

@bodnar-dan
Copy link

I saw in Realm documents and in this thread realm/realm-swift#7164 that it's recommendable to use GCD rather than Threads for doing background work.
We do have one Thread that is doing a specific task whenever an object that is causing the crash is added to Realm. We use this approach: https://academy.realm.io/posts/realm-notifications-on-background-threads-with-swift/ to add the notification block to a RLMResults on a the background thread (we also lower the threadPriority to 0.2 and the qualityOfService = .utility).
Could this be a possible cause of the issue?

@jedelbo
Copy link
Contributor

jedelbo commented Feb 28, 2022

@tgoyne can you comment on the above.

@tgoyne
Copy link
Member

tgoyne commented Mar 1, 2022

There isn't any obvious reason why that would cause problems.

@bodnar-dan
Copy link

I can confirm the background thread execution is not at fault. We flagged out the background task and the crash is still happening.
We really need a bit of help understanding better the crash, because I think the key here is that this is happening only on one object.
Could this happen because of a poorly implemented notification block? Or the crash happens before those are even called? We added more logs in our latest build, but it doesn't seem the notification blocks are getting called.
Could maybe a Realm migration on the client can cause this? Maybe something we didn't handled properly?
We did saw an increase on the crash rate when a new feature was released. This feature was adding a bunch of new properties (RLMObjects) in the owner of the object that is crashing when added. Could this be an issue? Maybe the owner has too many child objects? The owner it's the account main object, so basically it's the owner of anything the user does in the app. But it doesn't explain why the crash is happening only when one child is added.

@bodnar-dan
Copy link

One thing I saw in the latest release, there are certain situations when we have a different stack trace. In this scenario the crash happens inside a notification block, when updating the same object. Specifically, when we iterate the modifications and create a list with the changed objects (we are aware that the modifications reflect the changes in the old results and we do map it to the new results).
So in this case, the object was saved, but crashed when a field was updated on it.

Crashed: com.apple.main-thread
EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x0000000122142ff8
0  Realm                          0x154d68 long long realm::Array::get<64ul>(unsigned long) const + 4
1  Realm                          0xdb4c realm::ArrayKeyBase<0>::get(unsigned long) const + 36
2  Realm                          0xfd44 realm::util::FunctionRef<void (realm::BPlusTreeNode*, unsigned long)>::FunctionRef<realm::BPlusTree<realm::ObjKey>::get_uncached(unsigned long) const::'lambda'(realm::BPlusTreeNode*, unsigned long)&>(realm::ObjKey&&)::'lambda'(void*, realm::BPlusTreeNode*, unsigned long)::__invoke(void*, realm::BPlusTreeNode*, unsigned long) + 28
3  Realm                          0xfd18 realm::BPlusTree<realm::ObjKey>::get_uncached(unsigned long) const + 64
4  Realm                          0x60e24 realm::LnkLst::get_object(unsigned long) const + 32
5  Realm                          0x61128 realm::ObjList::try_get_object(unsigned long) const + 100
6  Realm                          0x23fe38 realm::Query::do_find_all(realm::TableView&, unsigned long) const + 156
7  Realm                          0x2cee24 realm::TableView::do_sync() + 652
8  Realm                          0x3f0c98 realm::Results::ensure_up_to_date(realm::Results::EvaluateMode) + 456
9  Realm                          0x3f0590 realm::util::Optional<realm::Obj> realm::Results::try_get<realm::Obj>(unsigned long) + 48
10 Realm                          0x3f0498 realm::Obj realm::Results::get<realm::Obj>(unsigned long) + 80
11 Realm                          0x2ced8 RLMAccessorContext realm::Results::dispatch<auto realm::Results::get<RLMAccessorContext>(RLMAccessorContext&, unsigned long)::'lambda'(RLMAccessorContext&)>(RLMAccessorContext&) const + 380
12 Realm                          0x28cb8 auto realm::Results::get<RLMAccessorContext>(RLMAccessorContext&, unsigned long) + 36
13 Realm                          0x12b3b8 -[RLMResults objectAtIndex:] + 52

@sync-by-unito
Copy link

sync-by-unito bot commented Jun 14, 2022

➤ Jørgen Edelbo commented:

We cannot find an explanation to why you see these crashes so there is clearly something we don't know about your use case. Without a minimal reproduction case, I don't think there is a way we can proceed with this.

@nicola-cab
Copy link
Member

Hello, this is a quite old thread. Do we have some way to reproduce this? Do we know if there was a migration that could have changed things? We some vague idea of how this could have happened, I could try to reproduce it in my environment. @bodnar-dan ...

@sync-by-unito sync-by-unito bot closed this as completed Sep 7, 2022
@sync-by-unito
Copy link

sync-by-unito bot commented Sep 7, 2022

➤ Nicola Cabiddu commented:

Closing this issue, because it is more than 1y old, and we have no clear way to reproduce it. It seems a migration could have been responsible for it, but without any further information, there is no way for us to tackle and fix the problem.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants