Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting replicator while transaction is open causes deadlock #476

Closed
blaugold opened this issue Apr 25, 2023 · 8 comments
Closed

Starting replicator while transaction is open causes deadlock #476

blaugold opened this issue Apr 25, 2023 · 8 comments
Assignees
Labels
👩‍⚕️ triaged This issue has been verified and moved to JIRA

Comments

@blaugold
Copy link
Contributor

blaugold commented Apr 25, 2023

Calling CBLReplicator_Start while the target database has an open transaction causes a deadlock.

@velicuvlad velicuvlad self-assigned this May 31, 2023
@velicuvlad velicuvlad added 🐛 bug Something isn't working 👩‍⚕️ triaged This issue has been verified and moved to JIRA labels May 31, 2023
@velicuvlad
Copy link
Contributor

Created CBL-4548

@velicuvlad velicuvlad removed the 🐛 bug Something isn't working label May 31, 2023
@jianminzhao
Copy link
Contributor

jianminzhao commented Jun 27, 2023

@blaugold May I ask what is the concrete situation you came to encounter? Your statement is correct; I just wonder whether you wanted to point it out or you have come across this situation in a tricky way.

@blaugold
Copy link
Contributor Author

@jianminzhao I came across this issue in the context of the async API of the Couchbase Lite SDK for Dart/Flutter, that I'm maintaining. The SDK has a fully synchronous API and another corresponding API with the same functionality but async return values. The async API offloads all work to another thread to prevent blocking the main thread.

In async code, it is easy for code to interleave in unexpected ways during execution. The callback of the async Database.inBatch method is asynchronous. So, if some event in an app triggers starting a replicator while the callback of inBatch is waiting for a Future to resolve, it is possible to deadlock the app. I observed this issue in a production app.

I have already implemented a mitigation in the Dart SDK, which is to asynchronously wait for any pending transaction before starting a replicator.

For the sync API, an error is thrown if the user tries to start a replicator from within the inBatch callback. I don't think this is a problem, since I cannot think of a use case where it is necessary or somehow really convenient to be able to do this.

I mostly raised this issue to make the team aware of this limitation. Just documenting it should be fine, IMHO.

@jianminzhao
Copy link
Contributor

@blaugold thanks for your explanation. I think this, "which is to asynchronously wait for any pending transaction before starting a replicator," is a good approach. Being a transaction, it is meant to be a sequestered context. By the way, have you tried to start the replicator asynchronously in a separate thread?
I have updated the doc, https://github.com/couchbase/couchbase-lite-core/blob/7c1082ef121ac96c0a7eb63123113e167942d0f5/C/include/c4Replicator.h#L80. Would it help?

@blaugold
Copy link
Contributor Author

blaugold commented Jul 10, 2023

By the way, have you tried to start the replicator asynchronously in a separate thread?

I was under the impression that it is not officially supported to access a database and related resources from multiple threads. That is why I architected the async API to create one thread per database and schedule all work on that thread. So, no, but now I'm seeing that some operations have the note "This function is thread-safe." :)

I have updated the doc, https://github.com/couchbase/couchbase-lite-core/blob/7c1082ef121ac96c0a7eb63123113e167942d0f5/C/include/c4Replicator.h#L80. Would it help?

I think that extra info would have helped!

@blaugold
Copy link
Contributor Author

So, no, but now I'm seeing that some operations have the note "This function is thread-safe." :)

Ah, that's in LiteCore. I don't think the documentation for the C SDK says anything about a function being thread-safe, no?

@blaugold
Copy link
Contributor Author

The only note about thread safety of the C SDK I could find is this:

And since the CBL API is not itself thread-safe, ...

It would probably be good to make this clear early on in Couchbase Lite docs, not just in this one sentence of the reference. 😉

@jianminzhao
Copy link
Contributor

"Thread safe" means we will not access the database you passed to the the replicator and race with your accesses. To do it, we open a new database connection through that all database accesses are done. This is the very reason you got deadlocked when start the replicator in your transaction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👩‍⚕️ triaged This issue has been verified and moved to JIRA
Projects
None yet
Development

No branches or pull requests

3 participants