Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple iterators in read-write transactions. #1286

Merged

Conversation

elliotcourant
Copy link
Contributor

@elliotcourant elliotcourant commented Mar 31, 2020

This adds support for multiple iterators during a read-write transaction. But iterators created in a read-write transaction will only be able to see writes that were performed before the iterator was created. Any writes that occur after the iterator is created will be invisible to the iterator.

This references an issue I opened previously: #981

I had closed this issue as I was able to work with a single iterator but I'm not trying to read multiple sequences of keys and values from different areas in the database at the same time, and the most efficient way to do this seems to be with multiple iterators.

Fixes #981


This change is Reviewable

@CLAassistant
Copy link

CLAassistant commented Mar 31, 2020

CLA assistant check
All committers have signed the CLA.

@elliotcourant elliotcourant force-pushed the feature/multiple-iterators branch from 2ee9ebb to baee8a9 Compare March 31, 2020 21:06
@jarifibrahim
Copy link
Contributor

Hey @elliotcourant, Thank you so much for the PR. The change looks like a simple change. I wonder why were multiple iterators disabled in Read-Write transactions. Maybe @manishrjain would know.

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Looks good to me but I'll wait for @manishrjain's approval.

Reviewable status: 0 of 2 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @elliotcourant, @jarifibrahim, and @manishrjain)


db_test.go, line 768 at r1 (raw file):

}

func TestIterateParallel(t *testing.T) {

It would also be nice if you add a test for multiple iterators in a RW Transaction.


db_test.go, line 815 at r1 (raw file):

		wg.Wait()

		// Check that a RW txn can't run multiple iterators.

This comment should be updated.


db_test.go, line 829 at r1 (raw file):

		// Run multiple iterators for a RO txn.
		txn = db.NewTransaction(false)

You could change this so that all the transactions are read-write (or maybe create a new test for RW transactions)

@elliotcourant
Copy link
Contributor Author

Thank you very much for the feedback, I will make those changes!

@elliotcourant
Copy link
Contributor Author

I think change it so that all transactions would be RW would be a bit of a large change in this MR. I will add tests for creating and using multiple iterators in a RW transaction though.

@jarifibrahim
Copy link
Contributor

jarifibrahim commented Apr 13, 2020

@elliotcourant
Copy link
Contributor Author

@jarifibrahim If you are iterating in multiple go routines then you are absolutely right. I don't think adding a mutex to appending that array would be very bad. I would think that most iterators are in a single thread to begin with, or (at least as of right now) are using read only transactions for multiple iterators. So adding a mutex would not impact any existing behavior.
As for the performance implications I'm not sure. If the performance of a mutex is a concern then I could try to find another solution?

@elliotcourant elliotcourant force-pushed the feature/multiple-iterators branch 2 times, most recently from 4ec5256 to 44a8421 Compare April 24, 2020 18:26
@elliotcourant
Copy link
Contributor Author

Just pushed the updates from the code review and squashed my commits.

@jarifibrahim I added a mutex around the addKey method, which should make concurrent reads safe. However it is still currently possible that if one were to try to perform a read on thread A and try to commit on thread B at the same time that a race condition could occur? As thread B would be accessing the read array without a lock while thread A might try to append the read array.

I don't see that happening very often, and it's certainly possible that it could happen now? Since there is nothing stopping a user from having a single iterator on a different thread and trying to commit at the same time.

I added a sub test to make sure that multiple concurrent iterators are working with a read-write transaction.

Another option to support more "lockless" reads with multiple iterators could be to store the keys read on the iterator itself, and then only when the iterator is closed will it try to push them to the read array on the transaction? This way we could have the commit action close all the active iterators before checking the read array, pushing the work to be done at the commit time, rather than as the reads are being performed?

This adds support for multiple iterators during a read-write transaction. But iterators created in a read-write
transaction will only be able to see writes that were performed before the iterator was created. Any writes that occur
after the iterator is created will be invisible to the iterator.

This references an issue I opened previously: hypermodeinc#981

Added a test case for using multiple concurrent iterators when inside a
read-write transaction. Added a mutex around the reads array on the
transaction to make sure that it is thread safe when it is being
modified by potentially concurrent iterators.
@elliotcourant elliotcourant force-pushed the feature/multiple-iterators branch from 44a8421 to 7212640 Compare April 24, 2020 18:33
Copy link
Contributor Author

@elliotcourant elliotcourant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)


db_test.go, line 768 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

It would also be nice if you add a test for multiple iterators in a RW Transaction.

Done.


db_test.go, line 815 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

This comment should be updated.

Done.


db_test.go, line 829 at r1 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

You could change this so that all the transactions are read-write (or maybe create a new test for RW transactions)

I think that might not be easy right now (to change it that is), but I did add a sub-test for making sure RW can use multiple iterators.

@elliotcourant
Copy link
Contributor Author

@jarifibrahim @manishrjain Is there any more feedback you have on this PR? Or are there any further changes that you think need to be made?

@jarifibrahim
Copy link
Contributor

jarifibrahim commented May 12, 2020

Hey @elliotcourant, I've made some minor changes to your PR and I'll get this merged today. Thanks for your contribution 👍

@jarifibrahim I added a mutex around the addKey method, which should make concurrent reads safe. However, it is still currently possible that if one were to try to perform a read on thread A and try to commit on thread B at the same time that a race condition could occur? As thread B would be accessing the read array without a lock while thread A might try to append the read array.

That won't happen because we acquire lock on oracle before checking for conflicts https://github.com/dgraph-io/badger/blob/9459a240bdf3cf1bf59a172558b37455c742d3bf/txn.go#L167-L174

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 3 files at r2, 1 of 2 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @ashish-goswami, @elliotcourant, and @jarifibrahim)


txn.go, line 441 at r4 (raw file):

		// needs to be locked whenever we mark a key as read.
		txn.readsLock.Lock()
		defer txn.readsLock.Unlock()

don't need a defer for one liner? defer has a cost.

@jarifibrahim jarifibrahim merged commit af22dfd into hypermodeinc:master May 13, 2020
@jarifibrahim
Copy link
Contributor

Thank you for your contribution @elliotcourant . I've merged your PR 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Question: Multiple iterators on a read-write transaction.
4 participants