Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit for multi-part lists #3105

Merged
merged 82 commits into from
Apr 19, 2019
Merged

Initial commit for multi-part lists #3105

merged 82 commits into from
Apr 19, 2019

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented Mar 6, 2019

This change is Reviewable

posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, a lot more complex than it needs to be. Have a look at the comments.

Reviewable status: 0 of 9 files reviewed, 16 unresolved discussions (waiting on @martinmr)


posting/list.go, line 55 at r1 (raw file):

	ErrStopIteration = errors.New("Stop iteration")
	emptyPosting     = &pb.Posting{}
	maxListLength    = 2000000

Let's do it based on size of the list instead of the number of UIDs.


posting/list.go, line 81 at r1 (raw file):

	pendingTxns int32 // Using atomic for this, to avoid locking in SetForDeletion operation.
	deleteMe    int32 // Using atomic for this, to avoid expensive SetForDeletion operation.
	next        *List // If a multi-part list, this is a pointer to the next list.

I think this should only contain the UIDs to identify the keys for the next parts, like a list of key UIDs or something, so they can be used to generate the keys.

If the list is empty, then we know it is one.


posting/list.go, line 85 at r1 (raw file):

}

func appendNextStartToKey(key []byte, nextPartStart uint64) []byte {

Don't need two funcs here. Just create a new key instead of trying to micro-optimize key creation.


posting/list.go, line 95 at r1 (raw file):

}

func replaceNextStartInKey(key []byte, nextPartStart uint64) []byte {

Same as above. Consolidate into one func, to create a multi-part key, using the base key and the uint64.


posting/list.go, line 104 at r1 (raw file):

}

func (l *List) getNextPartKey() []byte {

This should lie within the iterator, and not be part of the list.

Iterator can copy the uid slice from list, if needed. So, it can know which key to retrieve next.

The UID slice can also be used to do a quick binary search for afterUid option in iterate.


posting/list.go, line 130 at r1 (raw file):

	appendToKey := currPl.FirstPart || !currPl.MultiPart
	if appendToKey {
		return appendNextStartToKey(currKey, nextPartStart)

Yeah, too much arithmetic for micro-optimization.


posting/list.go, line 135 at r1 (raw file):

}

func (l *List) updateMinTs(readTs, minTs uint64) error {

Not sure what this func is for.


posting/list.go, line 711 at r1 (raw file):

}

func (l *List) partIterate(readTs uint64, f func(obj *pb.Posting) error) error {

Try not to create a copy of the iterate logic. It is easy to get wrong.

Instead, maybe do normal iteration, and do a checksum or something to figure out if that particular part has changed or not, and if needs to be re-written.


posting/list.go, line 904 at r1 (raw file):

		enc := codec.Encoder{BlockSize: blockSize}
		err := curr.partIterate(readTs, func(p *pb.Posting) error {

This shouldn't need to be changed. You can do normal iteration, but you know where the splits exist -- so all you need is to figure out if you need to write that part back or not.

Also, once the final list is created, you should do a judgement about whether it should be binary split or not.


posting/list.go, line 1232 at r1 (raw file):

	txn := pstore.NewTransactionAt(readTs, false)
	opts := badger.DefaultIteratorOptions

No need to do iteration here. All the iteration should be done when the list is first created, and multi-part keys are picked up.

This should just be a Txn.Get.


posting/list.go, line 1246 at r1 (raw file):

		return err
	}
	l.next = nextListPart

list should not be modified by iteration. Multiple iterators can be going on simultaneously on one list structure.

Replace all this linked-list logic with just a bunch of uint64s in the list, if multi-part. And then use those uint64s to generate keys and iterate in the iterator.


posting/list.go, line 1252 at r1 (raw file):

func (l *List) needsSplit(readTs uint64) bool {
	length := l.partialLength(readTs)

do it based on size. You can get proto size.


posting/list.go, line 1292 at r1 (raw file):

}

func (l *List) splitListPart(readTs uint64) []*List {

To split a protobuf list, you don't need all this logic. Just find the midpoint, and break that out into two protobufs.

martinmr added 2 commits March 6, 2019 13:04
- Load whole list at once.
- Eliminate linked list.
- Store all plists inside the List object.
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 10 files reviewed, 22 unresolved discussions (waiting on @golangcibot and @manishrjain)


posting/list.go, line 55 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Let's do it based on size of the list instead of the number of UIDs.

Done.


posting/list.go, line 81 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I think this should only contain the UIDs to identify the keys for the next parts, like a list of key UIDs or something, so they can be used to generate the keys.

If the list is empty, then we know it is one.

Done. I did something similar but instead it's a list of posting lists. If the list is empty, then the


posting/list.go, line 85 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Don't need two funcs here. Just create a new key instead of trying to micro-optimize key creation.

Done. Function has been removed.


posting/list.go, line 95 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Same as above. Consolidate into one func, to create a multi-part key, using the base key and the uint64.

Done. Function has been removed and replaced by a simpler function.


posting/list.go, line 104 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This should lie within the iterator, and not be part of the list.

Iterator can copy the uid slice from list, if needed. So, it can know which key to retrieve next.

The UID slice can also be used to do a quick binary search for afterUid option in iterate.

Replaced this function by a smaller function that is not part of any interface.


posting/list.go, line 130 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Yeah, too much arithmetic for micro-optimization.

Done.


posting/list.go, line 135 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Not sure what this func is for.

Done. Removed the method. It was used to update minTs in all of the linked list objects but the design has changed so this is not needed anymore.


posting/list.go, line 162 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: field firstPart is unused (from unused)

Done.


posting/list.go, line 178 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

Error return value of l.loadNextPart is not checked (from errcheck)

Done.


posting/list.go, line 711 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Try not to create a copy of the iterate logic. It is easy to get wrong.

Instead, maybe do normal iteration, and do a checksum or something to figure out if that particular part has changed or not, and if needs to be re-written.

Done. I managed to merge the two functions so the logic is not duplicate anymore.


posting/list.go, line 904 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This shouldn't need to be changed. You can do normal iteration, but you know where the splits exist -- so all you need is to figure out if you need to write that part back or not.

Also, once the final list is created, you should do a judgement about whether it should be binary split or not.

Done. I simplified the logic a bit. The logic goes through each part (or the entire list if not a multi part list) and applies the changes to each list. I moved the split logic to happen at the end.


posting/list.go, line 956 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

Error return value of l.updateMinTs is not checked (from errcheck)

Done.


posting/list.go, line 1232 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

No need to do iteration here. All the iteration should be done when the list is first created, and multi-part keys are picked up.

This should just be a Txn.Get.

Done. All parts of a list are read when the list is created.


posting/list.go, line 1246 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

list should not be modified by iteration. Multiple iterators can be going on simultaneously on one list structure.

Replace all this linked-list logic with just a bunch of uint64s in the list, if multi-part. And then use those uint64s to generate keys and iterate in the iterator.

Done. All parts of a list are created when the list is created.


posting/list.go, line 1252 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

do it based on size. You can get proto size.

Done.


posting/list.go, line 1292 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

To split a protobuf list, you don't need all this logic. Just find the midpoint, and break that out into two protobufs.

pb.PostingList is not just a list. It has the encoded and packed list of uids as well as the list of postings. I don't think it can be done as simply as you said.


posting/list.go, line 93 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func appendNextStartToKey is unused (from unused)

Done.


posting/list.go, line 103 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func replaceNextStartInKey is unused (from unused)

Done.


posting/list.go, line 112 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func (*List).getNextPartKey is unused (from unused)

Done.


posting/list.go, line 135 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func generateNextPartKey is unused (from unused)

Done.


posting/list.go, line 162 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: field partIndex is unused (from unused)

Done.


posting/list.go, line 566 at r3 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

U1000: func (*List).pickPartPostings is unused (from unused)

Done.

posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
posting/list.go Outdated Show resolved Hide resolved
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 13 files reviewed, 22 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)


posting/list.go, line 484 at r12 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

This is the only place where the postings are trimmed. The only place were deleteBelowTs is used is in the iterator but it's to indicate that the immutable layer should be ignored.

Are we making sure that we do create a posting list at the highest commit timestamp so far, which could be from deleteBelowTs?


posting/list.go, line 857 at r12 (raw file):

		// readTs and calculate the maxCommitTs.
		deleteBelow, mposts := l.pickPostings(readTs)
		maxCommitTs = x.Max(maxCommitTs, deleteBelow)

I'd move the comment I was mentioning earlier here.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 13 files reviewed, 23 unresolved discussions (waiting on @golangcibot and @manishrjain)


dgraph/cmd/debug/run.go, line 509 at r16 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

File is not goimports-ed (from goimports)

Done.


posting/list.go, line 484 at r12 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Are we making sure that we do create a posting list at the highest commit timestamp so far, which could be from deleteBelowTs?

Rollup is doing the following

maxCommitTs = x.Max(maxCommitTs, deleteBelow)

so I think this case is covered.


posting/list.go, line 857 at r12 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I'd move the comment I was mentioning earlier here.

Done. I copied it since it seemed relevant in both places.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work, @martinmr . Also I'm glad that now that you have authored a big change in posting/list.go, you really understand this core piece of tech well.

Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)


posting/list.go, line 893 at r17 (raw file):

		// way to get the max commit timestamp is to pick all the relevant postings for the given
		// readTs and calculate the maxCommitTs.
		// If deleteBelowTs is greater than zero, there was a delete all marker. The list of postings

100 chars

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still running some benchmarks so I'll hold on merging this for a bit.

I am seeing some differences though, the version with split lists finishes in about an hour (with 3 million mutations and fulltext indexes) while master starts crashing my machine after 10-15 min. This seems to be a memory issue so I am changing the index to term to see if that let's the benchmark finish. However, the fact that the benchmark can reliably finish with my changes without crapping out is encouraging.

Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot, @manishrjain, and @martinmr)

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 13 files reviewed, 21 unresolved discussions (waiting on @golangcibot and @manishrjain)


posting/list.go, line 893 at r17 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

100 chars

Done.

@martinmr
Copy link
Contributor Author

I ran the benchmark with just a term index but I didn't observe any substantial difference. I didn't observe any issues with my changes and all the queries I tried worked fine. I am going to write a different benchmark that just creates large posting lists (without split and with split) and compares the time to create iterate over this lists.

I am merging this PR for now since all tests and manual checks are passing.

@martinmr martinmr merged commit a3136d3 into master Apr 19, 2019
@martinmr martinmr deleted the martinmr/multi-part-list branch April 19, 2019 19:03
@manishrjain
Copy link
Contributor

Congrats!

dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
Large posting lists are now being split during rollup if they become too large.

A normal list has the following format:

<key> -> <posting list with all the data for this list>

A multi-part list is stored using multiple keys. The keys for the parts will be
generated by appending the first uid in the part to the key of the main part.

The main part of a multi-part list will have the following format:

<key> -> <posting with list of each part's start uid>

The data for this list will be stored in key-value pairs with the format below:

<key, 1> -> <first part of the list with all the data for this part>
<key, next start uid> -> <second part of the list with the data for this part>
...
<key, last start uid> -> <last part of the list with all its data>

The first part of a multi-part list always has start uid 1 and will be the last
part to be deleted, at which point the entire list will be marked for deletion.
As the list grows, existing parts might be split if they become too big.
pawanrawal added a commit that referenced this pull request Oct 23, 2019
… > 0 (#4204)

Don't iterate over the immutable layer when we have a delete marker. Earlier after doing a S P * deletion, we were still returning the first uid from the immutable layer.

Fixes #4182

This bug was introduced in #3105 as part of the PR which introduced multi-part posting lists.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants