Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: avoid unsplittable range #14654

Merged

Conversation

petermattis
Copy link
Collaborator

If a range cannot be split because a valid split key cannot be found,
set the max size for the range to the range's current size. When a range
is successfully split, reset the max range size to the zone's max range
size. This avoids situations where a range with a single large row
cannot be split.

Fixes #9555

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@petermattis
Copy link
Collaborator Author

This needs a test. I'll get to it tomorrow.

@spencerkimball
Copy link
Member

I'm worried about the following situation: we have a large row because there are many versions being written in the past 24 hours for a key. Versions continue to be written second by second, each of which triggers the maybeAdd call to the split queue. Just setting replica.mu.maxBytes to the current size means that we will continue to get log spam and do the work of scanning through the range on every write to the range, all with no hope of splitting the range.

@petermattis
Copy link
Collaborator Author

I'm worried about the following situation: we have a large row because there are many versions being written in the past 24 hours for a key. Versions continue to be written second by second, each of which triggers the maybeAdd call to the split queue. Just setting replica.mu.maxBytes to the current size means that we will continue to get log spam and do the work of scanning through the range on every write to the range, all with no hope of splitting the range.

The scenario you describe is likely to create a range that is much larger than 64MB. I'm pretty sure we'll have other more pressing problems than the log spam if we have such a large range.

@petermattis petermattis force-pushed the pmattis/avoid-unsplittable-ranges branch from 2248e55 to a1ec8a3 Compare April 6, 2017 14:03
@petermattis
Copy link
Collaborator Author

Test added which revealed a problem in MVCCFindSplitKey. Consider a range with a max-size of 64KB. Adding a single 64KB row will cause the range to be narrowed to include just that row. Now add another 64KB row to the range and we'll attempt to split again. The target size we pass to MVCCFindSplitKey will be ~32KB which will cause that method to select the first key in the range as the split key but we can't split on that key because that is already the key we're split on. I adjusted MVCCFindSplitKey to never return the first key as the split key. I still need to enhance TestFindSplitKey to verify this behavior.

@petermattis petermattis force-pushed the pmattis/avoid-unsplittable-ranges branch from a1ec8a3 to 73655fd Compare April 6, 2017 14:11
@petermattis
Copy link
Collaborator Author

The enhancement to TestFindSplitKey was trivial. This is ready for a look.

@bdarnell
Copy link
Contributor

bdarnell commented Apr 6, 2017

Reviewed 3 of 6 files at r1, 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


pkg/storage/client_split_test.go, line 2001 at r2 (raw file):

	})

	// Add another row to the range. Are funky key construction requires this to

s/Are/Our/

The funkiness is just in this test, right? It would be more natural if we used two actual row keys instead of writing the giant row to the table prefix with no PK values (especially since the table prefix keys are already special for splits)


pkg/storage/replica_command.go, line 2634 at r2 (raw file):

func (r *Replica) adminSplitWithDescriptor(
	ctx context.Context, args roachpb.AdminSplitRequest, desc *roachpb.RangeDescriptor,
) (_ roachpb.AdminSplitResponse, noSplitKey bool, _ *roachpb.Error) {

Negating this return value feels unnatural to me.


pkg/storage/replica_command.go, line 2652 at r2 (raw file):

				ctx, snap, desc.RangeID, desc.StartKey, desc.EndKey, targetSize)
			if err != nil {
				// Swallow the error and tell the caller that no split key could be

Swallowing all errors here seems like a bad idea; I think it would be better to either make "no valid splits" a separate error type or make it not an error at all (return nil, nil, since nil will never be a valid split key)


pkg/storage/split_queue.go, line 120 at r2 (raw file):

			// its current size to prevent future attempts to split the range until
			// it grows again.
			r.SetMaxBytes(size)

I think that adding some margin here makes sense so we don't re-enqueue the range on every subsequent write.


Comments from Reviewable

@petermattis petermattis force-pushed the pmattis/avoid-unsplittable-ranges branch from 1bd876f to 4eaa8ff Compare April 6, 2017 16:27
@petermattis
Copy link
Collaborator Author

Review status: 1 of 6 files reviewed at latest revision, 4 unresolved discussions.


pkg/storage/client_split_test.go, line 2001 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/Are/Our/

The funkiness is just in this test, right? It would be more natural if we used two actual row keys instead of writing the giant row to the table prefix with no PK values (especially since the table prefix keys are already special for splits)

Yes, the funkiness is just in this test. Played with this some more and was able to fix it so that we use 3 rows in the same table.


pkg/storage/replica_command.go, line 2634 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Negating this return value feels unnatural to me.

Inverted.


pkg/storage/replica_command.go, line 2652 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Swallowing all errors here seems like a bad idea; I think it would be better to either make "no valid splits" a separate error type or make it not an error at all (return nil, nil, since nil will never be a valid split key)

Done. I've made no valid splits not an error and instead return


pkg/storage/split_queue.go, line 120 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think that adding some margin here makes sense so we don't re-enqueue the range on every subsequent write.

I'm not against doing that, but I don't think it will matter in practice. Changed this to size * 2 which required changing the test to write 2 new rows.


Comments from Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Apr 6, 2017

:lgtm:


Reviewed 2 of 5 files at r3, 3 of 3 files at r4.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks pending.


pkg/storage/client_split_test.go, line 1986 at r4 (raw file):

	repl := store.LookupReplica(tableKey, nil)
	fmt.Printf("replica %s\n", repl)

Remove this.


Comments from Reviewable

If a range cannot be split because a valid split key cannot be found,
set the max size for the range to the range's current size. When a range
is successfully split, reset the max range size to the zone's max range
size. This avoids situations where a range with a single large row
cannot be split.

Fixes cockroachdb#9555
@petermattis petermattis force-pushed the pmattis/avoid-unsplittable-ranges branch from 4eaa8ff to dd5797e Compare April 6, 2017 16:44
@petermattis
Copy link
Collaborator Author

Review status: 2 of 6 files reviewed at latest revision, 5 unresolved discussions.


pkg/storage/client_split_test.go, line 1986 at r4 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Remove this.

Done.


Comments from Reviewable

@petermattis petermattis merged commit fe5c546 into cockroachdb:master Apr 6, 2017
@petermattis petermattis deleted the pmattis/avoid-unsplittable-ranges branch April 6, 2017 17:35
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request May 2, 2018
In cockroachdb#14654 we added a mechanism to double the max range size whenever a
split attempt found a range that was unsplittable. This prevented a
tight loop of split attempts. However, it didn't actually do anything to
help us find a split point to reduce the size of the range in the
future. This size doubling worked in practice, but it was a blunt
instrument that had strange effects (see cockroachdb#24215).

This change rips out this range size doubling and replaces it with a
split queue purgatory. This purgatory is used to house replicas that
are unsplittable, preventing them from getting into a tight loop.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request May 3, 2018
In cockroachdb#14654 we added a mechanism to double the max range size whenever a
split attempt found a range that was unsplittable. This prevented a
tight loop of split attempts. However, it didn't actually do anything to
help us find a split point to reduce the size of the range in the
future. This size doubling worked in practice, but it was a blunt
instrument that had strange effects (see cockroachdb#24215).

This change rips out this range size doubling and replaces it with a
split queue purgatory. This purgatory is used to house replicas that
are unsplittable, preventing them from getting into a tight loop.

Release note: None
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request May 8, 2018
In cockroachdb#14654 we added a mechanism to double the max range size whenever a
split attempt found a range that was unsplittable. This prevented a
tight loop of split attempts. However, it didn't actually do anything to
help us find a split point to reduce the size of the range in the
future. This size doubling worked in practice, but it was a blunt
instrument that had strange effects (see cockroachdb#24215).

This change rips out this range size doubling and replaces it with a
split queue purgatory. This purgatory is used to house replicas that
are unsplittable, preventing them from getting into a tight loop.

Release note: None
craig bot pushed a commit that referenced this pull request May 8, 2018
25245: storage: add unsplittable ranges to split queue purgatory r=nvanbenschoten a=nvanbenschoten

Fixes #25191.

This makes the fix for #24215 trivial.

In #14654 we added a mechanism to double the max range size whenever a
split attempt found a range that was unsplittable. This prevented a
tight loop of split attempts. However, it didn't actually do anything to
help us find a split point to reduce the size of the range in the
future. This size doubling worked in practice, but it was a blunt
instrument that had strange effects (see #24215).

This change rips out this range size doubling and replaces it with a
split queue purgatory. This purgatory is used to house replicas that
are unsplittable, preventing them from getting into a tight loop.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants