Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: filter abort span during split copy #26934

Merged
merged 1 commit into from
Jun 25, 2018

Conversation

tbg
Copy link
Member

@tbg tbg commented Jun 23, 2018

We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See #26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.

@tbg tbg requested a review from a team June 23, 2018 12:04
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg force-pushed the fix/split-abort-span branch from e94da24 to df93e3b Compare June 23, 2018 12:06
@tbg tbg requested a review from nvanbenschoten June 23, 2018 12:06
@tbg tbg force-pushed the fix/split-abort-span branch 2 times, most recently from 3ff8e9e to a17c7b8 Compare June 23, 2018 13:01
@nvanbenschoten
Copy link
Member

:lgtm: the approach looks good, just a few comments about re-organizing the code.


Reviewed 1 of 3 files at r1.
Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/client_split_test.go, line 102 at r1 (raw file):

}

func TestStoreSplitAbortSpan(t *testing.T) {

Leave a comment about what this is testing.


pkg/storage/replica_command.go, line 1148 at r1 (raw file):

	}

	// Initialize the RHS range's AbortSpan by copying the LHS's. Put a little extra

Do you think this should all be moved into a AbortSpan.CopyIntoWithFilter method (with a better name, if you can think of one)?


pkg/storage/replica_command.go, line 1163 at r1 (raw file):

		var scratch [64]byte
		rec.AbortSpan().Iterate(ctx, batch, func(k []byte, entry roachpb.AbortSpanEntry) {
			if err != nil {

We should add an error return value to Iterate's closure, terminate iteration if it's not nil, and return an error from the funtion.


pkg/storage/abortspan/abortspan.go, line 110 at r1 (raw file):

// each unmarshaled entry with the MVCC key and the decoded entry.
func (sc *AbortSpan) Iterate(
	ctx context.Context, e engine.Reader, f func([]byte, roachpb.AbortSpanEntry),

Not your change, but why does f pass a []byte instead of a roachpb.Key? Does changing that break anything?


Comments from Reviewable

@tbg tbg force-pushed the fix/split-abort-span branch 2 times, most recently from f1a0432 to 8054c91 Compare June 24, 2018 16:33
@tbg
Copy link
Member Author

tbg commented Jun 24, 2018

TFTR!


Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/client_split_test.go, line 102 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Leave a comment about what this is testing.

Done.


pkg/storage/replica_command.go, line 1148 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do you think this should all be moved into a AbortSpan.CopyIntoWithFilter method (with a better name, if you can think of one)?

I removed the Copy methods. We're only going to copy in two places, splits and merges. Left a TODO for @benesch in the merge code (removing a copy that apparently had no test coverage) and removed the impl of CopyInto and CopyFrom which ... wasn't great code anyway.


pkg/storage/replica_command.go, line 1163 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We should add an error return value to Iterate's closure, terminate iteration if it's not nil, and return an error from the funtion.

Done.


Comments from Reviewable

@tbg tbg force-pushed the fix/split-abort-span branch from 8054c91 to 6f3cc2f Compare June 24, 2018 16:37
@tbg
Copy link
Member Author

tbg commented Jun 24, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/abortspan/abortspan.go, line 110 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Not your change, but why does f pass a []byte instead of a roachpb.Key? Does changing that break anything?

No that works beautifully, done. Maybe there was an import problem at some distant point in the past? Not sure.


Comments from Reviewable

@tbg
Copy link
Member Author

tbg commented Jun 24, 2018

Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/abortspan/abortspan.go, line 110 at r1 (raw file):

Previously, tschottdorf (Tobias Schottdorf) wrote…

No that works beautifully, done. Maybe there was an import problem at some distant point in the past? Not sure.

Done.


Comments from Reviewable

@tbg tbg force-pushed the fix/split-abort-span branch from 6f3cc2f to f764b78 Compare June 24, 2018 18:41
We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See cockroachdb#26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.
@tbg tbg force-pushed the fix/split-abort-span branch from f764b78 to 46d6c47 Compare June 24, 2018 20:42
@tbg
Copy link
Member Author

tbg commented Jun 25, 2018

bors r=nvanbenschoten

craig bot pushed a commit that referenced this pull request Jun 25, 2018
26934: storage: filter abort span during split copy r=nvanbenschoten a=tschottdorf

We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See #26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.

Co-authored-by: Tobias Schottdorf <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jun 25, 2018

Build succeeded

@craig craig bot merged commit 46d6c47 into cockroachdb:master Jun 25, 2018
@tbg tbg deleted the fix/split-abort-span branch June 25, 2018 13:40
craig bot pushed a commit that referenced this pull request Jun 25, 2018
26944: backport-2.0: storage: filter abort span during split copy r=bdarnell a=tschottdorf

Backport 1/1 commits from #26934.

Needed slight adjustments to apply, but nothing in the split logic.

/cc @cockroachdb/release

---

We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See #26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.


Co-authored-by: Tobias Schottdorf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants