Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: distribute RESTORE work using DistSQL #51959

Merged
merged 1 commit into from
Aug 4, 2020

Conversation

pbardea
Copy link
Contributor

@pbardea pbardea commented Jul 28, 2020

This commit replaces restore's old method of coordinating work on a
single node to instead use DistSQL. It creates a 2 stage DistSQL flow.
The first stage assigns chunks to SplitAndScatter processors which
forward spans that they scattered to the second stage made up of the
RestoreData processors which ingest the data. The SplitAndScatter
processors attempt to send the work to the RestoreData colocated with
the leaseholder of the span after the scattering.

Release note: None.

@pbardea pbardea requested review from dt, yuzefovich and a team July 28, 2020 01:36
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@pbardea
Copy link
Contributor Author

pbardea commented Jul 28, 2020

This should be ready for a review, but shouldn't be merged until #51607 is merged and the scatter processor can actually find the destination of the span it imported.

@pbardea pbardea force-pushed the restore-planning branch from b864463 to 94e2857 Compare July 28, 2020 13:47
}{
highWaterMark: -1,
}

res := RowCount{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a minute to figure out why this wasn't in the mutex-protected set above. I think it is OK to not be, but it might confuse future readers too without a comment (or we could just put it back under the mu, since it should be ~uncontended anyway?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - also moved back requestsCompleted for similar reasons. I also am now asserting that the dataspan we're marking as done is the one that has the same progress idx.

@pbardea pbardea force-pushed the restore-planning branch 3 times, most recently from a21635b to c634534 Compare July 28, 2020 15:17
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewed 3 of 4 files at r1, 3 of 4 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @pbardea, and @yuzefovich)


pkg/ccl/backupccl/restore_processor_planning.go, line 17 at r2 (raw file):

	"github.com/cockroachdb/cockroach/pkg/kv"
	roachpb "github.com/cockroachdb/cockroach/pkg/roachpb"

super nit: are these roachpb and hlc aliases necessary?


pkg/ccl/backupccl/restore_processor_planning.go, line 71 at r2 (raw file):

	if len(splitAndScatterSpecs) == 0 {
		close(progCh)

nit: we probably can defer this close. I'd also mention in the comment to the method that progCh is closed.

@pbardea pbardea force-pushed the restore-planning branch from c634534 to 5e92de3 Compare July 28, 2020 16:51
Copy link
Contributor Author

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @yuzefovich)


pkg/ccl/backupccl/restore_processor_planning.go, line 17 at r2 (raw file):

Previously, yuzefovich wrote…

super nit: are these roachpb and hlc aliases necessary?

Nope, those were somehow auto-generated. Thanks for catching!


pkg/ccl/backupccl/restore_processor_planning.go, line 71 at r2 (raw file):

Previously, yuzefovich wrote…

nit: we probably can defer this close. I'd also mention in the comment to the method that progCh is closed.

Done and done.

This commit replaces restore's old method of coordinating work on a
single node to instead use DistSQL. It creates a 2 stage DistSQL flow.
The first stage assigns chunks to SplitAndScatter processors which
forward spans that they scattered to the second stage made up of the
RestoreData processors which ingest the data. The SplitAndScatter
processors attempt to send the work to the RestoreData colocated with
the leaseholder of the span after the scattering.

Release note: None.
@pbardea
Copy link
Contributor Author

pbardea commented Aug 4, 2020

TFTRs!
bors r=dt,yuzefovich

@craig
Copy link
Contributor

craig bot commented Aug 4, 2020

Build succeeded:

@craig craig bot merged commit 986e308 into cockroachdb:master Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants