Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify design and make tablet moves robust #2800

Merged
merged 22 commits into from
Jan 14, 2019
Merged

Conversation

manishrjain
Copy link
Contributor

@manishrjain manishrjain commented Nov 30, 2018

This PR changes the design of the tablet moves to this:

  • Zero leader blocks any commit on the predicate to be moved.
  • Gets a new timestamp, which the source group leader must wait for.
  • Source group leader streams data to the destination group leader.
  • Once done, Zero would propose the predicate be served by destination group.
  • Zero defer unblocks the commits on the predicate.

There's no need to propose read-only among Zeros, and propose states among Alphas, etc. Instead of trying to block mutations on Alphas, we just block Commits on the Zero leader. If the leader changes, the move would be aborted, and the source group would continue as nothing happened. Destination group would then clean up the extra data during its cleanup cycle.

Tested this with increment tool with tons of tablet moves, and we don't lose any commit during this. Will test with Jepsen.


This change is Reviewable

// No new deletion/background cleanup would start after we start streaming tablet,
// so all the proposals for a particular tablet would atmost wait for deletion of
// single tablet. Only leader needs to do this.
mu := groups().blockDeletes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assignment copies lock value to mu: sync.Mutex (from govet)

@@ -140,6 +140,12 @@ func (st *state) moveTablet(w http.ResponseWriter, r *http.Request) {
return
}

if !st.node.AmLeader() {
w.WriteHeader(http.StatusBadRequest)
x.SetStatus(w, x.ErrorInvalidRequest, "This Zero server is not the leader. Re-run command on leader.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line is 104 characters (from lll)

g.delPred <- struct{}{}
<-g.delPred
}
func (g *groupi) cleanupTablets() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func (*groupi).cleanupTablets is unused (from unused)

Copy link
Contributor Author

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 18 files reviewed, 3 unresolved discussions (waiting on @golangcibot)


dgraph/cmd/zero/http.go, line 145 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

line is 104 characters (from lll)

Done.


worker/groups.go, line 757 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

func (*groupi).cleanupTablets is unused (from unused)

Done.


worker/predicate_move.go, line 145 at r1 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

assignment copies lock value to mu: sync.Mutex (from govet)

Done.

@manishrjain manishrjain merged commit 63f8174 into master Jan 14, 2019
@manishrjain manishrjain deleted the mrjn/move-tablet branch January 14, 2019 03:31
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
This PR changes the design of the tablet moves to this:

- Zero leader blocks any commit on the predicate to be moved.
- Gets a new timestamp, which the source group leader must wait for.
- Source group leader streams data to the destination group leader.
- Once done, Zero would propose the predicate be served by destination group.
- Zero defer unblocks the commits on the predicate.

There's no need to propose read-only among Zeros, and propose states among Alphas, etc. Instead of trying to block mutations on Alphas, we just block Commits on the Zero leader. If the leader changes, the move would be aborted, and the source group would continue as nothing happened. Destination group would then clean up the extra data during its cleanup cycle.

Tested this with increment tool with tons of tablet moves, and we don't lose any commit during this.

Changes:
 * Start work on tablet move txn failures.
* Add some tracing as well.
* Start work on simplifying tablet move.
* More work on predicate movement in Zero. Need to remove moveTablet function. And run s.runRecovery at the end if got an error.
* Remove moveTablet and custom logic in favor of runRecovery.
* Some cleanup. Think we can do away with Zero proposals during move.
* Rewrote how predicate move is being handled at Zero. No need to propose anything until move is complete. The leader just blocks commits during the move.
* Working move tablets.
* Remove the simulated error for Jepsen.
* Add more spans.
* Use a pointer to mutex to avoid a mutex copy during locking.
* Use x.TxnWriter to avoid a bug which was causing sync.WaitGroup to not be done, hence causing deadlocks in applying proposal and predicate moves.
* More tracing.
* Wao. Build is fine, merging master after 1.5 months.
* Self review
* Remove TODOs by using move payload TxnTs.
* Set glog logger for Badger logs, so the log output is consistent.
* Rename tabletBlocked to isBlocked
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants