-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify design and make tablet moves robust #2800
Conversation
…nction. And run s.runRecovery at the end if got an error.
…se anything until move is complete. The leader just blocks commits during the move.
// No new deletion/background cleanup would start after we start streaming tablet, | ||
// so all the proposals for a particular tablet would atmost wait for deletion of | ||
// single tablet. Only leader needs to do this. | ||
mu := groups().blockDeletes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assignment copies lock value to mu: sync.Mutex (from govet
)
…t be done, hence causing deadlocks in applying proposal and predicate moves.
dgraph/cmd/zero/http.go
Outdated
@@ -140,6 +140,12 @@ func (st *state) moveTablet(w http.ResponseWriter, r *http.Request) { | |||
return | |||
} | |||
|
|||
if !st.node.AmLeader() { | |||
w.WriteHeader(http.StatusBadRequest) | |||
x.SetStatus(w, x.ErrorInvalidRequest, "This Zero server is not the leader. Re-run command on leader.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line is 104 characters (from lll
)
g.delPred <- struct{}{} | ||
<-g.delPred | ||
} | ||
func (g *groupi) cleanupTablets() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func (*groupi).cleanupTablets
is unused (from unused
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 18 files reviewed, 3 unresolved discussions (waiting on @golangcibot)
dgraph/cmd/zero/http.go, line 145 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
line is 104 characters (from
lll
)
Done.
worker/groups.go, line 757 at r2 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
func
(*groupi).cleanupTablets
is unused (fromunused
)
Done.
worker/predicate_move.go, line 145 at r1 (raw file):
Previously, golangcibot (Bot from GolangCI) wrote…
assignment copies lock value to mu: sync.Mutex (from
govet
)
Done.
This PR changes the design of the tablet moves to this: - Zero leader blocks any commit on the predicate to be moved. - Gets a new timestamp, which the source group leader must wait for. - Source group leader streams data to the destination group leader. - Once done, Zero would propose the predicate be served by destination group. - Zero defer unblocks the commits on the predicate. There's no need to propose read-only among Zeros, and propose states among Alphas, etc. Instead of trying to block mutations on Alphas, we just block Commits on the Zero leader. If the leader changes, the move would be aborted, and the source group would continue as nothing happened. Destination group would then clean up the extra data during its cleanup cycle. Tested this with increment tool with tons of tablet moves, and we don't lose any commit during this. Changes: * Start work on tablet move txn failures. * Add some tracing as well. * Start work on simplifying tablet move. * More work on predicate movement in Zero. Need to remove moveTablet function. And run s.runRecovery at the end if got an error. * Remove moveTablet and custom logic in favor of runRecovery. * Some cleanup. Think we can do away with Zero proposals during move. * Rewrote how predicate move is being handled at Zero. No need to propose anything until move is complete. The leader just blocks commits during the move. * Working move tablets. * Remove the simulated error for Jepsen. * Add more spans. * Use a pointer to mutex to avoid a mutex copy during locking. * Use x.TxnWriter to avoid a bug which was causing sync.WaitGroup to not be done, hence causing deadlocks in applying proposal and predicate moves. * More tracing. * Wao. Build is fine, merging master after 1.5 months. * Self review * Remove TODOs by using move payload TxnTs. * Set glog logger for Badger logs, so the log output is consistent. * Rename tabletBlocked to isBlocked
This PR changes the design of the tablet moves to this:
There's no need to propose read-only among Zeros, and propose states among Alphas, etc. Instead of trying to block mutations on Alphas, we just block Commits on the Zero leader. If the leader changes, the move would be aborted, and the source group would continue as nothing happened. Destination group would then clean up the extra data during its cleanup cycle.
Tested this with increment tool with tons of tablet moves, and we don't lose any commit during this. Will test with Jepsen.
This change is