-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete tablets which don't belong #3051
Conversation
…over the data periodically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments but nothing major. Feel free to merge after looking at them.
Reviewed 5 of 5 files at r1.
Reviewable status: 4 of 5 files reviewed, 6 unresolved discussions (waiting on @manishrjain and @martinmr)
dgraph/cmd/zero/tablet.go, line 155 at r1 (raw file):
span.Annotate(nil, msg) in.DestGid = 0 // Indicates deletion for the predicate.
// Indicates deletion of predicate in the source group.
dgraph/cmd/zero/zero.go, line 644 at r1 (raw file):
pl := s.Leader(gid) if pl == nil { return x.Errorf("Unable to reach leader of group: %d.", gid)
Remove period at the end.
worker/draft.go, line 856 at r1 (raw file):
val, _ = m.LoadOrStore(pk.Attr, sz) } sz := val.(*int64)
is sz supposed to stand for size? If so, I think renaming it to size is clearer.
worker/draft.go, line 903 at r1 (raw file):
pred := key.(string) space := val.(*int64) sz := atomic.LoadInt64(space)
Same here regarding sz vs size.
worker/groups.go, line 140 at r1 (raw file):
glog.Infof("Server is ready") gr.closer = y.NewCloser(3) // Match CLOSER:1 in this file.
Is this change supposed to be here?
worker/predicate_move.go, line 182 at r1 (raw file):
} if in.DestGid == 0 { glog.Infof("Got instructed to delete tablet: %v", in.Predicate)
"Was instructed"
Tablet deletion: - With recent changes, when a predicate is moved from one group to another, the tablet was left behind in the source group. - This PR adds a way for Zero to instruct the source group to delete the predicate after successfully transferring ownership to the destination group. - This PR also removes `cleanupTablets` method, so the source group does not make the decision of deleting tablets. - Rollup operation now does the size calculation per tablet and sends it out to Zero. - We no longer do a periodic `calculateTabletSize` anymore. - When Zero receives a tablet size update, after applying the proposal, it determines which tablets need to be deleted, and sends an instruction to Alpha leader accordingly. Potential issue: Considering the job of tablet deletion happens just after leader rollup, it is possible that the followers would still be doing their rollups when the tablet deletion happens. In that case, a race condition might happen where the deleted tablet could resurface because of roll-up operation (the key is read, then clean predicate deletes it, then the key is written back). In the future, we could cancel a roll-up operation if we determine it could cause a race condition like this. Changes: * Code to delete predicates which don't belong. * Test and fix the predicate delete logic. * Remove cleanup tablets. This is now coordinated by Zero. * Use rollup to calculate the tablet sizes. That way, we don't iterate over the data periodically. * Address Martin's comments
Tablet deletion: - With recent changes, when a predicate is moved from one group to another, the tablet was left behind in the source group. - This PR adds a way for Zero to instruct the source group to delete the predicate after successfully transferring ownership to the destination group. - This PR also removes `cleanupTablets` method, so the source group does not make the decision of deleting tablets. - Rollup operation now does the size calculation per tablet and sends it out to Zero. - We no longer do a periodic `calculateTabletSize` anymore. - When Zero receives a tablet size update, after applying the proposal, it determines which tablets need to be deleted, and sends an instruction to Alpha leader accordingly. Potential issue: Considering the job of tablet deletion happens just after leader rollup, it is possible that the followers would still be doing their rollups when the tablet deletion happens. In that case, a race condition might happen where the deleted tablet could resurface because of roll-up operation (the key is read, then clean predicate deletes it, then the key is written back). In the future, we could cancel a roll-up operation if we determine it could cause a race condition like this. Changes: * Code to delete predicates which don't belong. * Test and fix the predicate delete logic. * Remove cleanup tablets. This is now coordinated by Zero. * Use rollup to calculate the tablet sizes. That way, we don't iterate over the data periodically. * Address Martin's comments
@CodeLingoBot capture error messages shouldn't end in a period |
Tablet deletion:
cleanupTablets
method, so the source group does not make the decision of deleting tablets.calculateTabletSize
anymore.Potential issue: Considering the job of tablet deletion happens just after leader rollup, it is possible that the followers would still be doing their rollups when the tablet deletion happens. In that case, a race condition might happen where the deleted tablet could resurface because of roll-up operation (the key is read, then clean predicate deletes it, then the key is written back). In the future, we could cancel a roll-up operation if we determine it could cause a race condition like this.
This change is