Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent alphas from asking zero to serve tablets during queries. #3091

Merged
merged 10 commits into from
Mar 15, 2019

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented Mar 4, 2019

This change makes queries read-only in the sense that they won't ask
zero to serve any predicates.
Same change that was reverted due to a bug detected by running "dgraph
increment". The bug has been fixed by asking zero what group is serving
a particular tablet without requesting to serve that tablet. This
preserves both correctness and the read-only property of queries.

I verified "dgraph increment" is working as expected with this change.


This change is Reviewable

This change makes queries read-only in the sense that they won't ask
zero to serve any predicates.
Same change that was reverted due to a bug detected by running "dgraph
increment". The bug has been fixed by asking zero what group is serving
a particular tablet without requesting to serve that tablet. This
preserves both correctness and the read-only property of queries.
@martinmr martinmr requested a review from a team March 4, 2019 19:53
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 12 of 12 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @martinmr)


dgraph/cmd/zero/zero.go, line 578 at r1 (raw file):

// Serves returns the tablet currently serving the predicate or nil if the predicate
// is not being served by any alpha.
func (s *Server) Serves(

Instead of adding a new endpoint, just modify Tablet to add a new field in there, called bool read_only, which can indicate to Zero in the ShouldServe endpoint.


worker/groups.go, line 328 at r1 (raw file):

// BelongsToReadOnly acts like BelongsTo except it does not ask zero to serve
// the tablet for key if no group is currently serving it.
func (g *groupi) BelongsToReadOnly(key string) uint32 {

Let's make this return (uint32, error), so an error can be returned back properly by the caller. The other funcs in this file should ideally also return errros, to make our error handling better. There's a diff between not being able to communicate to zero, and considering a tablet as unserved -- and an error can help identify that.

gitlw
gitlw previously requested changes Mar 4, 2019
Copy link

@gitlw gitlw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @martinmr)


dgraph/cmd/zero/zero.go, line 579 at r1 (raw file):

// is not being served by any alpha.
func (s *Server) Serves(
	ctx context.Context, tablet *pb.Tablet) (*pb.Tablet, error) {

I'd suggest changing the 2nd parameter from tablet to "predicate string"


query/query.go, line 173 at r1 (raw file):

	ReadTs       uint64
	Attr         string
	UnknownAttr  bool

I'd suggesting having a field of type error to indicate something wrong has happened, which then can be used for handling other types of errors for early termination.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 16 files reviewed, 4 unresolved discussions (waiting on @gitlw and @manishrjain)


dgraph/cmd/zero/zero.go, line 578 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Instead of adding a new endpoint, just modify Tablet to add a new field in there, called bool read_only, which can indicate to Zero in the ShouldServe endpoint.

Done.


dgraph/cmd/zero/zero.go, line 579 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

I'd suggest changing the 2nd parameter from tablet to "predicate string"

Removed this method in favor of adding a read_only parameter to tablet so this is not needed anymore.


query/query.go, line 173 at r1 (raw file):

Previously, gitlw (Lucas Wang) wrote…

I'd suggesting having a field of type error to indicate something wrong has happened, which then can be used for handling other types of errors for early termination.

I am not sure how I would use this field. I modified the methods in groups.go to return errors so any issue while trying to determining group membership should make the query terminate early.


worker/groups.go, line 328 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Let's make this return (uint32, error), so an error can be returned back properly by the caller. The other funcs in this file should ideally also return errros, to make our error handling better. There's a diff between not being able to communicate to zero, and considering a tablet as unserved -- and an error can help identify that.

Done. BelongsTo and ServesTablet (both versions of each method) now return an error.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you verify 3 important edge cases I mentioned in the comments?

Reviewed 14 of 14 files at r2.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @gitlw and @martinmr)


worker/draft.go, line 188 at r2 (raw file):

					return err
				} else if !servesTablet {
					return fmt.Errorf("group 1 should always serve reserved predicate %s",

I'm seeing this error sometimes.

Can you check 3 things for me:

  1. Do we do something to ensure in Zero that reserved predicates are assigned group 1?
  2. That /moveTablet endpoint would fail for them?
  3. That Zero does not accidentally move them during auto-rebalancing?

Do these before merging this change.


worker/groups.go, line 163 at r2 (raw file):

			if tablet, err := g.Tablet(pred); err != nil {
				failed = true
				glog.V(1).Infof("Error while getting tablet for pred %s: %v", pred, err)

glog.Errorf

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 16 files reviewed, 4 unresolved discussions (waiting on @gitlw and @manishrjain)


worker/draft.go, line 188 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I'm seeing this error sometimes.

Can you check 3 things for me:

  1. Do we do something to ensure in Zero that reserved predicates are assigned group 1?
  2. That /moveTablet endpoint would fail for them?
  3. That Zero does not accidentally move them during auto-rebalancing?

Do these before merging this change.

  1. This is already being done in ShouldServe.
    2 and 3 were not happening. I opened Ensure reserved predicates cannot be moved. #3137 to fix these issues.

worker/groups.go, line 163 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

glog.Errorf

Done.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 6 of 16 files reviewed, 4 unresolved discussions (waiting on @gitlw and @manishrjain)


worker/draft.go, line 188 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…
  1. This is already being done in ShouldServe.
    2 and 3 were not happening. I opened Ensure reserved predicates cannot be moved. #3137 to fix these issues.

Done. Merged #3137 so we should be good to go now.

@martinmr martinmr requested a review from manishrjain March 14, 2019 23:40
@martinmr martinmr dismissed gitlw’s stale review March 14, 2019 23:40

Addressed comments

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @gitlw, @manishrjain, and @martinmr)


worker/groups.go, line 163 at r4 (raw file):

			if tablet, err := g.Tablet(pred); err != nil {
				failed = true
				glog.Errorf("Error while getting tablet for pred %s: %v", pred, err)

I'd use pred %q.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 16 of 17 files reviewed, 4 unresolved discussions (waiting on @gitlw and @manishrjain)


worker/groups.go, line 163 at r4 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I'd use pred %q.

Done.

@martinmr martinmr merged commit ef74ed8 into master Mar 15, 2019
@martinmr martinmr deleted the martinmr/query-tablet-read-only branch March 15, 2019 17:59
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
…aph-io#3091)

This change makes queries read-only in the sense that they won't ask
zero to serve any predicates.
Same change that was reverted due to a bug detected by running "dgraph
increment". The bug has been fixed by asking zero what group is serving
a particular tablet without requesting to serve that tablet. This
preserves both correctness and the read-only property of queries.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants