Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegExp filter work without filter #2913

Merged
merged 17 commits into from
Jan 24, 2019

Conversation

srfrog
Copy link
Contributor

@srfrog srfrog commented Jan 17, 2019

Remove the restriction on regexp filter and lookups to depend on trigram index.

  • If the trigram index is found, it is used.
  • When used as filter, no index is required.
  • When no index is found and called from root query, return error.

No breaking changes expected.

Closes #2565


This change is Reviewable

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @srfrog)


worker/task.go, line 908 at r1 (raw file):

	} else {
		out := &pb.Result{}
		err = qs.handleHasFunction(ctx, arg.q, out)

No need to grab all the uids for the predicate, if there already is a list of UIDs coming in funcArgs.q.UidList. You could just iterate over that, grab their values and run regexp over them.

We only need to call the has function, if we don't have a UidList coming into this, in case someone used it at the root. But, they themselves could run the has function first, and use a filter on regexp. That way, a list of uids is always passed to this func. So, we never need to generate a list of uids here.

Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 1 unresolved discussion (waiting on @manishrjain)


worker/task.go, line 908 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

No need to grab all the uids for the predicate, if there already is a list of UIDs coming in funcArgs.q.UidList. You could just iterate over that, grab their values and run regexp over them.

We only need to call the has function, if we don't have a UidList coming into this, in case someone used it at the root. But, they themselves could run the has function first, and use a filter on regexp. That way, a list of uids is always passed to this func. So, we never need to generate a list of uids here.

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @srfrog)


worker/task.go, line 908 at r1 (raw file):

Previously, srfrog (Gus) wrote…

Done.

Still see the Has func being called. So, not sure what changed.

Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @manishrjain)


worker/task.go, line 908 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Still see the Has func being called. So, not sure what changed.

Removed, added an error when called from root without index

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @srfrog)


worker/task.go, line 947 at r3 (raw file):

			vals, err = pl.AllUntaggedValues(arg.q.ReadTs)

		case lang != "":

This changes the ordering of the if statement. Can you bring it back to the original order?


worker/task.go, line 964 at r3 (raw file):

			strVal, err := types.Convert(val, types.StringID)
			if err == nil && matchRegex(strVal, arg.srcFn.regex) {
				filtered.Uids = append(filtered.Uids, uid)

There was a break here. We only add the uid once.

Be careful with this code: It's easy to introduce hard to find bugs and cause regressions.

Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @manishrjain)


worker/task.go, line 947 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This changes the ordering of the if statement. Can you bring it back to the original order?

Done.


worker/task.go, line 964 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

There was a break here. We only add the uid once.

Be careful with this code: It's easy to introduce hard to find bugs and cause regressions.

Acked

Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @manishrjain)


worker/task.go, line 964 at r3 (raw file):

Previously, srfrog (Gus) wrote…

Acked

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 4 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@srfrog srfrog merged commit a4756c2 into master Jan 24, 2019
@srfrog srfrog deleted the srfrog/issue-2565_no_trigram_for_filter_regexp branch January 24, 2019 01:09
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
* let regexp filter not depend on index.

* added func HasTokenizer

* using HasTokenizer for regexp index

* support regexp without trigram index at root and filter

* typo

* simplified the logic a bit, lint fixes.

* added regexp trigram tests

* changed tests since they won't trigger error now.

* this should return true sometimes

* regexp index is no longer required

* when filtering use the uid list provided

* handleRegexFunction returns error when no index and regexp is called at root

* updated test to expect an error in root case

* test cases might not be mutex in the future, revert the order.

* dont try to match all posting values against regexp

* added comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants