Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(search): Support Boolean OR Filters in Rest.li APIs #3344

Merged
merged 16 commits into from
Oct 14, 2021

Conversation

jjoyce0510
Copy link
Collaborator

@jjoyce0510 jjoyce0510 commented Oct 8, 2021

Changes
This filter accomplishes a few things:

  1. moves the "filter" model from datahub-gma into datahub repo
  2. adds support for OR filters in the Rest.li GMS apis. Previously we only supported ANDs.

Compatibility
This change is backwards incompatible for the Rest.li APIs exposed by GMS. If you are actively using them with the Filter functionality, you will need to update your code to use the new ConjunctiveCriterion model.

Old Filter structure:

"filter": {
        "criteria": [
           {
                "field": "title",
                "value": "Baz Chart 1",
                "condition": "EQUAL"
           }
        ]
 }

and new Filter structure:

"filter": {
        "or": [
           {
                "and": [ 
                  {
                       "field": "title",
                       "value": "Baz Chart 1",
                       "condition": "EQUAL"
                   }
                ]
           }
        ]
 }

As you can see, in adding this API we are going to support Disjunctive Normal Form boolean queries in our API, which makes it much more powerful!

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

Copy link
Contributor

@dexter-mh-lee dexter-mh-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is reference to filters in TimeseriesAspectService that you missed

Copy link
Contributor

@dexter-mh-lee dexter-mh-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also seems like we cannot do and of or operations. is this intended?

Copy link
Contributor

@dexter-mh-lee dexter-mh-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete files under metadata-ingestion/src/datahub/metadata ? Should be ignored

Copy link
Contributor

@dexter-mh-lee dexter-mh-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Did you confirm metadata-io:tests are passing? Since it has been disabled in workflow want to make sure they are passing at least loaclly

@jjoyce0510
Copy link
Collaborator Author

Tested locally, yes!

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit 61bd7fd into datahub-project:master Oct 14, 2021
@EnricoMi
Copy link
Contributor

@jjoyce0510 @shirshanka should the classes that have been moved out of datahub-gma be marked as deprecated in a new release of datahub-gma and datahub move to that version? Otherwise it might confuse that datahub and its current datahub-gma version both have identical classes under different packages and it is not clear which to use.

@EnricoMi
Copy link
Contributor

@jjoyce0510 @shirshanka @gabe-lyons Also, Filter in com.linkedin.metadata.query.filter still contains Criteria along with ConjunctiveCriterionArray. Can a filter have set both? Does GraphService have to implement both? If criteria is considered deprecated, it should have been removed from Filter while copying it over from datahub-gma.

Does a GraphService have to implement the full set of possible filter expressions (multiple ORs, multiple ANDs, all of com.linkedin.metadata.query.filter.Condition)?

@shirshanka
Copy link
Contributor

@EnricoMi : we kept the Criteria as we didn't want to break any existing clients of the REST protocol. I think we should translate Criteria based calls to the Conjunctive form higher up the stack so that by the time the query hits the GraphService it has already been translated to the Conjunctive form.
I will let @jjoyce0510 respond to your questions on what expressions the GraphService needs to implement.

@jjoyce0510
Copy link
Collaborator Author

@EnricoMi We are looking to this on our end. Ideally we can just kill that field for good and simplify this so no one has to deal with the case. Until then, you would be expected to deal with the possibility that the criteria field is populated. Update to come soon

@jjoyce0510
Copy link
Collaborator Author

should the classes that have been moved out of datahub-gma be marked as deprecated in a new release of datahub-gma and datahub move to that version? Otherwise it might confuse that datahub and its current datahub-gma version both have identical classes under different packages and it is not clear which to use.

Yes ideally that'd be the case, however to our knowledge LinkedIn is still maintaining a version of datahub-gma in active use, so it's unclear if we'll be able to remove those files. We are currently transitioning away from depending on datahub-gma altogether and once that completes we'll be able to remove all dependencies to make things more comprehensible. But yes the ideal state is not the one we're currently operating in (yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants