-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend metadata filtering support in ElasticsearchDocumentStore
#2108
Conversation
# Conflicts: # haystack/document_stores/elasticsearch.py # test/test_document_store.py
…_filter # Conflicts: # docs/_src/api/api/document_store.md
# Conflicts: # docs/_src/api/api/document_store.md # haystack/document_stores/elasticsearch.py # test/conftest.py # test/test_document_store.py # test/test_eval.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Left a few minor comments.
return cls(conditions) | ||
|
||
@abstractmethod | ||
def convert_to_elasticsearch(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably I missed this one, but where does this method get implemented? I saw it being called in the ElasticsearchDocumentStore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method gets implemented in each of the LogicalOperation and FilterOperation classes, such that each operation is responsible for converting their part of the whole filter clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add some labels to the PR before merging
This PR allows users to apply a wide variety of filters on Documents inside their
ElasticsearchDocumentStore
.Filters are defined as nested dictionaries. The keys of the dictionaries can be a logical operator (
"$and"
,"$or"
,"$not"
), a comparison operator ("$eq"
,"$in"
,"$gt"
,"$gte"
,"$lt"
,"$lte"
) or a metadata field name.Logical operator keys take a dictionary of metadata field names and/or logical operators as value. Metadata field names take a dictionary of comparison operators as value. Comparison operator keys take a single value or (in case of
"$in"
) a list of values as value.If no logical operator is provided,
"$and"
is used as default operation.If no comparison operator is provided,
"$eq"
(or"$in"
if the comparison value is a list) is used as default operation. Therefore, we don't add breaking changes.The tests pass when using an
OpenSearchDocumentStore
as well, so this filtering functionality can als be used right away with OpenSearch.Example:
Open To-DoDONEWith the current implementation, we can't use the same logical operator twice on the same level, i.e., the following filter is not valid as dictionary keys have to be unique:
To be able to handle the above mentioned case, we will allow logical operators to optionally take a list of dictionaries as value, such that the above filter would like this:
(Thanks to @MichelBartels for spotting this!)
Closes #2029