Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Refactor Weaviate tests #3541

Merged
merged 11 commits into from
Nov 14, 2022
Merged

refactor: Refactor Weaviate tests #3541

merged 11 commits into from
Nov 14, 2022

Conversation

masci
Copy link
Contributor

@masci masci commented Nov 7, 2022

Related Issues

  • fixes n/a

Proposed Changes:

Move Weaviate tests into their own class deriving from DocumentStoreBaseTestAbstract

How did you test it?

pytest -m"document_store and integration" test/document_stores/test_weaviate.py

Notes for the reviewer

  • calls to data_objects were patched to avoid incurring in the warning
    "The non-class namespaced APIs (None value for class_name) are going to be removed in the future versions of the Weaviate Server and Weaviate Python Client."
  • Weaviate version running in CI was bumped to 1.16 so we get bugfixes we need for filtering

Checklist

@masci masci force-pushed the massi/test-weaviate branch from 823305c to 507b8e4 Compare November 8, 2022 15:45
@masci masci marked this pull request as ready for review November 9, 2022 14:54
@masci masci requested a review from a team as a code owner November 9, 2022 14:54
@masci masci requested review from julian-risch and ZanSara and removed request for a team November 9, 2022 14:54
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner to me than before, great! 👍 Only two small changes that I would like to see where I commented. Further, I was wondering about the weaviate_fixture in conftest.py where there is still version weaviate:1.14.1. Other than that the PR looks ready to merge to me.

haystack/document_stores/weaviate.py Show resolved Hide resolved
.github/workflows/tests.yml Show resolved Hide resolved
docs = ds.query(query_text)
assert len(docs) == 3

# BM25 retrieval WITH filters is not yet supported as of Weaviate v1.14.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version number in this comment should be upgraded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this in a different PR, opened an issue for that #3553

Copy link
Contributor

@ZanSara ZanSara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this suite is getting really evident that we have many flavors of docstores: keyword docstores, docstores supporting labels, docstores supporting query_by_embedding... Really nice to see this trait coming out so clearly. Something to keep in mind if we ever decide to review the API.

.github/workflows/tests.yml Show resolved Hide resolved

@pytest.mark.integration
def test_delete_index(self, ds, documents):
"""Contrary to other Document Stores, this doesn't raise if the index is empty"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not for this PR, but let's raise an issue on this inconsistency. Also faiss has the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree, I planned to do this once all the document store tests are moved to their own classes (read: when I can kill test_document_store.py). This way I will have a precise picture of all the inconsistencies.

assert ds.get_document_count(index="custom_index") == 0

@pytest.mark.integration
def test_query_by_embedding(self, ds, documents):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test should go in the base suite. All docstores should support query_by_embedding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I would do this in a dedicated PR because doing it here might bring in unrelated changes to other document stores

assert len(docs) == 3

@pytest.mark.integration
def test_query(self, ds, documents):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically this is a test that all KeywordDocumentStore subclasses should pass, which means, all docstores supporting BM25. Worth making another test base suite and moving this one in there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the majority of the document stores support BM25 we should move this method to the base class and let the ones that doesn't support it override and skip the test.

@masci masci requested a review from julian-risch November 11, 2022 14:57
Copy link
Member

@julian-risch julian-risch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now! Thank you for addressing the change requests so quickly. 👍

@masci masci merged commit 4dfddf0 into main Nov 14, 2022
@masci masci deleted the massi/test-weaviate branch November 14, 2022 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants