Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancing Unit Testing by Adding sqldb Fixture #1270

Merged
merged 3 commits into from
Nov 29, 2023

Conversation

jieguangzhou
Copy link
Collaborator

@jieguangzhou jieguangzhou commented Nov 10, 2023

Description

#673

Related Issues

Checklist

  • Is this code covered by new or existing unit tests or integration tests?
  • Did you run make test successfully?
  • Do new classes, functions, methods and parameters all have docstrings?
  • Were existing docstrings updated, if necessary?
  • Was external documentation updated, if necessary?

Additional Notes or Comments

Comment on lines +74 to 81
children = relationship(
"Component",
secondary=ParentChildAssociation.__table__,
primaryjoin=id == ParentChildAssociation.parent_id,
secondaryjoin=id == ParentChildAssociation.child_id,
backref="children",
backref="parents",
cascade="all, delete",
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got parents and children reversed

)
.all()
)
return sum([c.parents for c in components], [])
parents = [a.parent_id for a in assocations]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return unique_id, keep the same logic as MongoDBMetaDataStore

@@ -251,7 +255,7 @@ def _replace_object(self, info, identifier, type_id, version):
Component.type_id == type_id,
Component.identifier == identifier,
Component.version == version,
).update({'dict': info})
).update(info)
Copy link
Collaborator Author

@jieguangzhou jieguangzhou Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix error structure

Comment on lines +287 to +308
identifiers = [c.identifier for c in session.query(Component).all()]
identifiers = sorted(set(identifiers), key=lambda x: identifiers.index(x))
return identifiers
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deduplicate identifiers, keep the same logic with MongoDBMetaDataStore

@jieguangzhou jieguangzhou force-pushed the unittests/sqllite branch 2 times, most recently from 0211178 to 0748303 Compare November 13, 2023 11:31
@jieguangzhou jieguangzhou force-pushed the unittests/sqllite branch 2 times, most recently from 6f52763 to e8b91df Compare November 27, 2023 07:33
@codecov-commenter
Copy link

codecov-commenter commented Nov 27, 2023

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (34830a7) 80.33% compared to head (52617a4) 80.12%.
Report is 135 commits behind head on main.

Files Patch % Lines
superduperdb/backends/base/compute.py 68.96% 9 Missing ⚠️
superduperdb/backends/base/artifact.py 80.00% 1 Missing ⚠️
superduperdb/backends/base/data_backend.py 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1270      +/-   ##
==========================================
- Coverage   80.33%   80.12%   -0.22%     
==========================================
  Files          95      103       +8     
  Lines        6602     7341     +739     
==========================================
+ Hits         5304     5882     +578     
- Misses       1298     1459     +161     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jieguangzhou jieguangzhou self-assigned this Nov 27, 2023
@fazlulkarimweb fazlulkarimweb changed the title Add sqldb fixture in UT Enhancing Unit Testing by Adding sqldb Fixture Nov 27, 2023
@jieguangzhou jieguangzhou force-pushed the unittests/sqllite branch 2 times, most recently from 5b881d5 to fa9a02d Compare November 28, 2023 15:21
args = Column(JSON)
kwargs = Column(JSON)
method_name = Column(String)
method_name = Column(String(DEFAULT_LENGTH))
stdout = Column(JSON)
stderr = Column(JSON)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For databases such as MySQL, we need to supplement the size of this String. DEFAULT_LENGTH = 255

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@jieguangzhou jieguangzhou linked an issue Nov 28, 2023 that may be closed by this pull request
@jieguangzhou jieguangzhou force-pushed the unittests/sqllite branch 2 times, most recently from 3aa4fe7 to ee25d60 Compare November 29, 2023 03:39
Comment on lines +37 to +38
@pytest.mark.parametrize("db", [DBConfig.mongodb, DBConfig.sqldb], indirect=True)
def test_only_uri(db):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test a use case using fixtures with MongoDB and SQL databases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

test/conftest.py Outdated
Comment on lines 288 to 306
def db(request, monkeypatch) -> Iterator[Datalayer]:
# TODO: Use pre-defined config instead of dict here
param = request.param if hasattr(request, 'param') else DBConfig.mongodb
if isinstance(param, dict):
# e.g. @pytest.mark.parametrize("db", [DBConfig.sqldb], indirect=True)
setup_config = param.copy()
elif isinstance(param, tuple):
# e.g. @pytest.mark.parametrize("db", [(DBConfig.sqldb, {'n_data': 10})], indirect=True) # noqa: E501
setup_config = param[0].copy()
setup_config.update(param[1] or {})
else:
raise ValueError(f'Unsupported param type: {type(param)}')

monkeypatch.setattr(CFG, 'data_backend', setup_config['data_backend'])

db = create_db(CFG, **setup_config)
yield db
Copy link
Collaborator Author

@jieguangzhou jieguangzhou Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configure the db fixture in test cases using the predefined DBConfig.

Standardize fixture creation using configuration settings, with support for MongoDB and SQL databases.

Then, every test case can use the same fixture, yet they can still have different setup behaviors.

  • DBConfig.mongodb_empty
  • DBConfig.sqldb_empty
  • DBConfig.mongodb (include: encoder, data, model, vector_index)
  • DBConfig.sqldb (include: encoder, data, model, vector_index)

The above four are the most commonly used configurations. If there are more frequently used database preset configurations, new configurations can also be preset, as follows:

  • DBConfig.mongodb_no_vector_index (include: encoder, data, model)
  • DBConfig.sqldb_no_vector_index (include: encoder, data, model)
  • DBConfig.mongodb_data (include: encoder, data)
  • DBConfig.sqldb_data (include: encoder, data)

For example:

default setup behavior with using mongodb

def test_delete_many(db):
    ...

empty mongodb

 @pytest.mark.parametrize("db", [DBConfig.mongodb_empty], indirect=True)
 def test_execute_insert_and_find(db):
    ...

empty mongodb and empty sqldb

@pytest.mark.parametrize(
     "db", [DBConfig.mongodb_empty, DBConfig.sqldb_empty], indirect=True
 )
 def test_add_version(db):
    ....

do not add vector_index and add 500 data points

pytest.mark.parametrize(
     "db",
     [
         (DBConfig.mongodb_no_vector_index, {'n_data': 500}),
         (DBConfig.sqldb_no_vector_index, {'n_data': 500}),
     ],
     indirect=True,
 )
 def test_dataset(db):
    ...

or use config param to do this

pytest.mark.parametrize(
     "db",
     [
         (DBConfig.mongodb, {'add_vector_index': False, 'n_data': 500}),
         (DBConfig.sqldb, {'add_vector_index': False, 'n_data': 500}),
     ],
     indirect=True,
 )
 def test_dataset(db):
    ...

do not add vector_index and add default size (5) data points

pytest.mark.parametrize(
     "db",
     [
         DBConfig.mongodb_no_vector_index,
         DBConfig.sqldb_no_vector_index,
     ],
     indirect=True,
 )
 def test_dataset(db):
    ...

This is an optimization I found while working, which involves combining the MongoDB and SQL DB fixtures. What do you think, should we do this? Or should we keep the fixtures separate? If we separate them, it might be a bit more troublesome to use different fixtures for a single test case.

WDYT? @thejumpman2323 @blythed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jieguangzhou jieguangzhou marked this pull request as ready for review November 29, 2023 03:48
Copy link
Collaborator

@blythed blythed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment and then we can merge.

superduperdb/backends/mongodb/metadata.py Outdated Show resolved Hide resolved
superduperdb/backends/mongodb/metadata.py Outdated Show resolved Hide resolved
@blythed blythed merged commit b081063 into superduper-io:main Nov 29, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modular tests: Modularize over underlying datastore in test suite
4 participants