Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ REFACTOR: Fully abstract QueryBuilder #5093

Merged
merged 13 commits into from
Aug 25, 2021

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Aug 24, 2021

Fixes #4327

There are four primary modules involved in the QueryBuilder abstraction:

  • aiida/orm/implementation/django/querybuilder.py::DjagoQueryBuilder
  • aiida/orm/implementation/sqlalchemy/querybuilder.py::SqlaQueryBuilder
  • aiida/orm/implementation/querybuilder.py::BackendQueryBuilder
  • aiida/orm/querybuilder.py::QueryBuilder

Prior to #5092, all of these modules imported objects from SQLAlchemy, i.e. there was actually no abstraction, since everything was tightly coupled to SQLAlchemy.
As well as this lack of abstraction, the spread of logic and state across these modules (as well as the lack of typing and non-uniform method naming) made it very difficult to decipher the workings of the code.

This PR continues from #5092, to fully abstract the QueryBuilder; moving all sqlalchemy code to SqlaQueryBuilder, making the BackendQueryBuilder a proper abstract class, and reducing QueryBuilder backend interaction to these 6 calls:

aiida/orm/querybuilder.py:
   131:         self._impl: BackendQueryBuilder = backend.query()
   952:         return self._impl.as_sql(data=self.as_dict(), inline=inline)
   962:         return self._impl.analyze_query(data=self.as_dict(), execute=execute, verbose=verbose)
   986:         result = self._impl.first(self.as_dict())
   999:         return self._impl.count(self.as_dict())
  1014:         for item in self._impl.iterall(self.as_dict(), batch_size):
  1034:         for item in self._impl.iterdict(self.as_dict(), batch_size):

As you can see, whenever the backend is called it is passed the (JSONable) query dict, which it uses to build the query. As was previously the case on the frontend, SqlaQueryBuilder hashes this dict, to ensure the query is only rebuilt on changes.

Some other changes of note:

  • aiida/orm/implementation/sqlalchemy/querybuilder.py is split into a few modules, to improve modularity
  • the QueryBuilder.get_query and QueryBuilder.inject_query method has been removed, since naturally these break abstraction, since they rely on an sqlalchemy.Query object (I don't know of any use-cases why these would be used)
  • Made the QueryBuilder.distinct method save its state to the query dict, rather than directly calling the sqlachemy.Query.distinct, e.g.
In [7]: qb = QueryBuilder()

In [8]: qb
Out[8]: QueryBuilder(path=[], filters={}, project={}, order_by=[], limit=None, offset=None, distinct=False)

In [9]: qb.distinct()
Out[9]: QueryBuilder(path=[], filters={}, project={}, order_by=[], limit=None, offset=None, distinct=True)

In [10]: qb.distinct(False)
Out[10]: QueryBuilder(path=[], filters={}, project={}, order_by=[], limit=None, offset=None, distinct=False)
  • correctly reset limit after calling QueryBuilder.one()
  • Improve validation of joining_keyword; before it would only validate against all possible keywords, now it validates against only the keywords for the entity type:
In [2]: QueryBuilder().append(orm.Log, with_incoming='a')
---------------------------------------------------------------------------
ValueError: 'with_incoming' is not a valid keyword for 'log' joining specification
Valid keywords are: {'with_node'}
  • Improve the defaults for joinin_keywords, i.e. rather than always setting with_incoming for all entities, set with_node for non-node entities:
In [5]: QueryBuilder().append(orm.Node).append(orm.Group)
Out[5]: QueryBuilder(path=[{'entity_type': '', 'orm_base': 'node', 'tag': 'node_1', 'joining_keyword': None, 'joining_value': None, 'edge_tag': None, 'outerjoin': False}, {'entity_type': 'group.core', 'orm_base': 'group', 'tag': 'core_1', 'joining_keyword': 'with_node', 'joining_value': 'node_1', 'edge_tag': 'node_1--core_1', 'outerjoin': False}], filters={'node_1': {'node_type': {'like': '%'}}, 'core_1': {'type_string': {'like': '%'}}, 'node_1--core_1': {}}, project={'node_1': [], 'core_1': [], 'node_1--core_1': []}, order_by=[], limit=None, offset=None, distinct=False)
  • sqlalchemy_utils is now only used in tests/backends/aiida_sqlalchemy/test_utils.py, so I have moved it to the tests extras, rather than install_requires
  • lots of typing!

@codecov
Copy link

codecov bot commented Aug 24, 2021

Codecov Report

Merging #5093 (e18b22c) into develop (eae3c50) will increase coverage by 0.08%.
The diff coverage is 84.95%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5093      +/-   ##
===========================================
+ Coverage    80.56%   80.64%   +0.08%     
===========================================
  Files          532      534       +2     
  Lines        37010    37066      +56     
===========================================
+ Hits         29815    29887      +72     
+ Misses        7195     7179      -16     
Flag Coverage Δ
django 75.32% <76.76%> (+0.08%) ⬆️
sqlalchemy 74.33% <84.81%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/orm/querybuilder.py 84.98% <ø> (+1.04%) ⬆️
...orm/implementation/sqlalchemy/querybuilder/main.py 81.92% <81.92%> (ø)
...m/implementation/sqlalchemy/querybuilder/joiner.py 91.93% <91.93%> (ø)
aiida/orm/implementation/querybuilder.py 94.92% <93.48%> (+15.36%) ⬆️
aiida/orm/groups.py 93.11% <100.00%> (+0.13%) ⬆️
aiida/orm/implementation/django/querybuilder.py 100.00% <100.00%> (ø)
...implementation/sqlalchemy/querybuilder/__init__.py 100.00% <100.00%> (ø)
aiida/orm/nodes/node.py 96.31% <100.00%> (+0.03%) ⬆️
aiida/transports/plugins/local.py 81.41% <0.00%> (-0.25%) ⬇️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eae3c50...e18b22c. Read the comment docs.

@chrisjsewell chrisjsewell linked an issue Aug 24, 2021 that may be closed by this pull request
@chrisjsewell
Copy link
Member Author

This is good to go FYI

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chrisjsewell . I have given the changes a global look, but have to admit that due to the amount of changes have not been able to review everything in detail, so I am relying mostly on the tests here that seem to run without a problem. I left one comment about actually deprecating queryhelp but since that was not actually changed in this PR, ok to merge this as is. Would be good to include the PR message in the commit, since that contains a useful summary of the changes


@property
def queryhelp(self) -> QueryDict:
def queryhelp(self) -> 'QueryDictType':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just deprecate this so we can remove it at some point? No reason to keep this around if there is a preferred alternative

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh I mentioned this here: #5081 (comment), lets open an issue for it to track

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I remember seeing this now. I am a bit surprised by you being against having deprecation warnings for a new major release? Why would that be objectionable? And why wouldn't that be a problem on a minor release?

@chrisjsewell
Copy link
Member Author

I have given the changes a global look, but have to admit that due to the amount of changes have not been able to review everything in detail, so I am relying mostly on the tests here that seem to run without a problem.

Thanks 😄 yeh I promise I was very thorough! and with the rigorous use of type checking, it really does help to make sure all the function/method inputs/outputs are correct

@chrisjsewell chrisjsewell merged commit 4174e5d into aiidateam:develop Aug 25, 2021
@chrisjsewell chrisjsewell deleted the querbuilder-abstract branch August 25, 2021 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants