Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2496] [Spike] Determine best practices for composing nodes with python Protocols #7470

Closed
Tracked by #7498
QMalcolm opened this issue Apr 26, 2023 · 1 comment
Closed
Tracked by #7498
Assignees

Comments

@QMalcolm
Copy link
Contributor

Spike

For 1.6 we'll be dropping support for python 3.7. With python >= 3.8 we can being to count on the existence of python's Protocol. We have a desire to use Protocol for defining shared interfaces between dbt-core and MetricFlow for the integration of the two. As such we need to develop a better understanding of what our best practices should be when using Protocols in the composition of python classes like dbt-core nodes.

@QMalcolm QMalcolm added the spike label Apr 26, 2023
@github-actions github-actions bot changed the title [Spike] Determine best practices for composing nodes with python Protocols [CT-2496] [Spike] Determine best practices for composing nodes with python Protocols Apr 26, 2023
@QMalcolm QMalcolm added this to the v1.6 milestone May 3, 2023
@jtcohen6 jtcohen6 removed this from the v1.6 milestone May 3, 2023
@QMalcolm QMalcolm added this to the v1.6 milestone May 8, 2023
@peterallenwebb
Copy link
Contributor

Spike Report

I reviewed the existing work Quigley has done in the dbt-semantic-interfaces repo toward extracting interfaces which will be shared between Core and MetricFlow.

Recommendations

  • We should not use the runtime_checkable decorator on protocols. Although Gerda raised valid concerns that isinstance() will not work without this decorator, it has severe limitations which could bite us. It verifies the presence of member variables and methods, but not their types or signatures. Fortunately, there are only four places we use isinstance() to check a node's type, and they would all be improved by refactoring the code to avoid its use.

  • We should not support default methods on these Protocols (effectively making them mixins) at least for now. The added complexity is not justified by any immediate use case, and functionality can be provided in the default Protocol implementations in the semantic interfaces library.

  • Since the protocols in semantic interfaces will constitute a cross-product API and because changes to them will have versioning implications, we should organize them separately from the rest of the code in that library. They should be contained entirely within the Protocols directory of the repository, and the types used by their member variables and function signatures should also be defined in the same directory. Some of these types (e.g. enums) might be concrete types, but they are part of the expected interface just the same. The work of isolating the protocols to their own directory is largely complete, but there are some exceptions which should be removed as we move forward with MetricFlow integration.

Loose Ends

  • Since the protocols will exist in a separate repository from Core, we will need to have a versioning strategy and release schedule for that repository. I don't know enough about the roadmap now to recommend a complete strategy, but we should look to establish a version 1.0.0 of dbt-semantic-interfaces by the time we release dbt 1.6.

  • Since the semantic interfaces repo is shared by Core and MetricFlow, who will own repo maintenance, releases, and how will we coordinate changes?

@jtcohen6 jtcohen6 removed this from the v1.6 milestone Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants