Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Macros can re-write refs when split into new projects. #197

Merged
merged 4 commits into from
Mar 25, 2024

Conversation

nicholasyager
Copy link
Collaborator

@nicholasyager nicholasyager commented Mar 10, 2024

Description and motivation

Currently, if a macro has an explicit ref to a model, and the macro is split into a new project, that ref will break. This PR implements ref rewrites so that the reference will become a two-argument ref.

Resolves: #189

(dbt-meshify-7d0CGWIh-py3.11) > $ dbt-meshify --debug split customers --project-path ./temp_proj --select customers                                          [±feature/macro_ref_rewrite ●(✹)]
11:36:08 | INFO | Executing dbt parse...
11:36:09 | INFO | Generating catalog with dbt docs generate...
11:36:09 | DEBUG | Registering the relationship between customers and its resources
11:36:09 | INFO | Selected 4 resources: {'model.split_proj.customers', 'test.split_proj.not_null_customers_customer_id.5c9bf9911d', 'test.split_proj.accepted_values_customers_customer_type__new__returning.d12f0947c8', 'test.split_proj.unique_customers_customer_id.c5af1ff4b1'}
11:36:09 | INFO | Creating subproject customers...
11:36:09 | INFO | Identifying operations required to split customers from split_proj.
11:36:09 | DEBUG | Generate contract to and access for boundary node model.split_proj.customers
11:36:09 | DEBUG | Updating ref functions for children of model.split_proj.customers...
11:36:09 | DEBUG | Moving model.split_proj.customers and associated YML to subproject customers...
11:36:09 | DEBUG | Updating reference to model.split_proj.stg_customers in customers.
11:36:09 | DEBUG | Updating reference to model.split_proj.orders in customers.
11:36:09 | DEBUG | FileChange(operation=<Operation.Append: 'append'>, entity_type=<EntityType.Code: 'code'>, identifier='redirect', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/macros/redirect.sql'), data="{% macro redirect() %}\n    {{ ref('orders') }}\n{% endmacro %}", source=None)
11:36:09 | [  1/17] STARTING | Append code `redirect` to temp_proj/customers/macros/redirect.sql
11:36:09 | [  1/17] SUCCESS  | Append code `redirect` to temp_proj/customers/macros/redirect.sql
11:36:09 | DEBUG | FileChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Code: 'code'>, identifier='redirect', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/macros/redirect.sql'), data="{% macro redirect() %}\n    {{ ref('split_proj', 'orders') }}\n{% endmacro %}", source=None)
11:36:09 | [  2/17] STARTING | Update code `redirect` in temp_proj/customers/macros/redirect.sql
11:36:09 | [  2/17] SUCCESS  | Update code `redirect` in temp_proj/customers/macros/redirect.sql
11:36:09 | DEBUG | FileChange(operation=<Operation.Append: 'append'>, entity_type=<EntityType.Code: 'code'>, identifier='customer_id', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/docs.md'), data='{% docs customer_id %}\nThe unique key for each customer.\n{% enddocs %}', source=None)
11:36:09 | [  3/17] STARTING | Append code `customer_id` to temp_proj/customers/models/docs.md
11:36:09 | [  3/17] SUCCESS  | Append code `customer_id` to temp_proj/customers/models/docs.md
11:36:09 | DEBUG | ResourceChange(operation=<Operation.Add: 'add'>, entity_type=<EntityType.Model: 'model'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/marts/__models.yml'), data={'name': 'customers', 'config': {'contract': {'enforced': True}}}, source_name=None)
11:36:09 | [  4/17] STARTING | Add model `customers` to temp_proj/customers/models/marts/__models.yml
11:36:09 | [  4/17] SUCCESS  | Add model `customers` to temp_proj/customers/models/marts/__models.yml
11:36:09 | DEBUG | ResourceChange(operation=<Operation.Add: 'add'>, entity_type=<EntityType.Model: 'model'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/marts/__models.yml'), data={'name': 'customers', 'access': 'public'}, source_name=None)
11:36:09 | [  5/17] STARTING | Add model `customers` to temp_proj/customers/models/marts/__models.yml
11:36:09 | [  5/17] SUCCESS  | Add model `customers` to temp_proj/customers/models/marts/__models.yml
11:36:09 | DEBUG | FileChange(operation=<Operation.Move: 'move'>, entity_type=<EntityType.Code: 'code'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/marts/customers.sql'), data=None, source=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/models/marts/customers.sql'))
11:36:09 | [  6/17] STARTING | Move code `customers` to temp_proj/customers/models/marts/customers.sql
11:36:09 | [  6/17] SUCCESS  | Move code `customers` to temp_proj/customers/models/marts/customers.sql
11:36:09 | DEBUG | ResourceChange(operation=<Operation.Add: 'add'>, entity_type=<EntityType.Model: 'model'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/marts/__models.yml'), data={'name': 'customers', 'description': 'Customer overview data mart, offering key details for each unique customer. One row per customer.', 'columns': {'customer_id': {'name': 'customer_id', 'description': "{{ doc('customer_id') }}", 'tests': ['not_null', 'unique']}, 'customer_name': {'name': 'customer_name', 'description': "Customers' full name."}, 'count_lifetime_orders': {'name': 'count_lifetime_orders', 'description': 'Total number of orders a customer has ever placed.'}, 'first_ordered_at': {'name': 'first_ordered_at', 'description': 'The timestamp when a customer placed their first order.'}, 'last_ordered_at': {'name': 'last_ordered_at', 'description': "The timestamp of a customer's most recent order."}, 'lifetime_spend_pretax': {'name': 'lifetime_spend_pretax', 'description': 'The sum of all the pre-tax subtotals of every order a customer has placed.'}, 'lifetime_spend': {'name': 'lifetime_spend', 'description': 'The sum of all the order totals (including tax) that a customer has ever placed.'}, 'customer_type': {'name': 'customer_type', 'description': "Options are 'new' or 'returning', indicating if a customer has ordered more than once or has only placed their first order to date.", 'tests': [{'accepted_values': {'values': ['new', 'returning']}}]}}}, source_name=None)
11:36:09 | [  7/17] STARTING | Add model `customers` to temp_proj/customers/models/marts/__models.yml
11:36:10 | [  7/17] SUCCESS  | Add model `customers` to temp_proj/customers/models/marts/__models.yml
11:36:10 | DEBUG | ResourceChange(operation=<Operation.Remove: 'remove'>, entity_type=<EntityType.Model: 'model'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/models/marts/__models.yml'), data={}, source_name=None)
11:36:10 | [  8/17] STARTING | Remove model `customers` from temp_proj/models/marts/__models.yml
11:36:10 | [  8/17] SUCCESS  | Remove model `customers` from temp_proj/models/marts/__models.yml
11:36:10 | DEBUG | FileChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Code: 'code'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/marts/customers.sql'), data="{{\n    config(\n        materialized='table'\n    )\n}}\n\nwith\n\ncustomers as (\n\n    select * from {{ ref('split_proj', 'stg_customers') }}\n\n),\n\norders_mart as (\n\n    -- Redirect is a proxy for `ref`, used to test ref rewrites in macros.\n    select * from {{ redirect() }}\n\n),\n\norder_summary as (\n\n    select\n        customer_id,\n\n        count(*) as count_lifetime_orders,\n        count(*) > 1 as is_repeat_buyer,\n        min(ordered_at) as first_ordered_at,\n        max(ordered_at) as last_ordered_at,\n\n        sum(subtotal) as lifetime_spend_pretax,\n        sum(order_total) as lifetime_spend\n\n    from orders_mart\n    group by 1\n\n),\n\njoined as (\n\n    select\n        customers.*,\n        order_summary.count_lifetime_orders,\n        order_summary.first_ordered_at,\n        order_summary.last_ordered_at,\n        order_summary.lifetime_spend_pretax,\n        order_summary.lifetime_spend,\n\n        case\n            when order_summary.is_repeat_buyer then 'returning'\n            else 'new'\n        end as customer_type\n\n    from customers\n\n    left join order_summary\n        on customers.customer_id = order_summary.customer_id\n\n)\n\nselect * from joined", source=None)
11:36:10 | [  9/17] STARTING | Update code `customers` in temp_proj/customers/models/marts/customers.sql
11:36:10 | [  9/17] SUCCESS  | Update code `customers` in temp_proj/customers/models/marts/customers.sql
11:36:10 | DEBUG | FileChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Code: 'code'>, identifier='customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/models/marts/customers.sql'), data="{{\n    config(\n        materialized='table'\n    )\n}}\n\nwith\n\ncustomers as (\n\n    select * from {{ ref('split_proj', 'stg_customers') }}\n\n),\n\norders_mart as (\n\n    -- Redirect is a proxy for `ref`, used to test ref rewrites in macros.\n    select * from {{ redirect() }}\n\n),\n\norder_summary as (\n\n    select\n        customer_id,\n\n        count(*) as count_lifetime_orders,\n        count(*) > 1 as is_repeat_buyer,\n        min(ordered_at) as first_ordered_at,\n        max(ordered_at) as last_ordered_at,\n\n        sum(subtotal) as lifetime_spend_pretax,\n        sum(order_total) as lifetime_spend\n\n    from orders_mart\n    group by 1\n\n),\n\njoined as (\n\n    select\n        customers.*,\n        order_summary.count_lifetime_orders,\n        order_summary.first_ordered_at,\n        order_summary.last_ordered_at,\n        order_summary.lifetime_spend_pretax,\n        order_summary.lifetime_spend,\n\n        case\n            when order_summary.is_repeat_buyer then 'returning'\n            else 'new'\n        end as customer_type\n\n    from customers\n\n    left join order_summary\n        on customers.customer_id = order_summary.customer_id\n\n)\n\nselect * from joined", source=None)
11:36:10 | [ 10/17] STARTING | Update code `customers` in temp_proj/customers/models/marts/customers.sql
11:36:10 | [ 10/17] SUCCESS  | Update code `customers` in temp_proj/customers/models/marts/customers.sql
11:36:10 | DEBUG | ResourceChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Model: 'model'>, identifier='orders', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/models/marts/__models.yml'), data={'name': 'orders', 'config': {'contract': {'enforced': True}}}, source_name=None)
11:36:10 | [ 11/17] STARTING | Update model `orders` in temp_proj/models/marts/__models.yml
11:36:10 | [ 11/17] SUCCESS  | Update model `orders` in temp_proj/models/marts/__models.yml
11:36:10 | DEBUG | ResourceChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Model: 'model'>, identifier='orders', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/models/marts/__models.yml'), data={'name': 'orders', 'access': 'public'}, source_name=None)
11:36:10 | [ 12/17] STARTING | Update model `orders` in temp_proj/models/marts/__models.yml
11:36:10 | [ 12/17] SUCCESS  | Update model `orders` in temp_proj/models/marts/__models.yml
11:36:10 | DEBUG | ResourceChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Model: 'model'>, identifier='stg_customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/models/staging/__models.yml'), data={'name': 'stg_customers', 'config': {'contract': {'enforced': True}}}, source_name=None)
11:36:10 | [ 13/17] STARTING | Update model `stg_customers` in temp_proj/models/staging/__models.yml
11:36:10 | [ 13/17] SUCCESS  | Update model `stg_customers` in temp_proj/models/staging/__models.yml
11:36:10 | DEBUG | ResourceChange(operation=<Operation.Update: 'update'>, entity_type=<EntityType.Model: 'model'>, identifier='stg_customers', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/models/staging/__models.yml'), data={'name': 'stg_customers', 'access': 'public'}, source_name=None)
11:36:10 | [ 14/17] STARTING | Update model `stg_customers` in temp_proj/models/staging/__models.yml
11:36:10 | [ 14/17] SUCCESS  | Update model `stg_customers` in temp_proj/models/staging/__models.yml
11:36:10 | DEBUG | FileChange(operation=<Operation.Add: 'add'>, entity_type=<EntityType.Code: 'code'>, identifier='dbt_project.yml', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/dbt_project.yml'), data="name: customers\nconfig-version: 2\nmodel-paths:\n  - models\nmacro-paths:\n  - macros\nseed-paths:\n  - seeds\n  - jaffle_data\ntest-paths:\n  - tests\nanalysis-paths:\n  - analyses\nsnapshot-paths:\n  - snapshots\nclean-targets:\n  - target\n  - dbt_packages\nprofile: split_proj\nrequire-dbt-version:\n  - '>=1.6.0'\n  - <1.7.0\nmodels:\n  customers:\n    +on_schema_change: append_new_columns\n    example:\n      +materialized: view\nseeds:\n  +schema: jaffle_raw\nvars:\n  truncate_timespan_to: '{{ current_timestamp() }}'\n", source=None)
11:36:10 | [ 15/17] STARTING | Add code `dbt_project.yml` to temp_proj/customers/dbt_project.yml
11:36:10 | [ 15/17] SUCCESS  | Add code `dbt_project.yml` to temp_proj/customers/dbt_project.yml
11:36:10 | DEBUG | FileChange(operation=<Operation.Copy: 'copy'>, entity_type=<EntityType.Code: 'code'>, identifier='packages.yml', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/packages.yml'), data=None, source=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/packages.yml'))
11:36:10 | [ 16/17] STARTING | Copy code `packages.yml` to temp_proj/customers/packages.yml
11:36:10 | [ 16/17] SUCCESS  | Copy code `packages.yml` to temp_proj/customers/packages.yml
11:36:10 | DEBUG | ResourceChange(operation=<Operation.Add: 'add'>, entity_type=<EntityType.Project: 'project'>, identifier='split_proj', path=PosixPath('/Users/nicholas/projects/nicholasyager/dbt-meshify/temp_proj/customers/dependencies.yml'), data={'name': 'split_proj'}, source_name=None)
11:36:10 | [ 17/17] STARTING | Add project `split_proj` to temp_proj/customers/dependencies.yml
11:36:10 | [ 17/17] SUCCESS  | Add project `split_proj` to temp_proj/customers/dependencies.yml

@nicholasyager nicholasyager added the enhancement New feature or request label Mar 10, 2024
@nicholasyager nicholasyager self-assigned this Mar 10, 2024
# Check if the model is part of the original project.
if model_id not in self.project.parent_project.resources:
raise ValueError(
f"Unable to find {model_id} in the parent project. How did we get here?"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not my beautiful house etc

Copy link
Collaborator

@dave-connors-3 dave-connors-3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i fear I've given you a merge conflict, but it looks good otherwise!

@nicholasyager
Copy link
Collaborator Author

@dave-connors-3 The merge conflicts have been resolved 🎉

@dave-connors-3 dave-connors-3 merged commit fa53aad into dbt-labs:main Mar 25, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The split command does not re-write ref() methods present in macros
2 participants