Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add ability to import/include YAML from other files #9695

Open
3 tasks done
b-per opened this issue Feb 28, 2024 · 9 comments · May be fixed by #10694
Open
3 tasks done

[Feature] Add ability to import/include YAML from other files #9695

b-per opened this issue Feb 28, 2024 · 9 comments · May be fixed by #10694
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day Refinement Maintainer input needed yaml

Comments

@b-per
Copy link
Contributor

b-per commented Feb 28, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

I wonder if dbt would benefit from providing the ability to include YAML snippets from other files.

Home Assistant (a popular python tool, leveraging YAML for config), provides a few additional YAML constructors (quick overview of constructors).

The most interesting one would be !include which allows to define values of keys in YAML from other YAML files
The implementation of Home Assistant is in this source file.

While we can define some logic in Jinja in some YAML files, we are limited today to defining a single entry at a time (e.g. a string or a bool). With this approach we can also store config for nested fields.

Based on my first observations, this technique wouldn't mess with the current dbt code as the YAML is rendered before we start any "real" dbt parsing.

Describe alternatives you've considered

Copy pasting the same logic in YAML as we do today

Who will this benefit?

  • people who want to DRY their YAML files (e.g common tests for columns, common tags etc...)
  • people who want to define the config of dbt_project.yml from different files, with different code owners in git
  • people following this issue

Are you interested in contributing this feature?

Yes

Anything else?

I got an early prototype working locally and can share some of the code if we want to implement this feature.

@b-per b-per added enhancement New feature or request triage labels Feb 28, 2024
@graciegoheen graciegoheen added the paper_cut A small change that impacts lots of users in their day-to-day label Mar 25, 2024
@dbeatty10 dbeatty10 added Refinement Maintainer input needed and removed triage labels Apr 5, 2024
@databius
Copy link

Thanks @b-per for opening this feature. We really need it to make dbt project be DRY.

@b-per
Copy link
Contributor Author

b-per commented Aug 29, 2024

Hi team. Is it something that we'd be keen to merge if I write a PR for it?

@will-sargent-dbtlabs
Copy link

+1 for this feature

@alison985
Copy link

I've been talking about this under the keywords "YAML inheritance" for at least a year. We desperately need it, IMO. I wouldn't be able to contribute to a PR, but I'm 100% on board for anything else to support it.

@IL-Minh
Copy link

IL-Minh commented Sep 11, 2024

  • 1 for this feature

@b-per b-per linked a pull request Sep 11, 2024 that will close this issue
8 tasks
@b-per
Copy link
Contributor Author

b-per commented Sep 11, 2024

As a few people have shown interest, I created a draft PR with the code I have so far. It is not ready to be merged, but some people might be willing to take it over from here.

@craigneasbey
Copy link

+1 for this feature

I assume this feature is only for more complex, real, data transformations....otherwise everyone would want it???

@dsillman2000
Copy link

P.S. I've implemented a basic library which supports a lot of what the OP calls out in the "Home Assistant" configuration docs. I've used this for personal projects and it has a few other helper constructors, but at least the code in the library should help with writing custom PyYAML extensions in dbt-common / dbt-core to support similar syntax and behavior during yaml.load(...) in the core DBT code.

https://github.com/dsillman2000/yaml-extras

Note that I use !import rather than !include as Home Assistant does, and it is further extended with the additional tag-based constructors:

  • !import.anchor <path> &<anchor-in-file>
  • !import-all <path-with-glob(s)>
  • !import-all.anchor <path-with-glob(s)> &<anchor-in-files>

I took great inspiration from the existing project, pyyaml-include, which I imagine Home Assistant used due to the similar behavior.

@babaMar
Copy link

babaMar commented Jan 21, 2025

+1 for this feature.
We are stumbling on this problem as well, this approach:

version: 2

models:
  - name: my_model
    config:
      contract:
        enforced: true

    columns: &common_columns
      - name: composite_id
        data_type: String

      - name: id
        data_type: UInt64
        description: Unique identifier. This ID is used to reference individual academies within the system.

      - name: name
        data_type: String
        description: The name ...

       [many other columns here]

  - name: my_similar_model
    config:
      contract:
        enforced: true

    columns: 
      - <<: *common_columns
      - name: load_time
        data_type: DateTime
        description: The timestamp when the data was loaded, equivalent to the runtime of the dbt job.
      
      [other columns]

seems to work at first, although when enforcing the contract it looks like the common columns are not detected for the second model. We tried all kind of YAML tricks, but those always result in an error from the loader.

Any other suggestion welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request paper_cut A small change that impacts lots of users in their day-to-day Refinement Maintainer input needed yaml
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants