-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Application of DRY Principle to Source Freshness Definitions in .yml files in large dbt projects #3397
Comments
@codigo-ergo-sum Thanks for an excellent write-up of the problem! I see this as one use case motivating some changes that we've long had in mind for how resources are configured, and properties defined, in dbt projects. I think you've proposed some solid solutions, which I'll take one by one:
|
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
I think this issue is still relevant in the future, particularly on larger dbt projects. So commenting here to keep it open. |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days. |
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
Describe the feature
Provide a better method for unified, nonrepetitive management of source freshness definition (and potentially other source-related settings) in dbt.
dbt currently allows for the hierarchical definition of source freshness directives at the source level which will cascade down to individual tables under the source: https://docs.getdbt.com/reference/resource-properties/freshness
However, this only works when the source in question and the tables defined underneath it are in the same .yml file. The problem with this is that in larger dbt projects we often have tens or hundreds of tables being loaded from an individual source, and we also tend to have a significant number of tests and other info about each table. So, putting all the tests, descriptions, and source freshness directives for all tables in one source in one .yml file tends not to work well because with one giant multi-thousand-line .yml file we get issues around versioning, change detection/merge conflicts, code review (much easier for a reviewer if they can see just by file name which source table definitions were edited), etc.
Source freshness definition inheritance won't work if you define the source-level freshness directives in one separate .yml file with just that source-related info, and then have separate files for each source table. There's no notion of implicit inheritance across .yml files.
This means in practice we end up having to repeat the source freshness definitions in every source .yml file, which leads to a lot of repetition (and likely errors/misconfiguration somewhere along the line) which goes against one of dbt's core principles, which is Don't Repeat Yourself.
Describe alternatives you've considered
dbt_project.yml
somehow. But the problem here is thatdbt_project.yml
can already get pretty big and we don't necessarily want it to balloon even more.dbt_project_source.yml
which explicitly handles global source configuration.Additional context
All databases would be relevant.
Who will this benefit?
Anyone who has a large enough dbt project with enough tables per source to not want to have to use just one .yml file per source and wants to really take advantage of the benefits of many small(er), well-named files for source table management.
Are you interested in contributing this feature?
I can try :).
The text was updated successfully, but these errors were encountered: