-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make seeds ref'able, provide for seed configuration #668
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's one little comment about how we add seed configs to the project file. i think the explicit seeds:
is better than the indirect data:
as a top level config. otherwise, this is so so so so so so so awesome.
dbt/model.py
Outdated
@@ -133,7 +135,10 @@ def get_project_config(self, project): | |||
for k in SourceConfig.ExtendDictFields: | |||
config[k] = {} | |||
|
|||
model_configs = project.get('models') | |||
if self.node_type == NodeType.Seed: | |||
model_configs = project.get('data') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data
is just the default seed data location right? should this be seeds
instead (since the other one is models
)?
@@ -125,14 +125,14 @@ def model_immediate_name(model, non_destructive): | |||
return "{}__dbt_tmp".format(model_name) | |||
|
|||
|
|||
def find_model_by_name(flat_graph, target_name, target_package): | |||
def find_refable_by_name(flat_graph, target_name, target_package): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
def project_config(self): | ||
return { | ||
"data-paths": ['test/integration/005_simple_seed_test/data-config'], | ||
"seeds": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmcarthur check it out :)
@cmcarthur @drewbanin we're so excited about this feature. Anything we can do to help accelerate the release of 0.9.2 (and therefore this feature) into the wild? |
@alanmcruickshank awesome! At this point, all the code is written for the 0.9.2 release -- we have a couple of PRs left to merge and we're doing some stress testing internally. We'd love it if you could install dbt from GitHub Definitely let me know if you find any bugs or quirks -- your testing will help us feel really confident in this release :) |
Fixes:
This PR makes seed CSVs ref'able. Further, they can now be configured with a
seeds
dict in thedbt_project.yml
file (with the same semantics asmodels:
)The semantics here are a little funny -- seed files can be ref'd, but they are not exactly a part of the graph. As such, they are not created by
dbt run
, anddbt run --models +some_model
will not invoke any upstream seed files.We could make seed files work exactly like models, but there's probably not a ton of merit to running them with every invocation of
dbt run
.Regarding configuration, the only sensible config options are:
In practice, this looks like