-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disabled Models in schema files #5868
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
This feels a little hacky but does resolve the exceptions thrown to be correct. I opened #5869 to explore the possible underlying issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a few changes, as described in the comments.
core/dbt/parser/schemas.py
Outdated
# If this yaml file is enabled but the project config is not, we need to move | ||
# the node from disabled to manifest.nodes | ||
if patch.config.get("enabled"): | ||
test_from = {"key": block.target.yaml_key, "name": block.target.name} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "test_from" variation on add_node is only for tests, not regular nodes. I don't think we should need to do "add_node" plus "remove_node'. I would just do something like popping it from "disabled" and adding it to "nodes". Also we can't really do this if there are multiple disabled nodes, since we wouldn't know which one to enable. So I think we'll have to limit this to cases where there's only one disabled node with this unique_id. You also shouldn't need to do a ref_lookup to get the unique_id, the node will already have a unique_id in it. So there would be an if/else after "if patch.config.get("enabled") with len(found_nodes) == 1, plus throwing an error if len is more than 1.
I think it might be possible to just add the node to ref_lookup by manifest.ref_lookup().add_node(node) too. Also need to remove the node from disabled_lookup. Could either rebuild it or add a function to remove node. Probably simpler to rebuild it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On thinking about this more, I think we're not handling some other cases. The precedence order is 1) dbt_project.yml config, 2) schema yaml config, 2) model config. So in theory the config in the model file could override the schema yaml config, and we have no code to move around the disabled/not disabled nodes for that case.
I really think we should switch to not having a separate dictionary for disabled, but that's probably for a later release.
For now I think that it might be best to always apply the patches to disabled nodes, except for the case where enabled is set to True and there is more than 1 matching disabled node (I kind of wish we never allowed that...) where we raise an error. Then when schema parsing is done, before refs are resolved, loop through nodes and disabled and make sure the nodes are in the right dictionary.
In addition, I think there's a hole in the "add_disabled" code that we didn't handle in the ticket for disabling metrics and exposures. It needs to be updated to check for test nodes (which used to be the only nodes from a SchemaSourceFile that could be updated), because the "test_from" piece only applies to tests.
Can you think of any additional holes?
bf44806
to
e464f5d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment from last time seems to have gotten lost. The processing in parse_patch looks okay, but I think the 'process_nodes' code still has issues.
core/dbt/parser/manifest.py
Outdated
if node.config.enabled: | ||
for dis_index, dis_node in enumerate(disabled): | ||
# Remove node from disabled and unique_id from disabled dict if necessary | ||
enable_nodes[dis_index] = dis_node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on this last time, but it looks like github lost it somehow... Making the index the key in the 'enable_nodes' dictionary means that if you have multiple enabled nodes at the same index, the first one will be overwritten.
I'm not sure why you are saving these in separate structures. It seems like for both looping through the nodes and looping through the disabled you could just move them when encountered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gshank I'm saving them as a separate structure because python doesn't like when you modify the dict you are looping through.
. Now I see it. Will fix it.disabled
on line 947 is the list of nodes for the unique id. Probably worth renaming if it's confusing. So the index is just the index of the list item. There can't be multiple nodes at a single index since each index represents a single node.
core/dbt/parser/schemas.py
Outdated
# There are multiple disabled nodes for this model and the schema file wants to enable one. | ||
# We have no way to know which one to enable. | ||
msg = ( | ||
f"Found {len(found_nodes)} matching disabled nodes for '{patch.name}'. " | ||
"Multiple nodes for the same unique id cannot be disabled in the schema " | ||
"file. They must be disabled in `dbt_project.yml` or in the sql files." | ||
) | ||
raise ParsingException(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6 can I get some input on the error message here, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tagging me in — this looks mostly good to me!
Could we include the resource type? Is that available from the patch
?
How about:
Found {len(found_nodes)} matching disabled nodes for {patch.resource_type} '{patch.name}'.
Multiple nodes with the same unique_id cannot be disabled in yaml resource properties.
If you need to have multiple disabled nodes with the same names, you should instead disable them
using in-file config, or resource-path config in `dbt_project.yml`.
core/dbt/parser/manifest.py
Outdated
# make sure the nodes are in the manifest.nodes or the disabled dict, | ||
# correctly now that the schema files are also parsed | ||
disable_node_copy = deepcopy(self.manifest.nodes) | ||
for node in disable_node_copy.values(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned by the overhead of deepcopying the entire nodes dictionary. Maybe we could have a list of disabled unique_ids, leave the 'add_disabled_nofile' in place, and just remove the disabled nodes afterward? I'm not so concerned by the disabled dictionary because it will be much smaller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's cleaner than creating a list of all nodes. 👍
* clean up debugging * reword some comments * changelog * add more tests * move around the manifest.node * fix typos * all tests passing * move logic for moving around nodes * add tests * more cleanup * fix failing pp test * remove comments * add more tests, patch all disabled nodes * fix test for windows * fix node processing to not overwrite enabled nodes * add checking disabled in pp, fix error msg * stop deepcopying all nodes when processing * update error message (cherry picked from commit fa4f9d3)
* clean up debugging * reword some comments * changelog * add more tests * move around the manifest.node * fix typos * all tests passing * move logic for moving around nodes * add tests * more cleanup * fix failing pp test * remove comments * add more tests, patch all disabled nodes * fix test for windows * fix node processing to not overwrite enabled nodes * add checking disabled in pp, fix error msg * stop deepcopying all nodes when processing * update error message (cherry picked from commit fa4f9d3) Co-authored-by: Emily Rockman <[email protected]>
resolves #3992
Description
Checklist
changie new
to create a changelog entry