Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3885] Partially parse when environment variables in schema files change #4162

Merged
merged 3 commits into from
Nov 8, 2021

Conversation

gshank
Copy link
Contributor

@gshank gshank commented Oct 29, 2021

resolves #3885

Description

A previous pull request handled env_vars in SQL files. This pull request handles env_vars in schema files. Prior to this yaml dictionaries were rendered by deep_map all at once. In order to flag individual entries in a schema file for partial parsing rendering, the deep_map rendering has been split so that each individual entry is rendered (except for tests and descriptions, which are rendered later). In addition env_vars are trapped by test rendering in several places. This does not address the possibility of env_vars in descriptions.

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • I have updated the CHANGELOG.md and added information about my change

@gshank gshank force-pushed the pp_schema_env_vars branch from 31c19d4 to fffa4de Compare October 29, 2021 14:10
@cla-bot cla-bot bot added the cla:yes label Oct 29, 2021
@gshank gshank force-pushed the pp_schema_env_vars branch 5 times, most recently from a24d1b3 to 7bb7234 Compare November 2, 2021 18:24
@gshank gshank force-pushed the pp_schema_env_vars branch from 7bb7234 to 6c1cf34 Compare November 3, 2021 13:31
@gshank gshank force-pushed the pp_schema_env_vars branch from 6c1cf34 to 17bd354 Compare November 3, 2021 13:55
if '.' in search_name: # source file definitions
(search_name, _) = search_name.split('.')
else:
search_name = target.search_name
Copy link
Contributor

@jtcohen6 jtcohen6 Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying something pretty wacky, and managed to break this:

version: 2
models:
  - name: model_a
    columns:
      - name: id
        tests:
          - unique:
              enabled: "{{ env_var('ENABLED', True) }}"
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/manifest.py", line 439, in parse_project
    parser.parse_file(block, dct=dct)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 507, in parse_file
    self.parse_tests(test_block)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 483, in parse_tests
    self.parse_column_tests(block, column)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 207, in parse_column_tests
    self.parse_test(block, test, column)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 479, in parse_test
    self.parse_node(block)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 417, in parse_node
    node = self._parse_generic_test(
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 295, in _parse_generic_test
    self.store_env_vars(target, schema_file_id, self.schema_yaml_vars.env_vars)
  File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 360, in store_env_vars
    search_name = target.search_name
AttributeError: 'UnparsedNodeUpdate' object has no attribute 'search_name'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make a test case for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed. I also added throwing an exception for when there is no node to match a patch, because part of the test case was not doing what it was supposed to.

@@ -21,6 +21,38 @@
import datetime
import re

# Contexts in dbt Core
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is simply outstandingly helpful. Thank you x1000

@gshank gshank force-pushed the pp_schema_env_vars branch from f288f0e to c57fb1d Compare November 3, 2021 21:50
@gshank gshank requested a review from jtcohen6 November 3, 2021 22:17
Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my end-to-end functional testing, this is swell. It's amazing how many places we let users put env vars; this PR manages to nab just about all of them. I'll happily document the limitation about description fields—we can swing back around for those if/when there's need for it.

I'm also deeply grateful for all the comments and annotations explaining the intricacies of our many contexts and spots for rendering. Compared to where we were in January, I feel like we've finally manage to excavate a massive paleontological creature. The project for next year will be finding ways to put the pieces together in a way that's much more sensible for us, and consistent for users.

Another member of the Core eng team should give this code a look before we merge, given the number of things it touches. From my perspective, this gets the job done.

# "kwargs", or keyword args that are passed to the test macro.
# The "kwargs" are not rendered into strings until compilation time.
# The "configs" are rendered here (since they were not rendered back
# in the 'get_key_dicts' methods in the schema parsers).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment here. This is quite confusing behavior, and we should aim to eventually refactor it. I just realized one more way it can confuse: test kwargs are included when creating the test node's name and unique_id, which are created at parse time, even though they're not really rendered until execution time.

There's some conceptual merit to this distinction. A test with different kwargs is a different test (e.g. accepted_values checking for 1,2,3 instead of a,b,c), whereas a test with different configs is just the same test being configured differently. But this is a fuzzy boundary at best, with some odd consequences. I think we should spend more time rethinking it after v1.

Copy link
Contributor

@jtcohen6 jtcohen6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We weren't able to give this a line-by-line review. At this point, it's more important that we have it in for the release candidate, where it can be thoroughly tested over the next 3-4 weeks.

Nice work on this one! I know it involved touching a lot of codepaths. Serious props for leaving them better than you found them.

@gshank gshank merged commit bda70c9 into main Nov 8, 2021
@gshank gshank deleted the pp_schema_env_vars branch November 8, 2021 16:28
Comment on lines +889 to +893
raise ParsingException(
f"Did not find matching node for patch with name '{patch.name}' "
f"in the '{patch.yaml_key}' section of "
f"file '{source_file.path.original_file_path}'"
)
Copy link
Contributor

@jtcohen6 jtcohen6 Nov 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gshank This is on me for not catching before we merged. This is going to be a breaking change for many users (and even some of our external integration tests). Historically, we've supported the ability to define patches for nodes that don't exist; they simply aren't applied.

Any chance we could switch this from an exception to a debug-level warning message, pass here, and have the parser keep going?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be fine. We still create the tests for non-existent nodes, which seems kind of wrong, but it won't affect the env_var parsing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, that does seem sort of wrong. Those tests will exist in the manifest, but I don't think they'll actually be executed. For later 🤷

iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
…ange (#4162)

* [#3885] Partially parse when environment variables in schema files
change

* Add documentation for test kwargs

* Add test and fix for schema configs with env_var

automatic commit by git-black, original commits:
  bda70c9
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
…ange (#4162)

* [#3885] Partially parse when environment variables in schema files
change

* Add documentation for test kwargs

* Add test and fix for schema configs with env_var

automatic commit by git-black, original commits:
  070e13d
  6c6649f
  bda70c9
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
…ange (#4162)

* [#3885] Partially parse when environment variables in schema files
change

* Add documentation for test kwargs

* Add test and fix for schema configs with env_var

automatic commit by git-black, original commits:
  16e055a
  bda70c9
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
…ange (#4162)

* [#3885] Partially parse when environment variables in schema files
change

* Add documentation for test kwargs

* Add test and fix for schema configs with env_var

automatic commit by git-black, original commits:
  bda70c9
  dc65118
iknox-fa pushed a commit that referenced this pull request Feb 8, 2022
…ange (#4162)

* [#3885] Partially parse when environment variables in schema files
change

* Add documentation for test kwargs

* Add test and fix for schema configs with env_var

automatic commit by git-black, original commits:
  bda70c9
  f344166
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Env vars + partial parsing
2 participants