[#3885] Partially parse when environment variables in schema files change #4162

gshank · 2021-10-29T14:10:17Z

resolves #3885

Description

A previous pull request handled env_vars in SQL files. This pull request handles env_vars in schema files. Prior to this yaml dictionaries were rendered by deep_map all at once. In order to flag individual entries in a schema file for partial parsing rendering, the deep_map rendering has been split so that each individual entry is rendered (except for tests and descriptions, which are rendered later). In addition env_vars are trapped by test rendering in several places. This does not address the possibility of env_vars in descriptions.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change

change

jtcohen6 · 2021-11-03T18:26:45Z

core/dbt/parser/schemas.py

+                if '.' in search_name:  # source file definitions
+                    (search_name, _) = search_name.split('.')
+            else:
+                search_name = target.search_name


I was trying something pretty wacky, and managed to break this:

version: 2 models: - name: model_a columns: - name: id tests: - unique: enabled: "{{ env_var('ENABLED', True) }}"

File "/Users/jerco/dev/product/dbt/core/dbt/parser/manifest.py", line 439, in parse_project parser.parse_file(block, dct=dct) File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 507, in parse_file self.parse_tests(test_block) File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 483, in parse_tests self.parse_column_tests(block, column) File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 207, in parse_column_tests self.parse_test(block, test, column) File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 479, in parse_test self.parse_node(block) File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 417, in parse_node node = self._parse_generic_test( File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 295, in _parse_generic_test self.store_env_vars(target, schema_file_id, self.schema_yaml_vars.env_vars) File "/Users/jerco/dev/product/dbt/core/dbt/parser/schemas.py", line 360, in store_env_vars search_name = target.search_name AttributeError: 'UnparsedNodeUpdate' object has no attribute 'search_name'

Will make a test case for it.

This should be fixed. I also added throwing an exception for when there is no node to match a patch, because part of the test case was not doing what it was supposed to.

jtcohen6 · 2021-11-03T18:30:34Z

core/dbt/context/base.py

@@ -21,6 +21,38 @@
 import datetime
 import re

+# Contexts in dbt Core


This comment is simply outstandingly helpful. Thank you x1000

jtcohen6

From my end-to-end functional testing, this is swell. It's amazing how many places we let users put env vars; this PR manages to nab just about all of them. I'll happily document the limitation about description fields—we can swing back around for those if/when there's need for it.

I'm also deeply grateful for all the comments and annotations explaining the intricacies of our many contexts and spots for rendering. Compared to where we were in January, I feel like we've finally manage to excavate a massive paleontological creature. The project for next year will be finding ways to put the pieces together in a way that's much more sensible for us, and consistent for users.

Another member of the Core eng team should give this code a look before we merge, given the number of things it touches. From my perspective, this gets the job done.

jtcohen6 · 2021-11-04T16:01:00Z

core/dbt/parser/generic_test_builders.py

+        # "kwargs", or keyword args that are passed to the test macro.
+        # The "kwargs" are not rendered into strings until compilation time.
+        # The "configs" are rendered here (since they were not rendered back
+        # in the 'get_key_dicts' methods in the schema parsers).


Thanks for the comment here. This is quite confusing behavior, and we should aim to eventually refactor it. I just realized one more way it can confuse: test kwargs are included when creating the test node's name and unique_id, which are created at parse time, even though they're not really rendered until execution time.

There's some conceptual merit to this distinction. A test with different kwargs is a different test (e.g. accepted_values checking for 1,2,3 instead of a,b,c), whereas a test with different configs is just the same test being configured differently. But this is a fuzzy boundary at best, with some odd consequences. I think we should spend more time rethinking it after v1.

jtcohen6

We weren't able to give this a line-by-line review. At this point, it's more important that we have it in for the release candidate, where it can be thoroughly tested over the next 3-4 weeks.

Nice work on this one! I know it involved touching a lot of codepaths. Serious props for leaving them better than you found them.

jtcohen6 · 2021-11-08T18:00:32Z

core/dbt/parser/schemas.py

+                raise ParsingException(
+                    f"Did not find matching node for patch with name '{patch.name}' "
+                    f"in the '{patch.yaml_key}' section of "
+                    f"file '{source_file.path.original_file_path}'"
+                )


@gshank This is on me for not catching before we merged. This is going to be a breaking change for many users (and even some of our external integration tests). Historically, we've supported the ability to define patches for nodes that don't exist; they simply aren't applied.

Any chance we could switch this from an exception to a debug-level warning message, pass here, and have the parser keep going?

Yes, that would be fine. We still create the tests for non-existent nodes, which seems kind of wrong, but it won't affect the env_var parsing.

Agree, that does seem sort of wrong. Those tests will exist in the manifest, but I don't think they'll actually be executed. For later 🤷

…ange (#4162) * [#3885] Partially parse when environment variables in schema files change * Add documentation for test kwargs * Add test and fix for schema configs with env_var automatic commit by git-black, original commits: bda70c9

…ange (#4162) * [#3885] Partially parse when environment variables in schema files change * Add documentation for test kwargs * Add test and fix for schema configs with env_var automatic commit by git-black, original commits: 070e13d 6c6649f bda70c9

…ange (#4162) * [#3885] Partially parse when environment variables in schema files change * Add documentation for test kwargs * Add test and fix for schema configs with env_var automatic commit by git-black, original commits: 16e055a bda70c9

…ange (#4162) * [#3885] Partially parse when environment variables in schema files change * Add documentation for test kwargs * Add test and fix for schema configs with env_var automatic commit by git-black, original commits: bda70c9 dc65118

…ange (#4162) * [#3885] Partially parse when environment variables in schema files change * Add documentation for test kwargs * Add test and fix for schema configs with env_var automatic commit by git-black, original commits: bda70c9 f344166

gshank force-pushed the pp_schema_env_vars branch from 31c19d4 to fffa4de Compare October 29, 2021 14:10

cla-bot bot added the cla:yes label Oct 29, 2021

gshank force-pushed the pp_schema_env_vars branch 5 times, most recently from a24d1b3 to 7bb7234 Compare November 2, 2021 18:24

gshank requested review from leahwicz, jtcohen6, nathaniel-may and emmyoop November 2, 2021 18:30

gshank force-pushed the pp_schema_env_vars branch from 7bb7234 to 6c1cf34 Compare November 3, 2021 13:31

[#3885] Partially parse when environment variables in schema files

17bd354

change

gshank force-pushed the pp_schema_env_vars branch from 6c1cf34 to 17bd354 Compare November 3, 2021 13:55

Add documentation for test kwargs

eb0e340

jtcohen6 reviewed Nov 3, 2021

View reviewed changes

Add test and fix for schema configs with env_var

c57fb1d

gshank force-pushed the pp_schema_env_vars branch from f288f0e to c57fb1d Compare November 3, 2021 21:50

gshank requested a review from jtcohen6 November 3, 2021 22:17

jtcohen6 reviewed Nov 4, 2021

View reviewed changes

jtcohen6 approved these changes Nov 8, 2021

View reviewed changes

gshank merged commit bda70c9 into main Nov 8, 2021

gshank deleted the pp_schema_env_vars branch November 8, 2021 16:28

jtcohen6 reviewed Nov 8, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#3885] Partially parse when environment variables in schema files change #4162

[#3885] Partially parse when environment variables in schema files change #4162

gshank commented Oct 29, 2021

jtcohen6 Nov 3, 2021 •

edited

Loading

gshank Nov 3, 2021

gshank Nov 3, 2021

jtcohen6 Nov 3, 2021

jtcohen6 left a comment

jtcohen6 Nov 4, 2021

jtcohen6 left a comment

jtcohen6 Nov 8, 2021 •

edited

Loading

gshank Nov 8, 2021

jtcohen6 Nov 8, 2021

[#3885] Partially parse when environment variables in schema files change #4162

[#3885] Partially parse when environment variables in schema files change #4162

Conversation

gshank commented Oct 29, 2021

Description

Checklist

jtcohen6 Nov 3, 2021 • edited Loading

Choose a reason for hiding this comment

gshank Nov 3, 2021

Choose a reason for hiding this comment

gshank Nov 3, 2021

Choose a reason for hiding this comment

jtcohen6 Nov 3, 2021

Choose a reason for hiding this comment

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 Nov 4, 2021

Choose a reason for hiding this comment

jtcohen6 left a comment

Choose a reason for hiding this comment

jtcohen6 Nov 8, 2021 • edited Loading

Choose a reason for hiding this comment

gshank Nov 8, 2021

Choose a reason for hiding this comment

jtcohen6 Nov 8, 2021

Choose a reason for hiding this comment

jtcohen6 Nov 3, 2021 •

edited

Loading

jtcohen6 Nov 8, 2021 •

edited

Loading