-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename "schema" + "data" tests #3234
Comments
I agree with this point |
I really appreciate you raising the concern @boxysean! For what it's worth, I'm not in love with the terms Part of the dbt v1.0 project is acknowledging that, given current growth trajectory in weekly active projects, the majority of post-v1.0 dbt users have still never used dbt. If we think there's something, anything, that could be improved for the sake of making it more intuitive and less confusing, the sooner the better. My perspective here is that "data test" and "schema test" are themselves quite confusing names today—they're just the confusing names that we're used to. (All tests on are data; "schema" is an overloaded term, and schema tests don't need to be defined in files named I'm not after a 1:1 rename, exactly. Within the codebase, I'd like to see us redefine these to all just be test, wherever possible, and then in our documentation we can differentiate between tests defined via one-off queries vs. tests defined via reusable/generic/parametrized queries, as different implementation mechanisms of the same functionality. There are a few places where we need to pick specific words, however, for the sake of existing functional parity. The biggest one that comes to mind is the selection criteria:
We're trying to make this distinction less explicit, but it still exists, so I'd like to find some words that would sit comfortably in the codebase, in the selection syntax docs, and in the hearts & minds of our community members. |
Can I propose that
is actually a two-parter as follows?
I'm assuming the code and selection constraints boil down to short and unique is good, one word is better. FWIW, I'm ok with generic as a name, but don't care for bespoke. Some alternative options: targeted/focused/single-use/single-purpose Templated vs fixed? Templated vs non-templated? A bit 🤢, but maybe just crazy enough to work. Particularly inelegant around node selection unless it changed to |
I agree that the current terms can be confusing to users. If you think about it, is a unique test not testing data? New users might understand terns like |
Okay this just hit me, and perhaps it's too late for additional input, but... Splitting the two types of tests on a different axis: what if they were called |
Ok, I think it's high time to make a final call here:
I'm open to one last round of persuading, if anyone wants to get a final word in. Then it's going west, going east, gone till 2.0 at least :) |
I agree with @jtcohen6 , generic test is fine. Less of a fan of bespoke, and harder to explain to new users. I'm not a fan of How about the antonym of generic: A generic test is re-usable across multiple models. |
Good point about potential confusion around running only once. I think @MartinGuindon What do you think of |
Singular is better than one off or single use, but not sure I like it more than specifc. But still a good option I think. |
My current order of preference, having read all the comments:
999: Bespoke I originally had specific and singular flipped, but then thought about the verbal gymnastics I'd have to do in slack trying to talk about a |
"Generic test" 👍. I am strongly against bespoke test. My favorite of the recent options is "singular". I'll throw in different options of a slightly different dimension: "SQL statement test", "statement test", "unparameterized test", "custom test" (throwback to @noel's earlier comment). |
Ooh I don't mind unparameterized. At that point I'd make a play for renaming generic tests to parameterized ones though. Also there are like 4 different ways to spell parametrised by the time you sneak the extra E in and get the Ss and Zs in the mix 😥 |
I think one big difference is having a macro for generic test and NOT having a macro in a bespoke test. |
Thank you all! Sounds like we can really agree about "not bespoke" 😅 @noel : It's a fair distinction, and it's definitely how I think deep-down about the difference between the two: one of these is macro-like, one of these is just my SQL in a file. I struggle with names like @boxysean Parametrized is a really good word to use when communicating the underlying implementation. I've been using this word to explain how a generic test works: I don't love @joellabes That's a really good point about how models:
- name: my_model
columns:
- name: id
tests:
- unique # <---- this one
- name: another_col
tests:
- unique # <---- not this one So I'm leaning toward
I edited the issue above, replacing |
@jtcohen6 when you say "it's desirable for users to write their own custom generic tests, as (parametrized) SQL" at this point they become a Maybe custom is not the "right" word. Could be something like
|
@jtcohen6 I'd be happy with singular + generic. |
Agreed that singular + generic is good. :-) I agree singular won't collide with other words and is reasonably related. |
Lock it in 👍 🔒 |
For v0.20.0, we want to make tests more powerful and (crucially) much more consistent. Today, there are real functional differences between "schema" tests and "data" tests. Instead, tests should just be tests. They'll still have two points of entry, but this should reflect only the dbt developer's choices to trade off clarity and reusability.
As such, we're thinking about renaming:
bespoke testsingular testThis renaming has real implications for:
ParsedDataTestNode
,ParsedSchemTestNode
). This shouldn't have any implications for end users.global_project/macros/schema_tests/
. This is cosmetic only.dbt ls
. (Today, theunique_id
istest.my_project.unique_model_a_fun
, but the fqn ismy_project.schema_test.unique_model_a_fun
.)Selection criteria
dbt test --data
,dbt test -m test_type:data
dbt test --schema
,dbt test -m test_type:schema
Should instead become:
dbt test -m test_type:singular
dbt test -m test_type:generic
Notes:
schema
+data
as aliases for the names we ultimately decidetest_type
method. I'm open to whether we should continue supporting the--schema
and--data
flags for backwards compatibility.Describe alternatives you've considered
The text was updated successfully, but these errors were encountered: