Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing test results in the database #2593

Closed
jtcohen6 opened this issue Jun 25, 2020 · 13 comments · Fixed by #3316
Closed

Storing test results in the database #2593

jtcohen6 opened this issue Jun 25, 2020 · 13 comments · Fixed by #3316
Labels
dbt tests Issues related to built-in dbt testing functionality discussion enhancement New feature or request
Milestone

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Jun 25, 2020

Describe the feature

dbt testing is powerful; being notified when a test fails can be life-saving; debugging those failed tests could be easier!

There are two long-lived issues related to this subject (#517 and #903), and dozens of valuable comments on both. After discussing with several colleagues and a few community members, we've arrived at an approach I feel good about.

The goal here isn't to build a dimensional model of historical data surrounding dbt invocations. I think we can and should build out capacity for those longitudinal analyses, but it should be powered by dbt artifacts (manifest and run results) rather than adding records to database tables (a la logging package).

Rather, the goal is to add tooling that can be of immediate benefit to dbt developers, for whom debugging tests is a slog that's harder than it needs to be. We want to provide the post-test investigator with a high-level overview of dbt test statuses, and the errant records from any failing tests, within a query or two.

A lot of this work is going to build on top of v2 schema tests (#1173) and data tests (#2274), both of which are on the 1.0.0 docket. So while this feature is not coming in the next month or two, it is firmly on the roadmap.

Spec: v2 schema tests

To get where we want to go, we need to talk a little about our planned changes to schema tests.

Let's imagine a modified version of the current not_null schema test. Several small things are different, the most significant of which is that the test returns a set of records instead of just count(*), similar to how data tests work today:

{% test not_null(model, column_name) %}

    {% set name = 'not_null__' ~ model.name ~ '__' ~ column_name %}

    {% set description %} Assert that {{ column_name }} is never null in {{ model }} {% endset %}

    {% set fail_msg %} Found {{ result }} null values of {{ model }}.{{ column_name }} {% endset %}
        
    {{ config(name=name, description=description, fail_msg=fail_msg) }}
    
    select * from {{ model }}
    where {{ column_name }} is null

{% endmacro %}

Let's say we have two not_null tests defined in our project, in a file called resources.yml:

version: 2

models:
  - name: customers
    colums:
      - name: id
        tests:
          - not_null

  - name: orders
    columns:
      - name: order_status
        tests:
          - not_null

The standard way of executing these tests looks like compiling and running some queries, and storing the compiled version of those queries in the target directory:

-- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__customers__id.sql`

select count(*) from (
    select * from dev_jcohen.customers
    where id is null
) validation_errors
;

-- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql`

select count(*) from (
    select * from dev_jcohen.orders
    where order_status is null
) validation_errors
;

After running, dbt returns any failures as log output to the CLI:

12:00:01 | 1 of 2 START test not_null__customers__id.........................[RUN]
12:00:02 | 1 of 2 PASS not_null__customers__id.............................. [PASS in 1.00s]
12:00:03 | 2 of 2 START test not_null__orders__order_status................. [RUN]
12:00:04 | 2 of 2 FAIL not_null__orders__order_status....................... [FAIL 7 in 1.00s]

Failure in test not_null__orders__order_status (models/resources.yml)
  Found 7 null values of orders.order_status

  compiled SQL at target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql

There are many features we imagine unlocking with v2 testing:

  • In addition to warn severity, more configurable warn_after and error_after thresholds based on a scalar count or %
  • Tagging tests to customize failure notifications

For now, let's focus on storing the results of these tests in the database for easier debugging.

Spec: storing results

Now let's imagine a new test flag:

$ dbt test --store-results

It's unlikely that dbt developers would want to include this flag when iterating on net-new models, but it's quite likely that production or CI test runs would benefit from including it.

When the --store-results flag is included, dbt will instead execute tests like so:

drop schema if exists dev_jcohen__dbt_test_results;
create schema dev_jcohen__dbt_test_results;

create table dev_jcohen__dbt_test_results.not_null__customers__id as (

    -- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__customers__id.sql`

    select * from dev_jcohen.customers
    where id is null

);

create table dev_jcohen__dbt_test_results.not_null__orders__order_status as (

    -- compiled SQL stored in target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql`

    select * from dev_jcohen.orders
    where order_status is null
        
);

Based on the row count in the query status returned by the database, or by running a quick count(*) on each table after creation, dbt will determine whether the test passed or failed and create a summary table of all tests in the invocation. (We'll use a mechanism similar to dbt seed to batch inserts and avoid any limits on inserts or query length.)

create table dev_jcohen__dbt_test_results.summary (
    invocation_id string,
    test_name string,
    status string,
    execution_time timestamp
);

insert into dev_jcohen__dbt_test_results.summary values (
    ('xxxxxxxx-xxxx-0000-0000-xxxxxxxxxxxx', 'not_null__customers__id', 'PASS', '2020-06-25 12:00:01'),
    ('xxxxxxxx-xxxx-0000-0000-xxxxxxxxxxxx', 'not_null__orders__order_status', 'FAIL 7', '2020-06-25 12:00:03')
);

And finally log to the CLI:

12:00:01 | 1 of 2 START test not_null__customers__id.........................[RUN]
12:00:02 | 1 of 2 PASS not_null__customers__id.............................. [PASS in 1.00s]
12:00:03 | 2 of 2 START test not_null__orders__order_status................. [RUN]
12:00:04 | 2 of 2 FAIL not_null__orders__order_status....................... [FAIL 7 in 1.00s]

Failure in test not_null__orders__order_status (models/resources.yml)
  Found 7 null values of orders.order_status
    -------------------------------------------------------------------------
    select * from dev_jcohen__dbt_test_results.not_null__orders__order_status
    -------------------------------------------------------------------------

  compiled SQL at target/compiled/models/resources.yml/schema_test/not_null__orders__order_status.sql

Questions

Test definition

dbt will depend entirely on the test block to circumscribe its failures. Each test block query will be responsible for defining:

  • Which columns to include in the table of test failures. E.g. *, an explicit column list, dbt_utils.star or * (except) (on BigQuery).
  • How many rows to include in the table of test failures, e.g. by adding limit 500.

Audit schema

dbt will store all results in one audit schema per dbt test --store-results invocation. It will name the schema {{ target.schema }}__dbt_test_results.

Can I name the schema differently? What about tests on models with custom schemas?

We're going to keep this simple for the first version by storing all tests in one schema with a set name. We envision adding configuration for this later, especially if there is user demand.

Why does dbt drop and recreate the dbt test audit schema every time?

dbt test is a stateless, idempotent operation; dbt test --store-results will be as well. If you run it several times, you will end up with the same artifacts.

How can I preserve history of past test failures?

The same way you preserve historical data from sources, models, or anything else: snapshots on top of tables in the audit schema, or your own custom run-operation macros to copy or unload that data to external storage.

Granular control

The --store-results flag flips dbt test from running a set of queries that performs only introspection to a set of queries that create assets in the database.

Will it be possible to store results for some tests, and not for others?

In the first version of this, the --store-results flag will change the behavior of all tests run. Eventually, we envision adding a test config flag (store_results: false) that takes precedence over the CLI flag, similar to changes to --full-refresh in 0.18.0.

Until then, you can manage this with multiple commands and node selection:

dbt test --store-failures --exclude my_massive_model
dbt test --models my_massive_model
@brunomurino
Copy link
Contributor

Heya, just giving my 2 cents here.

If you want to write stuff to the database, why not make it a model?

A few weeks ago people asked me this exact same thing about storing test failure results in the db, in order for them to be able to fully deprecate a much more expensive tooling they had in place. The solution I came up with was to have tests as models that output a single line, with exactly 1 row and at least the following 3 columns: test_name, test_result and test_details. On the test_details column I aggregate whatever information is relevant to the test, such that when posted on slack (by reading the target compiled file), it has like breaks and etc. After the model runs, there's also a custom schema test to check the column test_results for the value 'FAILED', pretty straightforward. Finally, I used a post-hook to, if the test failed, insert the results in a 'test_history' table, shared between all tests following this patter.

I must say it has worked pretty well for now, but of course, I don't think this would solve every use case.

Here's one of the examples:

with
{{import('base__account_management__transaction')}},
{{import('base__account_management__lender_account')}},

issues as (
    select

        account_id,
        sum(amount) as current_balance

    from base__account_management__transaction t

    join base__account_management__lender_account la
        on t.account_id = la.id

    where t.status in ('CONFIRMED', 'CONFIRMED_AWAITING_APPROVAL', 'CORRECTED')
        and la.account_status = 'CLOSED'

    group by account_id

    having current_balance >= 1

)

select

    'AllClosedAccountsHaveAZeroBalance' as test_name, -- could be {{ model.name }}
    
    case
        when count(*) != 0 then 'FAILED'
        else 'PASSED'
    end as test_result,

    'Number of accounts = ' || count(*) || '\n\n' ||
    listagg('Account: ' || account_id || '\nBalance: ' || current_balance, '\n\n') as test_details

from issues

Here's the macro I used:

{% macro insert_into_tech_rec(tablename) %}

-- you can check if `select test_results from {{ this }}` is FAILED or not to proceed

INSERT INTO {{ target.schema }}.{{tablename}} (run_date, test_name, test_result, test_details)

with
test_table as (
    select * from {{ this }}
)
SELECT

    getdate() as run_date,
    '{{ this.table }}' as test_name,

    test_table.test_result,
    test_table.test_details

FROM test_table
;
{% endmacro %}

And here's the sample slack alert I got from it.

Screenshot 2020-06-26 at 00 29 35

We're also able to surface that test_history table in our viz tool to try and detect patters in test failures etc.

Maybe this could be translated into a custom materialization, no?

Anyway, just thought it was worth sharing..

@bashyroger
Copy link

bashyroger commented Jun 30, 2020

While I like what is written in the FR, I even more agree with @brunomurino 's argument: 'If you want to write stuff to the database, why not make it a model?'

I do get that DBT is stateless now, but IMO this should not apply for tests. After all, you also don't delete the default logging metadata after each DBT run, but simply append to the existing log.

if the hassle of managing a database for storing test results is too much work:
Why not write out the results of tests like @jtcohen6 and @brunomurino are proposing here to a testing log, structured as JSON?
We could then simply use that as a data source, for a Snowflake / BigQuery external table (or other analytical database supporting JSON as source)

Taking @brunomurino 's input as a starter, I would then write the results of a test to 5 JSON keys per test:

  1. DBT run_id
  2. model name
  3. test name
  4. tested column(s), as an array
  5. test started datetime
  6. test finished datetime
  7. test status, (with vals like [OK,WARNING,FAIL]) with the status FAIL based on a treshhold that is 0 by default, but can be overwritten. Like this example: https://discourse.getdbt.com/t/creating-an-error-threshold-for-schema-tests/966
  8. test count, see below for more info
  9. test details , in a nested structure that @brunomurino mentioned, see below for more info

The important detail for me would then be what DBT will write to the test_count and test_details fields.
These are some ideas for the standard tests:

  • UNIQUE:
    -Logging the count of violations in test_count is good.
    -For better debugging and following DQ over time, logging the actual keys that fail the uniqueness check in test_details is better

  • NOT NULL:
    -Logging the count of violations in test_count is good.
    -For better debugging and following DQ over time, logging the violation count per unique key that fails the NOT NULL check in test_details is better. The latter REQUIRES an unique key to be defined or UNIQUE test to be present on a model.

  • ACCEPTED VALUES:
    -Logging the count of violations in test_count is good.
    -Logging the count of violations per OTHER than accepted in test_details better
    -Knowing the exact rows that fail this test plus OTHER than accepted value in test_details even better. The last option again REQUIRES an unique key to be defined or UNIQUE test to be present on a model.

  • RELATIONSHIPS:
    -Logging the count of violations in test_count is good.
    -Logging the parent keys not being present in test_details better.

If you want more input @jtcohen6 , I can mock up some of my proposed test logging JSON records for you...

@bashyroger
Copy link

bashyroger commented Jul 15, 2020

Hi @jtcohen6, I can see from the status of this ticket I see that implementing this is not planned yet. Can you give me any indication if / when this might happen?
Imo the most basic way DBT could start implementing this is already shared in ticket #903 , which comes down to adding in functionality for hooking into when a test starts or ends, like on-run-start / end:

on-test-start:
- "create table if not exists {{ target.schema }}.audit_dbt_test_results (node text, status text, execution_time decimal(10,2), time timestamp)"
on-test-end:
- "insert into {{ target.schema }}.audit_dbt_test_results (node, status, execution_time, time) values {{ results_values(results) }}"

@bashyroger
Copy link

bashyroger commented Jul 16, 2020

@joshtemple, you also might want to contribute to this FR as it greatly overlaps with #517

@jtcohen6
Copy link
Contributor Author

@brunomurino @bashyroger Thank you both for sharing your thoughts, and for engaging seriously with this proposal!

This feature is not currently prioritized, insofar as it depends on—and is meant to inform—work we want to do around v2 test nodes (#1173, #2274) for v1.0.0. My goal in this issue is to shape the feature we want to eventually support, so that we can scope out the required changes to tests that would support it.

You're absolutely right that there are other approaches possible today, which leverage the Results object or dbt logs/artifacts. Bruno laid out a compelling one above. With that multiplicity in mind, I believe the built-in approach needs to be:

Easy to use

This approach needs to make sense for the most basic, straightforward dbt projects. It should be accessible to users who can write and run SQL against their data warehouse. It shouldn't depend on an ability to write Jinja or parse JSON (trickier in some databases than in others). It should feel like a natural outgrowth of what dbt test does today.

My goal is not one of consolidation. It's not to create the be-all and end-all mechanism for storing test results in the database or preserving test metadata. Instead, I want a starting point that works well for most projects, while also opening the door for custom behavior as dbt users become more comfortable writing Jinja.

Idempotent

I believe that we guarantee sensible results and intuitive process by keeping this approach idempotent. I take this point:

I do get that DBT is stateless now, but IMO this should not apply for tests. After all, you also don't delete the default logging metadata after each DBT run, but simply append to the existing log.

It's true that dbt wants to be stateful when it comes to its project files (via git / version control) and debug logs. Every database operation it performs, however, is ultimately idempotent—with snapshots being the exception that prove the rule. It's important to me that dbt test remains a stateless and straightforward operation. If the user wishes to extend that operation and preserve state, they have levers to do so.

To achieve in-database statefulness, you'll be able to create snapshots that select from the stored test result tables. (Can those snapshots ref the test results? I think so—and it would establish a DAG dependency of model --> test --> snapshot.)

You could also put a post-hook on those snapshots, or a run-operation after dbt test --store-results, to offload into external file storage. As far as out-of-database statefulness, though, I think the best answer is saving dbt artifacts (especially run_results.json) to external file storage and performing separate longitudinal analysis. I think dbt Cloud could have compelling answers here, too.

Extensible

The --store-results mechanism needs to rely on the schema test definition as much as possible, since that's the piece that users can most readily copy, customize, and override. I don't believe this feature should be the place to encode special logic about desired behavior for unique that differs from not_null, accepted_values, and the universe of other custom tests.

In order to achieve what Rogier proposed:

For better debugging and following DQ over time, logging the violation count per unique key that fails the NOT NULL check in test_details is better. The latter REQUIRES an unique key to be defined or UNIQUE test to be present on a model.

I think it would reasonable to write a not_null adaptation, not_null_by_key, that looks somethng like the code below. In thinking through this, I realize that schema tests should have the option of defining a result aggregate calc other than count(*).

{% test not_null_by_key(model, column_name, unique_key) %}
    
    {% set name = 'not_null_by_key__' ~ model.name ~ '__' ~ column_name ~ '_by_' ~ unique_key  %}

    {% set description %} Assert that {{ column_name }} is never null in {{ model }} {% endset %}

    {% set result_calc = 'sum(null_values)' %}

    {% set fail_msg %} Found {{ result }} null values of {{ model }}.{{ column_name }}. See the breakdown by {{ unique_key }}: {% endset %}

    {{ config(name=name, description=description, result_calc=result_calc, fail_msg=fail_msg) }}

    select {{ unique_key }}, count(*) as null_values
    from {{ model }}
    where {{ column_name }} is null
    group by 1

{% endmacro %}

@bashyroger
Copy link

Hi @jtcohen6, I stubled on this article that shares some good ideas on how testing could be improved in DBT:
https://www.infinitelambda.com/post/dbt-testing-tools-gap

What I specifically liked about the article is the suggesting of defining tests as a Gherkin document:
https://cucumber.io/docs/gherkin/reference/

@yu-iskw
Copy link
Contributor

yu-iskw commented Nov 27, 2020

I want to store results of dbt source snapshot-freshness too in addition to results of tests.

@jtcohen6 jtcohen6 added the dbt tests Issues related to built-in dbt testing functionality label Dec 31, 2020
@ltchikindas
Copy link

Let me start by saying that I think this post, as well #2959, encapsulate the majority of what I’m about to write, but the context is split across many comments and linked posts, so I felt it may be beneficial to sum up here before adding on top.

I posit that:

  • All tests should be treated as models in the DAG
  • All (test) failures should be stored in a (test_)failures schema with all of the added context that the developer believes may be necessary to resolve the error (special case on run failures, perhaps a table that lists failed models and their syntax error as values, as previously proposed)
  • All tests should be configurable to fail fast, similar to model errors, but also, model errors should be configurable to warn only, allowing the run to proceed with the last successfully completed version of the model’s data as input to downstream models
  • Yml-instrumented tests are treated as shorthand or simple tests, but downstream persisted datasets are still created (TBD about which columns it should include -- all of the primary table, all of the referenced tables as well, etc)

Based on the following first principles:

  • Topological sorts should be informed implicitly by reading references in code, not explicitly by defining the order of tasks causing redundancy in defined dependencies and opening us up to divergence between the two definitions.

For example, if model B is defined as select * from ref(A)
then model A runs before B. You don’t also need to state A->B in a separate file.

This may seem like a core principle of dbt, however, certain features are very much outside the scope of this axiom. Specifically, seed, test, archive/snapshot, and run are not part of the same topological sort, making it challenging to snapshot a model, test a model before deciding whether to run its children, re-run a set of models based on a test that previously failed, or run models as part of a two-stage commit / transaction flow.

Specifically, if your invocation of DBT looks like this

  1. Clone git
  2. DBT Deps
  3. DBT Snapshot
  4. DBT Seed
  5. DBT Run
  6. DBT test

You are explicitly defining snapshots to happen before runs, but after seeds. What if you want a snapshot of a seed? or a snapshot of a model that’s based on a seed?

You also then get the following execution order:

  1. Run model A
  2. Run model B
  3. Test model A
  4. Test model B

When perhaps instead what you want is

  1. Run model A
  2. Test model A (decide not to proceed after this step)
  3. Run model B
  4. Test model B
  • Test failures and SQL syntax failures are two sides of the same coin and should be treated similarly in alerting, run modification, etc.. Specifically, the run failure of a model that references a field that has been removed by an integration should fail / be treated similarly to a model that references a value of an upstream field that has been removed.

Example: customer.status has 3 values [‘paying’,’canceled’,’paused’]. SWE decides to refactor ‘paused’ to ‘pause’. Data test may be instrumented to fail when an unexpected value appears or an expected value is missing. Similarly, if a dbt_utils.pivot creates a denormalized model with the counts of status by day, and is further directly referenced downstream, the run then breaks because model.paused_plans no longer exists (replaced by model.pause_plans). I believe that these two failures are closely related and dbt’s error handling support in these two cases should be equally built out.

The ideal future state of tests is such that error handling on run failures and test failures is equally mature: The result of a test can control the behavior of the remaining dbt invocation (fail fast causing downstream models not to run, roll back all or part of a transaction such that the data that passes assertions is committed, while failing data does not, complete rollback, etc.)

  • The purpose of tests is two-fold:
    • Identify a data (entry) issue that requires resolution (eg Salesforce active customers doesn’t match Stripe/Zuora active customers, numeric field has negative numbers, field name got changed)
    • Identify a model logic issue that requires resolution (eg new join causes fan out in downstream model)

It doesn’t particularly matter whether it was a “schema test” or a “data test” that issues the alert.

Though the instrumentation differs (return 0 vs return 0 rows), the advent of defining custom schema tests all but unifies these two features, leaving the distinction only in instrumentation complexity (custom schema tests require building macros for the purposes of re-using the test while data tests are faster to build but may cause duplication in testing logic over time).

Example: I have a mapping model with the following fields
company_id
customer_id (fk to payment processor)
source (where is the reference stored- in payment processor or in our app)

I expect there to be more than one entry per company_id, since the company.customer_id could carry the reference OR customer.company_id could carry the reference, however, each company should have at most 1 customer_id. To test for this, I can write the following custom data test:

select 
    *
  , count(distinct company_stripe_customer_mapping_m2m.stripe_customer_id) over(
    partition by
      company_stripe_customer_mapping_m2m.company_id
      ) as count_of_distinct_stripe_customers_per_company_id
from
  {{ref('company_stripe_customer_mapping_m2m')}}
qualify
  count_of_distinct_stripe_customers_per_company_id > 1 

OR I could include this in the model itself

      , count(distinct unioned_sources.stripe_customer_id) over(
          partition by 
            unioned_sources.company_id
        ) as count_of_distinct_stripe_customers_per_company_id 

And then write a schema test like this

  - name: company_stripe_customer_mapping_m2m
    columns:
      - name: count_of_distinct_stripe_customers_per_company_id
        tests:
          - accepted_values:
              values: [1]

(sidenote: this tells me how many distinct unexpected values there are, not how many rows are failing the test, which isn’t ideal, but a tangential discussion)

OR I can even write a custom schema test that takes 2 fields as params and builds out the count(distinct) window function for me.

@joellabes points out in #2959, when a schema test (any test, really) fails, the first thing to do is figure out why it failed (data issue vs logic issue) and that requires auditing/profiling queries with a bunch of context, which generally means:

  1. copy pasting sql
  2. adding joins
  3. running it in a workbench

If someone outside the Analytics Engineering team is the owner of the system of record (sales ops, marketing, billing etc.), then we need to convey to them that this test is failing AND for which records it’s failing, likely via a BI dashboard, so that they can resolve it without having to touch code / dbt. We should not replicate the test logic in the BI dashboard to do so.

@jtcohen6
Copy link
Contributor Author

@ltchikindas Thank you for the thought, time, and care that went into your comment! As we begin work on changes to tests for the next version of dbt (v0.20.0), including many of the ideas surfaced in this issue, it is validating to know just how much we're thinking about the same things.

Specifically, seed, test, archive/snapshot, and run are not part of the same topological sort, making it challenging to snapshot a model, test a model before deciding whether to run its children, re-run a set of models based on a test that previously failed, or run models as part of a two-stage commit / transaction flow.

I agree. In my view, the right implementation here doesn't involve a change to tests (treating them more like models) as much as it requires the advent of a new, generalized task that can execute a DAG of all resource types—to accomplish exactly that thing you're describing. My latest thoughts here are in #2743. Could you give that a read, and comment if anything there seems off the mark?

Though the instrumentation differs (return 0 vs return 0 rows), the advent of defining custom schema tests all but unifies these two features, leaving the distinction only in instrumentation complexity (custom schema tests require building macros for the purposes of re-using the test while data tests are faster to build but may cause duplication in testing logic over time).

I agree here as well. Instrumentation complexity feels like the biggest significant difference between "custom schema" tests and "data" tests, in a world when we'd love project and package maintainers to be writing both. To that end, we're thinking of potentially renaming these.

Why did we use these names in the first place? The Schema Test reflect a property of the model on which it's defined (primary key, foreign key, enum, etc); of course, for that properties to be of any value in analytics database, which lack (or worse, ignore) column constraints, it needs to be asserted and validated with queries. We didn't think about the Data Test as a property of their referenced models, but rather as a complex, emergent truth that may require information from several models—again, though, a conditional truth that must be validated at query time.

While pieces of this distinction may still be conceptually useful—or even, in a world where the dbt-docs generates an ERD based on primary (unique) + foreign keys (relationships), cosmetically/functionally useful (https://github.com/fishtown-analytics/dbt-docs/issues/84)—I think that instrumentation complexity is the much more meaningful distinction. To that end, the Schema Test is really the functional/reusable/polymorphic/generic/... test, and The Data Test is really the one-off/specific/concrete/... test, and as it's possible to test the same thing with either, it makes a lot of sense to start concrete and make something reusable when it needs to be reused.

@boxysean
Copy link
Contributor

I was talking with @kosti-hokkanen-supermetrics last week: he mentioned that this feature didn't work 100% as he anticipated and turned it off. Kosti wants to consume a single table of failures and use that as a notification trigger. Curious if anyone has thoughts on this idea?

@brunomurino
Copy link
Contributor

I haven’t yet refactored my approach since dbt released this new feature, however it may seem that my approach is still required, which is to use post-run-hooks to insert the single-line test-tables into a “historical” test results table. The single line test-tables all have same structure of course, with a column with a json to carry the details of the test and the rest is some metadata like “consecutive failures threshold for alerting” and etc…. It’s been working great, though I do use a python script to query that historical table and take the latest results and etc to get the notification alert and etc

@jtcohen6
Copy link
Contributor Author

@boxysean @brunomurino Right on! I just responded to a very similar comment from @trevor-petach: #903 (comment)

I totally get the desire for a single table of failures; I think the answer here is actually metadata artifacts. The manifest.json + run_results.json produced by dbt test—and soon also dbt build, currently in beta!—contain all of the rich information I could imagine putting into that single table, and much more to boot. I also think the dbt Cloud Metadata API could do powerful things around notifying when specific tests fail—perhaps someday, even a granular notification to that test's particular owner (#3436). I know @barryaron has ideas in this vein :)

In the meantime, I also want to give users the ability to take matters into their own hands, by writing their own macros for consuming results at the end of dbt test, and (if they desire) writing it back to the database—very similar to the approach Bruno has outlined. I think that looks like turning on on-run-end hooks for dbt test (#3463), which exposes the powerful Results context object. Note that dbt build, which includes tests, will support both on-run-end hooks and the store_failures config for test. (We'll be add support for the --store-failures flag before final release: #3596).

@pdivve
Copy link

pdivve commented Sep 6, 2021

base__account_management__transaction t

    join base__account_management__lender_account la
        on t.account_id = la.id

    where t.status in ('CONFIRMED', 'CONFIRMED_AWAITING_APPROVAL', 'CORRECTED')
        and la.account_status = 'CLOSED'

    group by account_id

    having current_balance >= 1

)

select

    'AllClosedAccountsHaveAZeroBalance' as test_name, -- could be {{ model.name }}
    
    case
        when count(*) != 0 then 'FAILED'
        else 'PASSED'
    end as test_result,

    'Number of accounts = ' || count(*) || '\n\n' ||
    listagg('Account: ' || account_id || '\nBalance: ' || current_balance, '\n\n') as test_details

from issues

Hi I have implemented this kind of test in dbt. I have also created macro with the code you gave in macros. but i don't understand how to run that macro. when i try to run dbt run operation it 's not working. what do i need to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dbt tests Issues related to built-in dbt testing functionality discussion enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants