Add cardinality equality schema test #35

dwallace0723 · 2018-01-04T01:14:01Z

This custom schema test macro checks to see if the cardinality of one field in a model exactly matches the cardinality of a chosen field in a different model.

drewbanin

Back to you @dwallace0723!

drewbanin · 2018-01-04T01:19:59Z

macros/schema_tests/cardinality_equality.sql

@@ -0,0 +1,28 @@
+{% macro test_cardinality_equality(model, from, to, field) %}


@dwallace0723 the from, to, and field args can be named arbitrarily -- we used these for the referential integrity test, but you can pick other names if you'd like!

I'll probably just leave them as is. The current naming makes logical sense to me. LMK if you think there are more suitable arg names for this specific test.

drewbanin · 2018-01-04T01:25:15Z

macros/schema_tests/cardinality_equality.sql

+
+with table_a as (
+select
+  count(1) as num_rows,


Can you put the dimensions before the aggregates here (eg. group by 1, also in the table_b cte)

Any reason to use count(1) instead of count(*) here?

re: count(1) instead of count(*), this is strictly habitual to keep myself out of the practice of selecting *. I'm relatively positive it makes no performance difference whatsoever. Will gladly change to count(*) if it better fits the dbt style guidelines.

drewbanin · 2018-01-04T01:38:13Z

macros/schema_tests/cardinality_equality.sql

+from (
+select *
+from table_a
+except


This is groovy! I think the except set function is asymmetrical -- table_a - table_b is different than table_b - table_a. As a result, this test would "pass" if table_b contained records which were not present in table_a, ie. if table_b is a superset of table_a.

There may be a simpler way to do this, but I've done something like this before:

-- table_a and table_b are equal if this union results in 0 rows ( select * from table_a except select * from table_b ) union all ( select * from table_b except select * from table_a )

This is cool (for some value of cool) b/c it's essentially the definition of set equality:
A = B if A is a subset of B, and B is a subset of A

Good catch! It does feel a little bit strange that there isn't a simpler way to do this than a union all

dwallace0723 · 2018-01-04T16:19:53Z

back to you @drewbanin

drewbanin

Couple more comments. Pending those, this is looking good to me

drewbanin · 2018-01-04T17:04:28Z

macros/schema_tests/cardinality_equality.sql

+
+table_b as (
+select
+  { field }},


There's a typo here

drewbanin · 2018-01-04T17:05:41Z

macros/schema_tests/cardinality_equality.sql

+  union all
+  select *
+  from except_b
+)


If you alias this subquery, then this schema test will also work on Postgres. At present, it errors out with:

subquery in FROM must have an alias LINE 36: from ( ^ HINT: For example, FROM (SELECT ...) [AS] foo.

So just change it to

) as sbq

and you should be good! Or, you could make another CTE for the subquery

Ah, good catch. I just created an additional CTE for the subquery.

dwallace0723 · 2018-01-04T17:22:53Z

Cool, changes made @drewbanin 👍

drewbanin · 2018-01-04T17:22:54Z

Groovy! Thanks for contributing @dwallace0723 -- merging this now :)

dwallace0723 added 3 commits January 3, 2018 16:43

add cardinality model

e552f18

add logic with rename

925b937

update readme

f197e1a

drewbanin requested changes Jan 4, 2018

View reviewed changes

dwallace0723 added 2 commits January 4, 2018 08:02

switch position of groupby dimensions

755ef91

address asymmetrical nature of except operator

cacd6be

drewbanin requested changes Jan 4, 2018

View reviewed changes

typo and add CTE

3b822d1

drewbanin approved these changes Jan 4, 2018

View reviewed changes

drewbanin merged commit af8102f into dbt-labs:master Jan 4, 2018

dwallace0723 deleted the add-cardinality-test branch January 4, 2018 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cardinality equality schema test #35

Add cardinality equality schema test #35

dwallace0723 commented Jan 4, 2018

drewbanin left a comment

drewbanin Jan 4, 2018

dwallace0723 Jan 4, 2018

drewbanin Jan 4, 2018

dwallace0723 Jan 4, 2018

drewbanin Jan 4, 2018

dwallace0723 Jan 4, 2018

dwallace0723 commented Jan 4, 2018

drewbanin left a comment

drewbanin Jan 4, 2018

drewbanin Jan 4, 2018

dwallace0723 Jan 4, 2018

dwallace0723 commented Jan 4, 2018

drewbanin commented Jan 4, 2018

		@@ -0,0 +1,28 @@
		{% macro test_cardinality_equality(model, from, to, field) %}

Add cardinality equality schema test #35

Add cardinality equality schema test #35

Conversation

dwallace0723 commented Jan 4, 2018

drewbanin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dwallace0723 commented Jan 4, 2018

drewbanin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dwallace0723 commented Jan 4, 2018

drewbanin commented Jan 4, 2018