You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be valuable to have a functional_dependency generic test in dbt-utils that ensures one or more "determinant" columns uniquely determine another "dependent" column. The test should fail if any determinant combination maps to more than one distinct dependent value. An optional where clause would allow users to scope the test to a subset of records.
This is also especially helpful for hierarchical relationships. For instance:
If region → country, ensuring each region only has one corresponding country.
If (country, state) → city, ensuring each (country, state) pair only maps to one city.
Ensuring each (parent_id, child_id) pair in a hierarchy leads to exactly one “child name” or other dependent attributes.
Describe alternatives you've considered
Manual SQL tests: Users currently write custom SQL queries or macros to check for duplicate mappings, but this adds boilerplate and inconsistency across projects.
Existing uniqueness tests: While dbt-utils has unique_combination_of_columns, it doesn’t directly verify that those columns determine the value of a separate column.
Additional context
This feature is database-agnostic since it primarily involves standard SQL grouping and counting.
The test should align with existing dbt-utils generic test conventions, including adapter dispatch and lower-case SQL syntax.
Who will this benefit?
Data modelers needing to validate referential integrity, e.g. ensuring (order_id, product_id) always maps to a unique quantity.
Teams maintaining dimensional models who want to confirm a dimension’s natural key always leads to a single record.
Organizations checking hierarchical or parent-child relationships, where each parent or parent-child pair must map to only one child or child property.
Are you interested in contributing this feature?
Yes, we already have a working macro and example seeds demonstrating passing and failing scenarios. We’d be happy to raise a PR and contribute the code, along with documentation.
The text was updated successfully, but these errors were encountered:
Describe the feature
It would be valuable to have a
functional_dependency
generic test in dbt-utils that ensures one or more "determinant" columns uniquely determine another "dependent" column. The test should fail if any determinant combination maps to more than one distinct dependent value. An optionalwhere
clause would allow users to scope the test to a subset of records.This is also especially helpful for hierarchical relationships. For instance:
region → country
, ensuring eachregion
only has one correspondingcountry
.(country, state) → city
, ensuring each(country, state)
pair only maps to onecity
.(parent_id, child_id)
pair in a hierarchy leads to exactly one “child name” or other dependent attributes.Describe alternatives you've considered
unique_combination_of_columns
, it doesn’t directly verify that those columns determine the value of a separate column.Additional context
Who will this benefit?
(order_id, product_id)
always maps to a uniquequantity
.Are you interested in contributing this feature?
Yes, we already have a working macro and example seeds demonstrating passing and failing scenarios. We’d be happy to raise a PR and contribute the code, along with documentation.
The text was updated successfully, but these errors were encountered: