You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What part(s) of the page would you like to see updated?
unique_key is only discussed in the context of snapshots, but it is also a "key" component for incremental models!
Depending on the context, it can be a column name, a SQL expression (snapshots only), or a list (incremental models only). And for at least the delete+insert incremental strategy, it doesn't have to uniquely match to only a single row in a data set.
What happens when…
unique_key is a column name?
✅ Snapshots
✅ Incremental models
unique_key is an expression?
✅ Snapshots
❌ Incremental models (error)
unique_key is a list of column names?
❌ Snapshots (error)
✅ Incremental models
unique_key is not actually unique?
❌ Snapshots (bad data)
Incremental models
✅ append
✅ delete_insert
✅ insert_overwrite
❌ merge (error)
Error messages for merge when unique_key is not unique
dbt-postgres
22:24:21 Database Error in model merge (models/merge.sql)
MERGE command cannot affect row a second time
HINT: Ensure that not more than one source row matches any one target row.
dbt-redshift
22:24:38 Database Error in model merge (models/merge.sql)
Found multiple matches to update the same tuple.
dbt-snowflake
22:24:50 Database Error in model merge (models/merge.sql)
100090 (42P18): Duplicate row detected during DML action
Row Values: [redacted]
dbt-bigquery
00:21:20 Database Error in model merge (models/merge.sql)
UPDATE/MERGE must match at most one source row for each target row
dbt-databricks
22:25:16 Runtime Error in model merge (models/merge.sql)
Cannot perform Merge as multiple source rows matched and attempted to modify the same
target row in the Delta table in possibly conflicting ways. By SQL semantics of Merge,
when multiple source rows match on the same target row, the result may be ambiguous
as it is unclear which source row should be used to update or delete the matching
target row. You can preprocess the source table to eliminate the possibility of
multiple matches. Please refer to
https://docs.databricks.com/delta/merge.html#merge-error
Effects on snapshots when unique_key is not unique
dbt-postgres
dbt-postgres doesn't use a merge statement for snapshots, but use deletes and inserts instead. So it will succeed, but the snapshot will contain bad data when unique_key is not unique.
dbt-redshift
dbt-redshift doesn't use a merge statement for snapshots, but use deletes and inserts instead. So it will succeed, but the snapshot will contain bad data when unique_key is not unique.
dbt-snowflake
dbt-snowflake uses a merge statement for snapshots, so it fails when unique_key is not unique:
00:31:07 Database Error in snapshot snapshot (snapshots/snapshot.sql)
100090 (42P18): Duplicate row detected during DML action
Row Values: [redacted]
dbt-bigquery
dbt-bigquery uses a merge statement for snapshots, so it fails when unique_key is not unique:
00:31:30 Database Error in snapshot snapshot (snapshots/snapshot.sql)
UPDATE/MERGE must match at most one source row for each target row
dbt-databricks
dbt-databricks uses a merge statement for snapshots, so it fails when unique_key is not unique:
01:17:58 Runtime Error in snapshot snapshot (snapshots/snapshot.sql)
Cannot perform Merge as multiple source rows matched and attempted to modify the same
target row in the Delta table in possibly conflicting ways. By SQL semantics of Merge,
when multiple source rows match on the same target row, the result may be ambiguous
as it is unclear which source row should be used to update or delete the matching
target row. You can preprocess the source table to eliminate the possibility of
multiple matches. Please refer to
https://docs.databricks.com/delta/merge.html#merge-error
Additional information
Here's a sample of pages that discuss the purpose and behavior of unique_key within either snapshots or incremental models:
Contributions
Link to the page on docs.getdbt.com requiring updates
https://docs.getdbt.com/reference/resource-configs/unique_key
What part(s) of the page would you like to see updated?
unique_key
is only discussed in the context of snapshots, but it is also a "key" component forincremental
models!Depending on the context, it can be a column name, a SQL expression (snapshots only), or a list (incremental models only). And for at least the
delete+insert
incremental strategy, it doesn't have to uniquely match to only a single row in a data set.What happens when…
unique_key
is a column name?unique_key
is an expression?unique_key
is a list of column names?unique_key
is not actually unique?Error messages for
merge
whenunique_key
is not uniquedbt-postgres
dbt-redshift
dbt-snowflake
dbt-bigquery
dbt-databricks
Effects on snapshots when
unique_key
is not uniquedbt-postgres
dbt-postgres doesn't use a
merge
statement for snapshots, but use deletes and inserts instead. So it will succeed, but the snapshot will contain bad data whenunique_key
is not unique.dbt-redshift
dbt-redshift doesn't use a
merge
statement for snapshots, but use deletes and inserts instead. So it will succeed, but the snapshot will contain bad data whenunique_key
is not unique.dbt-snowflake
dbt-snowflake uses a
merge
statement for snapshots, so it fails whenunique_key
is not unique:dbt-bigquery
dbt-bigquery uses a
merge
statement for snapshots, so it fails whenunique_key
is not unique:dbt-databricks
dbt-databricks uses a
merge
statement for snapshots, so it fails whenunique_key
is not unique:Additional information
Here's a sample of pages that discuss the purpose and behavior of
unique_key
within either snapshots or incremental models:Instigating context
Originally posted by @cdabel in dbt-labs/dbt-core#4618 (comment)
Discovered while researching dbt-labs/dbt-adapters#150.
The text was updated successfully, but these errors were encountered: