Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[merge after dec 5th] adds new hard_deletes config #6558

Merged
merged 39 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
071717d
add hard deletes
mirnawong1 Nov 27, 2024
4e329ce
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Nov 27, 2024
0f86a11
remove old
mirnawong1 Nov 27, 2024
ee76962
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Nov 28, 2024
a81ec90
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
660996c
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
eecf31b
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
6dfcf94
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
0fc49ba
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
513befd
Update website/docs/reference/resource-configs/snapshot_meta_column_n…
mirnawong1 Nov 28, 2024
3a1f0fc
Update website/snippets/_hard-deletes.md
mirnawong1 Nov 28, 2024
7fdf069
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
cff9dca
Update website/docs/reference/resource-configs/hard-deletes.md
mirnawong1 Nov 28, 2024
4d5edef
Update website/docs/reference/resource-configs/hard-deletes.md
mirnawong1 Nov 28, 2024
a7f236b
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
9ec7644
Update website/docs/docs/dbt-versions/release-notes.md
mirnawong1 Nov 28, 2024
454e33d
Update website/docs/reference/resource-configs/hard-deletes.md
mirnawong1 Nov 28, 2024
a05f318
Update website/docs/reference/resource-configs/hard-deletes.md
mirnawong1 Nov 28, 2024
6785baa
Update website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1…
mirnawong1 Nov 28, 2024
bb08235
Update website/docs/docs/build/snapshots.md
mirnawong1 Nov 28, 2024
8ce6271
update
mirnawong1 Nov 28, 2024
ece882c
Merge branch 'add-hard-deletes-config' of github.com:dbt-labs/docs.ge…
mirnawong1 Nov 28, 2024
a70a5f1
grace's feedback
mirnawong1 Nov 28, 2024
95b14cb
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Nov 29, 2024
fa9af39
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Nov 29, 2024
39ece31
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Nov 29, 2024
1237a14
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Nov 29, 2024
4d77327
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Dec 3, 2024
ef2b266
update table
mirnawong1 Dec 3, 2024
c3a57be
add updates
mirnawong1 Dec 3, 2024
536a346
Update website/docs/docs/build/snapshots.md
mirnawong1 Dec 3, 2024
d20b1b0
Update website/docs/docs/build/snapshots.md
mirnawong1 Dec 3, 2024
74d0465
Update website/docs/docs/build/snapshots.md
mirnawong1 Dec 3, 2024
8e22c5f
Update website/docs/reference/resource-configs/hard-deletes.md
mirnawong1 Dec 3, 2024
747049b
dougs'feedback
mirnawong1 Dec 3, 2024
8d5bf83
add check exmaple
mirnawong1 Dec 3, 2024
56ab21b
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Dec 4, 2024
dcaa0cf
add link to migration
mirnawong1 Dec 4, 2024
af4dc4c
Merge branch 'current' into add-hard-deletes-config
mirnawong1 Dec 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 72 additions & 27 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@
* [Snapshot properties](/reference/snapshot-properties)
* [`snapshot` command](/reference/commands/snapshot)


### What are snapshots?
## What are snapshots?
Analysts often need to "look back in time" at previous data states in their mutable tables. While some source data systems are built in a way that makes accessing historical data possible, this is not always the case. dbt provides a mechanism, **snapshots**, which records changes to a mutable <Term id="table" /> over time.

Snapshots implement [type-2 Slowly Changing Dimensions](https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row) over mutable source tables. These Slowly Changing Dimensions (or SCDs) identify how a row in a table changes over time. Imagine you have an `orders` table where the `status` field can be overwritten as the order is processed.
Expand Down Expand Up @@ -66,6 +65,7 @@
[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes): true | false
[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): dictionary
[dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): string
[hard_deletes](/reference/resource-configs/hard-deletes): string
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
```

</File>
Expand All @@ -84,6 +84,7 @@
| [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) | Find hard deleted records in source and set `dbt_valid_to` to current time if the record no longer exists | No | True |
| [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current) | Set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.| No | string |
| [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names) | Customize the names of the snapshot meta fields | No | dictionary |
| [hard_deletes](/reference/resource-configs/hard-deletes) | Track hard deletes by adding a new record when row become "deleted" in source | No | string |

Check warning on line 87 in website/docs/docs/build/snapshots.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/build/snapshots.md#L87

[custom.Typos] Oops there's a typo -- did you really mean 'hard_deletes'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'hard_deletes'? ", "location": {"path": "website/docs/docs/build/snapshots.md", "range": {"start": {"line": 87, "column": 4}}}, "severity": "WARNING"}


- In versions prior to v1.9, the `target_schema` (required) and `target_database` (optional) configurations defined a single schema or database to build a snapshot across users and environment. This created problems when testing or developing a snapshot, as there was no clear separation between development and production environments. In v1.9, `target_schema` became optional, allowing snapshots to be environment-aware. By default, without `target_schema` or `target_database` defined, snapshots now use the `generate_schema_name` or `generate_database_name` macros to determine where to build. Developers can still set a custom location with [`schema`](/reference/resource-configs/schema) and [`database`](/reference/resource-configs/database) configs, consistent with other resource types.
Expand Down Expand Up @@ -215,10 +216,14 @@
- The `dbt_valid_to` column will be updated for any existing records that have changed.
- The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null` or the value configured in `dbt_valid_to_current` (available in Versionless and 1.9 and higher).

<VersionBlock firstVersion="1.9">

#### Note
- These column names can be customized to your team or organizational conventions using the [snapshot_meta_column_names](#snapshot-meta-fields) config.
- Use the `dbt_valid_to_current` config to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.

- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track hard deletes by adding a new record when row become "deleted" in source. Supported options are `ignore`, `invalidate`, and `new_record`.
</VersionBlock>

Snapshots can be referenced in downstream models the same way as referencing models — by using the [ref](/reference/dbt-jinja-functions/ref) function.

## Detecting row changes
Expand Down Expand Up @@ -294,7 +299,7 @@

:::

**Example Usage**
**Example usage**

<VersionBlock lastVersion="1.8">

Expand Down Expand Up @@ -344,15 +349,64 @@

### Hard deletes (opt-in)

<VersionBlock firstVersion="1.9">

In dbt v1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source. The `hard_deletes` config is not a separate strategy but an additional opt-in feature that can be used with any snapshot strategy.

Check warning on line 354 in website/docs/docs/build/snapshots.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/build/snapshots.md#L354

[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'? ", "location": {"path": "website/docs/docs/build/snapshots.md", "range": {"start": {"line": 354, "column": 8}}}, "severity": "WARNING"}
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

The `hard_deletes` config has three options/fields:
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| Field | Description |
| --------- | ----------- |
| `ignore` (default) | No action for deleted records. |
| `invalidate` | Behaves the same as the existing `invalidate_hard_deletes=true`, where deleted records are invalidated by setting `dbt_valid_to`. |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| `new_record` | Tracks deleted records as new rows using the `dbt_is_deleted` [meta field](#snapshot-meta-fields) when records are deleted.|
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

import HardDeletes from '/snippets/_hard-deletes.md';

<HardDeletes />

#### Example usage

<File name='snapshots/orders_snapshot.yml'>

```yaml
snapshots:
- name: orders_snapshot_hard_delete
relation: source('jaffle_shop', 'orders')
config:
schema: snapshots
unique_key: id
strategy: timestamp
updated_at: updated_at
hard_deletes: new_record # options are: 'ignore', 'invalidate', or 'new_record'
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
```

</File>

In this example, the `hard_deletes: new_record` config will add a new row for deleted records woth the `dbt_is_deleted` column set to `True`.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
Any restored records are added as new rows with the `dbt_is_deleted` field set to `False`.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

The resulting table will look like this:

| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_is_deleted |

Check warning on line 390 in website/docs/docs/build/snapshots.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/build/snapshots.md#L390

[custom.Typos] Oops there's a typo -- did you really mean 'updated_at'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'updated_at'? ", "location": {"path": "website/docs/docs/build/snapshots.md", "range": {"start": {"line": 390, "column": 17}}}, "severity": "WARNING"}
| -- | ------ | ---------- | -------------- | ------------ | -------------- |
| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | False |
| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | False |
| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | | True |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| 1 | restored | 2024-01-01 12:00 | 2024-01-01 12:00 | | False |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

</VersionBlock>

<VersionBlock lastVersion="1.8">

Rows that are deleted from the source query are not invalidated by default. With the config option `invalidate_hard_deletes`, dbt can track rows that no longer exist. This is done by left joining the snapshot table with the source table, and filtering the rows that are still valid at that point, but no longer can be found in the source table. `dbt_valid_to` will be set to the current snapshot time.

This configuration is not a different strategy as described above, but is an additional opt-in feature. It is not enabled by default since it alters the previous behavior.

For this configuration to work with the `timestamp` strategy, the configured `updated_at` column must be of timestamp type. Otherwise, queries will fail due to mixing data types.

**Example Usage**
Note, in v1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config for better control over how to handle deleted rows from the source.

Check warning on line 407 in website/docs/docs/build/snapshots.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/build/snapshots.md#L407

[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'? ", "location": {"path": "website/docs/docs/build/snapshots.md", "range": {"start": {"line": 407, "column": 10}}}, "severity": "WARNING"}
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

<VersionBlock lastVersion="1.8">
#### Example usage

<File name='snapshots/orders_snapshot_hard_delete.sql'>

Expand All @@ -378,40 +432,22 @@

</VersionBlock>

<VersionBlock firstVersion="1.9">

<File name='snapshots/orders_snapshot.yml'>

```yaml
snapshots:
- name: orders_snapshot_hard_delete
relation: source('jaffle_shop', 'orders')
config:
schema: snapshots
unique_key: id
strategy: timestamp
updated_at: updated_at
invalidate_hard_deletes: true
```

</File>

</VersionBlock>

## Snapshot meta-fields

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.

Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless):
- These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config.
- Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.
- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track deleted records as new rows with the `dbt_is_deleted` meta field when using the `hard_deletes='new_record'` field.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

| Field | Meaning | Usage |
| -------------- | ------- | ----- |
| -------------- | ------- | ----- |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| dbt_valid_from | The timestamp when this snapshot row was first inserted | This column can be used to order the different "versions" of a record. |
| dbt_valid_to | The timestamp when this row became invalidated. <br /> For current records, this is `NULL` by default <VersionBlock firstVersion="1.9"> or the value specified in `dbt_valid_to_current`.</VersionBlock> | The most recent snapshot record will have `dbt_valid_to` set to `NULL` <VersionBlock firstVersion="1.9"> or the specified value. </VersionBlock> |
| dbt_scd_id | A unique key generated for each snapshotted record. | This is used internally by dbt |
| dbt_updated_at | The updated_at timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt |
| dbt_is_deleted | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. This is used internally by dbt |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

*The timestamps used for each column are subtly different depending on the strategy you use:

Expand Down Expand Up @@ -445,6 +481,15 @@
| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 |
| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | | 2024-01-01 11:05 |

Snapshot results with `hard_deletes='new_record'`:

| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted |

Check warning on line 486 in website/docs/docs/build/snapshots.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/docs/build/snapshots.md#L486

[custom.Typos] Oops there's a typo -- did you really mean 'updated_at'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'updated_at'? ", "location": {"path": "website/docs/docs/build/snapshots.md", "range": {"start": {"line": 486, "column": 18}}}, "severity": "WARNING"}
|----|---------|------------------|------------------|------------------|------------------|----------------|
| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | False |
| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | 2024-01-01 11:05 | False |
| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | | 2024-01-01 11:20 | True |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved


</details>

<br/>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ Beginning in dbt Core 1.9, we've streamlined snapshot configuration and added a
- Standard `schema` and `database` configs supported: Snapshots will now be consistent with other dbt resource types. You can specify where environment-aware snapshots should be stored.
- Warning for incorrect `updated_at` data type: To ensure data integrity, you'll see a warning if the `updated_at` field specified in the snapshot configuration is not the proper data type or timestamp.
- Set a custom current indicator for the value of `dbt_valid_to`: Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.
- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) configuration to track hard deletes by adding a new record when row become "deleted" in source. This config replaces the `invalidate_hard_deletes` to give you more control on how to handle deleted rows from the source. Supported fields are `ignore`, `invalidate`, and `new_record`.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

Read more about [Snapshots meta fields](/docs/build/snapshots#snapshot-meta-fields).

Expand Down
4 changes: 4 additions & 0 deletions website/docs/docs/dbt-versions/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ Release notes are grouped by month for both multi-tenant and virtual private clo

\* The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability.

## December 2024

- **New**: The [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source. Supported options are `ignore`, `invalidate`, and `new_record`.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

## November 2024
- **Fix**: Job environment variable overrides in credentials are now respected for Exports. Previously, they were ignored.
- **Behavior change**: If you use a custom microbatch macro, set a [`require_batched_execution_for_custom_microbatch_strategy` behavior flag](/reference/global-configs/behavior-changes#custom-microbatch-strategy) in your `dbt_project.yml` to enable batched execution. If you don't have a custom microbatch macro, you don't need to set this flag as dbt will handle microbatching automatically for any model using the [microbatch strategy](/docs/build/incremental-microbatch#how-microbatch-compares-to-other-incremental-strategies).
Expand Down
113 changes: 113 additions & 0 deletions website/docs/reference/resource-configs/hard-deletes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
title: hard_deletes
resource_types: [snapshots]
description: "Use the `hard_deletes` config to control how deleted rows are tracked in your snapshot table."
datatype: "{<dictionary>}"
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
default_value: {ignore}
id: "hard-deletes"
sidebar_label: "hard_deletes"
---

Available from dbt v1.9 or with [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud.

Check warning on line 11 in website/docs/reference/resource-configs/hard-deletes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/hard-deletes.md#L11

[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'v1.9'? ", "location": {"path": "website/docs/reference/resource-configs/hard-deletes.md", "range": {"start": {"line": 11, "column": 20}}}, "severity": "WARNING"}


<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: <snapshot_name>
config:
hard_deletes: 'ignore', 'invalidate', or 'new_record'
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
```
</File>

<File name='dbt_project.yml'>

```yml
snapshots:
[<resource-path>](/reference/resource-configs/resource-path):
+hard_deletes: "ignore", "invalidate", or "new_record"
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
```

</File>

<File name='snapshots/<filename>.sql'>

```sql
{{
config(
unique_key='id',
strategy='timestamp',
updated_at='updated_at',
hard_deletes='ignore', 'invalidate', 'new_record'
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
)
}}
```

</File>


## Description

Use the `hard_deletes` configuration to track hard deletes by adding a new record when row become "deleted" in source.
Replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source.
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

import HardDeletes from '/snippets/_hard-deletes.md';

<HardDeletes />

:::warning

If you're updating an existing snapshot to use the `hard_deletes` config, dbt _will not_ handle migrations automatically. We recommend either only using these settings for net-new snapshots, or arranging an update of pre-existing tables before enabling this setting.
:::
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

## Default

By default, if you don’t specify `hard_deletes`, it'll automatically default to `ignore`. Deleted rows will not be tracked and their `dbt_valid_to` column remains `NULL`.

The `hard_deletes` config has three options:

| Field | Description |
| --------- | ----------- |
| `ignore` (default) | No action for deleted records. |
| `invalidate` | Behaves the same as the existing `invalidate_hard_deletes=true`, where deleted records are invalidated by setting `dbt_valid_to`. |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| `new_record` | Tracks deleted records as new rows using the `dbt_is_deleted` meta field when records are deleted.|

## Impact on snapshot records
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved

- **Backward compatibility**: The `invalidate_hard_deletes` config is still supported for existing snapshots but can't be used alongside `hard_deletes`.
- **New snapshots**: For new snapshots, we recommend using `hard_deletes` instead of `invalidate_hard_deletes`.
- **Migration**: If you switch an existing snapshot to use `hard_deletes` without migrating your data, you may encounter inconsistent or incorrect results, such as a mix of old and new data formats.

## Example

<File name='snapshots/schema.yml'>

```yaml
snapshots:
- name: my_snapshot
config:
hard_deletes: new_record # options are: 'ignore', 'invalidate', or 'new_record'
strategy: timestamp
updated_at: updated_at
columns:
- name: dbt_valid_from
description: Timestamp when the record became valid.
- name: dbt_valid_to
description: Timestamp when the record stopped being valid.
- name: dbt_is_deleted
description: Indicates whether the record was deleted.
```

</File>

The resulting snapshot table contains the `hard_deletes: new_record` configuration. If a record is deleted and later restored, the resulting snapshot table might look like this:

| id | dbt_scd_id | Status | dbt_updated_at | dbt_valid_from | dbt_valid_to | dbt_is_deleted |
mirnawong1 marked this conversation as resolved.
Show resolved Hide resolved
| -- | -------------------- | ----- | -------------------- | --------------------| -------------------- | ----------- |
| 1 | 60a1f1dbdf899a4dd... | pending | 2024-10-02 ... | 2024-05-19... | 2024-05-20 ... | False |

Check warning on line 108 in website/docs/reference/resource-configs/hard-deletes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/hard-deletes.md#L108

[custom.Typos] Oops there's a typo -- did you really mean '60a1f1dbdf899a4dd'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean '60a1f1dbdf899a4dd'? ", "location": {"path": "website/docs/reference/resource-configs/hard-deletes.md", "range": {"start": {"line": 108, "column": 8}}}, "severity": "WARNING"}
| 1 | b1885d098f8bcff51... | cancelled| 2024-10-02 ... | 2024-05-20 ... | 2024-06-03 ... | True |

Check warning on line 109 in website/docs/reference/resource-configs/hard-deletes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/hard-deletes.md#L109

[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff51'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff51'? ", "location": {"path": "website/docs/reference/resource-configs/hard-deletes.md", "range": {"start": {"line": 109, "column": 8}}}, "severity": "WARNING"}
| 1 | b1885d098f8bcff53... | shipped | 2024-10-02 ... | 2024-06-03 ... | | False |

Check warning on line 110 in website/docs/reference/resource-configs/hard-deletes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/hard-deletes.md#L110

[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff53'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff53'? ", "location": {"path": "website/docs/reference/resource-configs/hard-deletes.md", "range": {"start": {"line": 110, "column": 8}}}, "severity": "WARNING"}
| 2 | b1885d098f8bcff55... | active | 2024-10-02 ... | 2024-05-19 ... | | False |

Check warning on line 111 in website/docs/reference/resource-configs/hard-deletes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/hard-deletes.md#L111

[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff55'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'b1885d098f8bcff55'? ", "location": {"path": "website/docs/reference/resource-configs/hard-deletes.md", "range": {"start": {"line": 111, "column": 8}}}, "severity": "WARNING"}

In this example, the `dbt_is_deleted` column is set to `True` when the record is deleted. When the record is restored, the `dbt_is_deleted` column is set to `False`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would love a gut check if this is right bc I've been reading this over and over again so might have mixed things up.

Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
datatype: column_name
---

:::tip Use the hard_deletes config instead

Check warning on line 7 in website/docs/reference/resource-configs/invalidate_hard_deletes.md

View workflow job for this annotation

GitHub Actions / vale

[vale] website/docs/reference/resource-configs/invalidate_hard_deletes.md#L7

[custom.Typos] Oops there's a typo -- did you really mean 'hard_deletes'?
Raw output
{"message": "[custom.Typos] Oops there's a typo -- did you really mean 'hard_deletes'? ", "location": {"path": "website/docs/reference/resource-configs/invalidate_hard_deletes.md", "range": {"start": {"line": 7, "column": 16}}}, "severity": "WARNING"}

Note, in Versionless and dbt Core 1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config for better control over how to handle deleted rows from the source.
:::

<VersionBlock firstVersion="1.9">

Expand Down
Loading
Loading