-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an archival strategy that operates by checking for column value diffs by PK #706
Comments
|
Possibly separate issues, but I was thinking along two lines: 1. A column excludes list. Our ETL brings in dozens of denormalized columns per order that would bloat the archive to no purpose. 2. Column data transform. One of our ETL brings in datetimes as varchars and it would be nice to massage into a timestamp first. I already do this with the |
@mplovepop our current thinking involves 1) a new "archive" block and 2) archiving a query, instead of a table. This might look like:
Still some more thinking required here about the exact interface, but do you buy the general approach? I think it might work for the use cases you mentioned here |
Where would that go? I like the general approach and maybe require certain archive column names, |
…trategy Add the check archive strategy (#706)
Edit: See the
'check'
strategy defined in #1175dbt's implementation of archive could be scoped to specific columns. If these columns haven't changed between invocations of
dbt archive
, then new rows would not need to be inserted.Most of this logic can be implemented in the materialization, though we'll also need to add a field to the
archive:
block in thedbt_project.yml
file.The text was updated successfully, but these errors were encountered: