-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When on_schema_change is set, pass common columns as dest_columns in incremental merge macros #4144
When on_schema_change is set, pass common columns as dest_columns in incremental merge macros #4144
Comments
I was going to report the same issue, I think it's a typical use case.
where then would be updated to:
In my test, both Indeed in my case, the debug mode leads to the query:
Obviously the The resulting when condition would become
I'm not 100% sure it covers every use case but since we're not caring about In the meantime, I see 2 workarounds:
Regarding that second option, even if you went with it through using
where we can see it's the comma that shouldn't be here. @jtcohen6, that's the issue I was reporting to you. Obviously it's quite a blocker to me for more production usage of incremental materialization. Let me know if you find something better. Thanks! |
@ccharlesgb Thanks for opening, and @github-christophe-oudar, thanks for finding :) I do think the move here is to use the As you saw, there's a syntax error with |
So, we can try that again as: {{
config(
alias = 'incr_test',
materialized = 'incremental',
incremental_strategy = 'insert_overwrite',
partition_by={'field': 'event_time', 'data_type': 'date'},
on_schema_change='sync_all_columns'
)
}}
SELECT
*
from (SELECT DATE('2021-01-01') as `event_time`,
'a' as `col_a`,
'b' as `col_b`) Change that to: ... same config ...
SELECT
*
from (SELECT DATE('2021-01-01') as `event_time`,
'a' as `col_a`) And, with the fix from #4147, e.g. by copy-pasting the fixed version of |
@jtcohen6 Thanks for the quick fix! |
@github-christophe-oudar Ah, okay, I hear you! To get this working with when not matched then insert
(`date`, `field2`)
values
(`date`, `field2`) I confirmed that both BigQuery and Snowflake are both smart enough to null out any un-specified columns, both for To that end, I think the fix here would be to adjust the I think the right place for this code change is in the BigQuery + Snowflake incremental materializations. In cases where we know a temp table exists, because we just used it to process
Some of these changes will need to happen in this repo, some in other repos (dbt-bigquery, dbt-snowflake). We can keep this issue open for now as a central spot for discussion. Is that something you might be interested in giving a go? :) |
@jtcohen6 Thanks for the explanation here, when we looked into this further we came to the same conclusion about Would the change to use the intersection of columns also be applied to |
@jtcohen6 Alright that solution looks great! @ccharlesgb I think that would be the behavior for |
The only limitation for |
@jtcohen6 all tests added and PRs are green ✅ |
Happy to have that issue solved! |
Is there an existing issue for this?
Current Behavior
We have noticed that when removing a column from an incremental model that uses
insert_overwrite
and BigQuery that this causes dbt to fail.Expected Behavior
We had initially assumed this was intended behaviour on dbt's part because of how schema changes with incremental were not supported until 0.21 however after reading the documentation. We thought maybe this was not intentional and that the missing column should just be ignored.
Steps To Reproduce
col_b
:Relevant log output
What database are you using dbt with?
bigquery
Additional Context
No response
The text was updated successfully, but these errors were encountered: