Use QUALIFY clause in deduplicate
macro for Redshift
#811
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #713
This is a:
All pull requests from community contributors should target the
main
branch (default).Description & motivation
In Redshift
deduplicate
macro causes rows with NULL values in any column to be discarded due to specifics of the natural join.Since Redshift has added support for QUALIFY keyword (https://aws.amazon.com/about-aws/whats-new/2023/07/amazon-redshift-qualify-clause-select-sql-statement/) we can get rid of natural join in the macro and fix the problem in an elegant manner.
Compare inputs and outputs:
Old version
Expected results
Actual results
New version
Actual result
Checklist
star()
source)limit_zero()
macro in place of the literal string:limit 0
dbt.type_*
macros instead of explicit datatypes (e.g.dbt.type_timestamp()
instead ofTIMESTAMP