-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Should dbt_metrics support null-safe joins? In other words, is null = null? #164
Comments
Hi @callum-mcdata, I think it would make sense to support null-safe joins. A sensible case would be when you have a derived metric which is a division between two metrics. If you have only one dimension which has a null value (not strange), the metric could not be calculated. I really think that being worried of nulls can create friction to the user. |
Hey @josepfranquetf. Do you mind laying out an example with data for the use case that you mentioned above? I'm not sure I fully understand the implications on derived metrics. I've got some opinions in this area after testing of my own but want to make sure I hear from users before settling on a path. |
Hey @callum-mcdata. Imagine that you are running a |
Hey @callum-mcdata, I was struggling with this issue before managed to understand what was happening. From my point of view, since metrics are created to have multiple dimensions it is problematic that you lose rows if one of the dimensions is null and for me, it doesn't seem reasonable to replace all nulls in my columns to be able to use metrics with dimensions. |
No commitment that we'll be implementing this but I wanted to dig into the method that we would use to resolve this issue. Using A IS [ NOT ] DISTINCT FROM BAll righty, let's do some digging here for all our supported adapters:
I suspect that this is the best syntax to use that is consistent across all the supported adapters. We could potentially need to figure out an alternative solution for Redshift but that is fine. |
Is this your first time submitting a feature request?
Describe the feature
As shown in some slack threads and some Github Issues, some customers are running into issues where they expect their NULL values to be correctly represented in the output. This is not currently supported in dbt_metrics because we don't use NULL-SAFE operators for joins. Should we support this as a config option or macro parameter?
Describe alternatives you've considered
The easy alternative to this is to just state that null values need to be overwritten for metric calculations.
Who will this benefit?
Anyone who has null values in their dimensions
Are you interested in contributing this feature?
Maybe
Anything else?
No response
The text was updated successfully, but these errors were encountered: