Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1110] [Feature] Cross-database macro for type_boolean() #5739

Closed
3 tasks done
jaypeedevlin opened this issue Aug 31, 2022 · 10 comments
Closed
3 tasks done

[CT-1110] [Feature] Cross-database macro for type_boolean() #5739

jaypeedevlin opened this issue Aug 31, 2022 · 10 comments
Labels
enhancement New feature or request help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors

Comments

@jaypeedevlin
Copy link
Contributor

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

There are some great type helpers that have migrated from utils to core, but type_boolean() is a notable exception. Speaking to @joellabes he indicated that y'all would likely be supportive of adding this one in. I've got a basic implementation in brooklyn-data/dbt_artifacts#188 that I'd be happy to contribute.

Describe alternatives you've considered

Not doing it :)

Who will this benefit?

Mostly folks maintaining packages.

Are you interested in contributing this feature?

Certainly!

Anything else?

Not officially included in the scope of this request, but something I'd love input on is a type_json and type_array, both of which I implemented in the above PR. If y'all were open to this I might expand the scope since I have versions ready to go.

@jaypeedevlin jaypeedevlin added enhancement New feature or request triage labels Aug 31, 2022
@github-actions github-actions bot changed the title [Feature] Cross-database macro for type_boolean() [CT-1110] [Feature] Cross-database macro for type_boolean() Aug 31, 2022
@joellabes
Copy link
Contributor

Additional context: another community member opened a PR for type_boolean in utils (dbt-labs/dbt-utils#508) which I closed because I didn't want to add it just to remove it a couple of weeks later. I had intended for it to come along in the initial migration, but it turns out it missed the boat.

@dbeatty10 dbeatty10 self-assigned this Aug 31, 2022
@dbeatty10
Copy link
Contributor

@jaypeedevlin thanks for opening!

Reaction

Any (or all) of the following are reasonable to include:

  • type_boolean
  • type_array
  • type_json

@Fleid please feel free to weigh in or contradict me case you have both interest and insight on this!

Side note: it would be awesome to consult the most recent SQL standard to see which types we already have covered in some fashion and which we don't. Maybe this list is an accurate reflection of SQL standard data types? Not that dbt-core has to also define and support these, but would serve as a helpful guide.

And maybe we'd create a pretty picture like this too:
image

Implementation hints

We'd expect any adapter to override these methods as-needed (including raising a not supported exception when appropriate). Not all databases will support all of these types, but they might be able to be "faked" in some fashion (see SQL Server and booleans).

Each high-level default implementation will look like this:

{%- macro type_string() -%}
{{ return(adapter.dispatch('type_string', 'dbt')()) }}
{%- endmacro -%}
{% macro default__type_string() %}
{{ return(api.Column.translate_type("string")) }}
{% endmacro %}

Then the low-level default implementation will use translate_type and be like this:

"STRING": "TEXT",

For database adapters that need to override the default, it will look like this:
https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/adapters/bigquery/column.py#L14

        "STRING": "STRING",

Testing hints

  1. Implement testing in dbt-core:
  2. Also implement the testing in every adapter (dbt-biquery, dbt-snowflake, dbt-redshift, and dbt-spark):

@dbeatty10 dbeatty10 added help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors and removed triage labels Aug 31, 2022
@dbeatty10 dbeatty10 removed their assignment Aug 31, 2022
@jpmmcneill
Copy link
Contributor

jpmmcneill commented Sep 10, 2022

@jaypeedevlin related to discussion over in #5778, I'm happy to take the type_array that @dbeatty10 suggested. I have a few cross db macros that I've spun up for https://github.com/jpmmcneill/dbt-graph-theory that I might lift and shift over.

This is one that's grinding on my gears for a lack of support.

@jaypeedevlin
Copy link
Contributor Author

You're welcome to any/all of it, I have some (temporary) code here you're welcome to borrow, although as Doug points out there will need to be changes in each adapter.

@dbeatty10
Copy link
Contributor

@jpmmcneill very timely of you: #5520 is related to cross db macros for arrays.

Background:
dbt-labs/dbt-utils#595 added some macros related to arrays to dbt-utils:

  • array_append
  • array_concat
  • array_construct

I'm in progress of moving those into dbt-core (plus each adapter, like dbt-redshift, etc)

Would be glad for any feedback you're able to provide on those implementations!

@dbeatty10
Copy link
Contributor

@jpmmcneill or @jaypeedevlin do either of you still want to submit a PR for type_boolean()?

To have a chance to be included in dbt Core 1.3, we'd probably want to have it ready within the next 7 days.

General outline for implementation is here. The testing part is always the most time-consuming for me.

@jpmmcneill
Copy link
Contributor

jpmmcneill commented Sep 19, 2022

@dbeatty10 👋 I've spun up two PRs for type_boolean. I think the bigquery one is required because of how the inheritence is set up, but am not 100%.

dbt-core: #5875
dbt-bigquery: dbt-labs/dbt-bigquery#313

I need to figure out how to get my dbt-core local dev setup for running the data type tests (they're complaining about a module error when I try to pytest them).

Depending on how fast local dev will get setup, type array / json might be possible before 1.3. type_boolean is definitely the most simple version of the three!

@dbeatty10
Copy link
Contributor

Awesome @jpmmcneill !

Continuing the conversation with next steps over at #5875
and dbt-labs/dbt-bigquery#313.

@jaypeedevlin
Copy link
Contributor Author

Thanks @jpmmcneill for picking this up and apologies both for my absence, I’ve been dealing with a family emergency. Glad to see this looks like it will sneak in for 1.3

@dbeatty10
Copy link
Contributor

Resolved by #5875, et al.

Thank you to @jpmmcneill, @jaypeedevlin, @Maayan-s and everyone else who participated in making this feature happen 🙌

ueshin added a commit to databricks/dbt-databricks that referenced this issue Sep 23, 2022
### Description

Adds a test for `type_boolean` macro introduced at dbt-labs/dbt-core#5739, following dbt-labs/dbt-spark#471.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help_wanted Trickier changes, with a clear starting point, good for previous/experienced contributors
Projects
None yet
Development

No branches or pull requests

4 participants