Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro does_table_exist causes ambiguity issue in get_relation macro #60

Closed
ggmblr opened this issue Sep 1, 2022 · 3 comments
Closed

Comments

@ggmblr
Copy link

ggmblr commented Sep 1, 2022

We recently upgraded the fivetran_log version from 0.5.2 to 0.6.2, this introduced a new macro does_table_exist which looks like this:

{%- macro does_table_exist(table_name) -%}
    {%- if execute -%} -- returns true when dbt is in execute mode
    {%- set ns = namespace(has_table=false) -%} -- declare boolean namespace and default value
        {%- for node in graph.sources.values() -%} -- grab sources from the dictionary of nodes
        -- call the database for the matching table
            {%- set source_relation = adapter.get_relation(
                    database=node.database,
                    schema=node.schema,
                    identifier=node.name) -%}
            {%- if source_relation == None and node.name | lower == table_name | lower -%}
                {{ return(False) }} -- return false if relation identified by the database.schema.identifier does not exist for the given table name
            {%- elif source_relation != None and node.name | lower == table_name | lower -%}
                {{ return(True) }} -- otherwise return True
            {% endif %}
        {%- endfor -%}
    {%- endif -%}
{%- endmacro -%}

We are currently running our dbt project on Snowflake and our typical src.yml looks like this:

version: 2

sources:
  - name: source_name
    database: '{{ env_var("SNOWFLAKE_LOAD_DATABASE") }}'
    schema: SCHEMA_NAME
    loader: fivetran
    loaded_at_field: _FIVETRAN_SYNCED

    quoting:
      database: true
      schema: true
      identifier: false

    tables:
      - name: table_name

This raises an error on compilation about ambiguity. This is most of the time a quoting issue. Here is a sample error message we got:

dbt.exceptions.RuntimeException: Runtime Error
  Compilation Error in model stg_fivetran_log__credits_used (models/staging/stg_fivetran_log__credits_used.sql)
    When searching for a relation, dbt found an approximate match. Instead of guessing 
    which relation to use, dbt will move on. Please delete "RAW"."FACEBOOK_ADS_FIVETRAN"."ACCOUNT_HISTORY", or rename it to be less ambiguous.
    Searched for: RAW.FACEBOOK_ADS_FIVETRAN.account_history
    Found: "RAW"."FACEBOOK_ADS_FIVETRAN"."ACCOUNT_HISTORY"
    
    > in macro does_table_exist (macros/does_table_exist.sql)
    > called by model stg_fivetran_log__credits_used (models/staging/stg_fivetran_log__credits_used.sql)

We found that the issue resides in the get_relation macro, to be more precise the fact that the name gets passed as the argument for the identifier. We solved this issue locally in two different ways, either adding an upper to the name, or directly changing the name to identifier.

{%- macro does_table_exist(table_name) -%}
    {%- if execute -%} -- returns true when dbt is in execute mode
    {%- set ns = namespace(has_table=false) -%} -- declare boolean namespace and default value
        {%- for node in graph.sources.values() -%} -- grab sources from the dictionary of nodes
        -- call the database for the matching table
            {%- set source_relation = adapter.get_relation(
                    database=node.database,
                    schema=node.schema,
                    identifier=node.identifier) -%}
            {%- if source_relation == None and node.name | lower == table_name | lower -%}
                {{ return(False) }} -- return false if relation identified by the database.schema.identifier does not exist for the given table name
            {%- elif source_relation != None and node.name | lower == table_name | lower -%}
                {{ return(True) }} -- otherwise return True
            {% endif %}
        {%- endfor -%}
    {%- endif -%}
{%- endmacro -%}
{%- macro does_table_exist(table_name) -%}
    {%- if execute -%} -- returns true when dbt is in execute mode
    {%- set ns = namespace(has_table=false) -%} -- declare boolean namespace and default value
        {%- for node in graph.sources.values() -%} -- grab sources from the dictionary of nodes
        -- call the database for the matching table
            {%- set source_relation = adapter.get_relation(
                    database=node.database,
                    schema=node.schema,
                    identifier=node.name | upper) -%}
            {%- if source_relation == None and node.name | lower == table_name | lower -%}
                {{ return(False) }} -- return false if relation identified by the database.schema.identifier does not exist for the given table name
            {%- elif source_relation != None and node.name | lower == table_name | lower -%}
                {{ return(True) }} -- otherwise return True
            {% endif %}
        {%- endfor -%}
    {%- endif -%}
{%- endmacro -%}

We would like to bring this to your attention and see if it would make sense to modify this macro to avoid this issue since we also saw that the dbt team used the identifier in one of the packages we use, see here.

Many thanks in advance and let us know if more information is needed.

@fivetran-joemarkiewicz
Copy link
Contributor

Hi @ggmblr thanks so much for raising this to our team!

We have actually encountered a similar issue in other implementations that leverage the get_relations macro and have solved it by using your suggested identifier in place of name edit. This is definitely something we will want to account for within our package to ensure other users do not encounter this compilation error.

If you would like to open a PR to account for this change I would be happy for myself or someone on my team to review it and work to merge it into our next release. We really appreciate you raising this with our team and contributing your insight into the fix! Let me know if you are open to contributing a PR. Otherwise, my team can address this issue in our next sprint.

@ggmblr
Copy link
Author

ggmblr commented Sep 1, 2022

Hi @fivetran-joemarkiewicz
I just now openend the PR. Ithink its my first open source PR, so if you guys need anything else for it to be approved feel free to let me know.

@fivetran-jamie
Copy link
Contributor

PR has been merged and v0.6.3 should be live on the hub shortly!! thanks @ggmblr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants