Skip to content
This repository has been archived by the owner on Dec 4, 2024. It is now read-only.

dbt 0.12 and custom schemas #7

Closed
dapearce opened this issue Jan 10, 2019 · 5 comments
Closed

dbt 0.12 and custom schemas #7

dapearce opened this issue Jan 10, 2019 · 5 comments
Assignees

Comments

@dapearce
Copy link

dapearce commented Jan 10, 2019

This package depends quite a bit on the default schema naming format (eg {{ schema }}_meta). dbt 0.12 applies custom schema naming overrides to packages (dbt-labs/dbt-core#801), which causes this package to break.

The primary issue I believe is the line below, which defines the schema name of audit log table, overriding any custom schema naming that the other models will now use:
https://github.com/fishtown-analytics/dbt-event-logging/blob/master/macros/audit.sql#L5

@dapearce
Copy link
Author

cc @jthandy

FYI I reconfigured our schema naming to get around this but may be impacting others

@jthandy
Copy link
Member

jthandy commented Jan 11, 2019

oh christ. yeah, I actually did it this way originally because that feature didn't exist. I will try to get around to fixing this. Thanks for the heads up.

@clrcrl clrcrl self-assigned this Jan 24, 2019
@clrcrl
Copy link
Contributor

clrcrl commented Jan 24, 2019

I had a look at this today as it's relevant to a piece of client work that I'm doing.

The tricky part about using the custom schema behavior is that we have to know which schema the audit table lives in.
Currently:

  • dbt inserts into the relation {{ logging.get_audit_relation() }} here, e.g. jaffle_shop_dev_meta.dbt_audit_log
  • dbt figures out what object that is by using schema=target.schema~'_meta' here, in our case jaffle_shop_dev_meta.

If we instead use: schema=schema, the dbt_audit_log table gets created in the schema that is responsive to the generate_schema_name macro.

But then, when we try to insert into this table from a model that is being created in another schema, it will use the schema for the current model, leading to errors like this:

...
10:24:16 | 4 of 5 START table model jaffle_shop_dev.customers................... [RUN]
10:24:16 | 4 of 5 ERROR creating table model jaffle_shop_dev.customers.......... [ERROR in 0.36s]
...
Completed with 2 errors:

Database Error in model customers (models/customers.sql)
  relation "jaffle_shop_dev.dbt_audit_log" does not exist
  compiled SQL at target/compiled/jaffle_shop/customers.sql

I can't find a way to easily tell dbt to always the schema for the dbt_audit_log table instead. I tried something like:

    {% set audit_schema =
        ref('stg_dbt_audit_log').schema
    %}

but that resulted in:

Compilation Error in model customers (models/customers.sql)
  dbt was unable to infer all dependencies for the model "customers".
  This typically happens when ref() is placed within a conditional block.

  To fix this, add the following hint to the top of the model "customers":

  -- depends_on: {{ ref('stg_dbt_audit_log') }}

  > in macro get_audit_relation (macros/audit.sql)
  > called by macro log_audit_event (macros/audit.sql)
  > called by macro log_model_start_event (macros/audit.sql)
  > called by macro statement (macros/core.sql)
  > called by macro run_hooks (macros/materializations/helpers.sql)
  > called by macro materialization_table_default (macros/materializations/table/table.sql)
  > called by model customers (models/customers.sql)

So, is there a way to access {{ logging.schema }} in some way without introducing a whole bunch of dependencies?

@drewbanin
Copy link
Contributor

@clrcrl what is the desired behavior here? Should all logs go into the same table? Or should a _meta schema with an audit table be created for every custom schema in the project.

If we want everything to go into a single table, I think we might need to hack it like this:

{% set schema = graph.nodes['model.logging.stg_dbt_audit_log']['schema'] %}

This is a pretty big hack, but it's no worse than the existing hack :p

@clrcrl
Copy link
Contributor

clrcrl commented Dec 10, 2019

Closing as I'm not actually sure what the bug is here. I tried to dig in a tiny bit, but didn't get anywhere:

Steps to reproduce:

  1. Add event-logging package to project, including the model pre and post hooks
  2. Configure some models to be materialized in a separate schema
  3. dbt run

Expected results:

  1. All audit records end up in the same table, <target.schema>_meta.dbt_audit_log, regardless of schema they are configured in.
  2. The event_schema is correct for each model

Actual results:

  1. As above — no bug detected

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants