Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve insert method used #381

Merged
merged 18 commits into from
Sep 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
246 changes: 246 additions & 0 deletions macros/_macros.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,246 @@
version: 2

macros:
## DATABASE SPECIFIC HELPERS ##
- name: column_identifier
description: |
Dependent on the adapter type, return the identifier for a column using a numerical index.
arguments:
- name: column_index
type: integer
description: |
The index of the column to return the identifier for

- name: generate_surrogate_key
description: |
Since folks commonly install dbt_artifacts alongside a myriad of other packages,
we copy the dbt_utils implementation of the surrogate_key macro so we don't have
any dependencies to make conflicts worse!

This version is:
URL: https://github.com/dbt-labs/dbt-utils/blob/main/macros/sql/generate_surrogate_key.sql
Commit SHA: eaa0e41b033bdf252eff0ae014ec11888f37ebff
Date: 2023-04-28
arguments:
- name: field_list
type: list
description: |
A list of fields to concatenate together to form the surrogate key

- name: get_relation
description: |
Identify a relation in the graph from a relation name
arguments:
- name: get_relation_name
type: string
description: |
The name of the relation to return from the graph

- name: parse_json
description: |
Dependent on the adapter type, return a column which parses the JSON field.
arguments:
- name: field
type: string
description: |
The name of the field to parse

- name: type_array
description: |
Dependent on the adapter type, returns the native type for storing an array.

- name: type_boolean
description: |
Dependent on the adapter type, returns the native boolean type.

- name: type_json
description: |
Dependent on the adapter type, returns the native type for storing JSON.

## MIGRATION ##
- name: migrate_from_v0_to_v1
description: |
A macro to assist with migrating from v0 to v1 of dbt_artifacts. See
https://github.com/brooklyn-data/dbt_artifacts/blob/main/README.md#migrating-from-100-to-100
for details on the usage.
arguments:
- name: old_database
type: string
description: |
The database of the <1.0.0 output (fct_/dim_) models - does not have to be different to `new_database`
- name: old_schema
type: string
description: |
The schema of the <1.0.0 output (fct_/dim_) models - does not have to be different to `new_schema`
- name: new_database
type: string
description: |
The target database that the v1 artifact sources are in - does not have to be different to `old_database`
- name: new_schema
type: string
description: |
The target schema that the v1 artifact sources are in - does not have to be different to `old_schema`

## UPLOAD INDIVIDUAL DATASETS ##
- name: upload_exposures
description: |
The macro to support upload of the data to the exposures table.
arguments:
- name: exposures
type: list
description: |
A list of exposure objects extracted from the dbt graph

- name: upload_invocations
description: |
The macro to support upload of the data to the invocations table.

- name: upload_model_executions
description: |
The macro to support upload of the data to the model_executions table.
arguments:
- name: models
type: list
description: |
A list of model execution results objects extracted from the dbt result object

- name: upload_models
description: |
The macro to support upload of the data to the models table.
arguments:
- name: models
type: list
description: |
A list of test objects extracted from the dbt graph

- name: upload_seed_executions
description: |
The macro to support upload of the data to the seed_executions table.
arguments:
- name: seeds
type: list
description: |
A list of seed execution results objects extracted from the dbt result object

- name: upload_seeds
description: |
The macro to support upload of the data to the seeds table.
arguments:
- name: seeds
type: list
description: |
A list of seeds objects extracted from the dbt graph

- name: upload_snapshot_executions
description: |
The macro to support upload of the data to the snapshot_executions table.
arguments:
- name: snapshots
type: list
description: |
A list of snapshot execution results objects extracted from the dbt result object

- name: upload_snapshots
description: |
The macro to support upload of the data to the snapshots table.
arguments:
- name: snapshots
type: list
description: |
A list of snapshots objects extracted from the dbt graph

- name: upload_sources
description: |
The macro to support upload of the data to the sources table.
arguments:
- name: sources
type: list
description: |
A list of sources objects extracted from the dbt graph

- name: upload_test_executions
description: |
The macro to support upload of the data to the test_executions table.
arguments:
- name: tests
type: list
description: |
A list of test execution results objects extracted from the dbt result object

- name: upload_tests
description: |
The macro to support upload of the data to the tests table.
arguments:
- name: tests
type: list
description: |
A list of test objects extracted from the dbt graph

## UPLOAD RESULTS ##
- name: get_column_name_list
description: |
A macro to return the list of column names for a particular dataset. Returns a comment if the dataset is not
valid.
arguments:
- name: dataset
type: string
description: |
The name of the dataset to return the column names for e.g. `models`

- name: get_dataset_content
description: |
A macro to extract the data to be uploaded from either the results or the graph object.
arguments:
- name: dataset
type: string
description: |
The name of the dataset to return the data for e.g. `models`

- name: get_table_content_values
description: |
A macro to create the insert statement values required to be uploaded to the table.
arguments:
- name: dataset
type: string
description: |
The name of the dataset to return the column names for e.g. `models`
- name: objects_to_upload
type: list
description: |
The objects to be used to generate the insert statement values - extracted from `get_dataset_content`

- name: insert_into_metadata_table
description: |
Dependent on the adapter type, the wrapper to insert the data into a table from a list of values. Used in the
`upload_results` macro, alongside the `get_column_lists` macro to generate the column names and the
`upload_dataset` macros to generate the data to be inserted.
arguments:
- name: database_name
type: string
description: |
The database name for the relation that the data is to be inserted into
- name: schema_name
type: string
description: |
The schema name for the relation that the data is to be inserted into
- name: table_name
type: string
description: |
The table name for the relation that the data is to be inserted into
- name: fields
type: string
description: |
The list of fields for the relation that the data is to be inserted into
- name: content
type: string
description: |
The data content to insert into the relation

- name: upload_results
description: |
The main macro called to upload the metadata into each of the source tables.
arguments:
- name: results
type: list
description: |
The results object from dbt.
14 changes: 14 additions & 0 deletions macros/database_specific_helpers/get_relation.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{% macro get_relation(relation_name) %}
{% if execute %}
{% set model_get_relation_node = graph.nodes.values() | selectattr('name', 'equalto', relation_name) | first %}
{% set relation = api.Relation.create(
database = model_get_relation_node.database,
schema = model_get_relation_node.schema,
identifier = model_get_relation_node.alias
)
%}
{% do return(relation) %}
{% else %}
{% do return(api.Relation.create()) %}
jared-rimmer marked this conversation as resolved.
Show resolved Hide resolved
{% endif %}
{% endmacro %}
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@
{% endmacro %}

{% macro snowflake__type_json() %}
OBJECT
object
{% endmacro %}

{% macro bigquery__type_json() %}
JSON
json
{% endmacro %}

{#- ARRAY -#}
Expand All @@ -37,9 +37,9 @@
{% endmacro %}

{% macro snowflake__type_array() %}
ARRAY
array
{% endmacro %}

{% macro bigquery__type_array() %}
ARRAY<string>
array<string>
{% endmacro %}
38 changes: 0 additions & 38 deletions macros/insert_into_metadata_table.sql

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
{% macro upload_exposures(graph) -%}
{% set exposures = [] %}
{% for node in graph.exposures.values() %}
{% do exposures.append(node) %}
{% endfor %}
{% macro upload_exposures(exposures) -%}
{{ return(adapter.dispatch('get_exposures_dml_sql', 'dbt_artifacts')(exposures)) }}
{%- endmacro %}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
{% macro upload_model_executions(results) -%}
{% set models = [] %}
{% for result in results %}
{% if result.node.resource_type == "model" %}
{% do models.append(result) %}
{% endif %}
{% endfor %}
{% macro upload_model_executions(models) -%}
{{ return(adapter.dispatch('get_model_executions_dml_sql', 'dbt_artifacts')(models)) }}
{%- endmacro %}

Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
{% macro upload_seed_executions(results) -%}
{% set seeds = [] %}
{% for result in results %}
{% if result.node.resource_type == "seed" %}
{% do seeds.append(result) %}
{% endif %}
{% endfor %}
{% macro upload_seed_executions(seeds) -%}
{{ return(adapter.dispatch('get_seed_executions_dml_sql', 'dbt_artifacts')(seeds)) }}
{%- endmacro %}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
{% macro upload_seeds(graph) -%}
{% set seeds = [] %}
{% for node in graph.nodes.values() | selectattr("resource_type", "equalto", "seed") %}
{% do seeds.append(node) %}
{% endfor %}
{% macro upload_seeds(seeds) -%}
{{ return(adapter.dispatch('get_seeds_dml_sql', 'dbt_artifacts')(seeds)) }}
{%- endmacro %}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
{% macro upload_snapshot_executions(results) -%}
{% set snapshots = [] %}
{% for result in results %}
{% if result.node.resource_type == "snapshot" %}
{% do snapshots.append(result) %}
{% endif %}
{% endfor %}
{% macro upload_snapshot_executions(snapshots) -%}
{{ return(adapter.dispatch('get_snapshot_executions_dml_sql', 'dbt_artifacts')(snapshots)) }}
{%- endmacro %}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,5 @@
{% macro upload_snapshots(graph) -%}
{% set snapshots = [] %}
{% for node in graph.nodes.values() | selectattr("resource_type", "equalto", "snapshot") %}
{% do snapshots.append(node) %}
{% endfor %}
{% macro upload_snapshots(snapshots) -%}

{{ return(adapter.dispatch('get_snapshots_dml_sql', 'dbt_artifacts')(snapshots)) }}

{%- endmacro %}
Expand Down
File renamed without changes.
Loading