Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/add union data #123

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Feature/add union data #123

wants to merge 16 commits into from

Conversation

fivetran-jamie
Copy link
Contributor

@fivetran-jamie fivetran-jamie commented Jan 8, 2024

PR Overview

This PR will address the following Issue/Feature:
fivetran/dbt_hubspot#130

This PR will result in the following new package version:

v0.15.0

Please detail what change(s) this PR introduces and any additional information that should be known during the review of this PR:
introduces support for running the package on a union of connectors

  • adds union_data macro to each _tmp model
  • includes source relation in each staging model
  • adds yml documentation for source_relation
  • includes source_relation in all uniqueness tests
  • documents unioning ability in README
  • ensures _dbt_source_relation does not slip in when hubspot__pass_through_all_columns is true
  • disables default source if unioning is occurring

dbt_hubspot_source v0.15.0

PR #123 includes the following updates:

🎉 Feature Update 🎉

  • This release supports running the package on multiple Hubspot sources at once! See the README for details on how to leverage this feature.

🛠️ Under the Hood 🛠️

  • Included auto-releaser GitHub Actions workflow to automate future releases.
  • Included Github Actions workflow to check for docs updates.
  • Updated the maintainer PR template to resemble the most up to date format.

PR Checklist

Basic Validation

Please acknowledge that you have successfully performed the following commands locally:

  • dbt compile
  • dbt run –full-refresh
  • dbt run
  • dbt test
  • dbt run –vars (if applicable) -- ran with diff hubspot__pass_through_all_columns values and specific passthrough columns with add_property_label: true

Before marking this PR as "ready for review" the following have been applied:

  • The appropriate issue has been linked and tagged
  • You are assigned to the corresponding issue and this PR
  • BuildKite integration tests are passing

Detailed Validation

Please acknowledge that the following validation checks have been performed prior to marking this PR as "ready for review":

  • You have validated these changes and assure this PR will address the respective Issue/Feature.
  • You are reasonably confident these changes will not impact any other components of this package or any dependent packages.
  • You have provided details below around the validation steps performed to gain confidence in these changes.

See Hex notebook validating transform models

Standard Updates

Please acknowledge that your PR contains the following standard updates:

  • Package versioning has been appropriately indexed in the following locations:
    • indexed within dbt_project.yml
    • indexed within integration_tests/dbt_project.yml
  • CHANGELOG has individual entries for each respective change in this PR
  • README updates have been applied (if applicable)
  • DECISIONLOG updates have been updated (if applicable)
  • Appropriate yml documentation has been added (if applicable)

dbt Docs

Please acknowledge that after the above were all completed the below were applied to your branch:

  • docs were regenerated (unless this PR does not include any code or yml updates) -- will do post-approval

If you had to summarize this PR in an emoji, which would it be?

🇨🇱

@fivetran-jamie fivetran-jamie self-assigned this Jan 8, 2024
@fivetran-jamie fivetran-jamie marked this pull request as ready for review January 9, 2024 18:35
Copy link
Contributor

@fivetran-joemarkiewicz fivetran-joemarkiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-jamie the results of this PR look great! I just have a few questions and suggestions in the comments below that I would like your eyes on before we approve this. Let me know if you have any questions or want to chat about these further.

Comment on lines 1 to 35
name: 'check docs'
on:
push:
branches:
- main
pull_request:
branches:
- main

jobs:
changed-files:
runs-on: ubuntu-latest
name: test changed-files
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Get changed files
id: changed-files
uses: tj-actions/[email protected]
with:
files: docs/**

- name: Check to see if docs folder hasn't changed
if: steps.changed-files.outputs.any_changed == 'false'
run: |
echo "Docs have not been regenerated."
exit 1

- name: Check if docs folder has changed
if: steps.changed-files.outputs.any_changed == 'true'
run: |
echo "Docs have been regenerated!"
exit 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's actually remove this. We are planning on taking a new approach this year on docs checking and generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Outdated

## 🛠️ Under the Hood 🛠️
- Included auto-releaser GitHub Actions workflow to automate future releases.
- Included Github Actions workflow to check for docs updates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the file we should also remove this entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

README.md Outdated
To properly incorporate all of your Hubspot connectors into your project's DAG:
1. Define each of your sources in a `.yml` file in your project. Utilize the following template to leverage our table and column documentation.

<details><summary><i>Expand for source configuration template</i></summary><p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea, but I worry it is going to be a pain to maintain going forward. Is there any other way we can document this without explicitly showing the yml example like you have below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

{% if all_passthrough_column_check('stg_hubspot__company_tmp',get_company_columns()) > 0 %}
-- just pass everything through if extra columns are present, but ensure required columns are present.
,{{
fivetran_utils.remove_prefix_from_columns(
columns=adapter.get_columns_in_relation(ref('stg_hubspot__company_tmp')),
prefix='property_', exclude=get_macro_columns(get_company_columns()))
prefix='property_', exclude=(get_macro_columns(get_company_columns()) + ['_dbt_source_relation']))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the need for this addition?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment applies to the other models with the similar code update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah essentially we don't want to include _dbt_source_relation (which is created by union_data/union_relation) in the remove_prefix_from_columns macro call.

without adding it to the exclude list, users passing through all columns would end up with both a source_relation and _dbt_source_relation column, which is redundant and a lil confusing. thus, this change makes sure that these users just have the more-nicely-named source_relation field.

Comment on lines +2 to +7
# - package: fivetran/fivetran_utils
# version: [">=0.4.0", "<0.5.0"]
- git: https://github.com/fivetran/dbt_fivetran_utils.git
revision: feature/enhance-union-data
warn-unpinned: false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to swap before release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants