Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MVP of dbt-based data publishing framework #1505

Merged
merged 35 commits into from
Jun 3, 2022
Merged

MVP of dbt-based data publishing framework #1505

merged 35 commits into from
Jun 3, 2022

Conversation

atvaccaro
Copy link
Contributor

@atvaccaro atvaccaro commented May 13, 2022

Description

This PR implements a minimum-viable data publishing framework based on dbt exposures. Initially, our goal is to primarily support uploads to CKAN; that process is documented in this PR as well.

Once deployed, #284 and #427 can be closed

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation
  • agencies.yml

How has this been tested?

I've tested metadata generation and GCS writing locally. However, we can't test publishing to CKAN until we actually go through the process, which we will be doing once this PR is merged.

Screenshots (optional)

@atvaccaro atvaccaro changed the title Dbt publishing Draft of dbt-based data publishing framework May 16, 2022
@atvaccaro atvaccaro self-assigned this May 18, 2022
@atvaccaro atvaccaro mentioned this pull request May 25, 2022
5 tasks
@github-actions
Copy link

1 similar comment
@github-actions
Copy link

@github-actions
Copy link

@atvaccaro atvaccaro changed the title Draft of dbt-based data publishing framework MVP of dbt-based data publishing framework Jun 1, 2022
@atvaccaro atvaccaro marked this pull request as ready for review June 1, 2022 14:55
@github-actions
Copy link

github-actions bot commented Jun 1, 2022

Copy link
Contributor

@lauriemerrell lauriemerrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. thank you for writing all these docs, I know it is not your favorite 🙏
  2. this is cool and I am excited!
  3. mostly nit picky comments, I just want the docs to be as detailed as possible in case this sits for a bit and then we have to come back to it later..... I think my implicit standard is -- could someone who's not Andrew and can't talk to Andrew successfully publish based on the documentation here alone? (if you don't think that level is possible yet, I guess perhaps that will need to wait)
  4. ETA: can you update description? seems like linked Google doc is obsolete, so can probably remove. and info on any testing would be good. also -- I don't think this PR does close all those tickets? we still need to actually implement, right? this gives the framework / infra to publish but does not actually publish

warehouse/scripts/dbt_artifacts.py Show resolved Hide resolved
warehouse/scripts/dbt_artifacts.py Outdated Show resolved Hide resolved
warehouse/scripts/json_to_docblocks.py Show resolved Hide resolved
warehouse/scripts/publish.py Outdated Show resolved Hide resolved
warehouse/scripts/publish.py Show resolved Hide resolved
docs/publishing/sections/7_ckan.md Show resolved Hide resolved
docs/publishing/sections/7_ckan.md Show resolved Hide resolved
docs/publishing/sections/7_ckan.md Show resolved Hide resolved
@lauriemerrell
Copy link
Contributor

Another request per meeting with Eric: can we update to give an option (maybe a modification of --dry-run where you write to somewhere in GCS (maybe a test- version of the bucket) but doesn't push to CKAN or the final endpoint, so people can see exactly the files that would be published?

@atvaccaro
Copy link
Contributor Author

Another request per meeting with Eric: can we update to give an option (maybe a modification of --dry-run where you write to somewhere in GCS (maybe a test- version of the bucket) but doesn't push to CKAN or the final endpoint, so people can see exactly the files that would be published?

done! the bucket is a CLI option still, but there's separate dry-run and deploy concepts now.

@atvaccaro atvaccaro requested a review from lauriemerrell June 2, 2022 20:09
@github-actions
Copy link

github-actions bot commented Jun 2, 2022

Copy link
Contributor

@lauriemerrell lauriemerrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok just one q

warehouse/scripts/publish.py Outdated Show resolved Hide resolved
@atvaccaro atvaccaro requested a review from lauriemerrell June 2, 2022 20:59
@github-actions
Copy link

github-actions bot commented Jun 2, 2022

Copy link
Contributor

@lauriemerrell lauriemerrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sorry, just wondering if we can update docs based on the changing defaults

docs/publishing/sections/7_ckan.md Outdated Show resolved Hide resolved
Copy link
Contributor

@lauriemerrell lauriemerrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay 🎉

@github-actions
Copy link

github-actions bot commented Jun 2, 2022

@atvaccaro atvaccaro merged commit 76dda61 into main Jun 3, 2022
@atvaccaro atvaccaro deleted the dbt-publishing branch June 3, 2022 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants