Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert epacems_to_parquet command to run dagster asset #2300

Merged
merged 4 commits into from
Feb 17, 2023

Conversation

bendnorman
Copy link
Member

@bendnorman bendnorman commented Feb 10, 2023

PR Overview

The epacems_to_parquet command calls pudl.etl.etl_epacems which is no longer a thing in the dagster-asset-etl branch. This PR replaces the call to pudl.etl.etl_epacems with the execution of a dagster job that just runs the hourly_emissions_epacems asset. This command still expects the epacamd_eia and plants_entity_eia tables to be in pudl.sqlite. The pudl_sqlite_io_manager will throw errors if the tables don't exist in the database.

I'd like to hold on on merge this into dagster-asset-etl until we can get the CI passing in #2299 and merged into this branch.

PR Checklist

  • Merge the most recent version of the branch you are merging into (probably dev).
  • All CI checks are passing. Run tests locally to debug failures
  • Make sure you've included good docstrings.
  • For major data coverage & analysis changes, run data validation tests
  • Include unit tests for new functions and classes.
  • Defensive data quality/sanity checks in analyses & data processing functions.
  • Update the release notes and reference reference the PR and related issues.
  • Do your own explanatory review of the PR to help the reviewer understand what's going on and identify issues preemptively.

@bendnorman bendnorman linked an issue Feb 10, 2023 that may be closed by this pull request
1 task
@bendnorman bendnorman requested a review from zschira February 10, 2023 22:24
@bendnorman bendnorman added this to the Port ETL to Dagster milestone Feb 13, 2023
@bendnorman bendnorman added inframundo dagster Issues related to our use of the Dagster orchestrator labels Feb 13, 2023
Copy link
Member

@jdangerx jdangerx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we're not merging until the tests are in, but figured I would approve this now so we can just merge when the time comes.

)
if df.empty:
raise AssertionError(
f"The {table_name} table is empty. Materialize "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I found the message you were talking about @bendnorman ! How helpful :)

@codecov
Copy link

codecov bot commented Feb 17, 2023

Codecov Report

Base: 85.7% // Head: 85.7% // Increases project coverage by +0.0% 🎉

Coverage data is based on head (4a29245) compared to base (8dfef94).
Patch coverage: 66.6% of modified lines in pull request are covered.

Additional details and impacted files
@@                Coverage Diff                @@
##           dagster-asset-etl   #2300   +/-   ##
=================================================
  Coverage               85.7%   85.7%           
=================================================
  Files                     79      79           
  Lines                   8906    8906           
=================================================
+ Hits                    7633    7637    +4     
+ Misses                  1273    1269    -4     
Impacted Files Coverage Δ
src/pudl/convert/epacems_to_parquet.py 84.0% <66.6%> (+17.3%) ⬆️
src/pudl/io_managers.py 93.9% <66.6%> (-1.0%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@bendnorman bendnorman merged commit 79709d5 into dagster-asset-etl Feb 17, 2023
@bendnorman bendnorman deleted the convert-epacems-dagster branch February 17, 2023 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dagster Issues related to our use of the Dagster orchestrator inframundo
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Convert EPA CEMS CLI command to dagster
3 participants