Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate income_statement_ferc1 table #2147

Merged
merged 16 commits into from
Dec 23, 2022
Merged

Integrate income_statement_ferc1 table #2147

merged 16 commits into from
Dec 23, 2022

Conversation

cmgosnell
Copy link
Member

@cmgosnell cmgosnell commented Dec 19, 2022

Overview

Closes #1813
This tbl is weird because it has two dbf tabels flowing into one pudl table.

  • bespoke extract step the concatenates the two dbf tables together
  • pudl metadata resource and field definitions
  • row maps for the dbf row numbers (which is 348 of the line changes!)
  • both the dbf and xbrl table need a reshape (dbf table has columns for each utility_type while the xbrl has multiple columns for each income_type
  • I enabled align_row_numbers_dbf to take a list of dbf_table_names instead of a single dbf_table_name. This felt a little silly because this many be only one table that ever needs this but it was very easy to implement and felt simpler than having two version of align_row_numbers_dbf
  • I had to make an overridden version of source_table_id because the Ferc1AbstractTableTransformer assumes there is only one source table. The new version uses the table name that was added during the extract step. We cooooould do this for all of the DBF tables, but we would need to always add the XBRL table name directly from the extract step. That all felt too.. much for one table. The main quirk here is that source_table_id in the main assign_record_id method takes a source_ferc1 and a df. for all of the other tables the df does nada, but the income statement table needs it. The default method as kwargs so this works okay but feels a little weird.

PR Checklist

Before requesting a review of your pull request, please make sure you've done the
following:

  • Merge the most recent version of dev (or the appropriate upstream branch) into
    your branch and resolved any merge conflicts. You may need to do this several
    times over the course of a PR as dev changes frequently.
  • Verify that all of the CI checks on your PR are passing. See
    Running Tests with Tox
    for details on how to run the full test suite locally if you need to debug a
    particular failure.
  • Ensure that the docstrings for any new modules, classes, functions, or methods are
    descriptive enough for developers and users to understand your code.
  • If you expanded data coverage or changed the outputs, ensure that the full
    data validation tests
    pass locally on a fresh DB.
  • If you've added new functions or classes, ensure that they have at least basic
    unit tests.
  • If you've added new analyses, make sure they include defensive sanity checks that
    will catch unexpected data issues.
  • Update the
    release notes
    to reflect your changes. Make sure to reference the PR and any related issues.
  • Do your own review of the PR. Add comments highlighting areas where you have
    questions you'd like reviewers to answer, known issues, solutions you're
    unsatisfied with, or other things that deserve special attention from the
    reviewer.

@cmgosnell cmgosnell added ferc1 Anything having to do with FERC Form 1 rmi xbrl Related to the FERC XBRL transition dbf Data coming from FERC's old Visual FoxPro DBF database file format. labels Dec 19, 2022
@cmgosnell cmgosnell self-assigned this Dec 19, 2022
@cmgosnell cmgosnell linked an issue Dec 19, 2022 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Dec 19, 2022

Codecov Report

Base: 85.3% // Head: 85.4% // Increases project coverage by +0.0% 🎉

Coverage data is based on head (cbff8fa) compared to base (c959d4b).
Patch coverage: 89.5% of modified lines in pull request are covered.

Additional details and impacted files
@@          Coverage Diff          @@
##             dev   #2147   +/-   ##
=====================================
  Coverage   85.3%   85.4%           
=====================================
  Files         73      73           
  Lines       8746    8777   +31     
=====================================
+ Hits        7469    7496   +27     
- Misses      1277    1281    +4     
Impacted Files Coverage Δ
src/pudl/metadata/fields.py 100.0% <ø> (ø)
src/pudl/metadata/resources/ferc1.py 100.0% <ø> (ø)
src/pudl/transform/params/ferc1.py 100.0% <ø> (ø)
src/pudl/transform/ferc1.py 94.7% <88.0%> (-0.5%) ⬇️
src/pudl/extract/ferc1.py 86.0% <100.0%> (+0.2%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@zaneselvans
Copy link
Member

I guess we never figured out why you were having weird Numpy failures in the CI. If the builds pass would you bump the max Numpy version back up to <1.25?

Copy link
Member

@zaneselvans zaneselvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey these changes look good. Hopefully the numpy thing isn't an issue!

@cmgosnell cmgosnell merged commit 3094677 into dev Dec 23, 2022
@cmgosnell cmgosnell deleted the income branch December 23, 2022 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dbf Data coming from FERC's old Visual FoxPro DBF database file format. ferc1 Anything having to do with FERC Form 1 rmi xbrl Related to the FERC XBRL transition
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transform f1_income_stmnt & f1_incm_stmnt_2 xbrl + dbf
3 participants