Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create spot fixer as generic transformer & apply to missing plant names #1980

Closed
Tracked by #2180 ...
cmgosnell opened this issue Oct 12, 2022 · 1 comment · Fixed by #2254
Closed
Tracked by #2180 ...

create spot fixer as generic transformer & apply to missing plant names #1980

cmgosnell opened this issue Oct 12, 2022 · 1 comment · Fixed by #2254
Assignees
Labels
data-repair Interpolating or extrapolating data that we don't actually have. ferc1 Anything having to do with FERC Form 1 rmi xbrl Related to the FERC XBRL transition

Comments

@cmgosnell
Copy link
Member

cmgosnell commented Oct 12, 2022

With our new ferc1 transform process we are less intense about dropping records. Because of that, I came across a fair amount of records in the steam table that have "" or 0 as a plant name. Most of these records are truly trash (they have no other data points in them) and thus can be dropped. But some of them actually have some information. I investigated a far number of those bad-name-but-data-full records and was able to pretty easily identify plant names.

a generic-ish spot-fixer

the params would need to include:

  • some way to identify the specific record to fit
  • the column(s) and fix(es) required

the func/meth would be a table-wide transformer bc it requires having multiple columns. probably we'd find the records and use map or something like that to replace the values in the specific columns

a steam specific application

Using something like the dictionary below:

{
    "spot_fix": [
            {
                "record_id_ferc1": "f1_steam_1999_12_72_0_1",
                "fixes": {"plant_name_ferc1": "clifty creek"},
            },
            {
                "record_id_ferc1": "f1_steam_2010_12_306_0_1",
                "fixes": {"plant_name_ferc1": "harrison county"},
            },
            {
                "record_id_ferc1": "f1_steam_1997_12_230_0_1",
                "fixes": {"plant_name_ferc1": "hermiston generating"},
            },
            {
                "record_id_ferc1": "f1_steam_1998_12_64_0_1",
                "fixes": {"plant_name_ferc1": "hardee power station"},
            },
            {
                "record_id_ferc1": "f1_steam_2015_12_276_0_1",
                "fixes": {"plant_name_ferc1": "state line"},
            },
            {
                "record_id_ferc1": "f1_steam_2014_12_276_0_1",
                "fixes": {"plant_name_ferc1": "state line"},
            },
            {
                "record_id_ferc1": "f1_steam_2003_12_62_2_3",
                "fixes": {"plant_name_ferc1": "pea ridge"},
            },
            {
                "record_id_ferc1": "f1_steam_2003_12_62_2_2",
                "fixes": {"plant_name_ferc1": "smith"},
            },
            {
                "record_id_ferc1": "f1_steam_2000_12_204_0_1",
                "fixes": {"plant_name_ferc1": "seabrook"},
            },
            {
                "record_id_ferc1": "f1_steam_2001_12_204_0_1",
                "fixes": {"plant_name_ferc1": "seabrook"},
            },
        ]
}

a hydro specific application

{
    "spot_fix": [
            {
                "record_id_ferc1": "'hydroelectric_generating_plant_statistics_large_plants_406_2021_C000617_2583, 0'",
                "fixes": {"plant_name_ferc1": "station 5"},
            },
}
@cmgosnell cmgosnell added ferc1 Anything having to do with FERC Form 1 data-repair Interpolating or extrapolating data that we don't actually have. rmi xbrl Related to the FERC XBRL transition labels Oct 12, 2022
@e-belfer e-belfer self-assigned this Jan 26, 2023
@e-belfer e-belfer linked a pull request Feb 1, 2023 that will close this issue
8 tasks
@jdangerx jdangerx moved this to 👀 In review in Catalyst Megaproject Feb 7, 2023
@jdangerx jdangerx moved this from 👀 In review to 🏗 In progress in Catalyst Megaproject Feb 7, 2023
@e-belfer e-belfer moved this from 🚧 In progress to 👀 In review in Catalyst Megaproject Feb 16, 2023
@e-belfer e-belfer moved this from 👀 In review to ✅ Done in Catalyst Megaproject Feb 20, 2023
@e-belfer
Copy link
Member

Closed by PR #2254.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-repair Interpolating or extrapolating data that we don't actually have. ferc1 Anything having to do with FERC Form 1 rmi xbrl Related to the FERC XBRL transition
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants