Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target_sda outputs NAs when supplied with scenario containing sectors not in data or ald #390

Closed
jdhoffa opened this issue Apr 5, 2022 · 1 comment · Fixed by #403
Closed
Assignees
Labels
bug an unexpected problem or unintended behavior

Comments

@jdhoffa
Copy link
Member

jdhoffa commented Apr 5, 2022

In the function target_sda(), if the input co2_intensity_scenario contains some sectors (e.g. "cement" and "steel'), but the other input datasets data and ald do not (e.g. they only contain "cement"), target_sda() will output NAs for the missing sector.

I would expect the function to not output any information for this sector.

suppressPackageStartupMessages(library(dplyr))
library(r2dii.data)
library(r2dii.analysis)

matched <- tibble::tribble(
  ~id_loan, ~loan_size_outstanding, ~loan_size_outstanding_currency, ~loan_size_credit_limit, ~loan_size_credit_limit_currency, ~id_2dii,            ~level, ~score,  ~sector,         ~name_ald, ~sector_ald,
  "L162",                      1,                           "EUR",                       2,                            "EUR",    "UP1", "ultimate_parent",      1, "cement", "american cement",    "cement"
)

ald <- tibble::tribble(
  ~company_id,     ~name_company, ~lei,  ~sector,           ~technology,  ~production_unit, ~year, ~production, ~emission_factor, ~country_of_domicile, ~plant_location, ~is_ultimate_owner, ~ald_timestamp,               ~emission_factor_unit,
  "18", "american cement",   NA, "cement", "integrated facility", "tonnes per year", 2020L, 6196233.305,      0.683651289,                 "GE",            "SM",              FALSE,       "1994Q1", "tonnes of CO2 per tonne of cement"
)

scenario <- tibble::tribble(
  ~scenario_source, ~scenario,  ~sector,  ~region, ~year,               ~emission_factor_unit, ~emission_factor,
  "demo_2020",    "demo", "cement", "global",  2020, "tonnes of CO2 per tonne of Cement",              0.7,
  "demo_2020",    "demo", "cement", "global",  2021, "tonnes of CO2 per tonne of Cement",              0.4,
  "demo_2020",    "demo",  "steel", "global",  2020,  "tonnes of CO2 per tonne of Steel",                2,
  "demo_2020",    "demo",  "steel", "global",  2021,  "tonnes of CO2 per tonne of Steel",              1.5
)

out <- matched %>% 
  target_sda(
    ald_demo,
    scenario
  )
#> Warning: Removing ald rows where `emission_factor` is NA

filter(out, emission_factor_metric == "target_demo")
#> # A tibble: 4 × 4
#>   sector  year emission_factor_metric emission_factor_value
#>   <chr>  <dbl> <chr>                                  <dbl>
#> 1 cement  2020 target_demo                            0.684
#> 2 steel   2020 target_demo                           NA    
#> 3 cement  2021 target_demo                            0.378
#> 4 steel   2021 target_demo                           NA

Created on 2022-04-05 by the reprex package (v2.0.1)

@jdhoffa jdhoffa added the bug an unexpected problem or unintended behavior label Apr 5, 2022
@jdhoffa jdhoffa self-assigned this Apr 5, 2022
@jdhoffa
Copy link
Member Author

jdhoffa commented Apr 25, 2022

The above reprex is not that useful. I have isolated the issue more precisely.
It occurs when:
Both ald and scenario contain multiple sectors, but data only matches to a company within one of those sectors:

suppressPackageStartupMessages(library(dplyr))
library(r2dii.data)
library(r2dii.analysis)

matched <- tibble::tribble(
  ~id_loan, ~loan_size_outstanding, ~loan_size_outstanding_currency, ~loan_size_credit_limit, ~loan_size_credit_limit_currency, ~id_2dii,            ~level, ~score,      ~sector,      ~name_ald, ~sector_ald,
  "L162",                      1,                           "EUR",                       2,                            "EUR",    "UP1", "ultimate_parent",      1, "automotive", "shaanxi auto",    "cement"
)

ald <- tibble::tribble(
  ~name_company,  ~sector, ~technology, ~year, ~production, ~emission_factor, ~plant_location, ~is_ultimate_owner,
  "shaanxi auto", "cement",       "ice",  2025,           1,                1,            "BF",               TRUE,
  "shaanxi auto",  "steel",       "ice",  2025,           1,                1,            "BF",               TRUE
)

scenario <- tibble::tribble(
  ~scenario,  ~sector,  ~region, ~year, ~emission_factor,           ~emission_factor_unit, ~scenario_source,
  "b2ds", "cement", "global",  2025,                1, "tons of CO2 per ton of cement",      "demo_2020",
  "b2ds", "cement", "global",  2026,                2, "tons of CO2 per ton of cement",      "demo_2020",
  "b2ds",  "steel", "global",  2025,                1, "tons of CO2 per ton of cement",      "demo_2020",
  "b2ds",  "steel", "global",  2026,                2, "tons of CO2 per ton of cement",      "demo_2020"
)

out <- target_sda(
  matched,
  ald,
  scenario,
  region_isos = region_isos_demo
)

out %>% filter(emission_factor_metric == "target_b2ds")
#> # A tibble: 4 × 6
#>   sector  year region scenario_source emission_factor_metric emission_factor_va…
#>   <chr>  <dbl> <chr>  <chr>           <chr>                                <dbl>
#> 1 cement  2025 global demo_2020       target_b2ds                              1
#> 2 steel   2025 global demo_2020       target_b2ds                             NA
#> 3 cement  2026 global demo_2020       target_b2ds                              2
#> 4 steel   2026 global demo_2020       target_b2ds                             NA

Created on 2022-04-25 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
1 participant