-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
technology_share
for target_*
is now properly calculated and weighted
#294
technology_share
for target_*
is now properly calculated and weighted
#294
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I have a few questions you may consider and merge.
R/summarize_weighted_production.R
Outdated
add_technology_share_target <- function(data) { | ||
crucial <- c("production_target", "sector_ald", "year", "technology") | ||
|
||
check_crucial_names(data, crucial) | ||
walk_(crucial, ~ check_no_value_is_missing(data, .x)) | ||
|
||
data %>% | ||
group_by(.data$sector_ald, .data$year, .data$scenario, .data$name_ald) %>% | ||
mutate(technology_share_target = .data$production_target / sum(.data$production_target)) %>% | ||
group_by(!!!dplyr::groups(data)) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this more or less duplicates add_technology_share
. Good candidate for refactoring
if (weight_production) { | ||
data <- summarize_weighted_production( | ||
data, | ||
!!!rlang::syms(summary_groups), | ||
use_credit_limit = use_credit_limit | ||
) | ||
} else { | ||
data <- summarize_unweighted_production( | ||
data, | ||
!!!rlang::syms(summary_groups) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now happens AFTER adding the targets. (See line 231). I also had to expand the summarize_weighted_production
function, as it now needs to implement something different.
A lot of WET code here unfortunately.
R/target_market_share.R
Outdated
data <- data %>% | ||
ungroup() %>% | ||
add_loan_weight(use_credit_limit = use_credit_limit) %>% | ||
add_technology_share() %>% | ||
add_technology_share_target() %>% | ||
calculate_weighted_loan_metric("production") %>% | ||
calculate_weighted_loan_metric("technology_share") %>% | ||
calculate_weighted_loan_metric("production_target") %>% | ||
calculate_weighted_loan_metric("technology_share_target") %>% | ||
group_by( | ||
.data$sector_ald, | ||
.data$technology, | ||
.data$year, | ||
!!!rlang::syms(summary_groups) | ||
) %>% | ||
summarize( | ||
weighted_production = sum(.data$weighted_loan_production), | ||
weighted_technology_share = sum(.data$weighted_loan_technology_share), | ||
weighted_production_target = sum(.data$weighted_loan_production_target), | ||
weighted_technology_share_target = sum(.data$weighted_loan_technology_share_target) | ||
) %>% | ||
# Restore old groups | ||
group_by(!!!dplyr::groups(data)) | ||
} else { | ||
data <- data %>% | ||
select(-c( | ||
.data$id_loan, | ||
.data$loan_size_credit_limit, | ||
.data$loan_size_outstanding | ||
)) %>% | ||
distinct() %>% | ||
group_by(.data$sector_ald, .data$technology, .data$year, !!!rlang::syms(summary_groups)) %>% | ||
# FIXME: Confusing: `weighted_production` holds unweighted_production? | ||
summarize( | ||
weighted_production = .data$production, | ||
weighted_production_target = .data$production_target, | ||
.groups = "keep" | ||
) %>% | ||
ungroup(.data$technology) %>% | ||
mutate( | ||
weighted_technology_share = .data$weighted_production / sum(.data$weighted_production), | ||
weighted_technology_share_target = .data$weighted_production_target / sum(.data$weighted_production_target) | ||
) %>% | ||
group_by(!!!dplyr::groups(data)) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These dplyr chains are essentially summarize_weighted_production
and summarize_unweighted_production
, but expanded. The different is that they now need to summarize both the technology_share
and technology_share_target
. Unfortunately, I couldn't change summarize_weighted_production
itself that much, as it is an exported function.
One possibility is to add an argument to summarize_weighted_production
that allows you to include targets or not. @maurolepore happy to explore options.
technology = c("electric", "ice", "electric", "ice", "electric", "ice", "electric", "ice"), | ||
year = c(2020, 2020, 2021, 2021, 2020, 2020, 2021, 2021), | ||
tmsr = c(1, 1, 1.85, 0.6, 1, 1, 1.85, 0.6), | ||
smsp = c(0, 0, 0.34, -0.2, 0, 0, 0.34, -0.2) | ||
technology = c("electric", "ice", "electric", "ice"), | ||
year = c(2020, 2020, 2021, 2021), | ||
tmsr = c(1, 1, 1.85, 0.6), | ||
smsp = c(0, 0, 0.34, -0.2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a poorly written test, since the scenario was effectively duplicated.
That said, perhaps it's good to write a test to see what happens if the scenario inputs duplicated scenarios?
I think the function should pass an error if this is the case, as I wouldn't know how to handle that otherwise.
Thanks @jdhoffa for your comments. I see a bunch of things we should follow up on and I worry that left only here they will fall through the cracks. I suggest we do either of these:
The order reflects my preference. You still have the context in your head and getting it done now seems the cheapest. This issue has already waited for a long time, and the number of impacted users I believe was low. |
Happy to do them now, except unfortunately I am in calls all afternoon, and tomorrow is a holiday, so I will have to get to it on Friday. |
R/summarize_weighted_production.R
Outdated
summarize_weighted_production_(data, ..., use_credit_limit = use_credit_limit, add_targets = FALSE) | ||
} | ||
|
||
summarize_weighted_production_ <- function(data, ..., use_credit_limit = FALSE, add_targets = FALSE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added this new internal function summarize_weighted_production_
which contains the argument add_targets
. I then call summarize_weighted_production
with default values for that internal function (this is to preserve the existing arguments of summarize_weighted_production
).
@maurolepore let me know what you think of this solution as a way to avoid WET code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R/summarize_weighted_production.R
Outdated
# FIXME: Confusing: `weighted_production` holds unweighted_production? | ||
summarize(weighted_production = .data$production, .groups = "keep") %>% | ||
ungroup(.data$technology, .data$tmsr, .data$smsp) %>% | ||
mutate(weighted_technology_share = .data$weighted_production / sum(.data$weighted_production)) %>% | ||
group_by(!!!dplyr::groups(data)) | ||
|
||
if (add_targets) { | ||
data %>% | ||
summarize( | ||
weighted_production = .data$production, | ||
weighted_production_target = .data$production_target, | ||
.groups = "keep" | ||
) %>% | ||
ungroup(.data$technology) %>% | ||
mutate( | ||
weighted_technology_share = .data$weighted_production / sum(.data$weighted_production), | ||
weighted_technology_share_target = .data$weighted_production_target / sum(.data$weighted_production_target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The #FIXME comment is now misplaced. Maybe we can remove it and instead write a comment explaining why we use the "weighted_" prefix in this function? Or change the prefix for a temporary one?
More accurate name, since this flag doesn't actually add anything just checks to see if targets are already there or not.
Exploring the failing CI checks right now. It seems they are on older R releases only? |
@@ -300,7 +300,7 @@ target_market_share <- function(data, | |||
ald_with_benchmark <- calculate_ald_benchmark(ald, region_isos, by_company) | |||
|
|||
data %>% | |||
rbind(ald_with_benchmark) %>% | |||
dplyr::bind_rows(ald_with_benchmark) %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well hot diggity!
The
target_*
values oftechnology_share
should be calculated as follows (in line with the methodology):1 - Calculate the
production_target
for each company (without weighting)2 - Calculate the
technology_share_target
for each company (without weighting)3 - Apply the portfolio_weight to each of
production
,technology_share
,production_target
andtechnology_share_target
This has been achieved.
@maurolepore this is an MVP solution, as this has been an open bug for some time. Please review (and note that I recognize that some of my solutions are not so elegant).
Also @georgeharris2deg if you could please have a look at this it would be much appreciated.
I would rather we get the solution out the door, and we can refactor later.
Closes #277