Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weights are treated as targets when using area budget #323

Open
edwardsmarc opened this issue May 31, 2023 · 15 comments
Open

Weights are treated as targets when using area budget #323

edwardsmarc opened this issue May 31, 2023 · 15 comments
Labels
bug Something isn't working Weights

Comments

@edwardsmarc
Copy link
Contributor

Problem

When using an area budget, if weights are selected they get added as additional targets. The weight layer is rescaled between 0-100 and the target is calculated as the sum of the weight values * the weight proportion from the UI slider. This is repeated for each selected positive weight (negative weights are dealt with using linear_constraints).

The solver is then treating Themes and Weights as if they are equal during the optimization. This results in solutions where PUs are selected to meet the Weight target even when the Theme target is still not met.

Example

Here's a very simple example where a binary species target is set to 100% (yellow square) and the KBA layer (green areas) is set as a Weight with value 100:
image

When running with no area budget, the result is as expected. The full species range is selected in the solution, weights are ignored because they don't overlap the species range:
image

When running with an area budget of 1%, the expected result would be that the entire 1% budget should overlap the species range. But as shown below, it gets divided between the species and the weight (note that the striations inside the species range are expected because WTW has no way to prioritize PUs within that area):
image

Solution

It seems like penalties (e.g. add_linear_penalites()) would be preferable by adding a secondary objective to the prioritization. But maybe this would make the problem too large and slow down processing?

@edwardsmarc edwardsmarc added bug Something isn't working help wanted Extra attention is needed labels May 31, 2023
@ricschuster
Copy link
Contributor

@edwardsmarc I really like the way to put together these problem statements!

@jeffreyhanson could you please have a look at this and share your thoughts here?

@jeffreyhanson
Copy link
Contributor

Hi,

Thank you very much for raising this point @edwardsmarc, and for providing such a detailed deep dive on the behavior!

Yeah, you're exactly right, by treating the weights as targets, this causes the solution output to select areas that wouldn't normally be selected to improve representation of the themes/features. Yeah, reformulating the weights using the linear penalties approach could be one way to acheive this. The trick is that the linear penalties need a scaling factor (the penalty argument) -- that says how important is the primary objective (i.e., minimizing target shortfalls) compared to the penalty (e.g., minimizing the sum of the inverse weight values?) -- which is really difficult to calibrate (i.e., find a value that provides a good compromise) because the value can be anywhere between 0 and infinity (though tends to be really small numbers like 1e-12). The reason why I originally used a target-based approach for the weights was that it meant we could use percentage-based target, which makes calibrating the trade-off values a lot easier.

How does that sound?

@edwardsmarc
Copy link
Contributor Author

Thanks for the explanation @jeffreyhanson!

That decision makes sense and I agree that the penalty argument would be very challenging to implement. I think in most of our prioritizations there are going to be a large number of themes and a small number of weights which will dilute this issue.

One concern I still have is that weights are treated different when using an area-budget compared to the min-set problem. I think this will be quite confusing to users. With no area budget, the solution will focus on targets but gravitate towards different PUs based on their cost (which is calculated from the weights), but with an area budget the weights become targets and the solution behavior is different.

I don't really have a solution to this but maybe it's fine to just document it and communicate the behavior to users.

@jeffreyhanson
Copy link
Contributor

Yeah, sorry, I can't think of an easy fix to this issue. I suppose it might be possible to use the weights to update the feature data (i.e. the rij data) some how so that greater emphasis is placed on planning units with a greater weight value? But that might cause issues with the goal setting.

@edwardsmarc
Copy link
Contributor Author

One more Q for you @jeffreyhanson.

What would happen if we used the min_set approach with the weights layer as the cost (i.e. the no area budget method), but then added a linear_contraint for the area budget (by providing the PU areas to add_linear_constraints())? How does the min_set_objective manage the target shortfall when it can't meet all targets?

@jeffreyhanson
Copy link
Contributor

Yeah, the min set approach treats the targets as "hard" constraints -- meaning they must be met by the solutino. So if you formulated a problem where it was impossible to meet the targets, then the solver would produce an error saying that the problem is infeasible (i.e., meaning there's no feasible/valid solution).

@edwardsmarc
Copy link
Contributor Author

We could consider clipping the Weights to the set of planning units with non-zero Theme values. That would prevent any planning units being selected that do not contribute to at least one Theme, and it would remove the confusing behavior in my example at the top of this issue.

Under this proposal, Weights would only influence the final solution if they overlap with Themes, which is closer to the min set behavior.

Realistically, with large problems there are unlikely to be many PUs without any Theme values. But at least it will make more logical solutions when using fewer Themes.

@edwardsmarc
Copy link
Contributor Author

edwardsmarc commented Jun 12, 2023

We could consider clipping the Weights to the set of planning units with non-zero Theme values. That would prevent any planning units being selected that do not contribute to at least one Theme, and it would remove the confusing behavior in my example at the top of this issue.

Under this proposal, Weights would only influence the final solution if they overlap with Themes, which is closer to the min set behavior.

Realistically, with large problems there are unlikely to be many PUs without any Theme values. But at least it will make more logical solutions when using fewer Themes.

Solution here would be to change the following code in fct_min_set and fct_min_shortfall from:

initial_pu_idx <- which(
      Matrix::colSums(theme_data) > 0 |
      Matrix::colSums(weight_data) > 0 |
      Matrix::colSums(include_data) > 0 |
      Matrix::colSums(exclude_data) > 0  
    )

to

initial_pu_idx <- which(
      Matrix::colSums(theme_data) > 0 |
      Matrix::colSums(include_data) > 0 |
      Matrix::colSums(exclude_data) > 0  
    )

so only PUs with themes, includes or excludes are considered in the solution.

Would need to check this does not have any negative impact on the second prioritizr call when clustering is requested.

@edwardsmarc edwardsmarc added Weights and removed help wanted Extra attention is needed labels Jul 10, 2023
@edwardsmarc
Copy link
Contributor Author

This is more complicated than I suggested with the above fix. Currently initial_pu_idx is set to all PUs with data, regardless of whether the themes are selected and have goals > 0. So my above fix fails because, while it limits initial_pu_idx to the PUs with Themes, it includes all Themes and not just the Themes with Goals. So the solution can still include PUs that are not contributing to any Goals.

Fix needs to limit initial_pu_idx to only include the PUs that have values > 0 and a Goal > 0.

@edwardsmarc
Copy link
Contributor Author

Solution also needs to adjust Weight targets so the sum of the weights is calculated for the initial_pu_idx PUs, not for the full Weight layer as it currently does.

Otherwise the Goal for the Weight will be greater than the amount possible in the solution (which is limited to the initial_pu_idx PUs).

@edwardsmarc
Copy link
Contributor Author

edwardsmarc commented Aug 4, 2023

Fix has been implemented in https://github.com/NCC-CNC/wheretowork/tree/issue-%23323-Weights and anecdotally works. For a simple problem with one theme and one weight, the min_set and min_shortfall solutions are now exactly the same.

  • Solutions using area budgets are now limited to the PUs that have active Theme Goals and positive values, as well as any Includes or Excludes. This prevents solutions containing PUs that only have Weight values.

  • The Goals for Weights (remember that Weights are treated as additional features with Goals when using an area budget) are now set using the filtered PUs, not the full set of Weight values. So if only half the PUs are being passed to prioritizr, the Weight Goal is calculated as the sum of Weight values for the filtered PUs multiplied by the user provided weight factor.

More testing is needed though and we'll want to add some unit tests.

Users have expressed concern about using Weights as Goals when area budget is applied. I'd like to do some comparisons of how different Weight factors influence a solution when using no area-budget vs. area budget. Any findings should be added to #342.

@edwardsmarc
Copy link
Contributor Author

After discussing this Weights implementation with some users, they expressed that it would be useful if the targets for the Weights were given lower priority than the Themes.

i.e. can the Theme goals be given a higher priority in the solution than the Weights goals such that the min_shortfall calculation will attempt to meet the Theme goals first, then address the Weight goals.

@jeffreyhanson I'm wondering if this can be achieved with add_feature_weights()? How are weights applied in this add_feature_weights()? (I traced the code into the rcpp function but couldn't find a mathematical explanation for how the weights effect the objective function).

@jeffreyhanson
Copy link
Contributor

Yeah that's exactly correct - the add_feature_weights() function could be used to achieve this. If using the min shortfall objective, the objective function is to minimize the target shortfalls expressed as a proportion and, when using feature weights, these proportionate-based target shortfalls are multiplied by the weight values. This means that larger shortfalls for features with higher weights are considered much worse. Rather than strictly hard-coding the weights to have lower priority than themes, I wonder if it's worth thinking about some way/method such that users can input the relative importance of weights compared to themes?

One approach could be to try updating the app so users could specify weight values for each Theme and Weight? This might be a bit complicated to implement because you'd probably need to add new sliders for each and every Theme/Weight, and possibly update the parameter file format to store this information.

Another approach could be to add a new parameter to the app that expresses the relative weight value for Themes and Weights? E.g., this parameter ranges between 0 and 1. If 0 then all Weights have 0.01 weight value and Themes have a 1 weight value; if 1 then all Weights have a 1 weight value and all Themes have a 0.1 Weight value, and if 0.5 (the default) then all Weights and Themes have a 0.5 weight value. Note I suggest a weight of 0.01 so that Weights/Themes have some tiny influence - which is useful in case the budget is so high that the targets for the higher weighted Themes/Weights can all be met.

Or maybe there's a better way to implement this? What do do you think?

@edwardsmarc
Copy link
Contributor Author

Thanks @jeffreyhanson for the confirmation and explanation of how the weights are applied!

I really like the idea of providing a slider for the relative weight values. I could see it being useful in 2 cases:

  1. Where users want the weights to be a secondary objective that only start influencing the solution once their primary theme goals have been addressed,
  2. In the flip case where users want to start building their solution from the most valuable weight PUs. This could especially useful at low area budgets.

The obvious concern here is that adding more complexity will confuse users.

I think the best approach is to gather further feedback from users once the fix on my branch is implemented. That fix (masking out PUs that don't have themes, includes or excludes) might be enough to get reasonable Weights behavior. If not, or if users want more control in the future, then we have add_feature_weights() in the back pocket as a future change.

@jeffreyhanson
Copy link
Contributor

No worries! Yeah that all sounds great. I definitely agree about avoiding uneeded complexity and getting feedback from users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Weights
Projects
None yet
Development

No branches or pull requests

3 participants