Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weights - should area be represented in cost under min set #329

Open
edwardsmarc opened this issue Jul 10, 2023 · 5 comments
Open

Weights - should area be represented in cost under min set #329

edwardsmarc opened this issue Jul 10, 2023 · 5 comments
Labels
enhancement New feature or request Weights

Comments

@edwardsmarc
Copy link
Contributor

The current behavior in WTW when no-area budget is used (i.e. the min set objective) is as follows:

  • Each weight dataset (with a weighting factor set) is rescaled from 1-0 (for positive weighting factors) or 0-1 (for negative weighting factors)
  • Each rescaled weight dataset is then multiplied by its (absolute) weighting factor (between 0-100)
  • All weights are summed to get one cost value per PU
  • Costs are rescaled from 0 – 1000
  • If no weights are active, all costs set to 1

This works fine when PUs are raster cells that all have the same area.

If PUs are polygons with different areas, is it appropriate that cost is constant? Or should cost in this case default to be the area of the planning unit? Using area would match the area-budget behavior (under min_shortfall where PU area is used to satisfy the budget).

If we switch to defaulting to area, we would also need to incorporate area into the cost values when Weights are present. Maybe area becomes an additional Weight layer that gets summed into the final Weight values.

@edwardsmarc edwardsmarc added enhancement New feature or request Weights labels Jul 10, 2023
@jeffreyhanson
Copy link
Contributor

jeffreyhanson commented Jul 10, 2023

Hi,

I guess it depends on how the vector (polygon) data are prepared. If the person preparing the vector data calculated the weight values by overlaying the polygons with a raster and calculating the sum of the overlapping values, then I don't think it would be correct to try and account for area in the WTW tool (e.g., multiply weight/cost values by area) because the sum calculations would already take care of this. However, if the person preparing the vector data calculated the weight values by calculating the mean of the overlapping values, then I agree it would be important to try and take into account the area of the polygons in the WTW tool (e.g., multiplying the weight/cost values by area).

So, I would recommmend that people prepare the weight data using the sum of overlapping values, so then the WTW tool doesn't risk making incorrect assumptions about how the data are generated? How does sound?

Also, regarding the constant costs: yeah, that sounds like a good idea to scale the constant costs according to the area of each polygon. But since the weight data should already take this into account (by calculating the sum of overlapping cells), I don't think this is needed when using weights.

@edwardsmarc
Copy link
Contributor Author

Thanks @jeffreyhanson, great point about the sum vs avg. I'm pretty sure we're using sums but maybe @DanWismer can check and confirm since he's done some data prep for projects using polygons.

@DanWismer
Copy link
Contributor

DanWismer commented Jul 11, 2023

All things data prep, we make sure that vector datasets are dissolved before extracting the area sum that intersects the grid. A good sanity check is to make sure extracted values are not greater than the grid size (when it comes to area).

@edwardsmarc
Copy link
Contributor Author

Thanks @DanWismer - so if I'm understanding correctly, given that weights are summed for each PU, if we have PUs of different sizes, the final summed values already take into account the area of the PUs.

Our decision is therefore to change the default cost values of PUs (when no Weights are selected) to be the PU areas. But not change the existing behavior when Weights are selected since the data-prep workflow already accounts for PU area.

Actions

  1. Edit default behavior here
  2. Document this in the vignette (Vignette - wtw theory #326)
  3. Document this in wtw-data-prep - specifically that Weights should account for PU area when being summarized into PUs.

@edwardsmarc
Copy link
Contributor Author

Updating the required actions here.

Let's just record the behavior in the WTW theory vignette: polygon PUs are assigned a cost of 1 by default so each PU has an equal chance of being selected. This will probably result in larger PUs being preferred because they're more likely to have higher feature values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Weights
Projects
None yet
Development

No branches or pull requests

3 participants