Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/refactor impact calc #436

Merged
merged 136 commits into from
Jul 15, 2022
Merged

Conversation

chahank
Copy link
Member

@chahank chahank commented May 3, 2022

This is a draft pull request to refactor the impact class. It now consists of two classes: Impact and ImpactCalc. The Impact class has a proper __init__. Back-compatibility is 100% kept so far.

# since the original exposures object will be "spoiled" through assign_centroids
# (a column cent_XY is added), overwrite=True makes sure the exposures can be reused
# for ImpactCalc with another Hazard object of the same hazard type.
self.exposures.assign_centroids(self.hazard, overwrite=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you change that to True? This is a bad idea as this results in a lot of extra computation time, often for nothing. A typical workflow implies that the Exposures object given to ImpactCalc already has assigned centroids. This line here is just in case that none are defined (which is a sub-optimal workflow, but needs to stay due to backwards compatibility).

Thus, this should be reversed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then we have the following problem:

agg1 = ImpactCalc(exp, impf, haz1).impact().aai_agg
agg2 = ImpactCalc(exp, impf, haz2).impact().aai_agg
print(agg1 - agg2)

does i.g. not print the same as

agg2 = ImpactCalc(exp, impf, haz2).impact().aai_agg
agg1 = ImpactCalc(exp, impf, haz1).impact().aai_agg
print(agg1 - agg2)

Simply, because in the first example agg2 is likely wrong and in the second agg1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point Emanuel, I remember that I stumbled upon this before!

I would be in favor of moving the whole centroids assignment procedure to the ImpactCalc object because semantically it's not a property of the Exposures object, but of the combination of an exposure with a hazard. However, I see the problem that it might be desirable to be able to reuse the same assignment with several different Hazard objects that share the same Centroids... And of course, there is the problem of backwards compatibility when people are explicitly using the Exposures.assign_centroids API.

Maybe the easiest solution is to leave the API as it is, but have a boolean keyword argument reassign_centroids=True for ImpactCalc.impact() so the user can opt out of the reassignment if that's desired?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too actually, I'd favor moving the centroids assignments out of Exposures. Also I'm not yet ready to go for the easiest solution, as this looks like it may be crucial.

Two questions:

  • What is a typical workflow?
  • When two Hazard objects share the same Centroids, are they "aware" of it, i.e., do they share the same Centroids object or do their Centroids just have the same content and only the user knows that they are the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rather complex discussion that will be very hard to follow clearly on a chat forum like here. But, a few remarks:

  • the current API and workflow is clear: the user is responsible for making sensible hazard-exposures-impact functions assignments. This includes assigning for a given hazard type the centroids to the exposures. The same goes true for assigning the correct impact functions to the exposures points. We can modify this of course, but this would be a major change in API and workflow
  • we 100% do NOT want to assign centroids for each impact computation as this is a step that needs to taken once only per exposures-hazard pair. We do not want to repeat this rather expensive operation (e.g. in the uncertainty computations).
  • two hazards are not aware that they have the same centroids. It is unclear to me how that would work, as the idea is that any user can define their hazard (and thus their centroids) from whatever source they want and bring it into CLIMADA for computation.
  • My opinion would be that actually the assign_centroids is entirely removed from the Impact.calc as it is an under the hood assignment which we would like the user to have control over (for instance, does one always want to use nearest-neighbors?). This way, the user must think and prepare consistent data.

I also agree that a quick solution is not optimal. The current solution is the one that keeps the same convention as in the Impact.calc() method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to decouple the two discussion. This PR is about making the impact class more readable, more modular, better tested and more efficient. It is not about changing the inner workings of the API fundamentally as is here suggested with the change with the assign_centroids.

@emanuel-schmid emanuel-schmid merged commit 3eefe1e into develop Jul 15, 2022
@emanuel-schmid emanuel-schmid deleted the feature/refactor_impact_calc branch July 15, 2022 12:06
@emanuel-schmid emanuel-schmid mentioned this pull request Aug 4, 2022
6 tasks
@emanuel-schmid emanuel-schmid mentioned this pull request Sep 8, 2022
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants