-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/refactor impact calc #436
Conversation
turn n_exp_pnt and n_events into properties set cover and deductible to None if not set (instead of an empty array)
climada/engine/impact_calc.py
Outdated
# since the original exposures object will be "spoiled" through assign_centroids | ||
# (a column cent_XY is added), overwrite=True makes sure the exposures can be reused | ||
# for ImpactCalc with another Hazard object of the same hazard type. | ||
self.exposures.assign_centroids(self.hazard, overwrite=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you change that to True? This is a bad idea as this results in a lot of extra computation time, often for nothing. A typical workflow implies that the Exposures object given to ImpactCalc already has assigned centroids. This line here is just in case that none are defined (which is a sub-optimal workflow, but needs to stay due to backwards compatibility).
Thus, this should be reversed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Then we have the following problem:
agg1 = ImpactCalc(exp, impf, haz1).impact().aai_agg
agg2 = ImpactCalc(exp, impf, haz2).impact().aai_agg
print(agg1 - agg2)
does i.g. not print the same as
agg2 = ImpactCalc(exp, impf, haz2).impact().aai_agg
agg1 = ImpactCalc(exp, impf, haz1).impact().aai_agg
print(agg1 - agg2)
Simply, because in the first example agg2
is likely wrong and in the second agg1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point Emanuel, I remember that I stumbled upon this before!
I would be in favor of moving the whole centroids assignment procedure to the ImpactCalc object because semantically it's not a property of the Exposures object, but of the combination of an exposure with a hazard. However, I see the problem that it might be desirable to be able to reuse the same assignment with several different Hazard objects that share the same Centroids... And of course, there is the problem of backwards compatibility when people are explicitly using the Exposures.assign_centroids
API.
Maybe the easiest solution is to leave the API as it is, but have a boolean keyword argument reassign_centroids=True
for ImpactCalc.impact()
so the user can opt out of the reassignment if that's desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too actually, I'd favor moving the centroids assignments out of Exposures. Also I'm not yet ready to go for the easiest solution, as this looks like it may be crucial.
Two questions:
- What is a typical workflow?
- When two Hazard objects share the same Centroids, are they "aware" of it, i.e., do they share the same Centroids object or do their Centroids just have the same content and only the user knows that they are the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a rather complex discussion that will be very hard to follow clearly on a chat forum like here. But, a few remarks:
- the current API and workflow is clear: the user is responsible for making sensible hazard-exposures-impact functions assignments. This includes assigning for a given hazard type the centroids to the exposures. The same goes true for assigning the correct impact functions to the exposures points. We can modify this of course, but this would be a major change in API and workflow
- we 100% do NOT want to assign centroids for each impact computation as this is a step that needs to taken once only per exposures-hazard pair. We do not want to repeat this rather expensive operation (e.g. in the uncertainty computations).
- two hazards are not aware that they have the same centroids. It is unclear to me how that would work, as the idea is that any user can define their hazard (and thus their centroids) from whatever source they want and bring it into CLIMADA for computation.
- My opinion would be that actually the
assign_centroids
is entirely removed from theImpact.calc
as it is an under the hood assignment which we would like the user to have control over (for instance, does one always want to use nearest-neighbors?). This way, the user must think and prepare consistent data.
I also agree that a quick solution is not optimal. The current solution is the one that keeps the same convention as in the Impact.calc()
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion would be to decouple the two discussion. This PR is about making the impact class more readable, more modular, better tested and more efficient. It is not about changing the inner workings of the API fundamentally as is here suggested with the change with the assign_centroids
.
The exposures index from the impact matrix generator was indexed for the reduced exposures only, instead of the total exposures. This leads to wrong indexing.
# Conflicts: # climada/engine/impact.py # climada/entity/impact_funcs/base.py
This is a draft pull request to refactor the impact class. It now consists of two classes:
Impact
andImpactCalc
. TheImpact
class has a proper__init__
. Back-compatibility is 100% kept so far.