Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/refactor impact calc #436

Merged
merged 136 commits into from
Jul 15, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
136 commits
Select commit Hold shift + click to select a range
aa90de3
Add methods to set imp vars from imp mat
Jan 18, 2022
06f9403
Cosmetics
Jan 18, 2022
41ad79b
Change set to calc
Jan 19, 2022
2984b94
Merge remote-tracking branch 'origin/develop' into feature/set_imp_ma…
Mar 14, 2022
e9d0fd7
Ordering the code
Mar 16, 2022
a15b2df
Make init
Mar 18, 2022
7f18c4d
Add haz_type as property
Apr 11, 2022
c6070de
Add centroid columns name in exposures as property
Apr 11, 2022
d4820f3
Add overwrite variable to assign_centroids
Apr 11, 2022
f49d492
Add method affect_values_gdf
Apr 11, 2022
f8bf6b0
Remove warning impf at int=0
Apr 11, 2022
48ca4d7
First commit - overhaul impact
Apr 11, 2022
51fdbb6
Add insurance layers
Apr 12, 2022
ad093af
Add methods get mdr, paa, fraction
Apr 12, 2022
9985414
Change sum to np.sum
Apr 12, 2022
3f5f402
Add affect_total_value
Apr 12, 2022
a2aa1e7
Reduce gdf to minimum necessary
Apr 12, 2022
b172eed
Correct exposure gdf column selection
Apr 12, 2022
8567c5d
Update docstrings
Apr 13, 2022
53a2bc1
Fix bug at_event / eai_exp
Apr 13, 2022
69ac247
Refactoring of methods to simplify code
Apr 13, 2022
1a028ff
Use generator instead of list
Apr 14, 2022
0bac4fc
Merge branch 'develop' into feature/refactor_impact_calc
Apr 14, 2022
7fc364d
Impact: improve performance by avoiding pandas operations
Apr 21, 2022
d694e5e
Add notimplementederror for mdf(0)
Apr 21, 2022
05ccc9a
Toggle logger warning for speed test
Apr 21, 2022
5a777af
Fetch only assigned impf
Apr 21, 2022
337e24e
Return the reduced exp indices only
Apr 21, 2022
bb8b457
Remove 0 from insured impacts
Apr 21, 2022
54df06e
Stich imp mat from generator
Apr 21, 2022
738108b
Rename impat_mat_list to _gen
Apr 21, 2022
d22ce45
Avoid generator to list conversion in insured risk
Apr 21, 2022
a6703c9
Rename exp_iimp
Apr 21, 2022
4cb3460
Add impact_calc class
May 3, 2022
7457e3a
Correct function signature
May 3, 2022
075836a
Adjust class name
May 3, 2022
31845f9
Correct indent
May 3, 2022
e2384d4
Typo: stich -> stitch
May 9, 2022
af831fd
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid May 9, 2022
2e36267
PEP8
emanuel-schmid May 9, 2022
07d12bc
impact: pydoc cosmetics
emanuel-schmid May 18, 2022
ed9b013
Include all helper methods in ImpactCalc
May 18, 2022
0cc03bb
Merge remote-tracking branch 'origin/feature/refactor_impact_calc' in…
May 18, 2022
a9458e3
Remove unused module
May 19, 2022
6bec23a
Improve cosmetics
May 19, 2022
6abd902
Improve docstring
May 19, 2022
6c69ce6
Import new class in init
Jun 2, 2022
414b369
Compute insured impact from existing imp_mat if available
Jun 2, 2022
8790eb4
Make two test files
Jun 8, 2022
82460c4
Add _return_impact method
Jun 9, 2022
057e719
Return impact object
Jun 9, 2022
e271925
Make risk_metrics a class method
Jun 9, 2022
a84d2fe
Separate Impact and ImpactCalc
Jun 9, 2022
bcf1590
Add deprecation warnings
Jun 9, 2022
c462290
Return min exp gdf instead of df
Jun 9, 2022
425cebf
Improve cover and deductible
Jun 9, 2022
15b7d57
Rename imp_mat to mat
Jun 9, 2022
bd7306e
Add tests for most exposed methods
Jun 9, 2022
41eb14e
Remove calc tests
Jun 9, 2022
c408414
Add test overwrite centroids flag
Jun 9, 2022
4598f13
Remove warning for impf nonzero at zero
Jun 9, 2022
3b34df9
Add docstrings
Jun 9, 2022
a32de31
Update test precision to account for different floating number
Jun 10, 2022
7f81052
Fix bug in insure mat
Jun 10, 2022
7c0e86f
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jun 10, 2022
e0a6a0a
Make 'imp_mat' an empty matrix by default
peanutfun Jun 10, 2022
18609c6
Merge branch 'feature/refactor_impact_calc' of https://github.com/CLI…
peanutfun Jun 10, 2022
90536e4
Avoid mutables in Impact and ImpactCalc signatures
peanutfun Jun 10, 2022
a54eb33
Add test case for ImpactCalc.impact_matrix
peanutfun Jun 10, 2022
84f8245
Add input attribute consistency checks to init
Jun 11, 2022
0d17a90
Add unfinished impact init test
Jun 11, 2022
bd8cb31
Merge remote-tracking branch 'origin/feature/refactor_impact_calc' in…
Jun 11, 2022
660dbc2
Set default imp_mat to None
Jun 11, 2022
5cec3fd
Merge remote-tracking branch 'origin/feature/refactor_impact_calc' in…
Jun 11, 2022
01e4e27
Update init consistency tests
Jun 11, 2022
963a65f
Add from_eih test
Jun 11, 2022
76cbf9a
Add test risk transfer and residual risk
Jun 11, 2022
eb5eba1
Add test impact_per_year
Jun 11, 2022
e984543
Cosmetics
Jun 11, 2022
1238cdb
Remove unsued method
Jun 13, 2022
8e25d09
Add test for affected value
Jun 13, 2022
f8b5376
Add property tests
Jun 13, 2022
597bdd7
Add test get_mdr
Jun 14, 2022
3d3774a
Add test get paa
Jun 14, 2022
7792e7c
Add test get fraction
Jun 14, 2022
87cda77
Add docstring
Jun 14, 2022
cc353e6
Add chunk method
Jun 15, 2022
b33ba54
Update name cent to centr
Jun 15, 2022
10f68e9
Update docstring
Jun 15, 2022
1183d6f
Use np.array_split for chunks
Jun 15, 2022
246267b
Update naming
Jun 15, 2022
a60c0b6
Simplify argument chunker
Jun 15, 2022
670fcb1
Add test case for ImpactCalc.imp_mat_gen
peanutfun Jun 15, 2022
3b35a3e
Add test for ImpactCalc.stitch_impact_matrix
peanutfun Jun 15, 2022
ffd0804
Add test for ImpactCalc.apply_deductible_to_mat
peanutfun Jun 15, 2022
6fe8611
Fix bug and formatting in ImpactCalc.imp_mat_gen
peanutfun Jun 15, 2022
4a2dac9
Add test with fraction not equal 1
Jun 15, 2022
c263c50
Improve error message in impact matrix chunking
peanutfun Jun 16, 2022
f3fe9d8
Merge branch 'feature/refactor_impact_calc' of https://github.com/CLI…
peanutfun Jun 16, 2022
78c9be4
test_impact_calc: persistently revert the CONFIG changes in tearDown
emanuel-schmid Jun 16, 2022
2a1122f
Replace instances in CONFIG instead of mocking
peanutfun Jun 16, 2022
8a699f8
Add test for ImpactCalc.stitch_risk_metrics and update docstring
peanutfun Jun 16, 2022
d43c36b
Add deepcopy import to test_impact_calc
peanutfun Jun 16, 2022
6ed55b0
Add test for ImpactCalc.insured_mat_gen and update docstring
peanutfun Jun 16, 2022
96924ea
Add test for ImpactCalc.impact_matrix
peanutfun Jun 16, 2022
b8a39fd
Add TestInsuredImpactMatrixGenerator to test loader
peanutfun Jun 16, 2022
2e21f9c
Add test for ImpactCalc._return_matrix
peanutfun Jun 17, 2022
d5341f3
Update docstring
Jun 20, 2022
763af8b
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jun 20, 2022
d386198
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jun 20, 2022
4eead86
pydoc cosmetics
emanuel-schmid Jun 20, 2022
a2c773d
pydoc cosmetics
emanuel-schmid Jun 20, 2022
1c318a4
pydoc consolidation
emanuel-schmid Jun 20, 2022
bb728ff
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jun 21, 2022
3ee149c
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jun 23, 2022
e4daf2b
Improve attribute error message from_eih
Jun 24, 2022
1f5322e
Update from_eih tests
Jun 24, 2022
8c393c9
impact_calc.ImpactCalc: remove imp_mat attribute
emanuel-schmid Jun 24, 2022
58449fd
ImpactCalc.impact: re-establish save_mat argument
emanuel-schmid Jun 24, 2022
55e5ae7
Set assigned_centroids overwrite=False
Jun 24, 2022
7d4e5f6
(Re)-Set ignore zero to False
Jun 27, 2022
1aa5581
Update deprecation warning message
Jun 27, 2022
aaf2f93
Correct bug wrong exp_idx
Jun 27, 2022
f59e04d
Make new index in case the index is not ordered
Jun 27, 2022
bf3a813
Slice numpy instead of gdf
Jun 27, 2022
7c910d0
Add _orig_exp_idx to tests
Jun 28, 2022
24a77ad
Add tests with Flood
Jun 28, 2022
3aea175
Return empty impact if no exposures matching hazard
Jun 28, 2022
0a6c9e1
Repalce squeeze by ravel for atevent single event
Jun 28, 2022
5bdfd02
Add _orig_exp_idx as argument from beginning on
Jun 28, 2022
60c25ca
Update comment cosmetics
Jun 28, 2022
2014af8
Reduce lines to max length
Jun 28, 2022
cdc536d
Replace generator by list due to deprecation in np.hstack
Jun 28, 2022
ed18913
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jul 1, 2022
d678fa2
Merge branch 'develop' into feature/refactor_impact_calc
emanuel-schmid Jul 14, 2022
74e3a8e
test_impact_calc: organize test data
emanuel-schmid Jul 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 29 additions & 30 deletions climada/engine/impact_calc.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,44 +40,43 @@ class ImpactCalc():
def __init__(self,
exposures,
impfset,
hazard,
imp_mat=None):
hazard):
"""
Initialize an ImpactCalc object.
ImpactCalc constructor

The dimension of the imp_mat variable must be compatible with the
exposures and hazard objects.

Parameters
----------
exposures : climada.entity.Exposures
exposure used to compute imp_mat
exposures used to compute imp_mat, the object is subject to change when ImpactCalc
methods are executed.
impf_set: climada.entity.ImpactFuncSet
impact functions set used to compute imp_mat
hazard : climada.Hazard
hazard used to compute imp_mat
imp_mat : sparse.csr_matrix, optional
matrix num_events x num_exp with impacts.
Default is an empty matrix.

Returns
-------
None.

"""

self.exposures = exposures
self.impfset = impfset
self.hazard = hazard
self.imp_mat = imp_mat if imp_mat is not None else sparse.csr_matrix((0, 0))
self.n_exp_pnt = self.exposures.gdf.shape[0]
self.n_events = self.hazard.size

@property
def n_exp_pnt(self):
"""Number of exposure points (rows in gdf)"""
return self.exposures.gdf.shape[0]

@property
def n_events(self):
"""Number of hazard events (size of event_id array)"""
return self.hazard.size

@property
def deductible(self):
"""
Deductibles from the exposures. Returns empty array
if no deductibles defined.
Deductibles from the exposures. Returns None if no deductibles defined.

Returns
-------
Expand All @@ -87,12 +86,11 @@ def deductible(self):
"""
if 'deductible' in self.exposures.gdf.columns:
return self.exposures.gdf['deductible'].to_numpy()
return np.array([])

@property
def cover(self):
"""
Covers from the exposures. Returns empty array if no covers defined.
Covers from the exposures. Returns None if no covers defined.

Returns
-------
Expand All @@ -102,9 +100,8 @@ def cover(self):
"""
if 'cover' in self.exposures.gdf.columns:
return self.exposures.gdf['cover'].to_numpy()
return np.array([])

def impact(self, save_mat=True):
def impact(self):
"""Compute the impact of a hazard on exposures.

Parameters
Expand All @@ -130,7 +127,7 @@ def impact(self, save_mat=True):
LOGGER.info('Calculating impact for %s assets (>0) and %s events.',
self.n_events, self.n_events)
imp_mat_gen = self.imp_mat_gen(exp_gdf, impf_col)
return self._return_impact(imp_mat_gen, save_mat)
return self._return_impact(imp_mat_gen, True)

#TODO: make a better impact matrix generator for insured impacts when
# the impact matrix is already present
Expand Down Expand Up @@ -162,7 +159,7 @@ def insured_impact(self, save_mat=False):
apply_cover_to_mat:
apply cover to impact matrix
"""
if self.cover.size == 0 and self.deductible.size == 0:
if self.cover is None and self.deductible is None:
raise AttributeError("Neither cover nor deductible defined."
"Please set exposures.gdf.cover"
"and/or exposures.gdf.deductible")
Expand All @@ -171,10 +168,8 @@ def insured_impact(self, save_mat=False):
LOGGER.info('Calculating impact for %s assets (>0) and %s events.',
exp_gdf.size, self.hazard.size)

if self.imp_mat.size == 0:
imp_mat_gen = self.imp_mat_gen(exp_gdf, impf_col)
else:
imp_mat_gen = ((self.imp_mat, np.arange(1, len(exp_gdf))) for n in range(1))
imp_mat_gen = self.imp_mat_gen(exp_gdf, impf_col)

ins_mat_gen = self.insured_mat_gen(imp_mat_gen, exp_gdf, impf_col)
return self._return_impact(ins_mat_gen, save_mat)

Expand All @@ -200,13 +195,14 @@ def _return_impact(self, imp_mat_gen, save_mat):

"""
if save_mat:
self.imp_mat = self.stitch_impact_matrix(imp_mat_gen)
at_event, eai_exp, aai_agg = self.risk_metrics(self.imp_mat, self.hazard.frequency)
imp_mat = self.stitch_impact_matrix(imp_mat_gen)
at_event, eai_exp, aai_agg = self.risk_metrics(imp_mat, self.hazard.frequency)
else:
imp_mat = None
at_event, eai_exp, aai_agg = self.stitch_risk_metrics(imp_mat_gen)
return Impact.from_eih(
self.exposures, self.impfset, self.hazard,
at_event, eai_exp, aai_agg, self.imp_mat
at_event, eai_exp, aai_agg, imp_mat
)

def minimal_exp_gdf(self, impf_col):
Expand All @@ -220,7 +216,10 @@ def minimal_exp_gdf(self, impf_col):
name of the impact function column in exposures.gdf

"""
self.exposures.assign_centroids(self.hazard, overwrite=False)
# since the original exposures object will be "spoiled" through assign_centroids
# (a column cent_XY is added), overwrite=True makes sure the exposures can be reused
# for ImpactCalc with another Hazard object of the same hazard type.
self.exposures.assign_centroids(self.hazard, overwrite=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you change that to True? This is a bad idea as this results in a lot of extra computation time, often for nothing. A typical workflow implies that the Exposures object given to ImpactCalc already has assigned centroids. This line here is just in case that none are defined (which is a sub-optimal workflow, but needs to stay due to backwards compatibility).

Thus, this should be reversed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then we have the following problem:

agg1 = ImpactCalc(exp, impf, haz1).impact().aai_agg
agg2 = ImpactCalc(exp, impf, haz2).impact().aai_agg
print(agg1 - agg2)

does i.g. not print the same as

agg2 = ImpactCalc(exp, impf, haz2).impact().aai_agg
agg1 = ImpactCalc(exp, impf, haz1).impact().aai_agg
print(agg1 - agg2)

Simply, because in the first example agg2 is likely wrong and in the second agg1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point Emanuel, I remember that I stumbled upon this before!

I would be in favor of moving the whole centroids assignment procedure to the ImpactCalc object because semantically it's not a property of the Exposures object, but of the combination of an exposure with a hazard. However, I see the problem that it might be desirable to be able to reuse the same assignment with several different Hazard objects that share the same Centroids... And of course, there is the problem of backwards compatibility when people are explicitly using the Exposures.assign_centroids API.

Maybe the easiest solution is to leave the API as it is, but have a boolean keyword argument reassign_centroids=True for ImpactCalc.impact() so the user can opt out of the reassignment if that's desired?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too actually, I'd favor moving the centroids assignments out of Exposures. Also I'm not yet ready to go for the easiest solution, as this looks like it may be crucial.

Two questions:

  • What is a typical workflow?
  • When two Hazard objects share the same Centroids, are they "aware" of it, i.e., do they share the same Centroids object or do their Centroids just have the same content and only the user knows that they are the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a rather complex discussion that will be very hard to follow clearly on a chat forum like here. But, a few remarks:

  • the current API and workflow is clear: the user is responsible for making sensible hazard-exposures-impact functions assignments. This includes assigning for a given hazard type the centroids to the exposures. The same goes true for assigning the correct impact functions to the exposures points. We can modify this of course, but this would be a major change in API and workflow
  • we 100% do NOT want to assign centroids for each impact computation as this is a step that needs to taken once only per exposures-hazard pair. We do not want to repeat this rather expensive operation (e.g. in the uncertainty computations).
  • two hazards are not aware that they have the same centroids. It is unclear to me how that would work, as the idea is that any user can define their hazard (and thus their centroids) from whatever source they want and bring it into CLIMADA for computation.
  • My opinion would be that actually the assign_centroids is entirely removed from the Impact.calc as it is an under the hood assignment which we would like the user to have control over (for instance, does one always want to use nearest-neighbors?). This way, the user must think and prepare consistent data.

I also agree that a quick solution is not optimal. The current solution is the one that keeps the same convention as in the Impact.calc() method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion would be to decouple the two discussion. This PR is about making the impact class more readable, more modular, better tested and more efficient. It is not about changing the inner workings of the API fundamentally as is here suggested with the change with the assign_centroids.


mask = (
(self.exposures.gdf.value.values != 0)
Expand Down
13 changes: 5 additions & 8 deletions climada/engine/test/test_impact_calc.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,6 @@ def test_init(self):
icalc = ImpactCalc(ENT.exposures, ENT.impact_funcs, HAZ)
self.assertEqual(icalc.n_exp_pnt, ENT.exposures.gdf.shape[0])
self.assertEqual(icalc.n_events, HAZ.size)
self.assertEqual(icalc.imp_mat.size, 0)
self.assertTrue(ENT.exposures.gdf.equals(icalc.exposures.gdf))
np.testing.assert_array_equal(HAZ.event_id, icalc.hazard.event_id)
np.testing.assert_array_equal(HAZ.event_name, icalc.hazard.event_name)
Expand Down Expand Up @@ -137,7 +136,7 @@ def test_calc_impact_pass(self):
def test_calc_impact_save_mat_pass(self):
"""Test compute impact with impact matrix"""
icalc = ImpactCalc(ENT.exposures, ENT.impact_funcs, HAZ)
impact = icalc.impact(save_mat=True)
impact = icalc.impact()

self.assertIsInstance(impact.imp_mat, sparse.csr_matrix)
self.assertEqual(impact.imp_mat.shape, (HAZ.event_id.size,
Expand Down Expand Up @@ -234,9 +233,8 @@ def test_minimal_exp_gdf(self):

def test_stitch_impact_matrix(self):
"""Check how sparse matrices from a generator are stitched together"""
icalc = ImpactCalc(Exposures(), ImpactFuncSet(), Hazard())
icalc.n_events = 3
icalc.n_exp_pnt = 4
icalc = ImpactCalc(Exposures({'blank': [1, 2, 3, 4]}), ImpactFuncSet(), Hazard())
icalc.hazard.event_id = np.array([1, 2, 3])

imp_mat_gen = [
(sparse.csr_matrix([[1.0, 1.0], [0.0, 1.0]]), np.array([0, 1])),
Expand Down Expand Up @@ -265,9 +263,8 @@ def test_apply_deductible_to_mat(self):

def test_stitch_risk_metrics(self):
"""Test computing risk metrics from an impact matrix generator"""
icalc = ImpactCalc(Exposures(), ImpactFuncSet(), Hazard())
icalc.n_events = 2
icalc.n_exp_pnt = 3
icalc = ImpactCalc(Exposures({'blank': [1, 2, 3]}), ImpactFuncSet(), Hazard())
icalc.hazard.event_id = np.array([1, 2])
icalc.hazard.frequency = np.array([2, 0.5])

# Matrices overlap at central exposure point
Expand Down