Skip to content

Commit

Permalink
Merge branch 'develop' into feature/set_geometry
Browse files Browse the repository at this point in the history
  • Loading branch information
emanuel-schmid committed Jul 9, 2024
2 parents c9d7c95 + f825ca5 commit 2cdcdb7
Show file tree
Hide file tree
Showing 20 changed files with 1,074 additions and 415 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,23 @@ Code freeze date: YYYY-MM-DD

### Changed

- Use Geopandas GeoDataFrame.plot() for centroids plotting function [896](https://github.com/CLIMADA-project/climada_python/pull/896)
- Update SALib sensitivity and sampling methods from newest version (SALib 1.4.7) [#828](https://github.com/CLIMADA-project/climada_python/issues/828)
- Allow for computation of relative and absolute delta impacts in `CalcDeltaClimate`
- Remove content tables and make minor improvements (fix typos and readability) in
CLIMADA tutorials. [#872](https://github.com/CLIMADA-project/climada_python/pull/872)
- Centroids complete overhaul. Most function should be backward compatible. Internal data is stored in a geodataframe attribute. Raster are now stored as points, and the meta attribute is removed. Several methds were deprecated or removed. [#787](https://github.com/CLIMADA-project/climada_python/pull/787)
- Improved error messages produced by `ImpactCalc.impact()` in case impact function in the exposures is not found in impf_set [#863](https://github.com/CLIMADA-project/climada_python/pull/863)
- Update the Holland et al. 2010 TC windfield model and introduce `model_kwargs` parameter to adjust model parameters [#846](https://github.com/CLIMADA-project/climada_python/pull/846)
- Changed module structure: `climada.hazard.Hazard` has been split into the modules `base`, `io` and `plot` [#871](https://github.com/CLIMADA-project/climada_python/pull/871)
- `Impact.from_hdf5` now calls `str` on `event_name` data that is not strings, and issue a warning then [#894](https://github.com/CLIMADA-project/climada_python/pull/894)
- `Impact.write_hdf5` now throws an error if `event_name` is does not contain strings exclusively [#894](https://github.com/CLIMADA-project/climada_python/pull/894)

### Fixed

- Avoid an issue where a Hazard subselection would have a fraction matrix with only zeros as entries by throwing an error [#866](https://github.com/CLIMADA-project/climada_python/pull/866)
- Allow downgrading the Python bugfix version to improve environment compatibility [#900](https://github.com/CLIMADA-project/climada_python/pull/900)
- Fix broken links in `CONTRIBUTING.md` [#900](https://github.com/CLIMADA-project/climada_python/pull/900)

### Added

Expand Down Expand Up @@ -158,6 +166,7 @@ Changed:

- `geopandas` >=0.13 → >=0.14
- `pandas` >=1.5,<2.0 &rarr; >=2.1
- `salib` >=1.3.0 &rarr; >=1.4.7

Removed:

Expand Down
13 changes: 7 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Please contact the [lead developers](https://wcr.ethz.ch/research/climada.html)

## Minimal Steps to Contribute

Before you start, please have a look at our [Developer Guide][devguide].
Before you start, please have a look at our Developer Guide section in the [CLIMADA Docs][docs].

To contribute follow these steps:

Expand Down Expand Up @@ -65,21 +65,22 @@ To contribute follow these steps:
## Resources
The CLIMADA documentation provides a [Developer Guide][devguide].
The [CLIMADA documentation][docs] provides several Developer Guides.
Here's a selection of the commonly required information:

* How to use Git and GitHub for CLIMADA development: [Development and Git and CLIMADA](https://climada-python.readthedocs.io/en/latest/guide/Guide_Git_Development.html)
* Coding instructions for CLIMADA: [Python Dos and Don'ts](https://climada-python.readthedocs.io/en/latest/guide/Guide_PythonDos-n-Donts.html), [Performance Tips](https://climada-python.readthedocs.io/en/latest/guide/Guide_Py_Performance.html), [CLIMADA Conventions](https://climada-python.readthedocs.io/en/latest/guide/Guide_Miscellaneous.html)
* How to execute tests in CLIMADA: [Testing and Continuous Integration][testing]
* Coding instructions for CLIMADA: [Python Dos and Don'ts](https://climada-python.readthedocs.io/en/latest/guide/Guide_PythonDos-n-Donts.html), [Performance Tips](https://climada-python.readthedocs.io/en/latest/guide/Guide_Py_Performance.html), [CLIMADA Conventions](https://climada-python.readthedocs.io/en/latest/guide/Guide_CLIMADA_conventions.html)
* How to execute tests in CLIMADA: [Testing][testing] and [Continuous Integration](https://climada-python.readthedocs.io/en/latest/guide/Guide_continuous_integration_GitHub_actions.html)
## Pull Requests
After developing a new feature, fixing a bug, or updating the tutorials, you can create a [pull request](https://docs.github.com/en/pull-requests) to have your changes reviewed and then merged into the CLIMADA code base.
To ensure that your pull request can be reviewed quickly and easily, please have a look at the _Resources_ above before opening a pull request.
In particular, please check out the [Pull Request instructions](https://climada-python.readthedocs.io/en/latest/guide/Guide_Git_Development.html#Pull-requests).
In particular, please check out the [Pull Request instructions](https://climada-python.readthedocs.io/en/latest/guide/Guide_Git_Development.html#pull-requests).
We provide a description template for pull requests that helps you provide the essential information for reviewers.
It also contains a checklist for both pull request authors and reviewers to guide the review process.
[docs]: https://climada-python.readthedocs.io/en/latest/
[devguide]: https://climada-python.readthedocs.io/en/latest/#developer-guide
[testing]: https://climada-python.readthedocs.io/en/latest/guide/Guide_Continuous_Integration_and_Testing.html
[testing]: https://climada-python.readthedocs.io/en/latest/guide/Guide_Testing.html
26 changes: 18 additions & 8 deletions climada/engine/impact.py
Original file line number Diff line number Diff line change
Expand Up @@ -937,11 +937,6 @@ def write_hdf5(self, file_path: Union[str, Path], dense_imp_mat: bool=False):
The impact matrix can be stored in a sparse or dense format.
Notes
-----
This writer does not support attributes with variable types. Please make sure
that ``event_name`` is a list of equally-typed values, e.g., all ``str``.
Parameters
----------
file_path : str or Path
Expand All @@ -950,6 +945,11 @@ def write_hdf5(self, file_path: Union[str, Path], dense_imp_mat: bool=False):
If ``True``, write the impact matrix as dense matrix that can be more easily
interpreted by common H5 file readers but takes up (vastly) more space.
Defaults to ``False``.
Raises
------
TypeError
If :py:attr:`event_name` does not contain strings exclusively.
"""
# Define writers for all types (will be filled later)
type_writers = dict()
Expand Down Expand Up @@ -983,7 +983,7 @@ def write(group: h5py.Group, name: str, value: Any):

def _str_type_helper(values: Collection):
"""Return string datatype if we assume 'values' contains strings"""
if isinstance(next(iter(values)), str):
if all((isinstance(val, str) for val in values)):
return h5py.string_dtype()
return None

Expand Down Expand Up @@ -1037,6 +1037,8 @@ def write_csr(group, name, value):
# Now write all attributes
# NOTE: Remove leading underscore to write '_tot_value' as regular attribute
for name, value in self.__dict__.items():
if name == "event_name" and _str_type_helper(value) is None:
raise TypeError("'event_name' must be a list of strings")
write(file, name.lstrip("_"), value)

def write_sparse_csr(self, file_name):
Expand Down Expand Up @@ -1240,10 +1242,18 @@ def from_hdf5(cls, file_path: Union[str, Path]):
).intersection(file.keys())
kwargs.update({attr: file[attr][:] for attr in array_attrs})

# Special handling for 'event_name' because it's a list of strings
# Special handling for 'event_name' because it should be a list of strings
if "event_name" in file:
# pylint: disable=no-member
kwargs["event_name"] = list(file["event_name"].asstr()[:])
try:
event_name = file["event_name"].asstr()[:]
except TypeError:
LOGGER.warning(
"'event_name' is not stored as strings. Trying to decode "
"values with 'str()' instead."
)
event_name = map(str, file["event_name"][:])
kwargs["event_name"] = list(event_name)

# Create the impact object
return cls(**kwargs)
Expand Down
19 changes: 15 additions & 4 deletions climada/engine/test/test_impact.py
Original file line number Diff line number Diff line change
Expand Up @@ -779,7 +779,8 @@ def test_select_event_identity_pass(self):
ent.exposures.assign_centroids(hazard)

# Compute the impact over the whole exposures
imp = ImpactCalc(ent.exposures, ent.impact_funcs, hazard).impact(save_mat=True, assign_centroids=False)
imp = ImpactCalc(ent.exposures, ent.impact_funcs, hazard).impact(
save_mat=True, assign_centroids=False)

sel_imp = imp.select(event_ids=imp.event_id,
event_names=imp.event_name,
Expand Down Expand Up @@ -1019,10 +1020,11 @@ def test_write_hdf5_without_imp_mat(self):

def test_write_hdf5_type_fail(self):
"""Test that writing attributes with varying types results in an error"""
self.impact.event_name = [1, "a", 1.0, "b", "c", "d"]
with self.assertRaises(TypeError) as cm:
self.impact.event_name = ["a", 1, 1.0, "b", "c", "d"]
with self.assertRaisesRegex(
TypeError, "'event_name' must be a list of strings"
):
self.impact.write_hdf5(self.filepath)
self.assertIn("No conversion path for dtype", str(cm.exception))

def test_cycle_hdf5(self):
"""Test writing and reading the same object"""
Expand Down Expand Up @@ -1120,6 +1122,15 @@ def test_read_hdf5_full(self):
impact = Impact.from_hdf5(self.filepath)
npt.assert_array_equal(impact.imp_mat.toarray(), [[0, 1, 2], [3, 0, 0]])

# Check with non-string event_name
event_name = [1.2, 2]
with h5py.File(self.filepath, "r+") as file:
del file["event_name"]
file.create_dataset("event_name", data=event_name)
with self.assertLogs("climada.engine.impact", "WARNING") as cm:
impact = Impact.from_hdf5(self.filepath)
self.assertIn("'event_name' is not stored as strings", cm.output[0])
self.assertListEqual(impact.event_name, ["1.2", "2.0"])

# Execute Tests
if __name__ == "__main__":
Expand Down
99 changes: 85 additions & 14 deletions climada/engine/unsequa/calc_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,8 +203,8 @@ def make_sample(self, N, sampling_method='saltelli',
Number of samples as used in the sampling method from SALib
sampling_method : str, optional
The sampling method as defined in SALib. Possible choices:
'saltelli', 'fast_sampler', 'latin', 'morris', 'dgsm', 'ff'
https://salib.readthedocs.io/en/latest/api.html
'saltelli', 'latin', 'morris', 'dgsm', 'fast_sampler', 'ff', 'finite_diff',
https://salib.readthedocs.io/en/latest/api.html
The default is 'saltelli'.
sampling_kwargs : kwargs, optional
Optional keyword arguments passed on to the SALib sampling_method.
Expand All @@ -215,6 +215,17 @@ def make_sample(self, N, sampling_method='saltelli',
unc_output : climada.engine.uncertainty.unc_output.UncOutput()
Uncertainty data object with the samples
Notes
-----
The 'ff' sampling method does not require a value for the N parameter.
The inputed N value is hence ignored in the sampling process in the case
of this method.
The 'ff' sampling method requires a number of uncerainty parameters to be
a power of 2. The users can generate dummy variables to achieve this
requirement. Please refer to https://salib.readthedocs.io/en/latest/api.html
for more details.
See Also
--------
SALib.sample: sampling methods from SALib SALib.sample
Expand All @@ -231,11 +242,17 @@ def make_sample(self, N, sampling_method='saltelli',
'names' : param_labels,
'bounds' : [[0, 1]]*len(param_labels)
}

#for the ff sampler, no value of N is needed. For API consistency the user
#must input a value that is ignored and a warning is given.
if sampling_method == 'ff':
LOGGER.warning("You are using the 'ff' sampler which does not require "
"a value for N. The entered N value will be ignored"
"in the sampling process.")
uniform_base_sample = self._make_uniform_base_sample(N, problem_sa,
sampling_method,
sampling_kwargs)
df_samples = pd.DataFrame(uniform_base_sample, columns=param_labels)

for param in list(df_samples):
df_samples[param] = df_samples[param].apply(
self.distr_dict[param].ppf
Expand Down Expand Up @@ -271,7 +288,7 @@ def _make_uniform_base_sample(self, N, problem_sa, sampling_method,
SALib sampling method.
sampling_method: string
The sampling method as defined in SALib. Possible choices:
'saltelli', 'fast_sampler', 'latin', 'morris', 'dgsm', 'ff'
'saltelli', 'latin', 'morris', 'dgsm', 'fast_sampler', 'ff', 'finite_diff',
https://salib.readthedocs.io/en/latest/api.html
sampling_kwargs: dict()
Optional keyword arguments passed on to the SALib sampling method.
Expand All @@ -292,8 +309,20 @@ def _make_uniform_base_sample(self, N, problem_sa, sampling_method,
#c.f. https://stackoverflow.com/questions/2724260/why-does-pythons-import-require-fromlist
import importlib # pylint: disable=import-outside-toplevel
salib_sampling_method = importlib.import_module(f'SALib.sample.{sampling_method}')
sample_uniform = salib_sampling_method.sample(
problem = problem_sa, N = N, **sampling_kwargs)

if sampling_method == 'ff': #the ff sampling has a fixed sample size and
#does not require the N parameter
if problem_sa['num_vars'] & (problem_sa['num_vars'] - 1) != 0:
raise ValueError("The number of parameters must be a power of 2. "
"To use the ff sampling method, you can generate "
"dummy parameters to overcome this limitation."
" See https://salib.readthedocs.io/en/latest/api.html")

sample_uniform = salib_sampling_method.sample(
problem = problem_sa, **sampling_kwargs)
else:
sample_uniform = salib_sampling_method.sample(
problem = problem_sa, N = N, **sampling_kwargs)
return sample_uniform

def sensitivity(self, unc_output, sensitivity_method = 'sobol',
Expand Down Expand Up @@ -323,17 +352,21 @@ def sensitivity(self, unc_output, sensitivity_method = 'sobol',
unc_output : climada.engine.unsequa.UncOutput
Uncertainty data object in which to store the sensitivity indices
sensitivity_method : str, optional
sensitivity analysis method from SALib.analyse
Possible choices:
'fast', 'rbd_fact', 'morris', 'sobol', 'delta', 'ff'
The default is 'sobol'.
Note that in Salib, sampling methods and sensitivity analysis
methods should be used in specific pairs.
Sensitivity analysis method from SALib.analyse. Possible choices: 'sobol', 'fast',
'rbd_fast', 'morris', 'dgsm', 'ff', 'pawn', 'rhdm', 'rsa', 'discrepancy', 'hdmr'.
Note that in Salib, sampling methods and sensitivity
analysis methods should be used in specific pairs:
https://salib.readthedocs.io/en/latest/api.html
sensitivity_kwargs: dict, optional
Keyword arguments of the chosen SALib analyse method.
The default is to use SALib's default arguments.
Notes
-----
The variables 'Em','Term','X','Y' are removed from the output of the
'hdmr' method to ensure compatibility with unsequa.
The 'Delta' method is currently not supported.
Returns
-------
sens_output : climada.engine.unsequa.UncOutput
Expand All @@ -360,7 +393,7 @@ def sensitivity(self, unc_output, sensitivity_method = 'sobol',

sens_output = copy.deepcopy(unc_output)

#Certaint Salib method required model input (X) and output (Y), others
#Certain Salib method required model input (X) and output (Y), others
#need only ouput (Y)
salib_kwargs = method.analyze.__code__.co_varnames # obtain all kwargs of the salib method
X = unc_output.samples_df.to_numpy() if 'X' in salib_kwargs else None
Expand Down Expand Up @@ -500,10 +533,47 @@ def _calc_sens_df(method, problem_sa, sensitivity_kwargs, param_labels, X, unc_d
else:
sens_indices = method.analyze(problem_sa, Y,
**sensitivity_kwargs)
#refactor incoherent SALib output
nparams = len(param_labels)
if method.__name__[-3:] == '.ff': #ff method
if sensitivity_kwargs['second_order']:
#parse interaction terms of sens_indices to a square matrix
#to ensure consistency with unsequa
interaction_names = sens_indices.pop('interaction_names')
interactions = np.full((nparams, nparams), np.nan)
#loop over interaction names and extract each param pair,
#then match to the corresponding param from param_labels
for i,interaction_name in enumerate(interaction_names):
interactions[param_labels.index(interaction_name[0]),
param_labels.index(interaction_name[1])] = sens_indices['IE'][i]
sens_indices['IE'] = interactions

if method.__name__[-5:] == '.hdmr': #hdmr method
#first, remove variables that are incompatible with unsequa output
keys_to_remove = ['Em','Term','select', 'RT', 'Y_em', 'idx', 'X', 'Y']
sens_indices = {k: v for k, v in sens_indices.items()
if k not in keys_to_remove}
names = sens_indices.pop('names') #names of terms

#second, refactor to 2D
for si, si_val_array in sens_indices.items():
if (np.array(si_val_array).ndim == 1 and #for everything that is 1d and has
np.array(si_val_array).size > nparams): #lentgh > n params, refactor to 2D
si_new_array = np.full((nparams, nparams), np.nan)
np.fill_diagonal(si_new_array, si_val_array[0:nparams]) #simple terms go on diag
for i,interaction_name in enumerate(names[nparams:]):
t1, t2 = interaction_name.split('/') #interaction terms
si_new_array[param_labels.index(t1),
param_labels.index(t2)] = si_val_array[nparams+i]
sens_indices[si] = si_new_array


sens_first_order = np.array([
np.array(si_val_array)
for si, si_val_array in sens_indices.items()
if (np.array(si_val_array).ndim == 1 and si!='names') # dirty trick due to Salib incoherent output
if (np.array(si_val_array).ndim == 1 # dirty trick due to Salib incoherent output
and si!='names'
and np.array(si_val_array).size == len(param_labels))
]).ravel()
sens_first_order_dict[submetric_name] = sens_first_order

Expand All @@ -515,6 +585,7 @@ def _calc_sens_df(method, problem_sa, sensitivity_kwargs, param_labels, X, unc_d
sens_second_order_dict[submetric_name] = sens_second_order

sens_first_order_df = pd.DataFrame(sens_first_order_dict, dtype=np.number)

if not sens_first_order_df.empty:
si_names_first_order, param_names_first_order = _si_param_first(param_labels, sens_indices)
sens_first_order_df.insert(0, 'si', si_names_first_order)
Expand Down
Loading

0 comments on commit 2cdcdb7

Please sign in to comment.