Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize concatenation of centroids #989

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ Code freeze date: YYYY-MM-DD

### Changed

- `Centroids.append` now takes multiple arguments and provides a performance boost when doing so [#989](https://github.com/CLIMADA-project/climada_python/pull/989)
- `climada.util.coordinates.get_country_geometries` function: Now throwing a ValueError if unregognized ISO country code is given (before, the invalid ISO code was ignored) [#980](https://github.com/CLIMADA-project/climada_python/pull/980)
- Improved scaling factors implemented in `climada.hazard.trop_cyclone.apply_climate_scenario_knu` to model the impact of climate changes to tropical cyclones [#734](https://github.com/CLIMADA-project/climada_python/pull/734)
- In `climada.util.plot.geo_im_from_array`, NaNs are plotted in gray while cells with no centroid are not plotted [#929](https://github.com/CLIMADA-project/climada_python/pull/929)
Expand All @@ -47,6 +48,7 @@ Code freeze date: YYYY-MM-DD

### Fixed

- Resolved an issue where windspeed computation was much slower than in Climada v3 [#989](https://github.com/CLIMADA-project/climada_python/pull/989)
- File handles are being closed after reading netcdf files with `climada.hazard` modules [#953](https://github.com/CLIMADA-project/climada_python/pull/953)
- Avoids a ValueError in the impact calculation for cases with a single exposure point and MDR values of 0, by explicitly removing zeros in `climada.hazard.Hazard.get_mdr` [#933](https://github.com/CLIMADA-project/climada_python/pull/948)

Expand Down
40 changes: 24 additions & 16 deletions climada/hazard/centroids/centr.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,16 +331,21 @@ def from_pnt_bounds(cls, points_bounds, res, crs=DEF_CRS):
}
)

def append(self, centr):
"""Append Centroids
def append(self, *centr):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a dedicated test for append with multiple arguments

"""Append Centroids to the current centroid object for concatenation.

This method check that all centroids use the same CRS, append the list of centroids to
NicolasColombi marked this conversation as resolved.
Show resolved Hide resolved
the initial Centroid object and eventually concatenate them to create a single centroid
NicolasColombi marked this conversation as resolved.
Show resolved Hide resolved
object with the union of all centroids.

Note that the result might contain duplicate points if the object to append has an overlap
with the current object.
with the current object. Remove duplicates by either using :py:meth:`union`
or calling :py:meth:`remove_duplicate_points` after appending.

Parameters
----------
centr : Centroids
Centroids to append. The centroids need to have the same CRS.
centr : list
List of Centroids to append. The centroids need to have the same CRS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since centr is given as a *args, it does not have to be a list but can be several centroids, no? Formulated differently, can this method be called these two ways with equivalent results or does only one way work?

centr.append(new_centrs1, new_centrs2)
centr.append([new_centrs1, new_centrs2])

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, the code as currently written would work if you call either centr.append(new_centrs1, new_centrs2) or centr.append(*[new_centrs1, new_centrs2]) but would not work calling centr.append([new_centrs1, new_centrs2]), since it wouldn't unpack the list into single centroid objects before the loop for other in centr which checks the CRS.

@NicolasColombi and @peanutfun, should we maybe consider adapting the method so that it would also work when passing a list without *?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Benoit, I think you are right, I will update the docs strings, it shouldn't be a list.


Raises
------
Expand All @@ -351,32 +356,35 @@ def append(self, centr):
union : Union of Centroid objects.
remove_duplicate_points : Remove duplicate points in a Centroids object.
"""
if not u_coord.equal_crs(self.crs, centr.crs):
raise ValueError(
f"The given centroids use different CRS: {self.crs}, {centr.crs}. "
"The centroids are incompatible and cannot be concatenated."
)
self.gdf = pd.concat([self.gdf, centr.gdf])
for other in centr:
if not u_coord.equal_crs(self.crs, other.crs):
raise ValueError(
f"The given centroids use different CRS: {self.crs}, {other.crs}. "
"The centroids are incompatible and cannot be concatenated."
)
self.gdf = pd.concat([self.gdf] + [other.gdf for other in centr])

def union(self, *others):
"""Create the union of Centroids objects
"""Create the union of the current Centroids object with one or more other centroids
objects by passing the list of centroids to :py:meth:`append` for concatenation and then
removes duplicates.

All centroids must have the same CRS. Points that are contained in more than one of the
Centroids objects will only be contained once (i.e. duplicates are removed).

Parameters
----------
others : list of Centroids
Centroids contributing to the union.
others : list
List of Centroids contributing to the union.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above: if it's an *args then it's not a list, no?


Returns
-------
centroids : Centroids
Centroids object containing the union of all Centroids.
"""
centroids = copy.deepcopy(self)
for cent in others:
centroids.append(cent)
centroids.append(*others)

return centroids.remove_duplicate_points()

def remove_duplicate_points(self):
Expand Down
20 changes: 20 additions & 0 deletions climada/hazard/centroids/test/test_centr.py
Original file line number Diff line number Diff line change
Expand Up @@ -816,6 +816,26 @@ def test_append_dif_crs(self):
with self.assertRaises(ValueError):
self.centr.append(centr2)

def test_append_multiple_arguments(self):
"""Test passing append multiple arguments in the form of a list of Centroids."""
NicolasColombi marked this conversation as resolved.
Show resolved Hide resolved
# create a single centroid
lat, lon = np.array([1, 2]), np.array([1, 2])
centr = Centroids(lat=lat, lon=lon)
# create a list of centroids
coords = [(np.array([3, 4]), np.array([3, 4]))]
centroids_list = [Centroids(lat=lat, lon=lon) for lat, lon in coords]

centr.append(*centroids_list)

self.assertEqual(centr.lat[0], 1)
self.assertEqual(centr.lat[1], 2)
self.assertEqual(centr.lat[2], 3)
self.assertEqual(centr.lat[3], 4)
self.assertEqual(centr.lon[0], 1)
self.assertEqual(centr.lon[1], 2)
self.assertEqual(centr.lon[2], 3)
self.assertEqual(centr.lon[3], 4)
NicolasColombi marked this conversation as resolved.
Show resolved Hide resolved

def test_remove_duplicate_pass(self):
"""Test remove_duplicate_points"""
centr = Centroids(
Expand Down
Loading