Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo Numpy v2 #6035

Merged
merged 35 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
a82bea0
Demo Numpy v2.
trexfeathers Jul 2, 2024
2305966
Merge branch 'main' into numpy2
trexfeathers Jul 2, 2024
959da64
Merge remote-tracking branch 'upstream/main' into numpy2
trexfeathers Oct 7, 2024
27d4b48
Force NumPy v2.
trexfeathers Oct 7, 2024
ca4e4fe
Store PP LB headers as signed ints (as specified in UMDP F3).
trexfeathers Oct 8, 2024
887784e
Correct PP tuple printing.
trexfeathers Oct 8, 2024
51a7ecf
NumPy 2 compliant dtype casting in regridding.
trexfeathers Oct 8, 2024
bf54926
Todo note in What's New.
trexfeathers Oct 8, 2024
770d180
Accept NumPy 2 avoiding up-casting dtype in integration NetCDF _get_s…
trexfeathers Oct 8, 2024
521b962
Adapt NetCDF _ensure_valid_dtype() for NumPy v2.
trexfeathers Oct 8, 2024
041901a
Adapt maths handling of rhs scalar dtype - for NumPy v2.
trexfeathers Oct 8, 2024
feeb7e2
Accept regrid KGO changes from changes to NumPy scalar arithmetic dty…
trexfeathers Oct 15, 2024
eba1b20
Correct FF tuple printing.
trexfeathers Oct 15, 2024
b39cfd1
More changed NumPy scalar representations.
trexfeathers Oct 15, 2024
c7217eb
Accept more changes for NumPy scalar arithmetic dtypes.
trexfeathers Oct 15, 2024
664f523
Accept NIMROD changes from NumPy scalar aritmetic dtypes.
trexfeathers Oct 15, 2024
0e03672
Revert "Accept NIMROD changes from NumPy scalar aritmetic dtypes."
trexfeathers Oct 15, 2024
e976857
More consistent dtype casting in NIMROD load, plus minor change to kn…
trexfeathers Oct 15, 2024
038f14e
Accept more changes for NumPy scalar arithmetic dtypes.
trexfeathers Oct 15, 2024
47ed377
Accept more changes for NumPy scalar arithmetic dtypes.
trexfeathers Oct 15, 2024
ecae712
Demonstrate new mo_pack version.
trexfeathers Oct 15, 2024
f3ec8ae
More thorough use of float32 for mask testing.
trexfeathers Oct 16, 2024
98dd05e
Explicitly use the values of the pandas object in as_cube().
trexfeathers Oct 16, 2024
d3954f0
Cube summary: convert any np.str_ instances to plain strings.
trexfeathers Oct 16, 2024
70af012
Merge remote-tracking branch 'upstream/main' into numpy2
trexfeathers Oct 16, 2024
6455bfd
Unpin NumPy.
trexfeathers Oct 16, 2024
aa5cdff
Fix doctests.
trexfeathers Oct 16, 2024
99061d8
Fix dtype handling in iris.util.regular_points for NumPy v2.
trexfeathers Oct 23, 2024
48c73fc
Merge remote-tracking branch 'upstream/main' into numpy2
trexfeathers Oct 23, 2024
3a809f8
Update lock files.
trexfeathers Oct 23, 2024
8207e05
Undo over-zealous doctest changes.
trexfeathers Oct 23, 2024
f547662
Remove dask/dask#11433 workaround.
trexfeathers Oct 23, 2024
85410ba
What's New entry.
trexfeathers Oct 23, 2024
fdbbd37
Missed doctest changes from f54766298.
trexfeathers Oct 23, 2024
587f8ca
Revert "Undo over-zealous doctest changes." - valid since f54766298.
trexfeathers Oct 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions docs/src/further_topics/metadata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -403,10 +403,10 @@ instances. Normally, this would cause issues. For example,

>>> simply = {"one": np.int32(1), "two": np.array([1.0, 2.0])}
>>> simply
{'one': 1, 'two': array([1., 2.])}
{'one': np.int32(1), 'two': array([1., 2.])}
>>> fruity = {"one": np.int32(1), "two": np.array([1.0, 2.0])}
>>> fruity
{'one': 1, 'two': array([1., 2.])}
{'one': np.int32(1), 'two': array([1., 2.])}
>>> simply == fruity
Traceback (most recent call last):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Expand All @@ -418,9 +418,9 @@ However, metadata class equality is rich enough to handle this eventuality,
>>> metadata1 = cube.metadata._replace(attributes=simply)
>>> metadata2 = cube.metadata._replace(attributes=fruity)
>>> metadata1
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': 1, 'two': array([1., 2.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': np.int32(1), 'two': array([1., 2.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
>>> metadata2
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': 1, 'two': array([1., 2.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': np.int32(1), 'two': array([1., 2.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))

.. doctest:: richer-metadata

Expand All @@ -430,10 +430,10 @@ However, metadata class equality is rich enough to handle this eventuality,
.. doctest:: richer-metadata

>>> metadata1
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': 1, 'two': array([1., 2.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': np.int32(1), 'two': array([1., 2.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
>>> metadata2 = cube.metadata._replace(attributes={"one": np.int32(1), "two": np.array([1000.0, 2000.0])})
>>> metadata2
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': 1, 'two': array([1000., 2000.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
CubeMetadata(standard_name='air_temperature', long_name=None, var_name='air_temperature', units=Unit('K'), attributes={'one': np.int32(1), 'two': array([1000., 2000.])}, cell_methods=(CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),))
>>> metadata1 == metadata2
False

Expand Down
6 changes: 3 additions & 3 deletions docs/src/further_topics/ugrid/operations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -500,7 +500,7 @@ GeoVista :external+geovista:doc:`generated/gallery/index`.
Attributes:
NCO 'netCDF Operators version 4.7.5 (Homepage = http://nco.sf.net, Code = h ...'
history 'Mon Apr 12 01:44:41 2021: ncap2 -s synthetic=float(synthetic) mesh_C4_synthetic.nc ...'
nco_openmp_thread_number 1
nco_openmp_thread_number np.int32(1)
# Convert our mesh+data to a PolyData object.
>>> face_polydata = cube_to_polydata(sample_mesh_cube)
Expand Down Expand Up @@ -600,7 +600,7 @@ below:
Attributes:
NCO 'netCDF Operators version 4.7.5 (Homepage = http://nco.sf.net, Code = h ...'
history 'Mon Apr 12 01:44:41 2021: ncap2 -s synthetic=float(synthetic) mesh_C4_synthetic.nc ...'
nco_openmp_thread_number 1
nco_openmp_thread_number np.int32(1)
>>> regional_cube = extract_unstructured_region(
... cube=sample_mesh_cube,
Expand All @@ -619,7 +619,7 @@ below:
Attributes:
NCO 'netCDF Operators version 4.7.5 (Homepage = http://nco.sf.net, Code = h ...'
history 'Mon Apr 12 01:44:41 2021: ncap2 -s synthetic=float(synthetic) mesh_C4_synthetic.nc ...'
nco_openmp_thread_number 1
nco_openmp_thread_number np.int32(1)
Regridding
Expand Down
16 changes: 14 additions & 2 deletions docs/src/whatsnew/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ This document explains the changes made to Iris for this release
📢 Announcements
================

#. N/A
#. Iris is now compliant with NumPy v2. This may affect your scripts.
:ref:`See the full What's New entry for more details <numpy2>`.


✨ Features
Expand Down Expand Up @@ -92,7 +93,15 @@ This document explains the changes made to Iris for this release
🔗 Dependencies
===============

#. N/A
.. _numpy2:

#. `@trexfeathers`_ adapted the Iris codebase to work with NumPy v2. The
`NumPy v2 full release notes`_ have the exhaustive details. Notable
changes that may affect your Iris scripts are below. (:pull:`6035`)

* `NumPy v2 changed data type promotion`_

* `NumPy v2 changed scalar printing`_


📚 Documentation
Expand Down Expand Up @@ -122,3 +131,6 @@ This document explains the changes made to Iris for this release
Whatsnew resources in alphabetical order:
.. _cartopy#2390: https://github.com/SciTools/cartopy/issues/2390
.. _NumPy v2 changed data type promotion: https://numpy.org/doc/stable/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion
.. _NumPy v2 changed scalar printing: https://numpy.org/doc/stable/release/2.0.0-notes.html#representation-of-numpy-scalars-changed
.. _NumPy v2 full release notes: https://numpy.org/doc/stable/release/2.0.0-notes.html
2 changes: 2 additions & 0 deletions lib/iris/_representation/cube_summary.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ def __init__(self, cube, name_padding=35):

def string_repr(text, quote_strings=False, clip_strings=False):
"""Produce a one-line printable form of a text string."""
# Convert any np.str_ instances to plain strings.
text = str(text)
force_quoted = re.findall("[\n\t]", text) or quote_strings
if force_quoted:
# Replace the string with its repr (including quotes).
Expand Down
22 changes: 15 additions & 7 deletions lib/iris/analysis/_regrid.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,22 +156,30 @@ def _src_align_and_flatten(coord):
#

# Wrap modular values (e.g. longitudes) if required.
modulus = sx.units.modulus
_modulus = sx.units.modulus
# Convert to NumPy scalar to enable cast checking.
modulus = np.min_scalar_type(_modulus).type(_modulus)

def _cast_sx_points(sx_points_: np.ndarray):
"""Ensure modulus arithmetic will not raise a TypeError."""
if not np.can_cast(modulus, sx_points_.dtype):
new_type = np.promote_types(sx_points_.dtype, modulus.dtype)
result = sx_points_.astype(new_type, casting="safe")
else:
result = sx_points_
return result

if modulus is not None:
# Match the source cube x coordinate range to the target grid
# cube x coordinate range.
min_sx, min_tx = np.min(sx.points), np.min(tx.points)
if min_sx < 0 and min_tx >= 0:
indices = np.where(sx_points < 0)
# Ensure += doesn't raise a TypeError
if not np.can_cast(modulus, sx_points.dtype):
sx_points = sx_points.astype(type(modulus), casting="safe")
sx_points = _cast_sx_points(sx_points)
sx_points[indices] += modulus
elif min_sx >= 0 and min_tx < 0:
indices = np.where(sx_points > (modulus / 2))
# Ensure -= doesn't raise a TypeError
if not np.can_cast(modulus, sx_points.dtype):
sx_points = sx_points.astype(type(modulus), casting="safe")
sx_points = _cast_sx_points(sx_points)
sx_points[indices] -= modulus

# Create target grid cube x and y cell boundaries.
Expand Down
2 changes: 1 addition & 1 deletion lib/iris/analysis/maths.py
Original file line number Diff line number Diff line change
Expand Up @@ -867,7 +867,7 @@ def _binary_op_common(
if iris._lazy_data.is_lazy_data(other):
rhs = other
else:
rhs = np.asanyarray(other)
rhs = np.asanyarray(other, dtype=new_dtype)

def unary_func(lhs):
data = operation_function(lhs, rhs)
Expand Down
6 changes: 3 additions & 3 deletions lib/iris/common/resolve.py
Original file line number Diff line number Diff line change
Expand Up @@ -2358,16 +2358,16 @@ def cube(self, data, in_place=False):
>>> resolver.map_rhs_to_lhs
True
>>> cube1.data.sum()
124652160.0
np.float32(124652160.0)
>>> zeros.shape
(240, 37, 49)
>>> zeros.sum()
0.0
np.float32(0.0)
>>> result = resolver.cube(zeros, in_place=True)
>>> result is cube1
True
>>> cube1.data.sum()
0.0
np.float32(0.0)
"""
from iris.cube import Cube
Expand Down
4 changes: 2 additions & 2 deletions lib/iris/coords.py
Original file line number Diff line number Diff line change
Expand Up @@ -2431,9 +2431,9 @@ def nearest_neighbour_index(self, point):
>>> cube = iris.load_cube(iris.sample_data_path('ostia_monthly.nc'))
>>> cube.coord('latitude').nearest_neighbour_index(0)
9
np.int64(9)
>>> cube.coord('longitude').nearest_neighbour_index(10)
12
np.int64(12)
.. note:: If the coordinate contains bounds, these will be used to
determine the nearest neighbour instead of the point values.
Expand Down
16 changes: 15 additions & 1 deletion lib/iris/fileformats/_ff.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"""Provides UK Met Office Fields File (FF) format specific capabilities."""

import os
from typing import Any
import warnings

import numpy as np
Expand Down Expand Up @@ -370,9 +371,22 @@ def __init__(self, filename, word_depth=DEFAULT_FF_WORD_DEPTH):
setattr(self, elem, res)

def __str__(self):
def _str_tuple(to_print: Any):
"""Print NumPy scalars within tuples as numbers, not np objects.
E.g. ``lookup_table`` is a tuple of NumPy scalars.
NumPy v2 by default prints ``np.int32(1)`` instead of ``1`` when
printing an iterable of scalars.
"""
if isinstance(to_print, tuple):
result = "(" + ", ".join([str(i) for i in to_print]) + ")"
else:
result = str(to_print)
return result

attributes = []
for name, _ in FF_HEADER:
attributes.append(" {}: {}".format(name, getattr(self, name)))
attributes.append(f" {name}: {_str_tuple(getattr(self, name))}")
return "FF Header:\n" + "\n".join(attributes)

def __repr__(self):
Expand Down
8 changes: 6 additions & 2 deletions lib/iris/fileformats/netcdf/saver.py
Original file line number Diff line number Diff line change
Expand Up @@ -1387,9 +1387,13 @@ def _ensure_valid_dtype(self, values, src_name, src_object):
val_min, val_max = (values.min(), values.max())
if is_lazy_data(values):
val_min, val_max = _co_realise_lazy_arrays([val_min, val_max])
# NumPy will inherit values.dtype even if the scalar numbers work
# with a smaller type.
min_dtype = np.promote_types(
*[np.min_scalar_type(m) for m in (val_min, val_max)]
)
# Cast to an integer type supported by netCDF3.
can_cast = all([np.can_cast(m, np.int32) for m in (val_min, val_max)])
if not can_cast:
if not np.can_cast(min_dtype, np.int32):
msg = (
"The data type of {} {!r} is not supported by {} and"
" its values cannot be safely cast to a supported"
Expand Down
27 changes: 17 additions & 10 deletions lib/iris/fileformats/nimrod_load_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,42 +126,41 @@ def units(cube, field):
"n/a": "1",
}

dtype_original = cube.dtype
field_units = remove_unprintable_chars(field.units)
if field_units == "m/2-25k":
# Handle strange visibility units
cube.data = (cube.data.astype(np.float32) + 25000.0) * 2
cube.data = (cube.data + 25000.0) * 2
field_units = "m"
if "*" in field_units:
# Split into unit string and integer
unit_list = field_units.split("*")
if "^" in unit_list[1]:
# Split out magnitude
unit_sublist = unit_list[1].split("^")
cube.data = cube.data.astype(np.float32) / float(unit_sublist[0]) ** float(
unit_sublist[1]
)
cube.data = cube.data / float(unit_sublist[0]) ** float(unit_sublist[1])
else:
cube.data = cube.data.astype(np.float32) / float(unit_list[1])
cube.data = cube.data / float(unit_list[1])
field_units = unit_list[0]
if "ug/m3E1" in field_units:
# Split into unit string and integer
unit_list = field_units.split("E")
cube.data = cube.data.astype(np.float32) / 10.0
cube.data = cube.data / 10.0
field_units = unit_list[0]
if field_units == "%":
# Convert any percentages into fraction
field_units = "1"
cube.data = cube.data.astype(np.float32) / 100.0
cube.data = cube.data / 100.0
if field_units == "oktas":
field_units = "1"
cube.data = cube.data.astype(np.float32) / 8.0
cube.data = cube.data / 8.0
if field_units == "dBZ":
# cf_units doesn't recognise decibels (dBZ), but does know BZ
field_units = "BZ"
cube.data = cube.data.astype(np.float32) / 10.0
cube.data = cube.data / 10.0
if field_units == "g/Kg":
field_units = "kg/kg"
cube.data = cube.data.astype(np.float32) / 1000.0
cube.data = cube.data / 1000.0
if not field_units:
if field.field_code == 8:
# Relative Humidity data are unitless, but not "unknown"
Expand All @@ -175,6 +174,14 @@ def units(cube, field):
# Deal with the case where the units are of the form '/unit' eg
# '/second' in the Nimrod file. This converts to the form unit^-1
field_units = field_units[1:] + "^-1"

if cube.dtype != dtype_original:
# Original development logic: if any arithmetic takes place, ensure
# the data type is float32 (starts as an int). Unknown why.
# Automatic casting is returning inconsistent types when masks are
# involved, so the new logic is to do the casting as the final step.
cube.data = cube.data.astype(np.float32)

try:
cube.units = field_units
except ValueError:
Expand Down
23 changes: 19 additions & 4 deletions lib/iris/fileformats/pp.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import os
import re
import struct
from typing import Any
import warnings

import cf_units
Expand Down Expand Up @@ -950,6 +951,21 @@ def t2(self):

def __repr__(self):
"""Return a string representation of the PP field."""

def _str_tuple(to_print: Any):
"""Print NumPy scalars within tuples as numbers, not np objects.

E.g. ``lbuser`` is a tuple of NumPy scalars.

NumPy v2 by default prints ``np.int32(1)`` instead of ``1`` when
printing an iterable of scalars.
"""
if isinstance(to_print, tuple):
result = "(" + ", ".join([str(i) for i in to_print]) + ")"
else:
result = str(to_print)
return result

# Define an ordering on the basic header names
attribute_priority_lookup = {name: loc[0] for name, loc in self.HEADER_DEFN}

Expand All @@ -975,9 +991,8 @@ def __repr__(self):
),
)

return (
"PP Field" + "".join(["\n %s: %s" % (k, v) for k, v in attributes]) + "\n"
)
contents = "".join([f"\n {k}: {_str_tuple(v)}" for k, v in attributes])
return f"PP Field{contents}\n"

@property
def stash(self):
Expand Down Expand Up @@ -1178,7 +1193,7 @@ def save(self, file_handle):
data.dtype = data.dtype.newbyteorder(">")

# Create the arrays which will hold the header information
lb = np.empty(shape=NUM_LONG_HEADERS, dtype=np.dtype(">u%d" % PP_WORD_DEPTH))
lb = np.empty(shape=NUM_LONG_HEADERS, dtype=np.dtype(">i%d" % PP_WORD_DEPTH))
b = np.empty(shape=NUM_FLOAT_HEADERS, dtype=np.dtype(">f%d" % PP_WORD_DEPTH))

# Fill in the header elements from the PPField
Expand Down
4 changes: 2 additions & 2 deletions lib/iris/pandas.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ def as_cube(
# 1.6 doesn't. Since we don't care about preserving the order we can
# just force it back to C-order.)
order = "C" if copy else "A"
data = np.array(pandas_array, copy=copy, order=order)
data = np.array(pandas_array.values, copy=copy, order=order)
cube = Cube(np.ma.masked_invalid(data, copy=False))
_add_iris_coord(cube, "index", pandas_array.index, 0, calendars.get(0, None))
if pandas_array.ndim == 2:
Expand Down Expand Up @@ -308,7 +308,7 @@ def as_cubes(
Pandas uses ``NaN`` rather than masking data. Converted
:class:`~iris.cube.Cube` can be masked in downstream user code :

>>> my_series = Series([300, np.NaN, 302], name="air_temperature")
>>> my_series = Series([300, np.nan, 302], name="air_temperature")
>>> converted_cube = as_cubes(my_series)[0]
>>> print(converted_cube.data)
[300. nan 302.]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -319,7 +319,7 @@ def test_regrid_reorder_axis(self):
dest = _resampled_grid(self.realistic_cube[0, 0, :3, :2], 3, 3)
res = regrid_area_weighted(src, dest)
self.assertArrayShapeStats(src, (4, 3, 2), 288.08868, 0.008262919)
self.assertArrayShapeStats(res, (4, 9, 6), 288.08865, 0.00826281)
self.assertArrayShapeStats(res, (4, 9, 6), 288.0886, 0.008271061)
# Reshape src so that the coords are ordered [x, z, y],
# the mean and std statistics should be the same
data = np.moveaxis(src.data.copy(), 2, 0)
Expand All @@ -329,7 +329,7 @@ def test_regrid_reorder_axis(self):
src.add_dim_coord(lon, 0)
res = regrid_area_weighted(src, dest)
self.assertArrayShapeStats(src, (2, 4, 3), 288.08868, 0.008262919)
self.assertArrayShapeStats(res, (6, 4, 9), 288.08865, 0.00826281)
self.assertArrayShapeStats(res, (6, 4, 9), 288.0886, 0.008271061)
# Reshape src so that the coords are ordered [y, x, z],
# the mean and std statistics should be the same
data = np.moveaxis(src.data.copy(), 2, 0)
Expand All @@ -340,7 +340,7 @@ def test_regrid_reorder_axis(self):
dest = _resampled_grid(self.realistic_cube[0, 0, :3, :2], 3, 3)
res = regrid_area_weighted(src, dest)
self.assertArrayShapeStats(src, (3, 2, 4), 288.08868, 0.008262919)
self.assertArrayShapeStats(res, (9, 6, 4), 288.08865, 0.00826281)
self.assertArrayShapeStats(res, (9, 6, 4), 288.0886, 0.008271061)

def test_regrid_lon_to_half_res(self):
src = self.simple_cube
Expand Down
Loading
Loading