Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DataTree.coords.__setitem__ by adding DataTreeCoordinates class #9451

Merged
merged 50 commits into from
Sep 11, 2024
Merged
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
704db79
add a DataTreeCoordinates class
TomNicholas Sep 8, 2024
417e3e9
passing read-only properties tests
TomNicholas Sep 8, 2024
9562e92
tests for modifying in-place
TomNicholas Sep 8, 2024
0e7de82
WIP making the modification test pass
TomNicholas Sep 8, 2024
839858f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 8, 2024
9370b9b
get to the delete tests
TomNicholas Sep 8, 2024
9b50567
test
TomNicholas Sep 8, 2024
c466f8d
improve error message
TomNicholas Sep 8, 2024
0397eca
implement delitem
TomNicholas Sep 8, 2024
85bb221
test KeyError
TomNicholas Sep 8, 2024
7802c63
Merge branch 'delitem' into datatree_coords_setitem
TomNicholas Sep 8, 2024
1bf5082
subclass Coordinates instead of DatasetCoordinates
TomNicholas Sep 8, 2024
e8620cf
use Frozen(self._data._coord_variables)
TomNicholas Sep 8, 2024
1108504
Simplify when to raise KeyError
TomNicholas Sep 8, 2024
0a7201b
correct bug in suggestion
TomNicholas Sep 8, 2024
51e11bc
Update xarray/core/coordinates.py
TomNicholas Sep 8, 2024
7ecdd16
simplify _update_coords by creating new node data first
TomNicholas Sep 8, 2024
dfcdb6d
Merge branch 'main' into datatree_coords_setitem
TomNicholas Sep 9, 2024
3278153
Merge branch 'main' into datatree_coords_setitem
TomNicholas Sep 9, 2024
f672c5e
update indexes correctly
TomNicholas Sep 9, 2024
7fb1622
passes test
TomNicholas Sep 9, 2024
897b7c4
update ._drop_indexed_coords
TomNicholas Sep 9, 2024
b5a56f4
Merge branch 'main' into datatree_coords_setitem
TomNicholas Sep 9, 2024
fdae5bc
some mypy fixes
TomNicholas Sep 10, 2024
9dc845a
remove the apparently-unused _drop_indexed_coords method
TomNicholas Sep 10, 2024
6595fe9
Merge branch 'datatree_coords_setitem' of https://github.com/TomNicho…
TomNicholas Sep 10, 2024
ed87554
fix import error
TomNicholas Sep 10, 2024
c155bc1
test that Dataset and DataArray constructors can handle being passed …
TomNicholas Sep 10, 2024
217cb84
test dt.coords can be passed to DataTree constructor
TomNicholas Sep 10, 2024
540bb0f
improve readability of inline comment
TomNicholas Sep 10, 2024
7126efa
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 10, 2024
8486227
initial tests with inherited coords
TomNicholas Sep 10, 2024
12f24df
Merge branch 'datatree_coords_setitem' of https://github.com/TomNicho…
TomNicholas Sep 10, 2024
8f09c93
ignore typeerror indicating dodgy inheritance
TomNicholas Sep 11, 2024
d23105f
try to avoid Unbound type error
TomNicholas Sep 11, 2024
978e05e
cast return value correctly
TomNicholas Sep 11, 2024
bd47575
cehck that .coords works with inherited coords
TomNicholas Sep 11, 2024
8ef94df
Merge branch 'main' into datatree_coords_setitem
TomNicholas Sep 11, 2024
10b8a78
Merge branch 'main' into datatree_coords_setitem
TomNicholas Sep 11, 2024
b9ede22
fix data->dataset
TomNicholas Sep 11, 2024
540a825
fix return type of __getitem__
TomNicholas Sep 11, 2024
b30d5e0
Use .dataset instead of .to_dataset()
TomNicholas Sep 11, 2024
639ad07
_check_alignment -> check_alignment
TomNicholas Sep 11, 2024
0a9a328
remove dict comprehension
TomNicholas Sep 11, 2024
80bc0bd
KeyError message formatting
TomNicholas Sep 11, 2024
a366bf6
keep generic types for .dims and .sizes
TomNicholas Sep 11, 2024
4d352bd
test verifying you cant delete inherited coord
TomNicholas Sep 11, 2024
4626fa8
fix mypy complaint
TomNicholas Sep 11, 2024
ea8a195
type hint as accepting objects
TomNicholas Sep 11, 2024
af94af4
update note about .dims returning all dims
TomNicholas Sep 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
simplify _update_coords by creating new node data first
TomNicholas committed Sep 8, 2024
commit 7ecdd16018b8b7b75f94ad531112b9572d165ba4
36 changes: 17 additions & 19 deletions xarray/core/coordinates.py
Original file line number Diff line number Diff line change
@@ -860,29 +860,27 @@ def _update_coords(
# TODO I don't know how to update coordinates that live in parent nodes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is OK for this to be an error. The user can replace those coordinates on the parent nodes.

# TODO We would have to find the correct node and update `._node_coord_variables`

coord_variables = self._data._coord_variables.copy()
coord_variables.update(coords)
# create updated node (`.to_dataset` makes a copy so this doesn't modify in-place)
node_ds = self._data.to_dataset(inherited=False)
node_ds.coords._update_coords(coords, indexes)

# check for inconsistent state *before* modifying anything in-place
variables = coord_variables | self._data._data_variables.copy()
# TODO is there a subtlety here with rebuild_dims?
dims = calculate_dimensions(variables)
new_coord_names = set(coords)
for dim, size in dims.items():
if dim in variables:
new_coord_names.add(dim)
from xarray.core.datatree import _check_alignment

# TODO we need to upgrade these variables to coord variables somehow
# coord_variables.update(new_coord_names)
# check consistency *before* modifying anything in-place
# TODO can we clean up the signature of _check_alignment to make this less awkward?
if self._data.parent is not None:
parent_ds = self._data.parent._to_dataset_view(rebuild_dims=False)
else:
parent_ds = None
_check_alignment(self._data.path, node_ds, parent_ds, self._data.children)

# assign updated attributes
coord_variables = {
k: v for k, v in node_ds.variables.items() if k in node_ds._coord_names
}
self._data._node_coord_variables = coord_variables
self._data._node_dims = dims

# TODO(shoyer): once ._indexes is always populated by a dict, modify
# it to update inplace instead.
original_indexes = dict(self._data.xindexes)
original_indexes.update(indexes)
self._data._node_indexes = original_indexes
self._data._node_dims = node_ds._dims
self._data._indexes = node_ds._indexes

def _drop_coords(self, coord_names):
# should drop indexed coordinates only

Unchanged files with check annotations Beta

dt["child"] = DataTree()
actual = dt.copy(deep=True)
actual.coords["x"] = ("x", ["a", "b"])

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / ubuntu-latest py3.10 bare-minimum

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / macos-latest py3.10

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / ubuntu-latest py3.11 all-but-dask

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / ubuntu-latest py3.10 min-all-deps

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / macos-latest py3.12

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / ubuntu-latest py3.10

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'

Check failure on line 611 in xarray/tests/test_datatree.py

GitHub Actions / ubuntu-latest py3.12

TestCoordsInterface.test_modify ValueError: group '/child' is not aligned with its parents: Group: Dimensions: (x: 2, y: 3) Coordinates: * x (x) int64 16B -1 -2 * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 Data variables: *empty* From parents: Dimensions: (x: 2, y: 3) Coordinates: * y (y) int64 24B 0 1 2 a (x) int64 16B 4 5 b int64 8B -10 * x (x) <U1 8B 'a' 'b'
assert_array_equal(actual["x"], ["a", "b"])
actual = dt.copy(deep=True)