Skip to content
This repository has been archived by the owner on Oct 24, 2024. It is now read-only.

Indexing tree should create new tree #77

Closed
TomNicholas opened this issue Apr 22, 2022 · 2 comments
Closed

Indexing tree should create new tree #77

TomNicholas opened this issue Apr 22, 2022 · 2 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@TomNicholas
Copy link
Member

Inspired by this example in the stackstac documentation

lowcloud = stack[stack["eo:cloud_cover"] < 20]

we should ensure that you can index a datatree with another (isomorphic) datatree, so that the above operation would work even if stack is a DataTree instance.

This is another map_over_subtree-type operation, but it needs careful testing because the __getitem__ function in xarray objects already does so many different things. This won't work with the code as-is because at the moment the DataTree naively dispatches the __getitem__ call down to the wrapped dataset.

return self.ds[key]

@TomNicholas TomNicholas added bug Something isn't working enhancement New feature or request labels Apr 22, 2022
@TomNicholas
Copy link
Member Author

TomNicholas commented Apr 22, 2022

To clarify, in order for this to work several things need to happen:

  1. stack["eo:cloud_cover"] needs to realise that "eo:cloud_cover" is not a tree, not a group in the tree, but a variable name. Then it needs to select the "eo:cloud_cover" variable from all nodes in the subtree, and return a tree containing only those variables. That in itself requires something like Ignore missing dims when mapping over tree #67 but ignoring nodes for which that variable is not present, at least for deeply-nested trees...
  2. stack["eo:cloud_cover"] < 20 needs to perform this comparison node-wise, returning a tree of results (hopefully this should already work...
  3. stack[stack["eo:cloud_cover"] < 20] needs to use the tree passed to perform a node-wise indexing operation, returning a new tree. (Or we could just .where)

Basically this is a really complicated usage example because it uses multiple different code-paths within __getitem__ sequentially within one line of user code 😅

@TomNicholas
Copy link
Member Author

Closing in favour of pydata/xarray#9341

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant