-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slightly improve DataTree repr #9064
Conversation
This is ready for another review, after switching the repr of each node from There are two alternatives that are worth considering:
I think this is a little less readable, but maybe is a good idea if we think people will mostly be using very heavily nested trees.
I Iike the look of |
With this change there is now no information about the
This seems okay.
I tried to avoid using the word "Group" because now we have 3 words for the same thing: "node", "group", and arguably Displaying the full path in each node might quickly get unweildy - e.g. |
This is one reason why I added the full path -- it shows the parent in an integrated part of the repr. But if we go for only printing group names, we could also add a line showing only full path only for the root.
I think "DataTree" and "Group" are separate concepts. When you access a coordinate from a |
I like the idea of having the full path to the root visible somewhere. It's more useful than just
Okay I'm convinced! 😁 I do wonder then whether we should try to replace occurrences of "node" with "group" in the docs... |
Let's ask in the new DataTree meeting for opinions on only showing the name vs. the full path on each group. |
Regarding displaying the group name vs full group name, here is an example: relative
absolute
Regarding my specific case, I don't have any opinion really on the default. However, users with long paths might be annoyed by the absolute paths for sure. But having the full complete path, copy-pastable has probably more value in the everyday life. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Quick question, probably also to @TomNicholas: is the term "Group" the word we are likely to settle on terminology-wise? (There was discussion of settling on one term for particular concepts in our call this week) I'm not sure if we wanted to decide that now, or if it can wait?
Also noting that we agreed in the same call on the full path being listed for each group/node/subtree, as this PR does.
I think "group" is the right terminology for the data in a node and its children. It's the same name already used by netCDF, HDF5 and Zarr. @TomNicholas can you give this a formal review in GitHub? I'd love to get this merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @shoyer. Only thing you could add is a test that checks that the group paths work properly in the case that the node you're printing is not the root of the tree (also a whatsnew entry 😉).
There is a .groups
method which returns the whole path, which is consistent with the usage of the term in the repr here.
There's a test for this already inside
I'm going to skip this. DataTree is not a released feature so I don't think users need to know about updates to it. |
* main: promote floating-point numeric datetimes to 64-bit before decoding (pydata#9182) also pin `numpy` in the all-but-dask CI (pydata#9184) temporarily remove `pydap` from CI (pydata#9183) temporarily pin `numpy<2` (pydata#9181) Change np.core.defchararray to np.char (pydata#9165) (pydata#9166) Fix example code formatting for CachingFileManager (pydata#9178) Slightly improve DataTree repr (pydata#9064) switch to unit `"D"` (pydata#9170) Docs: Add page with figure for navigating help resources (pydata#9147) Add test for pydata#9155 (pydata#9161) Remove mypy exclusions for a couple more libraries (pydata#9160) Include numbagg in type checks (pydata#9159) Improve zarr chunks docs (pydata#9140)
The DataTree header is now
<xarray.DataTree>
or<xarray.DataTree 'name'>
for consistency with the Dataset and DataArray reprs.The full path is now included for each DataTree node, rather than printing
name
andparent
separately.Before:
After: