-
Notifications
You must be signed in to change notification settings - Fork 41
Consistency between __contains__
and __getitem__
#240
Comments
Hi @djhoese - this is a neat idea that seems reasonable to me! Implementing this for checking for child nodes might be as simple as changing this line to read something like return key in self.variables or child.path for child in self.descendants but I think a slightly more complicated iteration would be needed to also return variables within child nodes. Another gotcha would be ensuring this works for relative paths (i.e. if the node you were searching from in your example was not the root node of the tree). Probably we want some internal convenience property like Otherwise I don't see why there would be a performance issue, it's the same kind of traversal as asking to list the descendants. If you are interested in submitting a PR then I will happily review it! |
Thinking about it a bit more this is a case where it's not immediately clear whether to make a datatree object behave more like a one-level dict of children and variables, or something more complex that allows access to multiple levels. At the moment I've made this distinction by basically trying to make most methods that would be defined on |
I had a similar thought. I was torn on whether or not I even wanted to file this issue for discussion. I think the thing that sells this idea for me is that The sub-node/child handling of I just did a test and it looks like This is just brainstorming. I'm not trying to enforce or suggest I know what is best design for DataTree. |
No thank you @djhoese I appreciate these questions - this is one of those things that I internally debated about with datatree's design but ultimately just picked something without really having to justify it to anyone 😅 At the very least it's useful to write out the logic here in case someone else has a similar question later.
I agree. It's one of the main things that elevates I also tried hard to ensure that even though
This is pretty much exactly what I decided too. I think when someone calls
So far my traversal properties are mostly named things like
Yes, but we could just as easily argue that (Also there is precedent for
I think I agree here. You mean like if path in dt:
node = dt[path] I wonder if there is an argument that checking for existence can safely be more "permissive" than the values returned by iteration... i.e. that succeeding in accessing multi-level paths is unlikely to surprise the user compared to returning the entire contents of a tree from |
I'm not sure I understand what you're saying here. I think you're restating your agreement that Just as an added example, what I meant when I said contains and getitem should "work together" is that they are complimentary methods in my opinion. In my original case I was writing a test:
Now some might argue that any |
Sorry for forgetting to reply to this for almost a year 😅 I was reminded of this by a different discussion and now I think I do agree with you.
This especially seems like a good reason for I also think that it's convenient that we can separate the "is this key in the local node's variables" question from the "is this path in the tree anywhere" question. In the former case the key will never have a |
No problem. I'm on paternity leave now so it was lucky I even saw your message (plus I subscribe to every event in this repository so sometimes I miss conversations I'm a part of). Good points and I think I agree. |
Closing in favour of pydata/xarray#9354 |
I'm working on adding conversion to a DataTree object in the Satpy library and while writing some tests I've come across a quality of life change I'm hoping is possible/allowed to add. Basically, I want to be able to check if a node is in a tree in the same way I can access that node using the
/
syntax. For example:I expected (with very limited prior usage/knowledge of DataTree) that the
__contains__
method would do the same or similar traversal as the__getitem__
method. In the above case thein
line would returnTrue
.Is there a functional reason why this can't be implemented this way? Any issues or gotchas you can foresee if I were to implement it (performance maybe?)?
Edit: Oops forgot to include the tree creation line.
The text was updated successfully, but these errors were encountered: