-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement PythonCall extension to convert xarray objects #882
Conversation
(I noticed that CI wasn't caching anything anyway so I took the liberty of adding it in d3c1730) |
There is also https://github.com/meggart/PyYAXArrays.jl which might give some inspiration. |
Can you show where you mean? NoLookup is always just a Base.OneTo(length) underneath because a Lookup is an AbstractArray and it needs length, so may as well be the indices. That will sometimes display as a 1:5 vector |
But this looks really nice to have overall. Should we also handle/test more complicated lookups like DateTime? And is there information from xarray if a lookup is points or intervals, categorical etc? |
Ah I'm dumb... the test was checking if the lookup
Agreed that it'd be useful, but I do not currently have the courage to tackle that hornets nest 😅
No I don't think that kind of information is supported yet according to the docs: https://docs.xarray.dev/en/stable/user-guide/data-structures.html#coordinates When reading that page I realized that these conversion methods won't work in two situations:
Does DimensionalData support those at all? From the docs I can't figure out how to construct a DimArray for either situation. |
I'm not sure how common the first would be, or if I understand totally, what's the use case? The second I have just never implemented but isn't too hard in theory, all the machinery is in place with Transformed lookups we "just" need to swap the functions for matrices. (But it's also very rare in the wild) My guess is merging this without DateTime handling workingwill lead to a bunch of bug reports. But won't pythoncall just handle that for us? Did you try it? |
Lets just merge and fix details later |
https://github.com/meggart/DateTimes64.jl was written for Python datetime -> Julia conversion. |
My apologies for not getting back to this, I didn't mean to ghost you into merging it 😬 Things got a bit busy leading up to Christmas (happy new year BTW 🥳) and this slipped my mind. I'll try to tackle datetime conversion later in the week.
I actually use both semi-regularly. For the first case I often work with arrays of events that have a time and 'quality', i.e. a DataArray with a time coordinate and a boolean coordinate indicating whether the event is trustworthy or not. For the second, I also often work with multi-dimensional arrays that have two different kinds of times: one is a coarse time and the other is orders of magnitude more precise but it's an offset relative to the coarse time. So to get the time of an event I have to combine the coarse time and offset. Concretely, the DataArray has dimensions In both cases having a separate array for the lookups would work, but it's really nice to keep them together. I'm not that familiar with the DimensionalData internals, but if you give me some pointers I'm happy to take a shot at implementing them? |
Is there interest in also having a conversion from Edit: there's a very barebones implementation of this here: https://github.com/arviz-devs/ArviZPythonPlots.jl/blob/main/src/xarray.jl |
That would be great, there's another implementation for YAXArrays at https://www.github.com/meggart/PyYAXArrays.jl |
This is quite handy for interop with Python. There's a couple of things I might need a hand with:
DataArray
with only one dimension having a coordinate. When I run the test code in the REPL,lookup(y, :length)
returnsNoLookup
, which is what I expect. But in the tests it seems to automatically create a lookup of1:5
and I'm not sure where that's coming from because it's not being created by thepyconvert()
methods.Also note that the tests create a shared conda environment named
@dimensionaldata-tests
, I went for a shared environment instead of a local one because this way it gets stored in the depot and will get cached byjulia-actions/cache
.