Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

numpy datetime conversion with DataArray is not working #6412

Open
jules-ch opened this issue Mar 25, 2022 · 3 comments
Open

numpy datetime conversion with DataArray is not working #6412

jules-ch opened this issue Mar 25, 2022 · 3 comments

Comments

@jules-ch
Copy link

What happened?

I have a simple DataArray with datetime[ns] inside, but when I try to convert this to a supported numpy datetime dtype, the results is still a datetime[ns].

Using the values attribute & converting it, returns the correct results

What did you expect to happen?

I excpected the DataArray to be converted in the correct dtype

Minimal Complete Verifiable Example

>>> time = pd.date_range('2015-01-01 00:00:00', '2015-12-31 23:59:00', inclusive='left', freq='15min')
>>> da = xr.DataArray(time)
>>> da
<xarray.DataArray (dim_0: 35040)>
array(['2015-01-01T00:00:00.000000000', '2015-01-01T00:15:00.000000000',
       '2015-01-01T00:30:00.000000000', ..., '2015-12-31T23:15:00.000000000',
       '2015-12-31T23:30:00.000000000', '2015-12-31T23:45:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * dim_0    (dim_0) datetime64[ns] 2015-01-01 ... 2015-12-31T23:45:00

>>> da.astype("datetime64[Y]")
<xarray.DataArray (dim_0: 35040)>
array(['2015-01-01T00:00:00.000000000', '2015-01-01T00:00:00.000000000',
       '2015-01-01T00:00:00.000000000', ...,
       '2015-01-01T00:00:00.000000000', '2015-01-01T00:00:00.000000000',
       '2015-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
  * dim_0    (dim_0) datetime64[ns] 2015-01-01 ... 2015-12-31T23:45:00
>>> da.values.astype("datetime64[Y]")
array(['2015', '2015', '2015', ..., '2015', '2015', '2015'],
      dtype='datetime64[Y]')

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.13.0-37-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: None

xarray: 0.20.2
pandas: 1.4.1
numpy: 1.22.3
scipy: 1.8.0
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.6.0
Nio: None
zarr: None
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.03.0
distributed: 2022.3.0
matplotlib: 3.5.1
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.02.0
cupy: None
pint: 0.18
sparse: None
setuptools: 60.6.0
pip: 22.0.3
conda: None
pytest: 6.2.5
IPython: 8.1.1
sphinx: None

@jules-ch jules-ch added bug needs triage Issue that has not been reviewed by xarray team member labels Mar 25, 2022
@spencerkclark
Copy link
Member

This is a common point of confusion, but is in fact expected. Xarray intentionally converts any np.datetime64 type to np.datetime64[ns]. The primary motivation is compatibility with pandas, which xarray relies on for time indexing and other time-related operations through things like pandas.DatetimeIndex or the pandas.Series.dt accessor (see, e.g., discussion in #789).

May I ask what your reason is for requiring a lower-precision datetime type? In xarray we have tried to provide alternatives (e.g. through cftime) for use-cases like longer date ranges.

@keewis keewis removed needs triage Issue that has not been reviewed by xarray team member bug labels Mar 25, 2022
@jules-ch
Copy link
Author

jules-ch commented Mar 29, 2022

I'm using xarray on a Dataset & it's convenient for me to make calculation using DataArray. Here when I want to retrieve the year of datetime, instead of casting back to an array of object & using datetime.year, it's handy to use built-in numpy datetime64 conversion.

It's really confusing astype is not working like numpy does. If you want to keep this behavious maybe add a warning in the docs and log a warning aswell.

@spencerkclark
Copy link
Member

I agree this should be better documented.

Here when I want to retrieve the year of datetime, instead of casting back to an array of object & using datetime.year, it's handy to use built-in numpy datetime64 conversion.

The recommended way to extract datetime components, like the year, is to use the DatetimeAccessor. For example if you have a DataArray of datetime-like values (whether they are of type np.datetime64[ns] or cftime.datetime) you can do something like this:

da.dt.year

and it will return a DataArray containing the year of each datetime (for more information see the "Datetime components" section of the documentation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants