Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFTimeIndex #1252

Merged
merged 75 commits into from
May 13, 2018
Merged
Show file tree
Hide file tree
Changes from 42 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
e1e8223
Start on implementing and testing NetCDFTimeIndex
spencerkclark Feb 5, 2017
6496458
TST Move to using pytest fixtures to structure tests
spencerkclark Feb 6, 2017
675b2f7
Address initial review comments
spencerkclark Feb 10, 2017
7beddc1
Address second round of review comments
spencerkclark Feb 11, 2017
3cf03bc
Fix failing python3 tests
spencerkclark Feb 11, 2017
53b085c
Match test method name to method name
spencerkclark Feb 11, 2017
738979b
Merge branch 'master' of https://github.com/pydata/xarray into NetCDF…
spencerkclark Apr 16, 2017
a177f89
First attempts at integrating NetCDFTimeIndex into xarray
spencerkclark May 10, 2017
48ec519
Cleanup
spencerkclark May 11, 2017
9e76df6
Merge branch 'master' into NetCDFTimeIndex
spencerkclark May 11, 2017
2a7b439
Fix DataFrame and Series test failures for NetCDFTimeIndex
spencerkclark May 11, 2017
b942724
First pass at making NetCDFTimeIndex compatible with #1356
spencerkclark May 11, 2017
7845e6d
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Jun 20, 2017
a9ed3c8
Address initial review comments
spencerkclark Jun 26, 2017
3e23ed5
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Aug 25, 2017
a9f3548
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Jan 22, 2018
f00f59a
Restore test_conventions.py
spencerkclark Jan 22, 2018
b34879d
Fix failing test in test_utils.py
spencerkclark Jan 22, 2018
e93b62d
flake8
spencerkclark Jan 22, 2018
61e8bc6
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Feb 20, 2018
0244f58
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Mar 1, 2018
32d7986
Update for standalone netcdftime
spencerkclark Mar 1, 2018
9855176
Address stickler-ci comments
spencerkclark Mar 1, 2018
8d61fdb
Skip test_format_netcdftime_datetime if netcdftime not installed
spencerkclark Mar 1, 2018
6b87da7
A start on documentation
spencerkclark Mar 9, 2018
812710c
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Mar 9, 2018
3610e6e
Fix failing zarr tests related to netcdftime encoding
spencerkclark Mar 9, 2018
8f69a90
Simplify test_decode_standard_calendar_single_element_non_ns_range
spencerkclark Mar 9, 2018
cec909c
Address a couple review comments
spencerkclark Mar 10, 2018
422792b
Use else clause in _maybe_cast_to_netcdftimeindex
spencerkclark Mar 10, 2018
de74037
Start on adding enable_netcdftimeindex option
spencerkclark Mar 10, 2018
2993e3c
Continue parametrizing tests in test_coding_times.py
spencerkclark Mar 10, 2018
f3438fd
Update time-series.rst for enable_netcdftimeindex option
spencerkclark Mar 10, 2018
c35364e
Use :py:func: in rst for xarray.set_options
spencerkclark Mar 10, 2018
08f72dc
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Mar 10, 2018
62ce0ae
Add a what's new entry and test that resample raises a TypeError
spencerkclark Mar 11, 2018
ff05005
Merge branch 'master' of https://github.com/pydata/xarray into NetCDF…
spencerkclark Mar 12, 2018
20fea63
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Mar 16, 2018
d5a3cef
Move what's new entry to the version 0.10.3 section
spencerkclark Mar 16, 2018
e721d26
Add version-dependent pathway for importing netcdftime.datetime
spencerkclark Mar 17, 2018
5e1c4a8
Make NetCDFTimeIndex and date decoding/encoding compatible with datet…
spencerkclark Mar 20, 2018
257f086
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Mar 20, 2018
00e8ada
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Apr 12, 2018
c9d0454
Remove logic to make NetCDFTimeIndex compatible with datetime.datetime
spencerkclark Apr 12, 2018
f678714
Documentation edits
spencerkclark Apr 12, 2018
b03e38e
Ensure proper enable_netcdftimeindex option is used under lazy decoding
spencerkclark Apr 13, 2018
890dde0
Add fix and test for concatenating variables with a NetCDFTimeIndex
spencerkclark Apr 13, 2018
80e05ba
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Apr 16, 2018
13c8358
Further namespace changes due to netcdftime/cftime renaming
spencerkclark Apr 16, 2018
ab46798
NetCDFTimeIndex -> CFTimeIndex
spencerkclark Apr 16, 2018
67fd335
Documentation updates
spencerkclark Apr 16, 2018
7041a8d
Only allow use of CFTimeIndex when using the standalone cftime
spencerkclark Apr 16, 2018
9df4e11
Fix errant what's new changes
spencerkclark Apr 16, 2018
9391463
flake8
spencerkclark Apr 16, 2018
da12ecd
Fix skip logic in test_cftimeindex.py
spencerkclark Apr 16, 2018
a6997ec
Use only_use_cftime_datetimes option in num2date
spencerkclark Apr 26, 2018
7302d7e
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Apr 26, 2018
9dc5539
Require standalone cftime library for all new functionality
spencerkclark Apr 28, 2018
1aa8d86
Improve skipping logic in test_cftimeindex.py
spencerkclark Apr 28, 2018
ef3f2b1
Fix skipping logic in test_cftimeindex.py for when cftime or netcdftime
spencerkclark Apr 28, 2018
4fb5a90
Fix skip logic in Python 3.4 build for test_cftimeindex.py
spencerkclark Apr 28, 2018
1fd205a
Improve error messages when for when the standalone cftime is not ins…
spencerkclark Apr 28, 2018
58a0715
Tweak skip logic in test_accessors.py
spencerkclark Apr 28, 2018
ca4d7dd
flake8
spencerkclark Apr 28, 2018
3947aac
Address review comments
spencerkclark Apr 30, 2018
a395db0
Temporarily remove cftime from py27 build environment on windows
spencerkclark Apr 30, 2018
1b00bde
flake8
spencerkclark Apr 30, 2018
5fdcd20
Install cftime via pip for Python 2.7 on Windows
spencerkclark Apr 30, 2018
459211c
Merge branch 'master' into NetCDFTimeIndex
spencerkclark Apr 30, 2018
7e9bb20
flake8
spencerkclark Apr 30, 2018
247c9eb
Remove unnecessary new lines; simplify _maybe_cast_to_cftimeindex
spencerkclark May 1, 2018
e66abe9
Restore test case for #2002 in test_coding_times.py
spencerkclark May 1, 2018
f25b0b6
Tweak dates out of range warning logic slightly to preserve current d…
spencerkclark May 2, 2018
b10cc73
Merge branch 'master' into NetCDFTimeIndex
spencerkclark May 2, 2018
c318755
Address review comments
spencerkclark May 12, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 91 additions & 1 deletion doc/time-series.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,10 @@ You can manual decode arrays in this form by passing a dataset to
One unfortunate limitation of using ``datetime64[ns]`` is that it limits the
native representation of dates to those that fall between the years 1678 and
2262. When a netCDF file contains dates outside of these bounds, dates will be
returned as arrays of ``netcdftime.datetime`` objects.
returned as arrays of ``netcdftime.datetime`` objects and a ``NetCDFTimeIndex``
can be used for indexing. The ``NetCDFTimeIndex`` enables only a subset of
the indexing functionality of a ``pandas.DatetimeIndex``. See
:ref:`NetCDFTimeIndex` for more information.

Datetime indexing
-----------------
Expand Down Expand Up @@ -207,3 +210,90 @@ Dataset and DataArray objects with an arbitrary number of dimensions.

For more examples of using grouped operations on a time dimension, see
:ref:`toy weather data`.


.. _NetCDFTimeIndex:

Non-standard calendars and dates outside the Timestamp-valid range
------------------------------------------------------------------

Through the optional ``netcdftime`` library and a custom subclass of
``pandas.Index``, xarray supports a subset of the indexing functionality enabled
through the standard ``pandas.DatetimeIndex`` for dates from non-standard
calendars or dates using a standard calendar, but outside the
`Timestamp-valid range`_ (approximately between years 1678 and 2262). This
behavior has not yet been turned on by default; to take advantage of this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might add that it will be enabled in v0.11.

functionality, you must have the ``enable_netcdftimeindex`` option set to
``True`` within your context (see :py:func:`~xarray.set_options` for more
information).

For instance, you can create a DataArray indexed by a time
coordinate with a no-leap calendar within a context manager setting the
``enable_netcdftimeindex`` option, and the time index will be cast to a
``NetCDFTimeIndex``:

.. ipython:: python

from itertools import product
from netcdftime import DatetimeNoLeap

dates = [DatetimeNoLeap(year, month, 1) for year, month in
product(range(1, 3), range(1, 13))]
with xr.set_options(enable_netcdftimeindex=True):
da = xr.DataArray(np.arange(24), coords=[dates], dims=['time'],
name='foo')

.. note::

With the ``enable_netcdftimeindex`` option activated, a ``NetCDFTimeIndex``
will be used for time indexing if any of the following are true:

- The dates are from a non-standard calendar
- Any dates are outside the Timestamp-valid range

Otherwise a ``pandas.DatetimeIndex`` will be used. In addition, if any
variable (not just an index variable) is encoded using a non-standard
calendar, its times will be decoded into ``netcdftime.datetime`` objects,
regardless of whether or not they can be represented using
``np.datetime64[ns]`` objects.

For data indexed by a ``NetCDFTimeIndex`` xarray currently supports `partial
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change this to bullet points rather than one long sentence? I.e. "...currently supports: " followed by one bullet each for partial datetime string indexing, dt accessor, groupby, and serialization.

datetime string indexing`_ using strictly `ISO 8601-format`_ partial datetime
strings:

.. ipython:: python

da.sel(time='0001')
da.sel(time=slice('0001-05', '0002-02'))

access of basic datetime components via the ``dt`` accessor (in this case just
"year", "month", "day", "hour", "minute", "second", "microsecond", and "season"):

.. ipython:: python

da.time.dt.year
da.time.dt.month
da.time.dt.season

group-by operations based on datetime accessor attributes (e.g. by month of the
year):

.. ipython:: python

da.groupby('time.month').sum()

and serialization:

.. ipython:: python

da.to_netcdf('example.nc')
xr.open_dataset('example.nc')

.. note::

Currently resampling along the time dimension for data indexed by a
``NetCDFTimeIndex`` is not supported.

.. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#timestamp-limitations
.. _ISO 8601-format: https://en.wikipedia.org/wiki/ISO_8601
.. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#partial-string-indexing
9 changes: 9 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,15 @@ Documentation
Enhancements
~~~~~~~~~~~~

- Add an option for using a ``NetCDFTimeIndex`` for indexing times with
non-standard calendars and/or outside the Timestamp-valid range; this index
enables a subset of the functionality of a standard
``pandas.DatetimeIndex`` (:issue:`789`, :issue:`1084`, :issue:`1252`).
By `Spencer Clark <https://github.com/spencerkclark>`_ with help from
`Stephan Hoyer <https://github.com/shoyer>`_.
- Allow for serialization of ``netcdftime.datetime`` objects (:issue:`789`,
:issue:`1084`, :issue:`1252`). By `Spencer Clark
<https://github.com/spencerkclark>`_.
- Some speed improvement to construct :py:class:`~xarray.DataArrayRolling`
object (:issue:`1993`)
By `Keisuke Fujii <https://github.com/fujiisoup>`_.
Expand Down
1 change: 0 additions & 1 deletion xarray/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
save_mfdataset)
from .backends.rasterio_ import open_rasterio
from .backends.zarr import open_zarr

from .conventions import decode_cf, SerializationWarning

try:
Expand Down
5 changes: 4 additions & 1 deletion xarray/backends/zarr.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@

from .. import Variable, coding, conventions
from ..core import indexing
from ..core.common import (contains_netcdftime_datetimes,
contains_datetime_datetimes)
from ..core.pycompat import OrderedDict, integer_types, iteritems
from ..core.utils import FrozenOrderedDict, HiddenKeyDict
from .common import AbstractWritableDataStore, ArrayWriter, BackendArray
Expand Down Expand Up @@ -221,7 +223,8 @@ def encode_zarr_variable(var, needs_copy=True, name=None):
A variable which has been encoded as described above.
"""

if var.dtype.kind == 'O':
if var.dtype.kind == 'O' and not (contains_netcdftime_datetimes(var) or
contains_datetime_datetimes(var)):
raise NotImplementedError("Variable `%s` is an object. Zarr "
"store can't yet encode objects." % name)

Expand Down
Loading