-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFTimeIndex #1252
Merged
Merged
CFTimeIndex #1252
Changes from all commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
e1e8223
Start on implementing and testing NetCDFTimeIndex
spencerkclark 6496458
TST Move to using pytest fixtures to structure tests
spencerkclark 675b2f7
Address initial review comments
spencerkclark 7beddc1
Address second round of review comments
spencerkclark 3cf03bc
Fix failing python3 tests
spencerkclark 53b085c
Match test method name to method name
spencerkclark 738979b
Merge branch 'master' of https://github.com/pydata/xarray into NetCDF…
spencerkclark a177f89
First attempts at integrating NetCDFTimeIndex into xarray
spencerkclark 48ec519
Cleanup
spencerkclark 9e76df6
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 2a7b439
Fix DataFrame and Series test failures for NetCDFTimeIndex
spencerkclark b942724
First pass at making NetCDFTimeIndex compatible with #1356
spencerkclark 7845e6d
Merge branch 'master' into NetCDFTimeIndex
spencerkclark a9ed3c8
Address initial review comments
spencerkclark 3e23ed5
Merge branch 'master' into NetCDFTimeIndex
spencerkclark a9f3548
Merge branch 'master' into NetCDFTimeIndex
spencerkclark f00f59a
Restore test_conventions.py
spencerkclark b34879d
Fix failing test in test_utils.py
spencerkclark e93b62d
flake8
spencerkclark 61e8bc6
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 0244f58
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 32d7986
Update for standalone netcdftime
spencerkclark 9855176
Address stickler-ci comments
spencerkclark 8d61fdb
Skip test_format_netcdftime_datetime if netcdftime not installed
spencerkclark 6b87da7
A start on documentation
spencerkclark 812710c
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 3610e6e
Fix failing zarr tests related to netcdftime encoding
spencerkclark 8f69a90
Simplify test_decode_standard_calendar_single_element_non_ns_range
spencerkclark cec909c
Address a couple review comments
spencerkclark 422792b
Use else clause in _maybe_cast_to_netcdftimeindex
spencerkclark de74037
Start on adding enable_netcdftimeindex option
spencerkclark 2993e3c
Continue parametrizing tests in test_coding_times.py
spencerkclark f3438fd
Update time-series.rst for enable_netcdftimeindex option
spencerkclark c35364e
Use :py:func: in rst for xarray.set_options
spencerkclark 08f72dc
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 62ce0ae
Add a what's new entry and test that resample raises a TypeError
spencerkclark ff05005
Merge branch 'master' of https://github.com/pydata/xarray into NetCDF…
spencerkclark 20fea63
Merge branch 'master' into NetCDFTimeIndex
spencerkclark d5a3cef
Move what's new entry to the version 0.10.3 section
spencerkclark e721d26
Add version-dependent pathway for importing netcdftime.datetime
spencerkclark 5e1c4a8
Make NetCDFTimeIndex and date decoding/encoding compatible with datet…
spencerkclark 257f086
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 00e8ada
Merge branch 'master' into NetCDFTimeIndex
spencerkclark c9d0454
Remove logic to make NetCDFTimeIndex compatible with datetime.datetime
spencerkclark f678714
Documentation edits
spencerkclark b03e38e
Ensure proper enable_netcdftimeindex option is used under lazy decoding
spencerkclark 890dde0
Add fix and test for concatenating variables with a NetCDFTimeIndex
spencerkclark 80e05ba
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 13c8358
Further namespace changes due to netcdftime/cftime renaming
spencerkclark ab46798
NetCDFTimeIndex -> CFTimeIndex
spencerkclark 67fd335
Documentation updates
spencerkclark 7041a8d
Only allow use of CFTimeIndex when using the standalone cftime
spencerkclark 9df4e11
Fix errant what's new changes
spencerkclark 9391463
flake8
spencerkclark da12ecd
Fix skip logic in test_cftimeindex.py
spencerkclark a6997ec
Use only_use_cftime_datetimes option in num2date
spencerkclark 7302d7e
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 9dc5539
Require standalone cftime library for all new functionality
spencerkclark 1aa8d86
Improve skipping logic in test_cftimeindex.py
spencerkclark ef3f2b1
Fix skipping logic in test_cftimeindex.py for when cftime or netcdftime
spencerkclark 4fb5a90
Fix skip logic in Python 3.4 build for test_cftimeindex.py
spencerkclark 1fd205a
Improve error messages when for when the standalone cftime is not ins…
spencerkclark 58a0715
Tweak skip logic in test_accessors.py
spencerkclark ca4d7dd
flake8
spencerkclark 3947aac
Address review comments
spencerkclark a395db0
Temporarily remove cftime from py27 build environment on windows
spencerkclark 1b00bde
flake8
spencerkclark 5fdcd20
Install cftime via pip for Python 2.7 on Windows
spencerkclark 459211c
Merge branch 'master' into NetCDFTimeIndex
spencerkclark 7e9bb20
flake8
spencerkclark 247c9eb
Remove unnecessary new lines; simplify _maybe_cast_to_cftimeindex
spencerkclark e66abe9
Restore test case for #2002 in test_coding_times.py
spencerkclark f25b0b6
Tweak dates out of range warning logic slightly to preserve current d…
spencerkclark b10cc73
Merge branch 'master' into NetCDFTimeIndex
spencerkclark c318755
Address review comments
spencerkclark File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,252 @@ | ||
from __future__ import absolute_import | ||
import re | ||
from datetime import timedelta | ||
|
||
import numpy as np | ||
import pandas as pd | ||
|
||
from xarray.core import pycompat | ||
from xarray.core.utils import is_scalar | ||
|
||
|
||
def named(name, pattern): | ||
return '(?P<' + name + '>' + pattern + ')' | ||
|
||
|
||
def optional(x): | ||
return '(?:' + x + ')?' | ||
|
||
|
||
def trailing_optional(xs): | ||
if not xs: | ||
return '' | ||
return xs[0] + optional(trailing_optional(xs[1:])) | ||
|
||
|
||
def build_pattern(date_sep='\-', datetime_sep='T', time_sep='\:'): | ||
pieces = [(None, 'year', '\d{4}'), | ||
(date_sep, 'month', '\d{2}'), | ||
(date_sep, 'day', '\d{2}'), | ||
(datetime_sep, 'hour', '\d{2}'), | ||
(time_sep, 'minute', '\d{2}'), | ||
(time_sep, 'second', '\d{2}')] | ||
pattern_list = [] | ||
for sep, name, sub_pattern in pieces: | ||
pattern_list.append((sep if sep else '') + named(name, sub_pattern)) | ||
# TODO: allow timezone offsets? | ||
return '^' + trailing_optional(pattern_list) + '$' | ||
|
||
|
||
_BASIC_PATTERN = build_pattern(date_sep='', time_sep='') | ||
_EXTENDED_PATTERN = build_pattern() | ||
_PATTERNS = [_BASIC_PATTERN, _EXTENDED_PATTERN] | ||
|
||
|
||
def parse_iso8601(datetime_string): | ||
for pattern in _PATTERNS: | ||
match = re.match(pattern, datetime_string) | ||
if match: | ||
return match.groupdict() | ||
raise ValueError('no ISO-8601 match for string: %s' % datetime_string) | ||
|
||
|
||
def _parse_iso8601_with_reso(date_type, timestr): | ||
default = date_type(1, 1, 1) | ||
result = parse_iso8601(timestr) | ||
replace = {} | ||
|
||
for attr in ['year', 'month', 'day', 'hour', 'minute', 'second']: | ||
value = result.get(attr, None) | ||
if value is not None: | ||
# Note ISO8601 conventions allow for fractional seconds. | ||
# TODO: Consider adding support for sub-second resolution? | ||
replace[attr] = int(value) | ||
resolution = attr | ||
|
||
return default.replace(**replace), resolution | ||
|
||
|
||
def _parsed_string_to_bounds(date_type, resolution, parsed): | ||
"""Generalization of | ||
pandas.tseries.index.DatetimeIndex._parsed_string_to_bounds | ||
for use with non-standard calendars and cftime.datetime | ||
objects. | ||
""" | ||
if resolution == 'year': | ||
return (date_type(parsed.year, 1, 1), | ||
date_type(parsed.year + 1, 1, 1) - timedelta(microseconds=1)) | ||
elif resolution == 'month': | ||
if parsed.month == 12: | ||
end = date_type(parsed.year + 1, 1, 1) - timedelta(microseconds=1) | ||
else: | ||
end = (date_type(parsed.year, parsed.month + 1, 1) - | ||
timedelta(microseconds=1)) | ||
return date_type(parsed.year, parsed.month, 1), end | ||
elif resolution == 'day': | ||
start = date_type(parsed.year, parsed.month, parsed.day) | ||
return start, start + timedelta(days=1, microseconds=-1) | ||
elif resolution == 'hour': | ||
start = date_type(parsed.year, parsed.month, parsed.day, parsed.hour) | ||
return start, start + timedelta(hours=1, microseconds=-1) | ||
elif resolution == 'minute': | ||
start = date_type(parsed.year, parsed.month, parsed.day, parsed.hour, | ||
parsed.minute) | ||
return start, start + timedelta(minutes=1, microseconds=-1) | ||
elif resolution == 'second': | ||
start = date_type(parsed.year, parsed.month, parsed.day, parsed.hour, | ||
parsed.minute, parsed.second) | ||
return start, start + timedelta(seconds=1, microseconds=-1) | ||
else: | ||
raise KeyError | ||
|
||
|
||
def get_date_field(datetimes, field): | ||
"""Adapted from pandas.tslib.get_date_field""" | ||
return np.array([getattr(date, field) for date in datetimes]) | ||
|
||
|
||
def _field_accessor(name, docstring=None): | ||
"""Adapted from pandas.tseries.index._field_accessor""" | ||
def f(self): | ||
return get_date_field(self._data, name) | ||
|
||
f.__name__ = name | ||
f.__doc__ = docstring | ||
return property(f) | ||
|
||
|
||
def get_date_type(self): | ||
return type(self._data[0]) | ||
|
||
|
||
def assert_all_valid_date_type(data): | ||
import cftime | ||
|
||
sample = data[0] | ||
date_type = type(sample) | ||
if not isinstance(sample, cftime.datetime): | ||
raise TypeError( | ||
'CFTimeIndex requires cftime.datetime ' | ||
'objects. Got object of {}.'.format(date_type)) | ||
if not all(isinstance(value, date_type) for value in data): | ||
raise TypeError( | ||
'CFTimeIndex requires using datetime ' | ||
'objects of all the same type. Got\n{}.'.format(data)) | ||
|
||
|
||
class CFTimeIndex(pd.Index): | ||
year = _field_accessor('year', 'The year of the datetime') | ||
month = _field_accessor('month', 'The month of the datetime') | ||
day = _field_accessor('day', 'The days of the datetime') | ||
hour = _field_accessor('hour', 'The hours of the datetime') | ||
minute = _field_accessor('minute', 'The minutes of the datetime') | ||
second = _field_accessor('second', 'The seconds of the datetime') | ||
microsecond = _field_accessor('microsecond', | ||
'The microseconds of the datetime') | ||
date_type = property(get_date_type) | ||
|
||
def __new__(cls, data): | ||
result = object.__new__(cls) | ||
assert_all_valid_date_type(data) | ||
result._data = np.array(data) | ||
return result | ||
|
||
def _partial_date_slice(self, resolution, parsed): | ||
"""Adapted from | ||
pandas.tseries.index.DatetimeIndex._partial_date_slice | ||
|
||
Note that when using a CFTimeIndex, if a partial-date selection | ||
returns a single element, it will never be converted to a scalar | ||
coordinate; this is in slight contrast to the behavior when using | ||
a DatetimeIndex, which sometimes will return a DataArray with a scalar | ||
coordinate depending on the resolution of the datetimes used in | ||
defining the index. For example: | ||
|
||
>>> from cftime import DatetimeNoLeap | ||
>>> import pandas as pd | ||
>>> import xarray as xr | ||
>>> da = xr.DataArray([1, 2], | ||
coords=[[DatetimeNoLeap(2001, 1, 1), | ||
DatetimeNoLeap(2001, 2, 1)]], | ||
dims=['time']) | ||
>>> da.sel(time='2001-01-01') | ||
<xarray.DataArray (time: 1)> | ||
array([1]) | ||
Coordinates: | ||
* time (time) object 2001-01-01 00:00:00 | ||
>>> da = xr.DataArray([1, 2], | ||
coords=[[pd.Timestamp(2001, 1, 1), | ||
pd.Timestamp(2001, 2, 1)]], | ||
dims=['time']) | ||
>>> da.sel(time='2001-01-01') | ||
<xarray.DataArray ()> | ||
array(1) | ||
Coordinates: | ||
time datetime64[ns] 2001-01-01 | ||
>>> da = xr.DataArray([1, 2], | ||
coords=[[pd.Timestamp(2001, 1, 1, 1), | ||
pd.Timestamp(2001, 2, 1)]], | ||
dims=['time']) | ||
>>> da.sel(time='2001-01-01') | ||
<xarray.DataArray (time: 1)> | ||
array([1]) | ||
Coordinates: | ||
* time (time) datetime64[ns] 2001-01-01T01:00:00 | ||
""" | ||
start, end = _parsed_string_to_bounds(self.date_type, resolution, | ||
parsed) | ||
lhs_mask = (self._data >= start) | ||
rhs_mask = (self._data <= end) | ||
return (lhs_mask & rhs_mask).nonzero()[0] | ||
|
||
def _get_string_slice(self, key): | ||
"""Adapted from pandas.tseries.index.DatetimeIndex._get_string_slice""" | ||
parsed, resolution = _parse_iso8601_with_reso(self.date_type, key) | ||
loc = self._partial_date_slice(resolution, parsed) | ||
return loc | ||
|
||
def get_loc(self, key, method=None, tolerance=None): | ||
"""Adapted from pandas.tseries.index.DatetimeIndex.get_loc""" | ||
if isinstance(key, pycompat.basestring): | ||
return self._get_string_slice(key) | ||
else: | ||
return pd.Index.get_loc(self, key, method=method, | ||
tolerance=tolerance) | ||
|
||
def _maybe_cast_slice_bound(self, label, side, kind): | ||
"""Adapted from | ||
pandas.tseries.index.DatetimeIndex._maybe_cast_slice_bound""" | ||
if isinstance(label, pycompat.basestring): | ||
parsed, resolution = _parse_iso8601_with_reso(self.date_type, | ||
label) | ||
start, end = _parsed_string_to_bounds(self.date_type, resolution, | ||
parsed) | ||
if self.is_monotonic_decreasing and len(self): | ||
return end if side == 'left' else start | ||
return start if side == 'left' else end | ||
else: | ||
return label | ||
|
||
# TODO: Add ability to use integer range outside of iloc? | ||
# e.g. series[1:5]. | ||
def get_value(self, series, key): | ||
"""Adapted from pandas.tseries.index.DatetimeIndex.get_value""" | ||
if not isinstance(key, slice): | ||
return series.iloc[self.get_loc(key)] | ||
else: | ||
return series.iloc[self.slice_indexer( | ||
key.start, key.stop, key.step)] | ||
|
||
def __contains__(self, key): | ||
"""Adapted from | ||
pandas.tseries.base.DatetimeIndexOpsMixin.__contains__""" | ||
try: | ||
result = self.get_loc(key) | ||
return (is_scalar(result) or type(result) == slice or | ||
(isinstance(result, np.ndarray) and result.size)) | ||
except (KeyError, TypeError, ValueError): | ||
return False | ||
|
||
def contains(self, key): | ||
"""Needed for .loc based partial-string indexing""" | ||
return self.__contains__(key) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might add that it will be enabled in v0.11.