-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for CFTimeIndex in get_clean_interp_index #3631
Conversation
…ype conversion. add overflow tests
…d added support for CFTimeIndex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @huard for all your work on this. I think we're getting closer now to a better situation regarding interpolation with cftime coordinates.
xarray/core/duck_array_ops.py
Outdated
array = array.astype(np.float64) # [ns] | ||
res = "ns" | ||
else: | ||
array = np.asarray(array).astype("timedelta64[us]").astype(np.float64) # [us] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the potential for loss of precision is here, but only when array
comes in with dtype np.timedelta64[ns]
or finer. I feel like we can guard against this; we could do this in one of two ways:
- Only pass dtype object arrays to this function and write a separate function to handle
np.timedelta64
arrays. - Add some logic to handle
np.timedelta64
arrays more robustly in this function, and name it something that reflects that both standard library timedeltas and NumPy timedeltas can be safely passed to it.
I probably prefer (1) as a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are two cases where we're losing precision:
- Conversion from ns to us when the time steps are < 1 us.
- Conversion to float of large timedelta[us] values. For example:
In [154]: np.float(13510798882111486) - np.float(13510798882111485)
Out[154]: 2.0
When we discussed about loss of precision, I was more worried about 2 than 1, hence maybe some confusion earlier. I'm concerned 2 will lead to hard to track bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we discussed about loss of precision, I was more worried about 2 than 1, hence maybe some confusion earlier. I'm concerned 2 will lead to hard to track bugs.
Ah, that makes sense. Indeed I had mainly been thinking about 1. Thanks for the clear example. My apologies for being dense earlier. At least in the short term, I guess I'm still not so worried about 2, primarily because it is already an issue in the existing code (i.e. even for np.datetime64[ns]
dates). So this update is not changing the behavior materially; it's just extending the approach to allow for long time ranges between cftime dates1.
Earlier you raised a good question regarding whether there would be a benefit to keeping these values as integers as long as possible to avoid these floating point problems -- it seems like there could be in the case of differentiate
and integrate
-- however, that would probably require some refactoring to do the unit conversion after the operation involving the numeric times, rather than before, to avoid overflow. I'd be open to discussing this more, but I don't feel like it's necessarily a blocker here, unless you have an example that clearly shows otherwise.
Regardless, I really appreciate your careful thoughts and help with this. This has been useful discussion.
1I guess it's also important to note that we are currently naively differencing cftime dates to produce timedeltas, which even before float conversion is not microsecond-exact:
In [1]: import cftime
In [2]: cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 123456) - cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 123455)
Out[2]: datetime.timedelta(microseconds=8)
We have worked around this before in xarray; see exact_cftime_datetime_difference
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @huard for these latest updates. I think this is pretty close.
xarray/core/duck_array_ops.py
Outdated
array = array.astype(np.float64) # [ns] | ||
res = "ns" | ||
else: | ||
array = np.asarray(array).astype("timedelta64[us]").astype(np.float64) # [us] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we discussed about loss of precision, I was more worried about 2 than 1, hence maybe some confusion earlier. I'm concerned 2 will lead to hard to track bugs.
Ah, that makes sense. Indeed I had mainly been thinking about 1. Thanks for the clear example. My apologies for being dense earlier. At least in the short term, I guess I'm still not so worried about 2, primarily because it is already an issue in the existing code (i.e. even for np.datetime64[ns]
dates). So this update is not changing the behavior materially; it's just extending the approach to allow for long time ranges between cftime dates1.
Earlier you raised a good question regarding whether there would be a benefit to keeping these values as integers as long as possible to avoid these floating point problems -- it seems like there could be in the case of differentiate
and integrate
-- however, that would probably require some refactoring to do the unit conversion after the operation involving the numeric times, rather than before, to avoid overflow. I'd be open to discussing this more, but I don't feel like it's necessarily a blocker here, unless you have an example that clearly shows otherwise.
Regardless, I really appreciate your careful thoughts and help with this. This has been useful discussion.
1I guess it's also important to note that we are currently naively differencing cftime dates to produce timedeltas, which even before float conversion is not microsecond-exact:
In [1]: import cftime
In [2]: cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 123456) - cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 123455)
Out[2]: datetime.timedelta(microseconds=8)
We have worked around this before in xarray; see exact_cftime_datetime_difference
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @huard -- a couple more minor things:
- Could you update the docstrings for
DataArray.interpolate_na
andDataset.interpolate_na
to reflect thatmax_gap
can now be adatetime.timedelta
object too? - Upstream failures are unrelated, though these Windows failures look real. I think it's likely due to imprecise cftime arithmetic. We should probably use
assert_allclose
instead ofassert_equal
for these tests.
Windows failures
2020-01-20T17:16:37.9661843Z ================================== FAILURES ===================================
2020-01-20T17:16:37.9911264Z ____ test_interpolate_na_max_gap_time_specifier[3H-<lambda>0-cftime_range] ____
2020-01-20T17:16:37.9916014Z
2020-01-20T17:16:37.9919666Z da_time = <xarray.DataArray (t: 11)>
2020-01-20T17:16:37.9927724Z array([nan, 1., 2., nan, nan, 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9929019Z Coordinates:
2020-01-20T17:16:37.9932071Z * t (t) object 2001-01-01 00:00:00 ... 2001-01-01 10:00:00
2020-01-20T17:16:37.9942882Z max_gap = '3H', transform = <function <lambda> at 0x000001EB4C167E58>
2020-01-20T17:16:37.9943161Z time_range_func = <function cftime_range at 0x000001EB453B0B88>
2020-01-20T17:16:37.9944138Z
2020-01-20T17:16:37.9947748Z @requires_bottleneck
2020-01-20T17:16:37.9953395Z @pytest.mark.parametrize("time_range_func", [pd.date_range, xr.cftime_range])
2020-01-20T17:16:37.9953621Z @pytest.mark.parametrize("transform", [lambda x: x, lambda x: x.to_dataset(name="a")])
2020-01-20T17:16:37.9953802Z @pytest.mark.parametrize(
2020-01-20T17:16:37.9953936Z "max_gap", ["3H", np.timedelta64(3, "h"), pd.to_timedelta("3H")]
2020-01-20T17:16:37.9954062Z )
2020-01-20T17:16:37.9954239Z def test_interpolate_na_max_gap_time_specifier(
2020-01-20T17:16:37.9954369Z da_time, max_gap, transform, time_range_func
2020-01-20T17:16:37.9954491Z ):
2020-01-20T17:16:37.9954660Z da_time["t"] = time_range_func("2001-01-01", freq="H", periods=11)
2020-01-20T17:16:37.9954787Z expected = transform(
2020-01-20T17:16:37.9954917Z da_time.copy(data=[np.nan, 1, 2, 3, 4, 5, np.nan, np.nan, np.nan, np.nan, 10])
2020-01-20T17:16:37.9955081Z )
2020-01-20T17:16:37.9955208Z actual = transform(da_time).interpolate_na("t", max_gap=max_gap)
2020-01-20T17:16:37.9955370Z > assert_equal(actual, expected)
2020-01-20T17:16:37.9956735Z E AssertionError: Left and right DataArray objects are not equal
2020-01-20T17:16:37.9957510Z E
2020-01-20T17:16:37.9957762Z E Differing values:
2020-01-20T17:16:37.9957918Z E L
2020-01-20T17:16:37.9958076Z E array([nan, 1., 2., 3., 4., 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9958298Z E R
2020-01-20T17:16:37.9958457Z E array([nan, 1., 2., 3., 4., 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9958590Z
2020-01-20T17:16:37.9958782Z xarray\tests\test_missing.py:568: AssertionError
2020-01-20T17:16:37.9970494Z ____ test_interpolate_na_max_gap_time_specifier[3H-<lambda>1-cftime_range] ____
2020-01-20T17:16:37.9970809Z
2020-01-20T17:16:37.9972168Z da_time = <xarray.DataArray (t: 11)>
2020-01-20T17:16:37.9978870Z array([nan, 1., 2., nan, nan, 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9983759Z Coordinates:
2020-01-20T17:16:37.9983942Z * t (t) object 2001-01-01 00:00:00 ... 2001-01-01 10:00:00
2020-01-20T17:16:37.9984137Z max_gap = '3H', transform = <function <lambda> at 0x000001EB4C167EE8>
2020-01-20T17:16:37.9984270Z time_range_func = <function cftime_range at 0x000001EB453B0B88>
2020-01-20T17:16:37.9984375Z
2020-01-20T17:16:37.9984492Z @requires_bottleneck
2020-01-20T17:16:37.9984644Z @pytest.mark.parametrize("time_range_func", [pd.date_range, xr.cftime_range])
2020-01-20T17:16:37.9985158Z @pytest.mark.parametrize("transform", [lambda x: x, lambda x: x.to_dataset(name="a")])
2020-01-20T17:16:37.9985367Z @pytest.mark.parametrize(
2020-01-20T17:16:37.9985534Z "max_gap", ["3H", np.timedelta64(3, "h"), pd.to_timedelta("3H")]
2020-01-20T17:16:37.9985659Z )
2020-01-20T17:16:37.9985823Z def test_interpolate_na_max_gap_time_specifier(
2020-01-20T17:16:37.9985951Z da_time, max_gap, transform, time_range_func
2020-01-20T17:16:37.9986072Z ):
2020-01-20T17:16:37.9986236Z da_time["t"] = time_range_func("2001-01-01", freq="H", periods=11)
2020-01-20T17:16:37.9986370Z expected = transform(
2020-01-20T17:16:37.9986538Z da_time.copy(data=[np.nan, 1, 2, 3, 4, 5, np.nan, np.nan, np.nan, np.nan, 10])
2020-01-20T17:16:37.9986668Z )
2020-01-20T17:16:37.9986795Z actual = transform(da_time).interpolate_na("t", max_gap=max_gap)
2020-01-20T17:16:37.9986970Z > assert_equal(actual, expected)
2020-01-20T17:16:37.9987102Z E AssertionError: Left and right Dataset objects are not equal
2020-01-20T17:16:37.9987622Z E
2020-01-20T17:16:37.9987837Z E
2020-01-20T17:16:37.9987978Z E Differing data variables:
2020-01-20T17:16:37.9988126Z E L a (t) float64 nan 1.0 2.0 3.0 4.0 5.0 nan nan nan nan 10.0
2020-01-20T17:16:37.9988319Z E R a (t) float64 nan 1.0 2.0 3.0 4.0 5.0 nan nan nan nan 10.0
2020-01-20T17:16:37.9988444Z
2020-01-20T17:16:37.9988610Z xarray\tests\test_missing.py:568: AssertionError
2020-01-20T17:16:37.9988799Z _ test_interpolate_na_max_gap_time_specifier[max_gap1-<lambda>0-cftime_range] _
2020-01-20T17:16:37.9988982Z
2020-01-20T17:16:37.9989126Z da_time = <xarray.DataArray (t: 11)>
2020-01-20T17:16:37.9989276Z array([nan, 1., 2., nan, nan, 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9989452Z Coordinates:
2020-01-20T17:16:37.9989600Z * t (t) object 2001-01-01 00:00:00 ... 2001-01-01 10:00:00
2020-01-20T17:16:37.9989770Z max_gap = numpy.timedelta64(3,'h')
2020-01-20T17:16:37.9989957Z transform = <function <lambda> at 0x000001EB4C167E58>
2020-01-20T17:16:37.9990106Z time_range_func = <function cftime_range at 0x000001EB453B0B88>
2020-01-20T17:16:37.9990229Z
2020-01-20T17:16:37.9990403Z @requires_bottleneck
2020-01-20T17:16:37.9990551Z @pytest.mark.parametrize("time_range_func", [pd.date_range, xr.cftime_range])
2020-01-20T17:16:37.9990740Z @pytest.mark.parametrize("transform", [lambda x: x, lambda x: x.to_dataset(name="a")])
2020-01-20T17:16:37.9991073Z @pytest.mark.parametrize(
2020-01-20T17:16:37.9991510Z "max_gap", ["3H", np.timedelta64(3, "h"), pd.to_timedelta("3H")]
2020-01-20T17:16:37.9991678Z )
2020-01-20T17:16:37.9991809Z def test_interpolate_na_max_gap_time_specifier(
2020-01-20T17:16:37.9991939Z da_time, max_gap, transform, time_range_func
2020-01-20T17:16:37.9992101Z ):
2020-01-20T17:16:37.9992243Z da_time["t"] = time_range_func("2001-01-01", freq="H", periods=11)
2020-01-20T17:16:37.9992589Z expected = transform(
2020-01-20T17:16:37.9992761Z da_time.copy(data=[np.nan, 1, 2, 3, 4, 5, np.nan, np.nan, np.nan, np.nan, 10])
2020-01-20T17:16:37.9992893Z )
2020-01-20T17:16:37.9993064Z actual = transform(da_time).interpolate_na("t", max_gap=max_gap)
2020-01-20T17:16:37.9993236Z > assert_equal(actual, expected)
2020-01-20T17:16:37.9993376Z E AssertionError: Left and right DataArray objects are not equal
2020-01-20T17:16:37.9993536Z E
2020-01-20T17:16:37.9993671Z E Differing values:
2020-01-20T17:16:37.9993962Z E L
2020-01-20T17:16:37.9994128Z E array([nan, 1., 2., 3., 4., 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9994270Z E R
2020-01-20T17:16:37.9994404Z E array([nan, 1., 2., 3., 4., 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9994513Z
2020-01-20T17:16:37.9994736Z xarray\tests\test_missing.py:568: AssertionError
2020-01-20T17:16:37.9994902Z _ test_interpolate_na_max_gap_time_specifier[max_gap1-<lambda>1-cftime_range] _
2020-01-20T17:16:37.9995011Z
2020-01-20T17:16:37.9995137Z da_time = <xarray.DataArray (t: 11)>
2020-01-20T17:16:37.9995463Z array([nan, 1., 2., nan, nan, 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:37.9995586Z Coordinates:
2020-01-20T17:16:37.9995871Z * t (t) object 2001-01-01 00:00:00 ... 2001-01-01 10:00:00
2020-01-20T17:16:37.9996025Z max_gap = numpy.timedelta64(3,'h')
2020-01-20T17:16:37.9996144Z transform = <function <lambda> at 0x000001EB4C167EE8>
2020-01-20T17:16:37.9996314Z time_range_func = <function cftime_range at 0x000001EB453B0B88>
2020-01-20T17:16:37.9996419Z
2020-01-20T17:16:37.9996535Z @requires_bottleneck
2020-01-20T17:16:37.9996659Z @pytest.mark.parametrize("time_range_func", [pd.date_range, xr.cftime_range])
2020-01-20T17:16:37.9996878Z @pytest.mark.parametrize("transform", [lambda x: x, lambda x: x.to_dataset(name="a")])
2020-01-20T17:16:37.9997010Z @pytest.mark.parametrize(
2020-01-20T17:16:37.9997175Z "max_gap", ["3H", np.timedelta64(3, "h"), pd.to_timedelta("3H")]
2020-01-20T17:16:37.9997293Z )
2020-01-20T17:16:37.9998011Z def test_interpolate_na_max_gap_time_specifier(
2020-01-20T17:16:37.9998222Z da_time, max_gap, transform, time_range_func
2020-01-20T17:16:37.9998361Z ):
2020-01-20T17:16:37.9998504Z da_time["t"] = time_range_func("2001-01-01", freq="H", periods=11)
2020-01-20T17:16:37.9998680Z expected = transform(
2020-01-20T17:16:37.9998829Z da_time.copy(data=[np.nan, 1, 2, 3, 4, 5, np.nan, np.nan, np.nan, np.nan, 10])
2020-01-20T17:16:37.9999016Z )
2020-01-20T17:16:37.9999542Z actual = transform(da_time).interpolate_na("t", max_gap=max_gap)
2020-01-20T17:16:37.9999700Z > assert_equal(actual, expected)
2020-01-20T17:16:37.9999888Z E AssertionError: Left and right Dataset objects are not equal
2020-01-20T17:16:38.0000044Z E
2020-01-20T17:16:38.0000198Z E
2020-01-20T17:16:38.0000340Z E Differing data variables:
2020-01-20T17:16:38.0000534Z E L a (t) float64 nan 1.0 2.0 3.0 4.0 5.0 nan nan nan nan 10.0
2020-01-20T17:16:38.0000684Z E R a (t) float64 nan 1.0 2.0 3.0 4.0 5.0 nan nan nan nan 10.0
2020-01-20T17:16:38.0000806Z
2020-01-20T17:16:38.0000980Z xarray\tests\test_missing.py:568: AssertionError
2020-01-20T17:16:38.0001330Z _ test_interpolate_na_max_gap_time_specifier[max_gap2-<lambda>0-cftime_range] _
2020-01-20T17:16:38.0001436Z
2020-01-20T17:16:38.0001591Z da_time = <xarray.DataArray (t: 11)>
2020-01-20T17:16:38.0001829Z array([nan, 1., 2., nan, nan, 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:38.0001949Z Coordinates:
2020-01-20T17:16:38.0002115Z * t (t) object 2001-01-01 00:00:00 ... 2001-01-01 10:00:00
2020-01-20T17:16:38.0002238Z max_gap = Timedelta('0 days 03:00:00')
2020-01-20T17:16:38.0002374Z transform = <function <lambda> at 0x000001EB4C167E58>
2020-01-20T17:16:38.0002537Z time_range_func = <function cftime_range at 0x000001EB453B0B88>
2020-01-20T17:16:38.0002639Z
2020-01-20T17:16:38.0002755Z @requires_bottleneck
2020-01-20T17:16:38.0002918Z @pytest.mark.parametrize("time_range_func", [pd.date_range, xr.cftime_range])
2020-01-20T17:16:38.0003047Z @pytest.mark.parametrize("transform", [lambda x: x, lambda x: x.to_dataset(name="a")])
2020-01-20T17:16:38.0003199Z @pytest.mark.parametrize(
2020-01-20T17:16:38.0003328Z "max_gap", ["3H", np.timedelta64(3, "h"), pd.to_timedelta("3H")]
2020-01-20T17:16:38.0003446Z )
2020-01-20T17:16:38.0003609Z def test_interpolate_na_max_gap_time_specifier(
2020-01-20T17:16:38.0003735Z da_time, max_gap, transform, time_range_func
2020-01-20T17:16:38.0003851Z ):
2020-01-20T17:16:38.0004009Z da_time["t"] = time_range_func("2001-01-01", freq="H", periods=11)
2020-01-20T17:16:38.0004132Z expected = transform(
2020-01-20T17:16:38.0004346Z da_time.copy(data=[np.nan, 1, 2, 3, 4, 5, np.nan, np.nan, np.nan, np.nan, 10])
2020-01-20T17:16:38.0004501Z )
2020-01-20T17:16:38.0004625Z actual = transform(da_time).interpolate_na("t", max_gap=max_gap)
2020-01-20T17:16:38.0004780Z > assert_equal(actual, expected)
2020-01-20T17:16:38.0004919Z E AssertionError: Left and right DataArray objects are not equal
2020-01-20T17:16:38.0005088Z E
2020-01-20T17:16:38.0005209Z E Differing values:
2020-01-20T17:16:38.0005323Z E L
2020-01-20T17:16:38.0005475Z E array([nan, 1., 2., 3., 4., 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:38.0005604Z E R
2020-01-20T17:16:38.0005725Z E array([nan, 1., 2., 3., 4., 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:38.0005866Z
2020-01-20T17:16:38.0005987Z xarray\tests\test_missing.py:568: AssertionError
2020-01-20T17:16:38.0006112Z _ test_interpolate_na_max_gap_time_specifier[max_gap2-<lambda>1-cftime_range] _
2020-01-20T17:16:38.0006261Z
2020-01-20T17:16:38.0006381Z da_time = <xarray.DataArray (t: 11)>
2020-01-20T17:16:38.0006502Z array([nan, 1., 2., nan, nan, 5., nan, nan, nan, nan, 10.])
2020-01-20T17:16:38.0006657Z Coordinates:
2020-01-20T17:16:38.0006780Z * t (t) object 2001-01-01 00:00:00 ... 2001-01-01 10:00:00
2020-01-20T17:16:38.0006899Z max_gap = Timedelta('0 days 03:00:00')
2020-01-20T17:16:38.0007059Z transform = <function <lambda> at 0x000001EB4C167EE8>
2020-01-20T17:16:38.0007598Z time_range_func = <function cftime_range at 0x000001EB453B0B88>
2020-01-20T17:16:38.0007768Z
2020-01-20T17:16:38.0007993Z @requires_bottleneck
2020-01-20T17:16:38.0008175Z @pytest.mark.parametrize("time_range_func", [pd.date_range, xr.cftime_range])
2020-01-20T17:16:38.0008326Z @pytest.mark.parametrize("transform", [lambda x: x, lambda x: x.to_dataset(name="a")])
2020-01-20T17:16:38.0008506Z @pytest.mark.parametrize(
2020-01-20T17:16:38.0008651Z "max_gap", ["3H", np.timedelta64(3, "h"), pd.to_timedelta("3H")]
2020-01-20T17:16:38.0008838Z )
2020-01-20T17:16:38.0008988Z def test_interpolate_na_max_gap_time_specifier(
2020-01-20T17:16:38.0009133Z da_time, max_gap, transform, time_range_func
2020-01-20T17:16:38.0009307Z ):
2020-01-20T17:16:38.0009453Z da_time["t"] = time_range_func("2001-01-01", freq="H", periods=11)
2020-01-20T17:16:38.0009595Z expected = transform(
2020-01-20T17:16:38.0009786Z da_time.copy(data=[np.nan, 1, 2, 3, 4, 5, np.nan, np.nan, np.nan, np.nan, 10])
2020-01-20T17:16:38.0009943Z )
2020-01-20T17:16:38.0010092Z actual = transform(da_time).interpolate_na("t", max_gap=max_gap)
2020-01-20T17:16:38.0011207Z > assert_equal(actual, expected)
2020-01-20T17:16:38.0011341Z E AssertionError: Left and right Dataset objects are not equal
2020-01-20T17:16:38.0011460Z E
2020-01-20T17:16:38.0011606Z E
2020-01-20T17:16:38.0011727Z E Differing data variables:
2020-01-20T17:16:38.0011867Z E L a (t) float64 nan 1.0 2.0 3.0 4.0 5.0 nan nan nan nan 10.0
2020-01-20T17:16:38.0012029Z E R a (t) float64 nan 1.0 2.0 3.0 4.0 5.0 nan nan nan nan 10.0
2020-01-20T17:16:38.0012135Z
2020-01-20T17:16:38.0012255Z xarray\tests\test_missing.py:568: AssertionError
Co-Authored-By: Spencer Clark <[email protected]>
Is the switch to |
I think so. If there's a branch fixing the assert_allclose failures, I can merge it here. |
I think it's just this function: |
It seems like |
Hmm...I think it should treat NaNs as equal by default. Are you using the version defined here? xarray/xarray/tests/__init__.py Line 170 in ecd67f4
|
Ah ! No, I tried numpy.testing's version. Works now. Thanks ! |
Thanks @huard and @spencerkclark. This took a solid amount of work. @spencerkclark merge if you think it's good to go? |
Indeed thanks @huard -- I'll give this another once-over tomorrow. If everything checks out, which I anticipate it will, I'll merge it then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huard in looking things over again, I noticed a few things related to the implementation of timedelta_to_numeric
that I think need attention. I think it would be helpful to have some explicit tests for it too.
Sorry for not catching these earlier.
Co-Authored-By: Spencer Clark <[email protected]>
Co-Authored-By: Spencer Clark <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks @huard, in particular for the new test. It would be great if you could take care of some last few nits; then I think things should be ready to merge.
I think it's clearer if both functions datetime_to_numeric and timedelta_to_numeric are side by side.
I see what you mean, particularly given the similarity in the handling of the datetime_unit
argument. I think it's fine to leave them side by side for now. We can always move them later, given they are private functions.
…named array to value for pd_timedelta_to_float. removed pd_timedeltaindex_to_float.
Thanks @huard! |
Thanks @spencerkclark for shepherding this to completion. |
* master: Add support for CFTimeIndex in get_clean_interp_index (pydata#3631) sel with categorical index (pydata#3670) bump min deps for 0.15 (pydata#3713) setuptools-scm and isort tweaks (pydata#3720) Allow binned coordinates on 1D plots y-axis. (pydata#3685) apply_ufunc: Add meta kwarg + bump dask to 2.2 (pydata#3660) setuptools-scm and one-liner setup.py (pydata#3714) Feature/align in dot (pydata#3699) ENH: enable `H5NetCDFStore` to work with already open h5netcdf.File a… (pydata#3618) One-off isort run (pydata#3705) hardcoded xarray.__all__ (pydata#3703) Bump mypy to v0.761 (pydata#3704) remove DataArray and Dataset constructor deprecations for 0.15 (pydata#3560) Tests for variables with units (pydata#3654) Add an example notebook using apply_ufunc to vectorize 1D functions (pydata#3629) Use encoding['dtype'] over data.dtype when possible within CFMaskCoder.encode (pydata#3652)
black . && mypy . && flake8
whats-new.rst
for all changes andapi.rst
for new APIRelated to #3349
As suggested by @spencerkclark, index values are computed as a delta with respect to 1970-01-01.
At the moment, this fails if dates fall outside of the range for nanoseconds timedeltas [ 1678 AD, 2262 AD]. Is this something we can fix ?