test decoding num_dates in float types #1863

j08lue · 2018-01-28T19:34:52Z

Closes Time decoding has round-off error 0.10.0. Gone now. #1859
Tests added (for all bug fixes or enhancements)
Tests passed (for all non-documentation changes)
try to find the origin of the difference in behaviour between v0.10.0 and current base

j08lue · 2018-01-28T19:43:21Z

For comparison, I added the same test to the 0.10.0 version in https://github.com/j08lue/xarray/tree/0p10-f4-time-decode, where it fails:

pytest -v xarray\tests\test_conventions.py

...

>               self.assertArrayEqual(expected, actual_cmp)
E               AssertionError:
E               Arrays are not equal
E
E               (mismatch 90.0%)
E                x: array([datetime.datetime(2000, 1, 1, 0, 0),
E                      datetime.datetime(2000, 1, 2, 0, 0),
E                      datetime.datetime(2000, 1, 3, 0, 0),...
E                y: array(['2000-01-01T00:00:00.000000', '2000-01-02T00:00:00.003211',
E                      '2000-01-03T00:00:00.006422', '2000-01-04T00:00:00.001245',
E                      '2000-01-05T00:00:00.012845', '2000-01-06T00:00:00.024444',...

j08lue · 2018-01-28T19:45:49Z

I believe the issue originates in these lines:

xarray/xarray/conventions.py

Lines 174 to 175 in ac854f0

    
           flat_num_dates_ns_int = (flat_num_dates * 
        
                                    _NS_PER_TIME_DELTA[delta]).astype(np.int64)

where we multiply the num_dates with some float value and then cast to int64.

If num_dates is float32, numpy keeps float32 when multiplying with e.g. 1e9 and that somehow introduces an error. Here is a stripped version of the above:

flat_num_dates = np.arange(100).astype('float32')
n = 1e9
roundtripped = (flat_num_dates * n).astype(np.int64) / n
assert np.all(flat_num_dates == roundtripped)

By the way, the factor has to be large, like 1e9. E.g. 1e6 ('ms since ...') won't give the error.

The weird thing is that the corresponding code in the current master is identical:

xarray/xarray/coding/times.py

Lines 151 to 152 in 50b0a69

    
           flat_num_dates_ns_int = (flat_num_dates * 
        
                                    _NS_PER_TIME_DELTA[delta]).astype(np.int64)

I will look into why the result is still different from v0.10.0.

Also, if this really is the origin of the error, there are two easy ways to avoid this:

Cast flat_num_dates to float64: (flat_num_dates.astype(np.float64) * n).astype(np.int64)
Store _NS_PER_TIME_DELTA values as int, then numpy will do the casting.

j08lue · 2018-01-28T19:59:23Z

Store _NS_PER_TIME_DELTA values as int, then numpy will do the casting.

Haha, OK, this is actually what you implemented in 50b0a69:

xarray/xarray/coding/times.py

Lines 32 to 37 in 50b0a69

    
           _NS_PER_TIME_DELTA = {'us': int(1e3), 
        
                                 'ms': int(1e6), 
        
                                 's': int(1e9), 
        
                                 'm': int(1e9) * 60, 
        
                                 'h': int(1e9) * 60 * 60, 
        
                                 'D': int(1e9) * 60 * 60 * 24}

So that is why it works now.

Case closed, I guess.

shoyer · 2018-02-02T02:01:43Z

Nice, thank you @j08lue for tracking this down and adding tests!

test decoding num_dates in float types

d1659b6

j08lue mentioned this pull request Jan 28, 2018

Time decoding has round-off error 0.10.0. Gone now. #1859

Closed

shoyer merged commit becd77c into pydata:master Feb 2, 2018

j08lue deleted the test-coding-times-float branch February 10, 2018 12:16

j08lue mentioned this pull request Mar 5, 2018

avoid integer overflow when decoding large time numbers #1965

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test decoding num_dates in float types #1863

test decoding num_dates in float types #1863

j08lue commented Jan 28, 2018 •

edited

Loading

j08lue commented Jan 28, 2018

j08lue commented Jan 28, 2018 •

edited

Loading

j08lue commented Jan 28, 2018

shoyer commented Feb 2, 2018

test decoding num_dates in float types #1863

test decoding num_dates in float types #1863

Conversation

j08lue commented Jan 28, 2018 • edited Loading

j08lue commented Jan 28, 2018

j08lue commented Jan 28, 2018 • edited Loading

j08lue commented Jan 28, 2018

shoyer commented Feb 2, 2018

j08lue commented Jan 28, 2018 •

edited

Loading

j08lue commented Jan 28, 2018 •

edited

Loading