Starter property-based test suite #1972

Zac-HD · 2018-03-07T13:45:07Z

Closes Add a suite of property-based tests with Hypothesis #1846
Tests added - you bet
Tests passed - well, the code under test hasn't changed...

This is a small property-based test suite, to give two examples of the kinds of tests that we could write for Xarray using Hypothesis.

For any array, encoding and decoding it with a CF coder outputs an identical array. As you would hope, these tests pass.
For any 2D array, you can call the 2D plotting methods without raising an exception. Alas, this is not the case, and Hypothesis will show you the failing inputs (and matplotlib-related tracebacks) to prove it.
(Contributing a very small feature to matplotlib was shockingly painful, so I'm not planning to take a similar suite upstream myself unless something changes)

Things that I would like to know:

Have I build-wrangled something reasonable here?
Will anyone else contribute property-based tests? I'm happy to help people debug or work out how to test something, but I simply don't have the time to write another test suite for free.
Is this something you want?

Zac-HD · 2018-03-08T07:29:18Z

Ping @fmaussion / @shoyer - would love your opinions on this, including high-value targets to test.

(btw Appveyor had an internal error; build is otherwise green)

shoyer · 2018-03-08T19:17:48Z

This looks like a great start to me -- thank you!

It's impressive that it's possible to break every plotting type with matplotlib :).

Zac-HD · 2018-03-09T00:26:13Z

This looks like a great start to me -- thank you!

You're welcome! Same questions as above though, plus "is there anything else you need in this initial PR?". If not, can we merge it

It's impressive that it's possible to break every plotting type with matplotlib :).

As much as I love matplotlib, it's a steaming pile of hacks and I want to avoid it more than I want it cleaned up 😥 (entirely because the process is dysfunctional, not the code)

shoyer · 2018-03-09T00:31:07Z

One thing that comes to mind is organization... would it make sense to put this alongside the current xarray tests, e.g., have xarray/tests/unit and xarray/tests/property?

I guess one downside of this would be that it could change how we need to invoke py.test by default, if we don't want to trigger all the property based tests.

tacaswell · 2018-03-09T01:20:44Z

What do the failing data sets look like? Does it get easier or harder to find failures if you go up to 10x10? What sort of exceptions are you getting?

You can shove a fair amount of configuration in to pytest.ini to make these opt-in.

Zac-HD · 2018-03-09T02:05:18Z

@shoyer - that depends mostly on whether you want to run these tests as part of a standard run. A test-time dependency on Hypothesis is very cheap compared to the other dev dependencies, so I'd think more about the impact on eg CI resources than contributors.

Upside is more power and coverage of edge-cases with odd data; downside is that they take a lot longer by virtue of trying hundreds of examples, and in this case also having to generate arrays takes a while (~log(elements) average via sparse filling).

@tacaswell - I would be delighted to write a test suite like this for matplotlib! The only reason I haven't is because I thought it would be rude to report so many bugs that I don't have time to help fix. If we can get a group project going though I'd be very enthusiastic 😄

For this suite I was passing in 2D arrays of unsigned ints, signed ints, or floats (in all sizes), with edge sizes between 2 and 10.
The failing outputs were reported as 2x2 arrays (~30 times), or 3x2 (once - there's a cap on number of shrinks that I think I hit there).
Values tended to be either uniform small integers - 0 or 1, dtype int8 or uint8; or for some an array of floats with one corner zero and the others very large.

I didn't keep the exact tracebacks, but I remember seeing many come from overflow in tick spacing calculations. Again, happy to write a test suite and make more detailed reports upstream if people want to fix this - in which case let's open an issue there!

max-sixty · 2018-03-09T02:18:13Z

If we don't want to trigger by default, we can do something like this and require passing this to run them:

pytest --property-tests

jklymak · 2018-03-09T05:38:23Z

As pointed out on the matplotlib gitter:

If you run

import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

for i in range(200):
    xr.DataArray(np.array([[0, 0], [0, 0]], dtype=np.uint8)).plot.pcolormesh()

at step 165 you will get:

File "/Users/jklymak/matplotlib/lib/matplotlib/figure.py", line 236, in update
    raise ValueError('left cannot be >= right')
ValueError: left cannot be >= right

Why? Because you have made a plot that if it displays looks like:

Are you sure your test isn't doing something similar? At some point there just isn't room for more colorbars! Adding a plt.clf() can cure the problem.

Its also is possible you are hitting floating point overflows with your test. At some point Matplotlib needs to be able to manipulate the data that comes in, and if you operate near the maximum number your data type can handle, you'll have problems. Just like you would if you just did

a = 2*xr.DataArray(np.array([[0, 0], [0, 1e308]]))

you will get:

/Users/jklymak/anaconda3/envs/matplotlibdev/lib/python3.6/site-packages/xarray/core/variable.py:1165: RuntimeWarning: overflow encountered in multiply

So maybe your hypothesis tester could be constrained to stay away from floating point overflows?

Matplotlib indeed has flaws and quirks, but if you are finding bugs it would be good to isolate them.

Zac-HD · 2018-03-09T06:05:46Z

...that also explains why I was having trouble reproducing the error, whoops. I'll see how it goes with those problems excluded later tonight!

Zac-HD · 2018-03-09T12:11:02Z

@jklymak - I'm getting RuntimeError: Invalid DISPLAY variable in the Qt backend now that I've added plt.clf(). It works on my machine now (:tada:, thanks!) - any suggestions for Travis?

(I'm also getting Zarr errors, but I assume those will go away soon as I didn't cause them)

tacaswell · 2018-03-09T13:58:15Z

Set the backend to Agg on travis as you don't have a xserever running. You probably want to manually force a draw as well.

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+
+import matplotlib
+matplotlib.use('Agg')
+import matplotlib.pyplot as plt


E402 module level import not at top of file

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+matplotlib.use('Agg')
+import matplotlib.pyplot as plt
+
+from hypothesis import given, settings


E402 module level import not at top of file

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+import matplotlib.pyplot as plt
+
+from hypothesis import given, settings
+import hypothesis.strategies as st


E402 module level import not at top of file

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+
+from hypothesis import given, settings
+import hypothesis.strategies as st
+import hypothesis.extra.numpy as npst


E402 module level import not at top of file

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+import hypothesis.strategies as st
+import hypothesis.extra.numpy as npst
+
+import xarray as xr


E402 module level import not at top of file

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+two_dimensional_array = st.sampled_from([
+    dict(dtype=npst.unsigned_integer_dtypes() | npst.integer_dtypes()),
+    dict(dtype=npst.floating_dtypes(), elements=st.floats(-2.**100, 2.**100)),
+# Then, we "flatmap" this into an arrays strategy - ie create a strategy using


E122 continuation line missing indentation or outdented

stickler-ci · 2018-03-09T23:37:07Z

properties/test_plotting.py

+    dict(dtype=npst.unsigned_integer_dtypes() | npst.integer_dtypes()),
+    dict(dtype=npst.floating_dtypes(), elements=st.floats(-2.**100, 2.**100)),
+# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
+# the kwargs above and a shape, then draw a value from that.


E122 continuation line missing indentation or outdented

stickler-ci · 2018-03-09T23:37:08Z

properties/test_plotting.py

+# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
+# the kwargs above and a shape, then draw a value from that.
+]).flatmap(lambda kwargs: npst.arrays(
+    **kwargs, shape=st.tuples(st.integers(2, 5), st.integers(2, 5)),


E999 SyntaxError: invalid syntax

stickler-ci · 2018-03-10T00:19:41Z

properties/test_plotting.py

+    dict(dtype=npst.unsigned_integer_dtypes() | npst.integer_dtypes()),
+    dict(dtype=st.sampled_from(['float32', 'float64']),
+         elements=st.floats(-2.**50, 2.**50)),
+# Then, we "flatmap" this into an arrays strategy - ie create a strategy using


E122 continuation line missing indentation or outdented

stickler-ci · 2018-03-10T00:19:41Z

properties/test_plotting.py

+    dict(dtype=st.sampled_from(['float32', 'float64']),
+         elements=st.floats(-2.**50, 2.**50)),
+# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
+# the kwargs above and a shape, then draw a value from that.


E122 continuation line missing indentation or outdented

stickler-ci · 2018-03-10T00:19:41Z

properties/test_plotting.py

+# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
+# the kwargs above and a shape, then draw a value from that.
+]).flatmap(lambda kwargs: npst.arrays(
+    **kwargs, shape=st.tuples(st.integers(2, 5), st.integers(2, 5)),


E999 SyntaxError: invalid syntax

stickler-ci · 2018-03-10T00:47:46Z

properties/test_plotting.py

+
+import matplotlib
+matplotlib.use('Agg')
+import matplotlib.pyplot as plt


E402 module level import not at top of file

stickler-ci · 2018-03-11T02:31:05Z

properties/test_plotting.py

+    dict(dtype=st.sampled_from(['float32', 'float64']),
+         elements=st.floats(-2.**50, 2.**50)),
+]).flatmap(lambda kwargs: npst.arrays(
+    shape=st.tuples(st.integers(2, 5), st.integers(2, 5)), **kwargs,


E999 SyntaxError: invalid syntax

Zac-HD · 2018-03-15T22:41:20Z

@shoyer & @fmaussion - I've just given up on the plotting tests as being more effort than they're worth. Are there any:

blockers to merging this as-is?
other APIs you think it would be reasonably easy to write property tests for? Here's a nice list of properties to test 😄

shoyer · 2018-03-17T21:35:52Z

blockers to merging this as-is?

This looks pretty good to me in its current state. I would say we should merge it now and iterate in future PRs.

other APIs you think it would be reasonably easy to write property tests for? Here's a nice list of properties to test 😄

Almost anywhere where we currently make heavy use of parametrize would be a good candidate. Some other possibilities:

Consistency with pandas for groupby/rolling aggregations.
Roundtrip writing/reading data to netCDF. There are a couple of known exceptions (e.g., dtypes not supported by netCDF and MultiIndex) but otherwise every xarray object should be serializable to netCDF and back without data loss.
Roundtrip to/from pandas Series/DataFrame with to_series()/to_dataframe()/to_xarray().
Indexing consistency tests for backends: all indexing operations should be supported consistently on data accessed from any backend.
NumPy vs Dask: any operation on dask arrays should be consistent with the operation on numpy arrays (e.g., f(xarray_obj.chunk()).compute() == f(xarray_obj)).
Indexing followed by xarray.concat: should get back the same result.
Binary arithmetic on xarray objects with Python operators (+, -, etc) and NumPy ufuncs (np.add, np.subtract, etc).

Zac-HD · 2018-03-20T12:03:13Z

I would say we should merge it now and iterate in future PRs.

Merge away then!

fmaussion · 2018-03-20T12:42:37Z

Thanks @Zac-HD !

Zac-HD force-pushed the hypothesis-tests branch 2 times, most recently from 6c02f79 to 6df5cd3 Compare March 8, 2018 03:03

This was referenced Mar 9, 2018

RFC: A timeline for the next two major Hypothesis versions HypothesisWorks/hypothesis#1134

Closed

Deprecate use of APIs that violate our style guide HypothesisWorks/hypothesis#1155

Closed

stickler-ci reviewed Mar 9, 2018

View reviewed changes

Zac-HD force-pushed the hypothesis-tests branch from a7a7876 to 7c89212 Compare March 10, 2018 00:19

stickler-ci reviewed Mar 10, 2018

View reviewed changes

Zac-HD force-pushed the hypothesis-tests branch from a18d973 to 47e0258 Compare March 11, 2018 02:30

stickler-ci reviewed Mar 11, 2018

View reviewed changes

Starter property-based test suite

1db77e6

Zac-HD force-pushed the hypothesis-tests branch from 47e0258 to 1db77e6 Compare March 15, 2018 11:10

fmaussion merged commit 6456df4 into pydata:master Mar 20, 2018

fmaussion mentioned this pull request Mar 20, 2018

Add a suite of property-based tests with Hypothesis #1846

Open

Zac-HD deleted the hypothesis-tests branch March 20, 2018 12:51

spencerahill mentioned this pull request Mar 22, 2018

Property-based testing with Hypothesis spencerahill/aospy#260

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Starter property-based test suite #1972

Starter property-based test suite #1972

Zac-HD commented Mar 7, 2018 •

edited

Loading

Zac-HD commented Mar 8, 2018

shoyer commented Mar 8, 2018

Zac-HD commented Mar 9, 2018

shoyer commented Mar 9, 2018

tacaswell commented Mar 9, 2018

Zac-HD commented Mar 9, 2018 •

edited

Loading

max-sixty commented Mar 9, 2018

jklymak commented Mar 9, 2018

Zac-HD commented Mar 9, 2018

Zac-HD commented Mar 9, 2018

tacaswell commented Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 9, 2018

stickler-ci Mar 10, 2018

stickler-ci Mar 10, 2018

stickler-ci Mar 10, 2018

stickler-ci Mar 10, 2018

stickler-ci Mar 11, 2018

Zac-HD commented Mar 15, 2018

shoyer commented Mar 17, 2018 •

edited

Loading

Zac-HD commented Mar 20, 2018

fmaussion commented Mar 20, 2018

Starter property-based test suite #1972

Starter property-based test suite #1972

Conversation

Zac-HD commented Mar 7, 2018 • edited Loading

Zac-HD commented Mar 8, 2018

shoyer commented Mar 8, 2018

Zac-HD commented Mar 9, 2018

shoyer commented Mar 9, 2018

tacaswell commented Mar 9, 2018

Zac-HD commented Mar 9, 2018 • edited Loading

max-sixty commented Mar 9, 2018

jklymak commented Mar 9, 2018

Zac-HD commented Mar 9, 2018

Zac-HD commented Mar 9, 2018

tacaswell commented Mar 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented Mar 15, 2018

shoyer commented Mar 17, 2018 • edited Loading

Zac-HD commented Mar 20, 2018

fmaussion commented Mar 20, 2018

Zac-HD commented Mar 7, 2018 •

edited

Loading

Zac-HD commented Mar 9, 2018 •

edited

Loading

shoyer commented Mar 17, 2018 •

edited

Loading