Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starter property-based test suite #1972

Merged
merged 1 commit into from
Mar 20, 2018
Merged

Conversation

Zac-HD
Copy link
Contributor

@Zac-HD Zac-HD commented Mar 7, 2018

This is a small property-based test suite, to give two examples of the kinds of tests that we could write for Xarray using Hypothesis.

  1. For any array, encoding and decoding it with a CF coder outputs an identical array. As you would hope, these tests pass.
  2. For any 2D array, you can call the 2D plotting methods without raising an exception. Alas, this is not the case, and Hypothesis will show you the failing inputs (and matplotlib-related tracebacks) to prove it.
    (Contributing a very small feature to matplotlib was shockingly painful, so I'm not planning to take a similar suite upstream myself unless something changes)

Things that I would like to know:

  • Have I build-wrangled something reasonable here?
  • Will anyone else contribute property-based tests? I'm happy to help people debug or work out how to test something, but I simply don't have the time to write another test suite for free.
  • Is this something you want?

@Zac-HD Zac-HD force-pushed the hypothesis-tests branch 2 times, most recently from 6c02f79 to 6df5cd3 Compare March 8, 2018 03:03
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 8, 2018

Ping @fmaussion / @shoyer - would love your opinions on this, including high-value targets to test.

(btw Appveyor had an internal error; build is otherwise green)

@shoyer
Copy link
Member

shoyer commented Mar 8, 2018

This looks like a great start to me -- thank you!

It's impressive that it's possible to break every plotting type with matplotlib :).

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 9, 2018

This looks like a great start to me -- thank you!

You're welcome! Same questions as above though, plus "is there anything else you need in this initial PR?". If not, can we merge it

It's impressive that it's possible to break every plotting type with matplotlib :).

As much as I love matplotlib, it's a steaming pile of hacks and I want to avoid it more than I want it cleaned up 😥 (entirely because the process is dysfunctional, not the code)

@shoyer
Copy link
Member

shoyer commented Mar 9, 2018

One thing that comes to mind is organization... would it make sense to put this alongside the current xarray tests, e.g., have xarray/tests/unit and xarray/tests/property?

I guess one downside of this would be that it could change how we need to invoke py.test by default, if we don't want to trigger all the property based tests.

@tacaswell
Copy link

What do the failing data sets look like? Does it get easier or harder to find failures if you go up to 10x10? What sort of exceptions are you getting?

You can shove a fair amount of configuration in to pytest.ini to make these opt-in.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 9, 2018

@shoyer - that depends mostly on whether you want to run these tests as part of a standard run. A test-time dependency on Hypothesis is very cheap compared to the other dev dependencies, so I'd think more about the impact on eg CI resources than contributors.

Upside is more power and coverage of edge-cases with odd data; downside is that they take a lot longer by virtue of trying hundreds of examples, and in this case also having to generate arrays takes a while (~log(elements) average via sparse filling).


@tacaswell - I would be delighted to write a test suite like this for matplotlib! The only reason I haven't is because I thought it would be rude to report so many bugs that I don't have time to help fix. If we can get a group project going though I'd be very enthusiastic 😄

  • For this suite I was passing in 2D arrays of unsigned ints, signed ints, or floats (in all sizes), with edge sizes between 2 and 10.
  • The failing outputs were reported as 2x2 arrays (~30 times), or 3x2 (once - there's a cap on number of shrinks that I think I hit there).
  • Values tended to be either uniform small integers - 0 or 1, dtype int8 or uint8; or for some an array of floats with one corner zero and the others very large.

I didn't keep the exact tracebacks, but I remember seeing many come from overflow in tick spacing calculations. Again, happy to write a test suite and make more detailed reports upstream if people want to fix this - in which case let's open an issue there!

@max-sixty
Copy link
Collaborator

If we don't want to trigger by default, we can do something like this and require passing this to run them:

pytest --property-tests

@jklymak
Copy link
Contributor

jklymak commented Mar 9, 2018

As pointed out on the matplotlib gitter:

If you run

import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

for i in range(200):
    xr.DataArray(np.array([[0, 0], [0, 0]], dtype=np.uint8)).plot.pcolormesh()

at step 165 you will get:

File "/Users/jklymak/matplotlib/lib/matplotlib/figure.py", line 236, in update
    raise ValueError('left cannot be >= right')
ValueError: left cannot be >= right

Why? Because you have made a plot that if it displays looks like:

fig1

Are you sure your test isn't doing something similar? At some point there just isn't room for more colorbars! Adding a plt.clf() can cure the problem.

Its also is possible you are hitting floating point overflows with your test. At some point Matplotlib needs to be able to manipulate the data that comes in, and if you operate near the maximum number your data type can handle, you'll have problems. Just like you would if you just did

a = 2*xr.DataArray(np.array([[0, 0], [0, 1e308]]))

you will get:

/Users/jklymak/anaconda3/envs/matplotlibdev/lib/python3.6/site-packages/xarray/core/variable.py:1165: RuntimeWarning: overflow encountered in multiply

So maybe your hypothesis tester could be constrained to stay away from floating point overflows?

Matplotlib indeed has flaws and quirks, but if you are finding bugs it would be good to isolate them.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 9, 2018

...that also explains why I was having trouble reproducing the error, whoops. I'll see how it goes with those problems excluded later tonight!

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 9, 2018

@jklymak - I'm getting RuntimeError: Invalid DISPLAY variable in the Qt backend now that I've added plt.clf(). It works on my machine now (:tada:, thanks!) - any suggestions for Travis?

(I'm also getting Zarr errors, but I assume those will go away soon as I didn't cause them)

@tacaswell
Copy link

Set the backend to Agg on travis as you don't have a xserever running. You probably want to manually force a draw as well.


import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

matplotlib.use('Agg')
import matplotlib.pyplot as plt

from hypothesis import given, settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

import matplotlib.pyplot as plt

from hypothesis import given, settings
import hypothesis.strategies as st
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file


from hypothesis import given, settings
import hypothesis.strategies as st
import hypothesis.extra.numpy as npst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

import hypothesis.strategies as st
import hypothesis.extra.numpy as npst

import xarray as xr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

two_dimensional_array = st.sampled_from([
dict(dtype=npst.unsigned_integer_dtypes() | npst.integer_dtypes()),
dict(dtype=npst.floating_dtypes(), elements=st.floats(-2.**100, 2.**100)),
# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E122 continuation line missing indentation or outdented

dict(dtype=npst.unsigned_integer_dtypes() | npst.integer_dtypes()),
dict(dtype=npst.floating_dtypes(), elements=st.floats(-2.**100, 2.**100)),
# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
# the kwargs above and a shape, then draw a value from that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E122 continuation line missing indentation or outdented

# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
# the kwargs above and a shape, then draw a value from that.
]).flatmap(lambda kwargs: npst.arrays(
**kwargs, shape=st.tuples(st.integers(2, 5), st.integers(2, 5)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E999 SyntaxError: invalid syntax

@Zac-HD Zac-HD force-pushed the hypothesis-tests branch from a7a7876 to 7c89212 Compare March 10, 2018 00:19
dict(dtype=npst.unsigned_integer_dtypes() | npst.integer_dtypes()),
dict(dtype=st.sampled_from(['float32', 'float64']),
elements=st.floats(-2.**50, 2.**50)),
# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E122 continuation line missing indentation or outdented

dict(dtype=st.sampled_from(['float32', 'float64']),
elements=st.floats(-2.**50, 2.**50)),
# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
# the kwargs above and a shape, then draw a value from that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E122 continuation line missing indentation or outdented

# Then, we "flatmap" this into an arrays strategy - ie create a strategy using
# the kwargs above and a shape, then draw a value from that.
]).flatmap(lambda kwargs: npst.arrays(
**kwargs, shape=st.tuples(st.integers(2, 5), st.integers(2, 5)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E999 SyntaxError: invalid syntax


import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E402 module level import not at top of file

@Zac-HD Zac-HD force-pushed the hypothesis-tests branch from a18d973 to 47e0258 Compare March 11, 2018 02:30
dict(dtype=st.sampled_from(['float32', 'float64']),
elements=st.floats(-2.**50, 2.**50)),
]).flatmap(lambda kwargs: npst.arrays(
shape=st.tuples(st.integers(2, 5), st.integers(2, 5)), **kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E999 SyntaxError: invalid syntax

@Zac-HD Zac-HD force-pushed the hypothesis-tests branch from 47e0258 to 1db77e6 Compare March 15, 2018 11:10
@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 15, 2018

@shoyer & @fmaussion - I've just given up on the plotting tests as being more effort than they're worth. Are there any:

@shoyer
Copy link
Member

shoyer commented Mar 17, 2018

blockers to merging this as-is?

This looks pretty good to me in its current state. I would say we should merge it now and iterate in future PRs.

other APIs you think it would be reasonably easy to write property tests for? Here's a nice list of properties to test 😄

Almost anywhere where we currently make heavy use of parametrize would be a good candidate. Some other possibilities:

  • Consistency with pandas for groupby/rolling aggregations.
  • Roundtrip writing/reading data to netCDF. There are a couple of known exceptions (e.g., dtypes not supported by netCDF and MultiIndex) but otherwise every xarray object should be serializable to netCDF and back without data loss.
  • Roundtrip to/from pandas Series/DataFrame with to_series()/to_dataframe()/to_xarray().
  • Indexing consistency tests for backends: all indexing operations should be supported consistently on data accessed from any backend.
  • NumPy vs Dask: any operation on dask arrays should be consistent with the operation on numpy arrays (e.g., f(xarray_obj.chunk()).compute() == f(xarray_obj)).
  • Indexing followed by xarray.concat: should get back the same result.
  • Binary arithmetic on xarray objects with Python operators (+, -, etc) and NumPy ufuncs (np.add, np.subtract, etc).

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Mar 20, 2018

I would say we should merge it now and iterate in future PRs.

Merge away then!

@fmaussion
Copy link
Member

Thanks @Zac-HD !

@Zac-HD Zac-HD deleted the hypothesis-tests branch March 20, 2018 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants