Marks slow, flaky, and failing tests #1336

pwolfram · 2017-03-28T19:03:20Z

pwolfram · 2017-03-28T19:05:14Z

@shoyer, this PR should also fix the issue with travis CI failing due to testing of too many open files. Note, when a user runs py.test locally, the slow tests will run. They can be skipped via running py.test --skip-slow.

pwolfram · 2017-03-28T19:07:01Z

cc @MaximilianR

pwolfram · 2017-03-28T19:12:21Z

@shoyer et al, please feel free to mark additional tests as slow in this PR. I only marked the ones from #1198 that were too slow for this round.

pwolfram · 2017-03-28T19:37:12Z

@shoyer, all tests pass and I've spot checked that this works as expected, e.g.,

xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_1_autoclose_netcdf4 PASSED

xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_1_open_large_num_files_netcdf4 SKIPPED

xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_2_autoclose_scipy PASSED

xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_2_open_large_num_files_scipy SKIPPED

xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_3_autoclose_pynio PASSED

xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_3_open_large_num_files_pynio SKIPPED

pwolfram · 2017-03-29T19:05:04Z

@shoyer, is there anything else you would like done on this before merging?

shoyer · 2017-03-29T23:18:01Z

@pwolfram Thanks for putting this together.

"Slow" tests on Travis aren't a huge problem -- it takes at least a minute for Travis to complete even in the best case scenario, so having to wait a minute more is not so bad. It's really an issue for interactive, local development, where it's really valuable for tests to complete in less than 10 seconds.

Flaky tests, which fail sometimes, are the problem. So think we want a different category for these many file tests, maybe "Flaky". Ideally we would have an another Travis-CI job setup to run these that isn't required to always pass, like our existing jobs for testing development versions of other libraries.

pwolfram · 2017-03-30T23:11:29Z

@shoyer, as we discussed, here is a robustness-ing of the testing as needed for 0.9.2

pwolfram · 2017-03-30T23:12:15Z

This also fixes the issue noted in #1038 where flakey tests cause travis CI failure.

shoyer · 2017-03-30T23:51:29Z

conftest.py

+
+def pytest_addoption(parser):
+    parser.addoption("--skip-flakey", action="store_true",
+            help="skips flakey tests")


nit: please intent even with ( on the previous line.

shoyer · 2017-03-30T23:54:24Z

.travis.yml

@@ -76,7 +76,7 @@ install:
  - python setup.py install

 script:
-  - py.test xarray --cov=xarray --cov-report term-missing --verbose
+  - py.test xarray --cov=xarray --cov-report term-missing --verbose --skip-flakey


Can you make --skip-flakey an environment variable so we can make one of our "allowed failure" matrix builds not skip flakey tests?

shoyer · 2017-03-30T23:59:35Z

xarray/tests/__init__.py

+
+slow = pytest.mark.skipif(
+    not pytest.config.getoption("--run-slow"),
+    reason="set --run-slow option to run slow tests"


Just to be clear, it looks like the current default behavior is:

Don't skip flakey tests

Skip slow tests

I think we actually want to flip these. Consider someone running our test suite for the first time. They probably want to understand if their xarray installation worked. It's not a big deal if this is a little slow, because they probably aren't doing this many times in a row, but not testing part of the test suite would be unfortunate. Once they notice the tests are slow, they will probably quickly look for a --skip-slow option.

I've set the three categories:

slow (runs on default): slow

CI specific (runs on default, specified to be optional for CI): optionalci

flakey (not run on default): flakey

I don't currently have flakey tests. Plotting routines are marked as slow and long tests from the too many open files issue are marked optionalci. We have some test (h5netcdf) that are flakey because of that bug I've found. They are no longer commented but are flagged via @flakey.

The h5netcdf many file tests are not flaky: they always fail. So let's mark as a "expected failure" with pytest.mark.xfail instead.

I would argue that the "CI specific" tests should actually be considered flaky: if they fail on our CI, they are likely to also fail for users and redistributors as well (e.g., consider someone packaging xarray for conda-forge or ubuntu). Since these failures do not indicate a failed installation, we shouldn't run them by default.

I'll push a commit shortly with these changes...

shoyer · 2017-03-31T03:32:14Z

.travis.yml

@@ -76,7 +158,7 @@ install:
  - python setup.py install

 script:
-  - py.test xarray --cov=xarray --cov-report term-missing --verbose
+  - py.test xarray --cov=xarray --cov-report term-missing --verbose $SKIP_CI $RUN_FLAKEY


Let's use a single $EXTRA_FLAGS env variable

shoyer · 2017-03-31T03:34:37Z

.travis.yml

@@ -13,28 +13,63 @@ matrix:
  - python: 2.7
    env: CONDA_ENV=py27-min
  - python: 2.7
+    env: CONDA_ENV=py27-min SKIP_CI=--skip-optional-ci


We have a limited number of concurrent builds on Travis-CI: only five. So extra optional builds clog up other CI infrastructure and make it slower to run new tests. So I'd strongly prefer to only have one extra build optional if possible. The 15 or so you add here is too much :)

@shoyer-- This is the intent of my new changes: We really only need to test the h5netcdf tests that are now marked as flakey. I added this extra tests and make the rest of the tests to skip optional CI if they aren't marked as allowed failures.

Actually, we can just run flakey tests for all tests allowed to fail and skip optional CI for all tests that must pass. This avoids having new tests and should be more economical.

@flakey

Tests are marked using the following py.test options: * @slow which is run on default, not run if `--skip-slow`. * @optionalci is run on default, not run if `--skip-optional-ci` * @flakey which is not run on default, run if `--run-flakey` Note, the `optionalci` category is needed in order to mark tests that are environment dependent when run on continuous integration (CI) hardware, e.g., tests that fail due to variable loads on travis CI.

pwolfram · 2017-04-02T14:03:03Z

@shoyer, changes have been made as requested. Thanks!

pwolfram · 2017-04-02T22:12:28Z

@shoyer, all checks pass and this is ready for a review / merge when you have time.

``` WindowsError: [Error 32] The process cannot access the file because it is being used by another process ```

shoyer · 2017-04-03T05:30:23Z

Thanks @pwolfram!

QuLogic · 2017-04-07T04:26:02Z

If slow tests are being run by default, I'm not sure they really need their own special option. You can mark them (with @pytest.mark.slow) and use pytest's builtin selectors to not run them: pytest -m "not slow".

Are flaky tests actually flaky or do they just not work? If flaky, and a re-run will help, then maybe try the pytest-rerunfailures plugin.

shoyer · 2017-04-07T04:36:08Z

If slow tests are being run by default, I'm not sure they really need their own special option. You can mark them (with @pytest.mark.slow) and use pytest's builtin selectors to not run them: pytest -m "not slow".

Yes, this is a good idea.

Are flaky tests actually flaky or do they just not work? If flaky, and a re-run will help, then maybe try the pytest-rerunfailures plugin.

Thanks for pointing this out! I'm not entirely sure pytest-rerunfailures is a fit here but it's a good option to know about.

The tests are flaky on Travis-CI, but they are reliable when run locally, and we still want to keep them around as integration tests. These are slow tests that open and close quite a lot of files (2000 each).

pwolfram force-pushed the mark_slow_tests branch from a554dd3 to 014a928 Compare March 28, 2017 19:05

pwolfram force-pushed the mark_slow_tests branch from 014a928 to 82a8601 Compare March 28, 2017 19:06

pwolfram mentioned this pull request Mar 28, 2017

Attributes from netCDF4 intialization retained #1038

Merged

pwolfram force-pushed the mark_slow_tests branch from 82a8601 to 732e042 Compare March 30, 2017 22:39

pwolfram changed the title ~~Marks slow tests~~ Marks slow and flakey tests Mar 30, 2017

pwolfram force-pushed the mark_slow_tests branch from 732e042 to 62b0c9e Compare March 30, 2017 22:47

shoyer reviewed Mar 30, 2017

View reviewed changes

pwolfram force-pushed the mark_slow_tests branch 6 times, most recently from 8edff08 to 58878db Compare March 31, 2017 02:38

pwolfram changed the title ~~Marks slow and flakey tests~~ Marks slow, flakey, and optional-CI tests Mar 31, 2017

pwolfram force-pushed the mark_slow_tests branch 6 times, most recently from d8ceca9 to 28d825f Compare March 31, 2017 02:52

shoyer reviewed Mar 31, 2017

View reviewed changes

pwolfram force-pushed the mark_slow_tests branch 3 times, most recently from 6af7c22 to b4e2a76 Compare April 2, 2017 14:00

pwolfram force-pushed the mark_slow_tests branch from b4e2a76 to 717c216 Compare April 2, 2017 14:01

shoyer and others added 7 commits April 2, 2017 18:02

Merge branch 'master' into mark_slow_tests

024166a

Spelling: flakey -> flaky

c18a9d5

Test suite adjustments

5a4cba5

Typos

26aff95

Remove bottleneck from most CI builds

278a566

xfail for cross engine read write netcdf4

673e6dc

Try fix: Closes open file (for Windows error)

283c80f

``` WindowsError: [Error 32] The process cannot access the file because it is being used by another process ```

pwolfram changed the title ~~Marks slow, flakey, and optional-CI tests~~ Marks slow, flaky, and failing tests Apr 3, 2017

Try fix: Closes open file (for Windows error)

57d3324

``` WindowsError: [Error 32] The process cannot access the file because it is being used by another process ```

pwolfram force-pushed the mark_slow_tests branch from a99e1f0 to 57d3324 Compare April 3, 2017 04:59

shoyer merged commit 685ba06 into pydata:master Apr 3, 2017

pwolfram deleted the mark_slow_tests branch April 3, 2017 13:45

jjhelmus mentioned this pull request Apr 5, 2017

Cannot import xarray.tests due to use of pytest.config #1353

Closed

shoyer mentioned this pull request Apr 7, 2017

netcdf4/h5netcdf cross engine test broken on Python 2 #1298

Closed

4 tasks

shoyer mentioned this pull request Apr 10, 2017

Fix missing config errors when running test suite #1365

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marks slow, flaky, and failing tests #1336

Marks slow, flaky, and failing tests #1336

pwolfram commented Mar 28, 2017

pwolfram commented Mar 28, 2017

pwolfram commented Mar 28, 2017

pwolfram commented Mar 28, 2017 •

edited

Loading

pwolfram commented Mar 28, 2017

pwolfram commented Mar 29, 2017

shoyer commented Mar 29, 2017

pwolfram commented Mar 30, 2017

pwolfram commented Mar 30, 2017

shoyer Mar 30, 2017

shoyer Mar 30, 2017

shoyer Mar 30, 2017

pwolfram Mar 31, 2017

shoyer Apr 3, 2017

shoyer Mar 31, 2017

shoyer Mar 31, 2017

pwolfram Apr 2, 2017

pwolfram Apr 2, 2017

pwolfram commented Apr 2, 2017

pwolfram commented Apr 2, 2017

shoyer commented Apr 3, 2017

QuLogic commented Apr 7, 2017

shoyer commented Apr 7, 2017

Marks slow, flaky, and failing tests #1336

Marks slow, flaky, and failing tests #1336

Conversation

pwolfram commented Mar 28, 2017

pwolfram commented Mar 28, 2017

pwolfram commented Mar 28, 2017

pwolfram commented Mar 28, 2017 • edited Loading

pwolfram commented Mar 28, 2017

pwolfram commented Mar 29, 2017

shoyer commented Mar 29, 2017

pwolfram commented Mar 30, 2017

pwolfram commented Mar 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pwolfram commented Apr 2, 2017

pwolfram commented Apr 2, 2017

shoyer commented Apr 3, 2017

QuLogic commented Apr 7, 2017

shoyer commented Apr 7, 2017

pwolfram commented Mar 28, 2017 •

edited

Loading