-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Marks slow, flaky, and failing tests #1336
Conversation
a554dd3
to
014a928
Compare
@shoyer, this PR should also fix the issue with travis CI failing due to testing of too many open files. Note, when a user runs |
014a928
to
82a8601
Compare
cc @MaximilianR |
@shoyer, all tests pass and I've spot checked that this works as expected, e.g., xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_1_autoclose_netcdf4 PASSED
xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_1_open_large_num_files_netcdf4 SKIPPED
xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_2_autoclose_scipy PASSED
xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_2_open_large_num_files_scipy SKIPPED
xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_3_autoclose_pynio PASSED
xarray/tests/test_backends.py::OpenMFDatasetManyFilesTest::test_3_open_large_num_files_pynio SKIPPED |
@shoyer, is there anything else you would like done on this before merging? |
@pwolfram Thanks for putting this together. "Slow" tests on Travis aren't a huge problem -- it takes at least a minute for Travis to complete even in the best case scenario, so having to wait a minute more is not so bad. It's really an issue for interactive, local development, where it's really valuable for tests to complete in less than 10 seconds. Flaky tests, which fail sometimes, are the problem. So think we want a different category for these many file tests, maybe "Flaky". Ideally we would have an another Travis-CI job setup to run these that isn't required to always pass, like our existing jobs for testing development versions of other libraries. |
82a8601
to
732e042
Compare
732e042
to
62b0c9e
Compare
@shoyer, as we discussed, here is a robustness-ing of the testing as needed for 0.9.2 |
This also fixes the issue noted in #1038 where flakey tests cause travis CI failure. |
conftest.py
Outdated
|
||
def pytest_addoption(parser): | ||
parser.addoption("--skip-flakey", action="store_true", | ||
help="skips flakey tests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please intent even with (
on the previous line.
.travis.yml
Outdated
@@ -76,7 +76,7 @@ install: | |||
- python setup.py install | |||
|
|||
script: | |||
- py.test xarray --cov=xarray --cov-report term-missing --verbose | |||
- py.test xarray --cov=xarray --cov-report term-missing --verbose --skip-flakey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make --skip-flakey
an environment variable so we can make one of our "allowed failure" matrix builds not skip flakey tests?
xarray/tests/__init__.py
Outdated
|
||
slow = pytest.mark.skipif( | ||
not pytest.config.getoption("--run-slow"), | ||
reason="set --run-slow option to run slow tests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, it looks like the current default behavior is:
- Don't skip flakey tests
- Skip slow tests
I think we actually want to flip these. Consider someone running our test suite for the first time. They probably want to understand if their xarray installation worked. It's not a big deal if this is a little slow, because they probably aren't doing this many times in a row, but not testing part of the test suite would be unfortunate. Once they notice the tests are slow, they will probably quickly look for a --skip-slow
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've set the three categories:
- slow (runs on default):
slow
- CI specific (runs on default, specified to be optional for CI):
optionalci
- flakey (not run on default):
flakey
I don't currently have flakey tests. Plotting routines are marked as slow and long tests from the too many open files issue are marked optionalci
. We have some test (h5netcdf) that are flakey because of that bug I've found. They are no longer commented but are flagged via @flakey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The h5netcdf many file tests are not flaky: they always fail. So let's mark as a "expected failure" with pytest.mark.xfail
instead.
I would argue that the "CI specific" tests should actually be considered flaky: if they fail on our CI, they are likely to also fail for users and redistributors as well (e.g., consider someone packaging xarray for conda-forge or ubuntu). Since these failures do not indicate a failed installation, we shouldn't run them by default.
I'll push a commit shortly with these changes...
8edff08
to
58878db
Compare
d8ceca9
to
28d825f
Compare
.travis.yml
Outdated
@@ -76,7 +158,7 @@ install: | |||
- python setup.py install | |||
|
|||
script: | |||
- py.test xarray --cov=xarray --cov-report term-missing --verbose | |||
- py.test xarray --cov=xarray --cov-report term-missing --verbose $SKIP_CI $RUN_FLAKEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use a single $EXTRA_FLAGS env variable
.travis.yml
Outdated
@@ -13,28 +13,63 @@ matrix: | |||
- python: 2.7 | |||
env: CONDA_ENV=py27-min | |||
- python: 2.7 | |||
env: CONDA_ENV=py27-min SKIP_CI=--skip-optional-ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a limited number of concurrent builds on Travis-CI: only five. So extra optional builds clog up other CI infrastructure and make it slower to run new tests. So I'd strongly prefer to only have one extra build optional if possible. The 15 or so you add here is too much :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shoyer-- This is the intent of my new changes: We really only need to test the h5netcdf tests that are now marked as flakey. I added this extra tests and make the rest of the tests to skip optional CI if they aren't marked as allowed failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we can just run flakey tests for all tests allowed to fail and skip optional CI for all tests that must pass. This avoids having new tests and should be more economical.
6af7c22
to
b4e2a76
Compare
Tests are marked using the following py.test options: * @slow which is run on default, not run if `--skip-slow`. * @optionalci is run on default, not run if `--skip-optional-ci` * @flakey which is not run on default, run if `--run-flakey` Note, the `optionalci` category is needed in order to mark tests that are environment dependent when run on continuous integration (CI) hardware, e.g., tests that fail due to variable loads on travis CI.
b4e2a76
to
717c216
Compare
@shoyer, changes have been made as requested. Thanks! |
@shoyer, all checks pass and this is ready for a review / merge when you have time. |
``` WindowsError: [Error 32] The process cannot access the file because it is being used by another process ```
``` WindowsError: [Error 32] The process cannot access the file because it is being used by another process ```
a99e1f0
to
57d3324
Compare
Thanks @pwolfram! |
If slow tests are being run by default, I'm not sure they really need their own special option. You can mark them (with Are flaky tests actually flaky or do they just not work? If flaky, and a re-run will help, then maybe try the |
Yes, this is a good idea.
Thanks for pointing this out! I'm not entirely sure pytest-rerunfailures is a fit here but it's a good option to know about. The tests are flaky on Travis-CI, but they are reliable when run locally, and we still want to keep them around as integration tests. These are slow tests that open and close quite a lot of files (2000 each). |
Closes #1309