Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excel Tests Continued Fixture Cleanup #26662

Merged
merged 7 commits into from
Jun 8, 2019

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Jun 5, 2019

Follow up to #26579

Removing the frame2 fixture which is sparsely used. I've also removed inheritance of ReadingTestsBase and made that it's own discoverable class. Previously these tests were only being generated when TestXlrdReader was being run which tightly couples reading with xlrd.

The goal here in subsequent PRs would be to better parametrize TestReaders for new engines that come in and truly make TestXlrdReader only deal with xlrd specific tests (will have to move around some tests to make that happen)

@WillAyd WillAyd added IO Excel read_excel, to_excel Testing pandas testing functions or related to the test suite labels Jun 5, 2019
@@ -58,8 +51,9 @@ def ignore_xlrd_time_clock_warning():
yield


@td.skip_if_no('xlrd', '1.0.0')
class ReadingTestsBase:
@td.skip_if_no('xlrd')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that our project-wide minimum ix 1.0.0, so this decorator was being duplicative.

This decorator also won't be there in the long term once we completely couple TestReaders from xlrd, but keeping here for now

@WillAyd
Copy link
Member Author

WillAyd commented Jun 5, 2019

This was causing failures for me locally per this convo:

#26657 (comment)

Let's see what happens here...

@codecov
Copy link

codecov bot commented Jun 5, 2019

Codecov Report

Merging #26662 into master will decrease coverage by 50.09%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26662      +/-   ##
==========================================
- Coverage   91.87%   41.78%   -50.1%     
==========================================
  Files         174      174              
  Lines       50663    50663              
==========================================
- Hits        46548    21168   -25380     
- Misses       4115    29495   +25380
Flag Coverage Δ
#multiple ?
#single 41.78% <ø> (-0.06%) ⬇️
Impacted Files Coverage Δ
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/core/groupby/categorical.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
pandas/io/formats/html.py 0% <0%> (-99.37%) ⬇️
pandas/io/sas/sas7bdat.py 0% <0%> (-91.16%) ⬇️
pandas/io/sas/sas_xport.py 0% <0%> (-90.1%) ⬇️
pandas/core/sparse/scipy_sparse.py 10.14% <0%> (-89.86%) ⬇️
pandas/core/tools/numeric.py 10.14% <0%> (-89.86%) ⬇️
... and 128 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e0c41f7...417fc30. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 5, 2019

Codecov Report

Merging #26662 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26662      +/-   ##
==========================================
- Coverage   91.78%   91.77%   -0.01%     
==========================================
  Files         174      174              
  Lines       50703    50703              
==========================================
- Hits        46538    46534       -4     
- Misses       4165     4169       +4
Flag Coverage Δ
#multiple 90.37% <ø> (ø) ⬆️
#single 41.81% <ø> (-0.09%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 96.88% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf25c5c...fd697ea. Read the comment docs.

@WillAyd WillAyd added this to the 0.25.0 milestone Jun 5, 2019
Copy link
Member Author

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK think I'm stopping here. This should be a pretty good cleanup to remove unnecessary fixturation.

I think after this going to align the engine / ext usage in the Writer classes and can start splitting into sub-modules thereafter

@@ -80,7 +83,6 @@ def df_ref(self):
parse_dates=True, engine='python')
return df_ref

@td.skip_if_no("xlrd", "1.0.1") # see gh-22682
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These didn't really solve the issue they were supposed to so removed to make things cleaner. Again I think the real issue occurs when defusedxml is installed in the right environment, but will have to delve further into which environments that is exactly (orthogonal to this PR)

@@ -844,7 +841,7 @@ def test_read_excel_squeeze(self, ext):
tm.assert_series_equal(actual, expected)


@td.skip_if_no('xlrd', '1.0.0')
@td.skip_if_no('xlrd')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment might have been lost on a prior PR but the min xlrd version for pandas is 1.0.0 anyway, so this was duplicative

pandas/tests/io/test_excel.py Show resolved Hide resolved
pandas/tests/io/test_excel.py Show resolved Hide resolved
@jreback
Copy link
Contributor

jreback commented Jun 6, 2019

lgtm. @simonjayhawkins

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd generally lgtm. a few suggestions.

pandas/tests/io/test_excel.py Show resolved Hide resolved
pandas/tests/io/test_excel.py Show resolved Hide resolved
pandas/tests/io/test_excel.py Show resolved Hide resolved
pandas/tests/io/test_excel.py Show resolved Hide resolved
pandas/tests/io/test_excel.py Show resolved Hide resolved
@simonjayhawkins
Copy link
Member

@WillAyd changes look good.

test failure is a mystery to me at the moment. I would say it was unrelated if it wasn't for the fact that no other builds are seeing this.

the failing test has an engine parameter. coincidence? no reason that changes to test_excel should affect this test. test output shows engine = 'numexpr' so implies no clash of fixtures.

the parametrisation of the engine fixture in pandas\tests\computation\test_eval.py does look iffy though.

@WillAyd
Copy link
Member Author

WillAyd commented Jun 7, 2019 via email

@simonjayhawkins
Copy link
Member

i think the engine bit may be a coincidence.

it appears that a few tests are using set_use_numexpr which is messing with global variables instead of monkeypatching.

so if the order of tests change, could be any unrelated changes, this could cause a problem.

as an experiment, you could try changing the engine fixture in pandas\tests\computation\test_eval.py to

@pytest.fixture(params=_engines)
def engine(request):
    engine = request.param
    if engine == 'numexpr' and not _USE_NUMEXPR:
        pytest.skip('numexpr enabled->{enabled}, '
                    'installed->{installed}'.format(
                        enabled=_USE_NUMEXPR,
                        installed=_NUMEXPR_INSTALLED))
    return engine

so that the skip is evaluated at test execution instead of during test collection since skipping depends on the global variable.

@WillAyd
Copy link
Member Author

WillAyd commented Jun 7, 2019

Makes sense and I think what you have is more readable anyway. Just pushed

@simonjayhawkins
Copy link
Member

i have a suspicion that'll probably not work because the test is quite involved so there will still be lag between checking the global and performing the failing op.

not sure how much time you want to spend on this. could just revert the last couple of commits for now.

i've had a go a parametrising the tests to split them up. but it may be easier to just debug the test on ci by adding print statements.

    def _test_basic_series_frame_alignment(
            self, engine, parser, r_idx_type, c_idx_type, index_name):
        with warnings.catch_warnings(record=True):
            # avoid warning about comparing strings and ints
            warnings.simplefilter("ignore", RuntimeWarning)
            df = mkdf(10, 7, data_gen_f=f, r_idx_type=r_idx_type,
                      c_idx_type=c_idx_type)
            index = getattr(df, index_name)
            s = Series(np.random.randn(5), index[:5])
            if should_warn(s.index, df.index):
                with tm.assert_produces_warning(RuntimeWarning):
                    res = pd.eval('s + df', engine=engine, parser=parser)
            else:
                res = pd.eval('s + df', engine=engine, parser=parser)

            if r_idx_type == 'dt' or c_idx_type == 'dt':
                expected = df.add(s) if engine == 'numexpr' else s + df
            else:
                expected = s + df
            assert_frame_equal(res, expected)

    @pytest.mark.parametrize('r_idx_type', ['i', 'u', 's'])
    @pytest.mark.parametrize('c_idx_type', ['i', 'u', 's'])
    @pytest.mark.parametrize('index_name', ['index', 'columns'])
    def test_basic_series_frame_alignment(
            self, engine, parser, r_idx_type, c_idx_type, index_name):
        self._test_basic_series_frame_alignment(
            engine, parser, r_idx_type, c_idx_type, index_name)

    @pytest.mark.parametrize('index_name', ['index', 'columns'])
    def test_basic_series_frame_alignment_dt(self, engine, parser, index_name):
        # only test dt with dt, otherwise weird joins result
        self._test_basic_series_frame_alignment(
            engine, parser, 'dt', 'dt', index_name)```

@WillAyd
Copy link
Member Author

WillAyd commented Jun 8, 2019

OK yea let me revert the last few changes then. Even if it passes CI somewhat fragile and orthogonal anyway. PR is big enough as is so happy to address separately and in coordination with whatever you are looking at

@WillAyd WillAyd force-pushed the excel-fixture-cleanup branch from 950b695 to 7fc3f76 Compare June 8, 2019 00:11
@WillAyd WillAyd merged commit c759897 into pandas-dev:master Jun 8, 2019
@WillAyd WillAyd deleted the excel-fixture-cleanup branch June 8, 2019 16:16
@WillAyd
Copy link
Member Author

WillAyd commented Jun 8, 2019

Thanks for the review @simonjayhawkins

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants