Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: Split test_excel.py into sub test files #24472 #25334

Closed
wants to merge 1 commit into from

Conversation

stevenbw
Copy link

@stevenbw stevenbw commented Feb 15, 2019

In brief, I have tried to split the quite large test_excel.py test file into logical chunks. I have done this by first splitting by reading and writing, and then by engine.

@stevenbw
Copy link
Author

Most of the failures appear to because of the use of the depreciated parse_cols instead of usecols.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a direct split w/o actually changing anything? we much prefer this, its impossible to test what you actually change vs just moved.

@jreback jreback added Testing pandas testing functions or related to the test suite IO Excel read_excel, to_excel labels Feb 16, 2019
@stevenbw
Copy link
Author

Yes I have only moved code into separate files and not made any changes. This was intentional since, as you said, it cannot be tested and also I'd rather push small changes, rather than rearranging and changing code in one PR.

I checked the original test_excel.py file, this also uses the parse_cols argument. I presume this functionality has depreciated since the development of these tests.

@jreback
Copy link
Contributor

jreback commented Feb 16, 2019

Most of the failures appear to because of the use of the depreciated parse_cols instead of usecols

this was quite a while ago, are you sure you started with current master?

@WillAyd
Copy link
Member

WillAyd commented Feb 20, 2019

If you can merge master there were some changes made since you opened this PR would need to be split.

Generally I'd comment that I think the sub-directories for reader and writer are a little overkill; we had a similar conversation on the actual splitting of code and the only engine that can really do both in theory is openpyxl, but that's not even implemented at this point

If I could suggest an alternate structure it would probably look something like:

  • common.py (for SharedItems class)
  • test_readers.py
  • test_writers.py
  • test_{engine}.py for tests specific to the various engines
  • test_style.py for styling tests (maybe also in test_writers, not sure)

This is a rather large document to split up and it doesn't need to all be done at once. In fact it may be preferable to do multiple PRs where you split reading tests first, then writing, etc...

@stevenbw
Copy link
Author

Thanks @WillAyd , @jreback ! I will ensure I pull the latest changes and take a look over the weekend.

@stevenbw
Copy link
Author

Hi @WillAyd , I have fetched the latest changes from master and then split test_excel.py into three for now as you suggested into common.py, test_reader.py and test_writer.py. However when I run the test_reader.py test module it results in several failures due to the depreciation of 'parse_cols'. I checked test_excel.py at master, that too contains the 'parse_cols' argument. Is this performing correctly? Is it testing to see if a depreciated argument is used? Thanks

@WillAyd
Copy link
Member

WillAyd commented Feb 22, 2019 via email

@stevenbw
Copy link
Author

There is no error before the split. Could it be realated to the stack_level? The function is executed in ReadingTestsBase in test_reader.py , using arguments from SharedItems , defined in common.py.

The error is

AssertionError: Warning not set with correct stacklevel. File where warning is raised: pandas\tests\io\excel\common.py != pandas\tests\io\excel\test_reader.py. Warning message: the 'parse_cols' keyword is deprecated, use 'usecols' instead

@WillAyd
Copy link
Member

WillAyd commented Feb 26, 2019

Hmm I see there isn't consistency in existing codebase around stacklevel handling so we should maybe as a precursor here make that consistent with stacklevel=False before moving or alternately just use pytest.warns

@gfyoung I think you've been looking a lot at leveraging pytest methods instead of internal testing ones - any thoughts to doing that here?

@gfyoung
Copy link
Member

gfyoung commented Feb 26, 2019

@WillAyd : The general consensus is continue using tm.assert_produces_warning because of the check for stacklevel. That being said, if you're having issues getting the right level, pass in check_stacklevel=False.

@WillAyd WillAyd mentioned this pull request Mar 9, 2019
4 tasks
@jreback
Copy link
Contributor

jreback commented Mar 25, 2019

would love to have this. probably better to start over with a clean copy of master.

@jreback jreback closed this Mar 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST: Split test_excel into sub test files
4 participants