Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Clean up of locale testing #29883

Merged
merged 21 commits into from
Jan 1, 2020
Merged

CLN: Clean up of locale testing #29883

merged 21 commits into from
Jan 1, 2020

Conversation

datapythonista
Copy link
Member

There are couple of things I don't understand from the locale:

  • For what I see, we test with different locales (e.g. it_IT.UTF-8) but pandas never uses the language (it_IT) only the encoding (UTF-8). So, not sure if testing with different locales is being useful.
  • The variable LOCALE_OVERRIDE seems to add more complexity than value. Unless LC_ALL/LANG are being set afterwards, but I don't think so.
  • There is a test that is using LOCALE_OVERRIDE but doesn't seem to depend on the actual locale of the system. It probably makes more sense to parametrize and test all the locales we want, than use the variable value (when testing locally this test will be more useful).
  • There is a "test" being skipped in run_tests.sh that is probably worth converting to an actual test.

Addressing these things here, but I may be missing something here. Please let me know if that's the case.

@datapythonista datapythonista added Clean Needs Discussion Requires discussion from core team before further action labels Nov 27, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these builds were added a few years ago to ensure that we run correctly in other locales. does this PR change this?

@datapythonista
Copy link
Member Author

these builds were added a few years ago to ensure that we run correctly in other locales. does this PR change this?

With this PR we still run with the same exact locales as before. Nothing less is being tested, I'm making sure the same works, but also that pandas works when not only the locale is different but also the encoding.

@jreback
Copy link
Contributor

jreback commented Nov 29, 2019

ok looks fine, ping on green.

@datapythonista
Copy link
Member Author

I couldn't find a way to change the terminal encoding in Ubuntu to one different than UTF-8. Leaving this out out this PR. But I think this means that there are no tests in the CI to know if the option display.encoding is working or not.

This PR seems to properly change the locale (I think the locale stuff currently in master is not actually doing anything). The failures in the CI are tests that only work with an English locale (the error messages are localized, and don't match the ones in tests):

AssertionError: Pattern "(File (b')?.+does_not_exist\\.xlsx'? does not exist|\\[Errno 2\\] No such file or directory: '.+does_not_exist\\.xlsx'|Expected object or value|path_or_buf needs to be a string file path or file-like|\\[Errno 2\\] File .+does_not_exist\\.xlsx does not exist: '.+does_not_exist\\.xlsx')" not found in "[Errno 2] 没有那个文件或目录: '/home/vsts/work/1/s/pandas/tests/io/data/does_not_exist.xlsx'"
pandas/tests/io/test_common.py:166: AssertionError

If I'm not wrong, the exceptions come from other libraries (xlrd for example), that localize them. Not sure what's the best solution.

@jreback any preference?

@datapythonista datapythonista removed the Needs Discussion Requires discussion from core team before further action label Dec 30, 2019
@datapythonista
Copy link
Member Author

@jreback got this working. This makes the locale builds actually test with different locales. Currently the locale builds don't really change the system locale.

@jreback jreback added this to the 1.0 milestone Jan 1, 2020
@jreback jreback merged commit a9423e3 into pandas-dev:master Jan 1, 2020
@jreback
Copy link
Contributor

jreback commented Jan 1, 2020

very nice @datapythonista

hweecat pushed a commit to hweecat/pandas that referenced this pull request Jan 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants