Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cygwin doesn't support Unicode in PATH/HOME #7188

Open
jdpipe opened this issue Oct 27, 2020 · 4 comments
Open

cygwin doesn't support Unicode in PATH/HOME #7188

jdpipe opened this issue Oct 27, 2020 · 4 comments

Comments

@jdpipe
Copy link

jdpipe commented Oct 27, 2020

I have an issue that arose from the use of my code by an MSYS2 user who has a non-ASCII username. I've done testing of my own with a new Windows user called "1414°", to confirm the problem, as shown below.

If I have some executable code installed in /home/1414°/.local/bin/omc.exe, then it is correctly located in bash, via which omc.

However, if I attempt to use Python to do the same job, shutil.which('omc') fails to locate the program. Furthermore, I see that if I output the contents of os.environ['PATH'] as understood by Python, then I get a strange unicode 'surrogate' character \udcb0 being shown. The correct unicode escape for the ° in /home/1414° should be \u00b0. So this looks like some issue with encodings and locales, which I can't see any clear fix for it. I presume that this issue with os.environ may be a precursor to the problem with shutil.which.

I believe that perhaps on MSYS2, there is some incorrect setting of the locale in Python, which has the result of mangling unicode characters in file paths.

Full session output below:

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ export PATH=$PATH:~/.local/bin

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ echo $PATH
/mingw64/bin:/usr/local/bin:/usr/bin:/bin:/c/Windows/System32:/c/Windows:/c/Windows/System32/Wbem:/c/Windows/System32/WindowsPowerShell/v1.0/:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/1414°/.local/bin

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ which omc
/home/1414°/.local/bin/omc

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ ipython
Python 3.8.6 (default, Oct  1 2020, 13:01:33)  [GCC 10.2.0 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.13.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import shutil

In [2]: shutil.which('omc')

In [3]: shutil.which('ls')
Out[3]: 'C:\\msys64_2\\usr\\bin/ls.EXE'

In [4]: import os

In [5]: os.environ['PATH'].split(";")
Out[5]:
['C:\\msys64_2\\mingw64\\bin',
 'C:\\msys64_2\\usr\\local\\bin',
 'C:\\msys64_2\\usr\\bin',
 'C:\\msys64_2\\usr\\bin',
 'C:\\Windows\\System32',
 'C:\\Windows',
 'C:\\Windows\\System32\\Wbem',
 'C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\',
 'C:\\msys64_2\\usr\\bin\\site_perl',
 'C:\\msys64_2\\usr\\bin\\vendor_perl',
 'C:\\msys64_2\\usr\\bin\\core_perl',
 'C:\\msys64_2\\home\\1414\udcb0\\.local\\bin',
 'C:\\msys64_2\\mingw64\\bin\\']

In [7]:
@lazka
Copy link
Member

lazka commented Oct 27, 2020

Thanks, yeah, something's not right here.

@jdpipe
Copy link
Author

jdpipe commented Oct 27, 2020

Here is some further weirdness:

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ export PATH=/mingw64/bin:/usr/local/bin:/usr/bin:/bin:/home/1414°/.local/bin

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ echo $PATH
/mingw64/bin:/usr/local/bin:/usr/bin:/bin:/home/1414°/.local/bin

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ which omc
/home/1414°/.local/bin/omc

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ python -X utf8 -c 'import os; print(os.environ["PATH"].split(";"))'
['C:\\msys64_2\\mingw64\\bin', 'C:\\msys64_2\\usr\\local\\bin', 'C:\\msys64_2\\usr\\bin', 'C:\\msys64_2\\usr\\bin', 'C:\\msys64_2\\home\\1414\udcb0\\.local\\bin', 'C:\\msys64_2\\mingw64\\bin\\']

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ python -X utf8 -c 'import os; print(os.environ["PATH"].split(";")[-2])'
C:\msys64_2\home\1414▒\.local\bin

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ export TEST1=1414°

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ python -X utf8 -c 'import os; print(os.environ["TEST1"])'
1414°

1414°@DESKTOP-6ADQVP0 MINGW64 ~
$ python -c 'import os; print(os.environ["TEST1"])'
1414▒

Firstly, it's conspicuous that MSYS seems to be quietly adding a PATH component before invoking PYTHON, even though it's not necessary.

Secondly, you can see therefore that arbitrary environment variables (TEST above) come through correctly (although I have to use this -X utf8 thing, whatever that is), but even with that, the PATH is still mangled.

@lazka
Copy link
Member

lazka commented Oct 27, 2020

yeah, I doubt that this is Python specific. more likely in the PATH translation that is happening in cygwin

@lazka
Copy link
Member

lazka commented Oct 27, 2020

Doesn't look like cygwin implements any kind of unicode support for environment conversion (see environ.cc, env_plist_to_win32 and CCP_POSIX_TO_WIN_A vs CCP_POSIX_TO_WIN_W)

So it's unlikely this is going to be fixed soon.

@lazka lazka changed the title Unicode issue with Python, MinGW64/MSYS2 cygwin doesn't support Unicode in PATH Oct 27, 2020
@lazka lazka changed the title cygwin doesn't support Unicode in PATH cygwin doesn't support Unicode in PATH/HOME Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants