Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on windows when one of the bundled data files has non-ascii characters #535

Closed
ankith26 opened this issue Nov 16, 2023 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@ankith26
Copy link

This error only happens on windows, and only happens when I try to include a file that has some korean characters in its path string. Perhaps the fix to this could be to force utf-8 usage? Apologies in advance if this is not actually a meson-python issue, I'm just assuming it is from the traceback

Traceback (most recent call last):
 File "C:\Users\runneradmin\AppData\Local\Temp\cibw-run-a3rpdcsp\cp310-win_amd64\build\venv\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in <module>
   main()
 File "C:\Users\runneradmin\AppData\Local\Temp\cibw-run-a3rpdcsp\cp310-win_amd64\build\venv\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
   json_out['return_val'] = hook(**hook_input['kwargs'])
 File "C:\Users\runneradmin\AppData\Local\Temp\cibw-run-a3rpdcsp\cp310-win_amd64\build\venv\lib\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 152, in prepare_metadata_for_build_wheel
   whl_basename = backend.build_wheel(metadata_directory, config_settings)
 File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-1c4y2iz0\overlay\Lib\site-packages\mesonpy\__init__.py", line 985, in wrapper
   return func(*args, **kwargs)
 File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-1c4y2iz0\overlay\Lib\site-packages\mesonpy\__init__.py", line 1039, in build_wheel
   return project.wheel(out).name
 File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-1c4y2iz0\overlay\Lib\site-packages\mesonpy\__init__.py", line 890, in wheel
   return builder.build(directory)
 File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-1c4y2iz0\overlay\Lib\site-packages\mesonpy\__init__.py", line 445, in build
   counter.update(src)
 File "C:\Users\runneradmin\AppData\Local\Temp\pip-build-env-1c4y2iz0\overlay\Lib\site-packages\mesonpy\_util.py", line 69, in update
   print(line)
 File "C:\Users\runneradmin\AppData\Local\pypa\cibuildwheel\Cache\nuget-cpython\python.3.10.11\tools\lib\encodings\cp1252.py", line 19, in encode
   return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 72-75: character maps to <undefined>
@dnicolodi
Copy link
Member

What raises an exception is trying to log to the relative path of the file added to the wheel archive to the standard output. Python thinks that standard output should use the cp1252 encoding but this encoding cannot represent the path of the file being added, as expected: the cp1252 encoding cannot represent Korean script. What surprises me a bit is that you can have files with names in Korean script on a system that apparently uses cp1252 encoding.

I thought that system using the cp1252 encoding were extinct. On which system are you getting this error?

@ankith26
Copy link
Author

This is a standard github actions windows VM so I guess it's not something exotic or ancient. I can also reproduce the same fail locally while running a fresh windows 11 instance under virtualbox.

dnicolodi added a commit to dnicolodi/meson-python that referenced this issue Nov 16, 2023
We print log messages and error messages that may contain file names
containing characters that cannot be represented in the stdout
encoding. Use replacement markers for those instead than raising
UnicodeEncodeError.

Fixes mesonbuild#535.
@dnicolodi
Copy link
Member

Proposed fix in #536

@rgommers rgommers added the bug Something isn't working label Nov 22, 2023
@rgommers rgommers added this to the v0.16.0 milestone Nov 22, 2023
dnicolodi added a commit to dnicolodi/meson-python that referenced this issue Nov 25, 2023
We print log messages and error messages that may contain file names
containing characters that cannot be represented in the stdout
encoding. Use replacement markers for those instead than raising
UnicodeEncodeError.

Fixes mesonbuild#535.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants