Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pretty-format-json says the JSON file is not valid when it is (Windows-specific?) #570

Closed
Jackenmen opened this issue Mar 14, 2021 · 6 comments · Fixed by #571
Closed

pretty-format-json says the JSON file is not valid when it is (Windows-specific?) #570

Jackenmen opened this issue Mar 14, 2021 · 6 comments · Fixed by #571

Comments

@Jackenmen
Copy link
Contributor

Jackenmen commented Mar 14, 2021

Here's a full log from GitHub runner presenting this issue:
https://github.com/jack1142/Red-DiscordBot/runs/2107108269?check_suite_focus=true

I tried to make a simple repro on GH Actions using Linux environment but I didn't get this error so I tried using the same environment I have this problem with locally - Windows - and then I was indeed able to reproduce it on the GH Actions runner.

check-json says the JSON file is valid and as far as I can tell, it is.

@asottile
Copy link
Member

here's the underlying error -- it's unhappy about printing non-ascii due to (for some reason?) the streams being considered as cp1252 instead of utf8

Traceback (most recent call last):
  File "C:\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\asott\.cache\pre-commit\repo0mj285sw\py_env-python3.8\Scripts\pretty-format-json.EXE\__main__.py", line 7, in <module>
  File "c:\users\asott\.cache\pre-commit\repo0mj285sw\py_env-python3.8\lib\site-packages\pre_commit_hooks\pretty_format_json.py", line 120, in main
    print(
  File "C:\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0421' in position 287: character maps to <undefined>

@asottile
Copy link
Member

@jack1142 you might be interested in https://pre-commit.ci by the way

@Jackenmen
Copy link
Contributor Author

it's unhappy about printing non-ascii due to (for some reason?) the streams being considered as cp1252 instead of utf8

This is related to how Python's subprocess works - the stdout/stderr file objects by default seems to use the "preferred encoding" which on Windows is not UTF-8 (unless you use PYTHONIOENCODING env var which is usually not the case). Could it be that the change is needed in pre-commit rather than here?

@jack1142 you might be interested in pre-commit.ci by the way

Oh ye, I'm aware of its existence but for now, I want to stick to GH Actions as that's what all other things in our CI use. Thanks for this information though, I might look into it one day :)

@asottile
Copy link
Member

no, setting PYTHONIOENCODING is not a correct solution. you can still produce stdout as UTF-8 independent of what io encoding is set

@Jackenmen
Copy link
Contributor Author

no, setting PYTHONIOENCODING is not a correct solution. you can still produce stdout as UTF-8 independent of what io encoding is set

Sorry, I didn't mean to say that PYTHONIOENCODING is a solution here.
I was thinking that maybe pre-commit should be calling subprocess.Popen() with encoding kwarg so that the called process can detect that stdout is a UTF-8 stream. After testing, I see that this kwarg does not actually have any effect on the called process so it seems it wouldn't work anyway.

Anyway, I'll go for sys.stdout.buffer.write() and make a PR in a moment.

@asottile
Copy link
Member

@jack1142 the encoding arg has nothing to do with this, that's for the consumer to automatically decode the output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants