Reproduce unicode issue in VSCode.

I believe the underlying issue is unicode characters in test function names, i.e. the circles in:

def test_ᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤᐤ952():

When running a large set of unit tests each test name is provided to the pytest invocation on the client machine as command line args. If there are enough tests to cross a 1024 byte boundary (or maybe its 3000), then there is a chance that PipeManager will do a read that does not terminate on a unicode code point boundary.

For example, say the character ᐤ is encoded as two bytes, A and B, then it's possible by reading 1024 bytes that the final character of the bunch is A, and that B will be available in a subsequent read. Decoding a string under this condition will result in the following error:

Traceback (most recent call last):
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2024.6.0/python_files/vscode_pytest/run_pytest_script.py", line 39, in <module>
    data = sock.read(3000)
  File "/home/ubuntu/.vscode-server/extensions/ms-python.python-2024.6.0/python_files/testing_tools/socket_manager.py", line 75, in read
    data = part.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe1 in position 2999: unexpected end of data
Finished running tests!

There's an easy fix, upon the above exception read an extra byte and try again decoding utf again.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
tools		tools
.gitignore		.gitignore
README.md		README.md
pytest.ini		pytest.ini
test_1.py		test_1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reproduce unicode issue in VSCode.

About

Releases

Packages

Languages

tongfa/repro-unicode-error

Folders and files

Latest commit

History

Repository files navigation

Reproduce unicode issue in VSCode.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages