-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with non-ascii characters in the JVM path on Windows #1111
Comments
I am going to have to depend on you to find a way to address the issue, as my local system likely won't produce the correct result. Given that it is a SystemError rather than a Java error of some kind I am not sure if it is actually getting to the Java code. Most likely some encoding converter when passing the string from Python to Java or in this case the C library forget to specify the encoding. The specific code path is Raw char* gets passed to Windows
Which was called from
Which was called from
Which came through
So we can see where the party responsible for encoding that string was
So if it is bytes it would not be converted, but if it was unicode it would be UTF-8 with strict. As you can see the encoder that did the work was Python. So either we need to change the encoding or push the error upstream as the encoding came from a Python call not a JPype one. My guess is that the Looking for similar issues: https://forums.sketchup.com/t/help-with-win32api-loadlibrary-and-utf-8-paths/95376/4 So one solution would be to convert to wide characters from UTF-8 then call Of course just fixing the startJVM may not get you working as you will also need to see if the classpath works as well. Given most of the Java API requires UTF-8 it likely will work. Hopefully that helps you identify the source of the issue. |
Cheers @Thrameos, thank you for looking into it so promptly! I'm afraid this is well beyond my skills and knowledge. I'd like to ask some help from @vepo and @py5coding ... let's see what happens. |
Okay. I will try to see if I can replicate when I get a chance. If you do want to mess with it there are instructions for building JPype from source on the web, and at least for the encoding options it should be as little as changing the string value from "strict" to "ignore" or one of the other options and testing. Not sure if that would be fruitful but if it is then all I need is that info so I can start working on a production patch. Assuming that changing the encoder doesn't work, the LoadLibraryW would be harder as you would need to pattern to convert a UTF-8 char* string into a wide string in C++. I don't recall it either and would have to research, but perhaps others can help you there. Perhaps I can make a proposed patch and you can test it? Either way once we can replicate it, it should be fixable. |
Hi @Thrameos , I'd like to work on this issue so it can be fixed for our non-english speaking Windows users. To make progress here I need to be able to fiddle with the code, run the jpype build, and test. Building jpype on Linux is easy and worked on the first try. I can't run the build on Windows though...can you tell me what I am doing wrong? I have installed Microsoft Visual Studio 2022 Community Edition on my machine and java 17. This is the output when I run the build:
The directory Although on Windows I can install JPype with the tar.gz source distribution and I can create new source distributions on my Linux machine. If that's a sufficient workflow for what needs to be done here, I can go with that instead. As to testing, if I can't reproduce it on my machine I'll get help from @villares . Between the two of us, I think we can get to the bottom of this. |
Ooops, I just needed to add Next I'll try to reproduce @villares 's issue on my Windows machine and will try making that |
Sounds good. This was one I wasn't able to replicate and slipped through the cracks. Thanks for taking it up. |
OK, I have created several JPype builds that change the I emailed a folder with the compiled wheels to @villares for testing. The baseline build should reproduce the issue. Hopefully one of the other builds lets @villares start the JVM, which will be an important clue about what is going on and what we can do to fix it. |
Hi, I tested all the wheels built by @hx2A on Windows 11 today, and got the same results as the baseline build. They work with ASCII paths, and break with non-ASCII paths: |
It seems that changing the encoding didn't work, but the fact that it didn't work is useful information for planning next steps. @Thrameos , what would you like us to test next? |
I can give it another shot. My plan was to call the Python char to charge converter then LoadLibraryW. After all this is a Python string we are passing to a system call, and not really anything to do with Javas wacky encoding. The problem is that LoadLibraryW is a system call and I have no idea whether this wants the wide char for the displayed character or perhaps something from the region character set. Thus all I can do is hope Python knows how to get the right encoding. Not having a region encoded Windows version means I may not get the same behavior. See |
I will have to work on it this weekend when I have time off.
|
@Thrameos , thank you for working on this and creating a potential fix. Unfortunately I overlooked the issue notification and didn't see this until now. @villares , you can test @Thrameos 's fix? I believe you can install it with the following command: pip install git+https://github.com/Thrameos/jpype@windows-locale If for some reason that doesn't work, use the wheel I created on Windows in the folder link I just DM'd you. |
Wow, excellent! Building the wheel requires compilation, which I know on Windows is a chore. @Thrameos , thank you so much for fixing this! |
Great I will push it out in the next release. |
fixed in 1.5.1 |
that's terrible news. Do you have any output? Please open a new issue. |
I tested 1.5.1 on Windows and didn't have any issues, at least for my PyGhidra use case. |
Context:
in BrazilAround the world, many users have names, usernames and various directories containing non-ascii characters. I found this issue because some students couldn't run the same project/tools that other students (and myself) could run.A mininmal way to reproduce is to put the JDK in a non-ascii path location. And then try the REPL steps bellow:
Removing the
é
character from the folder name allows the JVM to start.This seems to be a Windows issue, tested on Windows 10 21H1, and It seems not to be present on Linux.
The text was updated successfully, but these errors were encountered: