-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mono][aot] Fix support for files with non-ascii characters on windows #92279
Conversation
cc @fanyang-mono. Shamelessly stole some code from #90436. |
src/mono/mono/eglib/gfile.c
Outdated
FILE *fp; | ||
|
||
#ifdef HOST_WIN32 | ||
gunichar2 *wPath = g_utf8_to_utf16 (path, -1, 0, 0, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor optimization that Johan suggested and I wasn't able to do it correctly is that if path only contains ascii characters, it could simply use fopen
. I tried adding the optimization here. But it failed the test on CI. I haven't figured out how to test it locally. So I took it out. If you could get it right, that would be great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see this optimization having any impact in the real world, especially since we are dealing with file access which is bound to be quite slow. Unless the encoding conversion is craazy slow, which I doubt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really have any real world experience with the relevant API's. By looking at the code, we could avoid two calculations of g_utf8_to_utf16
.
@lateralusX Do you have any insights here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not done any measurements on the speed of encoding conversions on any platform and in relation to file IO its probably a small amount of time. The idea was to avoid doing the heap allocation and encoding conversion when not really needed (all characters in ascii format so standard function can be called) and since we have the logic put into a couple of g_ functions, the optimization will be isolated, simple and straightforward. I still think avoiding string encoding conversions and heap allocations when not needed is worthwhile, but I agree that in this case the frequency of calls coming into these functions will be low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lateralusX I added the ascii string check to avoid utf16 conversion. Could you do a final review on this PR ?
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
Add g_fopen, g_unlink and g_rename which on windows do a utf8 to utf16 conversion and then call the corresponding wide char api.
63f0b38
to
6c59108
Compare
6c59108
to
6192682
Compare
Add g_fopen, g_unlink and g_rename which on windows do a utf8 to utf16 conversion and then call the corresponding wide char api.
Contributes to #83203