-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving the standard library implementation on Windows #17
Comments
cc @rust-lang/libs-api |
Thanks a lot for the detailed write up! We discussed this in our meeting last week, and while we can't make a decision on every single one of these issues at once, we do generally feel positive about fixing these, even if they might cause some theoretical breakage. A key point to keep in mind is that in Rust we care more about actual breakage than about theoretical breakage. So some change being observable isn't much of a problem when the change doesn't break (or maybe even improves) existing code. Of course, we can't see or test all Rust code in existence, so such changes would need some reasoning why the change (most likely) won't cause actual breakage before we can merge it. We figured that in some of these cases, that reasoning might be based on input from Microsoft itself. For example, Microsoft already changed the CRT command line argument parsing, so there must be some data/experience available about how much was broken (or fixed) by that. (There's a group of people who signed up specifically to provide input on Windows-specific issues. On the
I don't see anything like that happen any time soon. So if there's any way to fix such issues by changing 'std 1' without breaking a lot, that's absolutely worth discussing.
We do sometimes change documented promises if we think the risk of breakage/disruption is very low. (Here's an example: rust-lang/rust#76932) Many changes to the standard library are technically breaking changes in subtle ways, but as long as we're careful, that usually works out okay. |
Thanks for the thorough response! My worry was partly that knowing the difference between theoretical and actual breakage can sometimes be challenging. But I feel you've addressed that sufficiently.
This is very useful to know. I do plan on being cautious about breaking documented behaviour but it's good to know that there's some leeway there. In some places it may be enough to add documentation on how Windows and *nix differ in important ways. |
Recently I've been looking at ways I can improve the standard library implementation on Windows.
There are many issues which have been open for a long time which I believe I can help to fix but I was worried about the disruption doing so may cause. @m-ou-se suggested I create an issue here documenting my current plans in case there were anything the libs team felt needed discussing. Or at least to give everyone a heads up.
A big chunk of the open Windows issues fall into one of two categories:
std::env
andstd::process
. SpecificallyCommand
but also argument parsing (and #85270 fixes an issue with environment variables). I recently made two other PRs (#86406 and #86467) to address a couple of smaller issues withCommand
.There are other issues particularly within
std::fs
(e.g. #81357 and #29497) and I've suggested a workaround (#86447) for a driver problem that is the root cause of a number of issues.But anyway, back to the two categories mentioned above.
std::{env, process}
Argument parsing
In Windows, arguments are passed to a program as a string. This string is commonly parsed by the C runtime (CRT) in order to recover the arguments. In 2008 Microsoft subtly changed the way the CRT parses arguments. However, Rust still uses the old method. The change is small, it's centred around how consecutive double-quotes (e.g.
""
) are handled when encountered in a quoted block. See #44650. Fixing this should be fairly easy but is it technically a breaking a change?Command
uses the CRT rules to build an argument string to pass to an application. However, not every application parses commands according to these rules. Currently there's no good way to bypass Rust's handling. #85832 (by kornelski) seeks to address this. It might also be good to be able to override the zeroth argument, which by convention is set to the name of the binary being run but it needn't be. But I think both of these would be purely additions to the library and not change existing behaviour.Command
executable searchThe way
Command
finds executables is weird. By default it uses theCreateProcessW
search sequence. However if you useCommad::env
at all then it first attempts to emulate Linux's behaviour, which is rather surprising. See #37519 and #50870. The specific implementation of that could be improved but I question if it's a good idea in the first place.Also I'm not sure that the default search sequence is good. On internals, a CVE was discussed about
Command
looking for executables in the current directory. The problem is thatCreateProcessW
itself is, as with most of the Windows API, concerned with backwards compatibility rather than doing the "best" thing out of the box. Fortunately it provides for using a path to an executable instead of relying on its search.It was brought up in that thread that the way of resolving the executable name could be made configurable, which is a good idea. However, I still think changing the defaults might be good. But it might be disruptive.
There are also a number of other issues, mostly around child processes inheriting handles from the parent (#70719, #67616, #54760 and #38227. But I have not looked too deeply into these yet.
File paths
Most of the open issues here centre around the legacy
MAX_PATH
limits and verbatim paths (aka\\?\
prefixed paths) which allow bypassing the limit.\\?\
where needed on Windows\\?\
when displayed"\??\C:\..."
)r"\"
depends on whether set_current_dir has been calledIdeally verbatim paths would be purely an implementation detail that's not normally visible unless necessary. The std would use whatever path type makes most sense in the context, unless explicitly asked to do otherwise. I'm not sure if Rust can reach that ideal or not (at least not unless there's a
std 2.0
). What I think it can do is handle verbatim paths a little bit better.In Win32, resolving
.
and..
paths is done purely lexically. That is, the path is made absolute without ever touching the filesystem. The Win32 APIs will usually do this for you. However, verbatim paths are passed to the kernel (almost) verbatim. So.
and..
are never resolved and are instead treated as if they were actual files or directories (which they're not, according to Windows filesystem drivers). This has consequences for Rust. What should happen when usingPath::push
on such paths? Should it automatically resolve the path as you push? Should it panic? Should it resolve them at the point of use (e.g.fs::File::open
)? What if the user really really did want to pass a..
component to the kernel? (note: I can't think why anyone would but it is hypothetically possible that some device could use such components in some way.)There is also a problem with
/
. The Win32 API will automatically convert it to\
before passing the path to the kernel... unless verbatim paths are used. In that case it's again passed to the kernel verbatim, where it will be treated like any other character in a file name instead of being a path separator.I'm saying all this not to find answers here (internals or the main repo would be a better venue for discussing the particulars) but to give a flavour of the changes that may be required to handle Windows paths better. But I worry that such improvements may mean breaking the documented behaviour of
Path
, if only in small ways. That said, I am looking for ways to improve the implementation that have much less risk of being disruptive.The text was updated successfully, but these errors were encountered: