-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stdlib Path has inconsistent normalisation behaviour #29008
Comments
Result:
This one contradicts the docs for pop https://doc.rust-lang.org/std/path/struct.PathBuf.html#method.pop
|
cc @aturon |
This one surprised me a lot (and wasted a fair amount of time)!
http://is.gd/yli8ls
|
I'm pretty sure it's a bug if the implementation of |
I confess, I've not found the edge-case handling of the I'm secretly a bit pleased about the |
triage: I-nominated |
triage: P-high |
seems like we should fix the Hash/Eq situation at the very least |
Thanks @alexcrichton for taking care of the hash issue -- it's an interesting footgun with derive. Regarding the semantic/normalization issues: there's definitely disagreement with the docs here, which is bad (and indicates insufficient unit testing). The question about trailing In any case, I agree that the behavior here is counterintuitive and there are cases where you want trailing If the APIs weren't stable, I'd suggest treating The behavior of All that said, I'm not sure what's best to do given the stable status of these APIs. We should definitely discuss this at the next libs team meeting. |
Almost all operations on Path are based on the components iterator in one form or another to handle equivalent paths. The `Hash` implementations, however, mistakenly just went straight to the underlying `OsStr`, causing these equivalent paths to not get merged together. This commit updates the `Hash` implementation to also be based on the iterator which should ensure that if two paths are equal they hash to the same thing. cc rust-lang#29008, but doesn't close it
Almost all operations on Path are based on the components iterator in one form or another to handle equivalent paths. The `Hash` implementations, however, mistakenly just went straight to the underlying `OsStr`, causing these equivalent paths to not get merged together. This commit updates the `Hash` implementation to also be based on the iterator which should ensure that if two paths are equal they hash to the same thing. cc #29008, but doesn't close it
Just focusing on the trailing slash aspect of this issue. Some notes on using
As much as I hate to admit it, I think treating trailing slashes as meaningful in terms of components is probably outvoted (especially since this behaviour is stable). So: document trailing slashes being ignored as part of Python 2:
C++ (boost 1.59):
Tcl 8.5
Ruby 2.2
Nodejs 5
Java 1.8
PHP 5.6
Go
|
@aturon is going to send a PR about docs to clarify trailing slash behavior and some of the other issues going on here. |
.NET handles it like Python and C++, which does seem like more useful behavior: https://msdn.microsoft.com/en-us/library/system.io.path.getfilename%28v=vs.110%29.aspx
|
The stdlib Python 3
|
Just needs to update docs pr @aturon's last comment. |
make note of one more normalization that Paths do Fixes rust-lang#29008
Given:
You get:
http://is.gd/XbyUdj
First, the documentation is misleading (wrong?).
file_name
(https://doc.rust-lang.org/std/path/struct.PathBuf.html#method.file_name) is documented as "The final component of the path, if it is a normal file.", but it's returning the directory when there's a trailing slash, which is not (by my book) a normal file?Looking at the implementation of
components
it will return the final component of the path unless it is the root directory or a relative component. Maybe it just needs clarifying.There's a stranger issue with normalisation of trailing slashes. Initially I thought it should be documented (https://doc.rust-lang.org/std/path/index.html#normalization), because trailing slashes are ignored by
components()
. However, unlike any other normalisation, it is possible to reconstruct the original path (by doing.push("")
)!I'm unsure what the intended behaviour is here and what needs fixing (implementation or docs).
Given stabilisation, I'm going to assume that implemented behaviour wins at this point? It's slightly irritating because being able to detect a trailing slash can be useful (as rsync does). Failing that I might be tempted to make
.push("")
do nothing - this uncertain state with trailing slashes is strange. Is there an RFC I can look at maybe?The text was updated successfully, but these errors were encountered: