Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shim for realpath on unix #2294

Closed
wants to merge 4 commits into from

Conversation

LegNeato
Copy link
Contributor

No description provided.

@LegNeato LegNeato mentioned this pull request Jun 30, 2022
@LegNeato
Copy link
Contributor Author

LegNeato commented Jun 30, 2022

This will likely fail on windows as I haven't tested it there yet (I don't have access to a windows box).

src/shims/unix/fs.rs Outdated Show resolved Hide resolved
src/shims/unix/fs.rs Outdated Show resolved Hide resolved
src/shims/unix/fs.rs Outdated Show resolved Hide resolved
tests/pass/fs.rs Outdated Show resolved Hide resolved
tests/pass/libc.rs Outdated Show resolved Hide resolved
tests/pass/libc.rs Outdated Show resolved Hide resolved
@LegNeato LegNeato force-pushed the shim-realpath branch 7 times, most recently from 086c8ee to 55da284 Compare July 2, 2022 07:56
@LegNeato
Copy link
Contributor Author

LegNeato commented Jul 2, 2022

I'm not sure why this test ran on windows, the file has // ignore-windows: No libc on Windows at the top.

@RalfJung
Copy link
Member

RalfJung commented Jul 2, 2022

As I said, everything in the test file refers to the target architecture. The failing test is running a Linux target on a Windows host.

// Make sure we get an error for long paths.
use std::convert::TryInto;
let too_long = "x/".repeat(libc::PATH_MAX.try_into().unwrap());
assert!(canonicalize(too_long).is_err());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can also test the error kind, can't we?


// Test that a long path returns an error.
//
// Linux first checks if the path to exists and macos does not.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is that the behavior of running the code for real, or in Miri?
If the answer is "both", doesn't that mean that using a Linux target on a macOS host will behave differently than on a Linux host? Not sure if that is a problem though.

Copy link
Contributor Author

@LegNeato LegNeato Jul 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without miri, the test with a non-existent path fails on Linux and the same code succeeds on macos.

Yes, I am not sure it matters in practice though. I guess to be extra thorough I can catch the error in std::fs::canonicalize() and adjust if target != host

Copy link
Contributor Author

@LegNeato LegNeato Jul 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an existing function that checks if target != host? I can't seem to find one though I feel like it probably exists in the codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No there is not. We don't want to ever actually check the host if we can help it, since behavior should be host-independent.

I guess POSIX does not specify whether a non-existing path returns an error or not? In that case this seems fine but there should be a comment in the implementation.

tests/pass/libc.rs Outdated Show resolved Hide resolved
tests/pass/libc.rs Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors commented Jul 2, 2022

☔ The latest upstream changes (presumably #2306) made this pull request unmergeable. Please resolve the merge conflicts.

@LegNeato
Copy link
Contributor Author

LegNeato commented Jul 2, 2022

I need to think about this some more but I am not sure the Windows path issue is workable based on the current fs code.

Paths are part of the host environment leaking into the target environment. The target doesn't know how to operate on a host path. This hasn't been a problem because other syscalls just treat a path as an opaque pointer/handle to the filesystem, but the test I wrote for realpath() cares about the semantics of the path itself.

Because realpath() takes a target Path and hits the filesystem--which returns a host Path--roundtripping through it exposes the latent issue. When unix std code is given a windows path, it of course can't coherently perform operations on it that don't hit the filesystem...the string parsing logic is wrong! This is likely why the Path in the failure has "//" in it even though https://doc.rust-lang.org/stable/std/path/struct.Path.html#method.components says double separators are normalized. It seems we were not running tests with any std path-manipulation functions and the existing tests always reference target-generated paths for syscalls and assertions.

I think we need our own path abstraction so we can differentiate / translate between different path formats in miri code, or we need an entire filesystem abstraction?

@RalfJung
Copy link
Member

RalfJung commented Jul 2, 2022

Paths are part of the host environment leaking into the target environment.

Yes. We had that problem before, in fs.rs, and there the stuff we are doing in fn tmp was enough. It seems like unfortunately this is not the case here.

Note that functions like read_path_from_c_str also call convert_path_separator to further help with this. I think it's "just" a matter of figuring out the right way to assemble these functions.

// the resolved pathname, and returns a pointer to this buffer. The
// caller should deallocate this buffer using free(3)."
// <https://man7.org/linux/man-pages/man3/realpath.3.html>
this.alloc_os_str_as_c_str(resolved.as_os_str(), MiriMemoryKind::C.into())?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I just realized this is a general os_str function, but it should be a path function to do the directory separator conversion. We need alloc_path_as_c_str but such a function does not exist yet. Can you add it?

Comment on lines +17 to +23
// before constructing a `PathBuf`

#[cfg(windows)]
return PathBuf::from(tmp.replace("/", "\\"));

#[cfg(not(windows))]
return PathBuf::from(tmp.replace("\\", "/"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// before constructing a `PathBuf`
#[cfg(windows)]
return PathBuf::from(tmp.replace("/", "\\"));
#[cfg(not(windows))]
return PathBuf::from(tmp.replace("\\", "/"));
// before constructing a `PathBuf`.
return PathBuf::from(tmp.replace("\\", "/"));

This is a non-windows test.

@LegNeato
Copy link
Contributor Author

LegNeato commented Jul 3, 2022

Paths are part of the host environment leaking into the target environment.

Yes. We had that problem before, in fs.rs, and there the stuff we are doing in fn tmp was enough. It seems like unfortunately this is not the case here.

Note that functions like read_path_from_c_str also call convert_path_separator to further help with this. I think it's "just" a matter of figuring out the right way to assemble these functions.

I'm still not sure this is enough. For example, unix target code calling https://doc.rust-lang.org/stable/std/path/struct.Path.html#method.has_root will use the unix logic described there but the underlying string will be in windows/host format so the result is not correct. Do we have a way to say "when running under miri, even though the target OS is unix, use the windows implementation of std::path::Path"?`` But even with that, non-std code can still go from a Path to an OSString and code under test may expect a certain format of the OSString.

@RalfJung
Copy link
Member

RalfJung commented Jul 3, 2022

Yeah it's not perfect. But it might work for the test suite?

Do we have a way to say "when running under miri, even though the target OS is unix, use the windows implementation of std::path::Path"?``

No, when the target is Unix, then we only have MIR for the Unix part of std available.

@RalfJung RalfJung added the S-waiting-on-author Status: Waiting for the PR author to address review comments label Jul 11, 2022
@bors
Copy link
Contributor

bors commented Jul 18, 2022

☔ The latest upstream changes (presumably #2349) made this pull request unmergeable. Please resolve the merge conflicts.

@LegNeato
Copy link
Contributor Author

Going to close this and retool it a bit.

@LegNeato LegNeato closed this Jul 27, 2022
@RalfJung
Copy link
Member

RalfJung commented Aug 2, 2022

What do you mean by "retool"? This was pretty closed to finished, I think.

//
// Rather than creating a bunch of directories, we create two directories containing symlinks.
// Sadly we can't avoid creating directories and instead use a path like "./././././" or "./../../" as linux
// appears to collapse "." and ".." before checking path length.
Copy link
Member

@RalfJung RalfJung Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this test makes any sense... the actual real path after resolving all those symlinks to /tmp/.../x, i.e., it is very short. The test creates a very long input path but the interesting case is a very long output.

On my system this test produces a FilesystemLoop error.

bors added a commit that referenced this pull request Aug 2, 2022
Add shim for realpath on unix

Salvaged from #2294 by `@LegNeato`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: Waiting for the PR author to address review comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants