-
Notifications
You must be signed in to change notification settings - Fork 893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback to copy when rename causes cross-link error #2410
Conversation
Thank you for this patch. As background, rustup has a transactional layer here which attempts to make sure that installation of components is atomic. To make that work we only unpack things in the same directory structure and explicitly do not support having multiple filesystems sharing RUSTUP_HOME. The issue for your patch is by downgrading to copy, we no longer have that transactional protection and capability: if an IO error, or a memory error or an unexpected signal occurs during the middle of the copy (which will typically be of a couple hundreds of MB), then the installed toolchain location will have a partial write present (perhaps of a partial file, or perhaps of a subset of files of a directory). And the issue with that is that we have no UI for detecting or recovering from those partial writes in rustup today, so if we introduce this class of error that cannot exist today, we'll have to deal with the inevitable occurrences of this, which will be a support burden for a small team. Longer term @kinnison and I are discussing a different model for rustup where we can be eventually consistent, recover from partial writes and never write to a staging directory at all, avoiding this complication. Finally, it makes very little sense to write a file to disk, attempt to rename it, then copy-and-delete it: this is strictly slower than unpacking and renaming as we do today, or unpacking directly to the desired location, which would address the issue with docker. As a way forward, in the interim, I suggest either:
By unsupported I mean:
This will mean that the current status quo will remain, where the default behaviour is very robust and it doesn't work with additional docker layers, but folk that search for the error can find docs on how to "fix" it. For a more polished approach, we could do something along these lines - https://github.com/sindresorhus/is-docker/blob/master/index.js. I'll leave some thoughts on the code too. |
src/utils/utils.rs
Outdated
@@ -615,6 +634,14 @@ where | |||
notify_handler(Notification::RenameInUse(&src, &dest).into()); | |||
OperationResult::Retry(e) | |||
} | |||
io::ErrorKind::Other | |||
if cfg!(target_os = "linux") && Some(18i32) == e.raw_os_error() => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
written like this, this code path is compiled into every binary, linux or not; I think a #[cfg(linux)] block would be better.
Rather than having the magic number 18 here, please use https://docs.rs/libc/0.2.72/libc/constant.EXDEV.html
src/utils/utils.rs
Outdated
@@ -593,6 +593,25 @@ pub fn toolchain_sort<T: AsRef<str>>(v: &mut Vec<T>) { | |||
}); | |||
} | |||
|
|||
fn slow_rename<'a, N>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest copy_and_delete as a name.
@rbtcollins Thanks for the detailed feedback. I came across this issue and it seemed like a quick fix, but I can now see why it's a bit more complicated. Still I think this is a common enough problem that the fallback could be useful, so I extended the patch.
I'm a little weary of this because there are now multiple container runtimes besides docker and I assume they might have the same issue. It's also possible that docker won't even be a problem with rename depending on how redirct_dir is tuned. |
@rbtcollins Can you please take a look when you have a chance. |
src/utils/utils.rs
Outdated
// https://github.com/rust-lang/rustup/issues/1239 | ||
// This uses std::fs::copy() instead of the faster std::fs::rename() to | ||
// avoid cross-device link errors. | ||
if src.is_file() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than checking is_file, check !is_dir() (or invert the order and check is_dir); that way you'll correctly fall back to a simple copy for symlinks
src/utils/utils.rs
Outdated
if src.is_file() { | ||
copy_file(src, dest).and(remove_file(name, src)) | ||
} else { | ||
copy_dir(src, dest, notify_handler).and(remove_dir(name, src, notify_handler)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove_dir sends a notification that a directory is being removed; thats sensible for e.g. removing an installed toolchain, but here I'm not sure we want to do that, since we're emulating rename; I think calling raw::remove_dir, or with the modification I suggest above, remove_dir_all::remove_dir_all(path) directly, would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good, one last change please.
This feature is enabled by RUSTUP_PERMIT_COPY_RENAME and is Linux only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing this will be awkward since it adds a dependency on nested docker :/. I think as it is an unsupported thing that that is probably ok.
This is the issue #1239. There is further discussion here. We came across it in our dockerized CI environment.
The fix for the bug is to fall back to copying when rename fails due to a cross-link error on Linux.
I'm not sure how a test could be created for this, but it can be reproduce in docker so long as dockerhub doesn't delete the images I used.
Please let me know what you think.