-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix freshness when linking is interrupted. #8087
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
Truncating the file seems a little hacky, but it seemed like a reasonable way to invalidate the fingerprint. I don't want to delete it because that would break the |
☔ The latest upstream changes (presumably #8073) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the docs! I've got a question about the scenario this is fixing, but in general looks great to me
// Interrupt the child. | ||
child.kill().unwrap(); | ||
// Note: rustc and the linker are still running, let them exit here. | ||
conn.write(b"X").unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it should be sufficient to drop the conn
here rather than writing to it, but either way is fine.
// up-to-date until after a successful build. | ||
if loc.exists() { | ||
paths::write(&loc, b"")?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to make sure I fully understand this. So today we consider step 4 fresh, but it should actually rerun because the linker didn't actually finish successfully. When linking is interrupted the compiler has already emitted the dep-info file, so when we finish the unit and wrap up the fingerprint stuff it looks like it succeeded because the output file exists and the dep-info exists? Is that right?
This feels like it may be best handled by we truncate the file if the process failed for whatever reason. I'm a little hazy on this though. Is the idea that this truncation happens in step 3, which when interrupted, means step 4 picks up the truncated file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the dep-info exists? Is that right?
I don't think it matters whether or not rustc emitted the .d
file. Cargo doesn't translate it into the fingerprint directory unless the compilation finished successfully.
The issue is that there is a valid fingerprint from the previous run (step 1). During step 3, the new unit integration test executable is created but not completed. In step 4, all components of the unit integration test fingerprint are fresh from step 1 (nothing changed), and the mtime comparison of the incomplete executable is newer than the library's rlib (and all the unit integration test source files), so it thinks it is fresh.
truncate the file if the process failed
I don't think we can only truncate on failure, because Cargo is forcefully terminated. (I guess we could catch signals but that doesn't help with SIGKILL, and I'd prefer not to deal with signals.)
Is the idea that this truncation happens in step 3, which when interrupted, means step 4 picks up the truncated file?
Yes, step 4 compares an empty string with the expected hash string, and determines it is dirty because they are not equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man see this is when I have mega-doubts about everything. Doing anything with mtimes is so tricky that it seems like a folly almost. In any case agreed we shouldn't catch signals (was misinterpreting this for a second as failed rustc vs interrupted cargo).
I think the way we should think about this is that when we start a compilation we delete the expected outputs. That way if something bad happens, the next time we pick up the outputs won't exist and we'll see we need to recompile. This only works if the outputs are Cargo-created, not well-known files (like the linked artifact). The dep-info here, consequently, is a Cargo-output which only Cargo creates, so it makes sense to delete it before a build and create it after a successful build.
Would it perhaps make more sense to delete the file here rather than truncating it to zero bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason it truncates instead of deleting is that the fingerprint debug logging doesn't work in the scenario:
- Successful build.
- Change something.
- Build with a compile error. Logs fingerprint reason. Delete fingerprint.
- Build again. Fingerprint log just says "file not found" with no other detail.
I wanted to make sure step 4 continued to log the real reason for the fingerprint error (like "current filesystem status shows we're outdated") rather than "file not found".
I don't think it is too important (the first build failure logs the real reason), so I can switch it to delete if you prefer. I'm curious what concerns you have about truncating, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah that makes sense to me, so I think truncation here is fine. Could you update the comment with some of this discussion though? (about what files are flowing into what steps and why we're truncating instead of deleting).
I'm slightly confused again though. I would have thought that step 4 can't read the fingerprint file of step 1 (without this patch) because the mtime of the input files (from step 2) shows that the fingerprint file is stale. The mtime of the file from step 2 would be later than that of the fingerprint file from step 1, right?
(or do we write out the fingerprint file again in step 3?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking mtime is done at the same time as checking the fingerprint hash here. So in a sense it checks both the fingerprint hash and the fs_status at the same time. If either of them fail, it falls through to the code below to log the failure.
If it deleted the file, the first line in compare_old_fingerprint
would fail, circumventing the logging.
(Fingerprints are only written after a successful build.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er sorry I was roundaboutedly asking a question on your response above where you said
The issue is that there is a valid fingerprint from the previous run (step 1). During step 3, the new unit test executable is created but not completed. In step 4, all components of the unit test fingerprint are fresh from step 1 (nothing changed), and the mtime comparison of the incomplete executable is newer than the library's rlib (and all the unit test source files), so it thinks it is fresh.
You say "all components of the unit test fingerprint are fresh from step 1" but I thought that the final fingerprint file that Cargo writes for the executable woul dnot be fresh, right? The executable is there (and corrupt) but a source file is still newer than the final fingerprint file because we don't write that until rustc finishes I thought?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I made a typo in my reply. Where I said "unit test" I meant "integration test". Going back to the original example, "lib.rs" is changed, and compiles successfully. Then Cargo goes to build the integration test executable, and is interrupted. The next time around (step 4) "lib.rs" is up-to-date (and correct on-disk). And none of the integration test sources have changed, so that check appears up-to-date as well.
That was a bit of a subtle point, I'll emphasize that in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit with extra details. Hopefully it's not too confusing, I tried to be more precise with the wording.
8c54e58
to
cd396f3
Compare
@bors: r+ Ah ok that was a key part I was missing. This workaround isn't necessary for a one-unit compilation, but it's required for multi-unit compilations where you don't change the source of previous compilations. This happens because we don't hash the contents of each output, we just rely on mtimes. In any case looks good to me :) |
📌 Commit 14e86cc has been approved by |
☀️ Test successful - checks-azure |
Update cargo 12 commits in 390e8f245ef2cd7ac698b8a76abf029f9abcab0d..74e3a7d5b756d7c0e94399fc29fcd154e792c22a 2020-04-07 17:46:45 +0000 to 2020-04-13 20:41:52 +0000 - Update dependencies to support illumos target (rust-lang/cargo#8093) - Whitelist another known spurious curl error (rust-lang/cargo#8102) - Fix nightly test matching rustc "warning" output. (rust-lang/cargo#8098) - Update default for codegen-units. (rust-lang/cargo#8096) - Fix freshness when linking is interrupted. (rust-lang/cargo#8087) - Add `cargo tree` command. (rust-lang/cargo#8062) - Add "build-finished" JSON message. (rust-lang/cargo#8069) - Extend -Zpackage-features with more capabilities. (rust-lang/cargo#8074) - Disallow invalid dependency names through crate renaming (rust-lang/cargo#8090) - Use the same filename hash for pre-release channels. (rust-lang/cargo#8073) - Index the commands section (rust-lang/cargo#8081) - Upgrade to mdBook v0.3.7 (rust-lang/cargo#8083)
Fixes a scenario where hitting Ctrl-C while linking would leave a corrupted executable, but Cargo would think it is "fresh" and fail to rebuild it.
This also includes a separate commit which adds more documentation on fingerprinting.
Fixes #7767