Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build on Github Action with macos-latest missing dependencies #8603

Closed
zonyitoo opened this issue Aug 8, 2020 · 10 comments
Closed

Build on Github Action with macos-latest missing dependencies #8603

zonyitoo opened this issue Aug 8, 2020 · 10 comments
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug

Comments

@zonyitoo
Copy link

zonyitoo commented Aug 8, 2020

Problem

https://github.com/shadowsocks/shadowsocks-rust/runs/962064929?check_suite_focus=true

Build running on Github Action with macos-latest always reports missing serde_derive.

@zonyitoo zonyitoo added the C-bug Category: bug label Aug 8, 2020
zonyitoo added a commit to shadowsocks/shadowsocks-rust that referenced this issue Aug 9, 2020
@zonyitoo
Copy link
Author

zonyitoo commented Aug 9, 2020

It works if cache is disabled.

@ehuss
Copy link
Contributor

ehuss commented Aug 19, 2020

Without the cache, this will be very difficult to diagnose. The most likely cause that I can think of is that the libserde_derive.dylib file was truncated or otherwise corrupted in the cache. In that circumstance, cargo doesn't know it needs to be rebuilt, and rustc will silently ignore the dylib. Unfortunately I can't think of much that can be done in that situation, other than clearing the cache.

If you're able to reproduce on CI, please let me know!

@ehuss ehuss added the A-rebuild-detection Area: rebuild detection and fingerprinting label Aug 19, 2020
@zonyitoo
Copy link
Author

It is reproduciable on CI nearly 100% just by enabling cache.

@ehuss
Copy link
Contributor

ehuss commented Aug 19, 2020

So I've been digging into this a little bit, and I've been able to reproduce on GHA. Unfortunately I cannot reproduce it locally. As far as I see, it looks to be an issue with the GitHub cache action. After clearing the cache, and running cargo build in your project, I added some debugging lines and discovered that the cache is overwriting 0's to the first 8MB (0x800000 bytes) of the target/debug/deps/libserde_derive-797b01cb80d42716.dylib file.

I tried reproducing locally using the tar commands displayed in the build logs to reproduce their caching behavior, but I could not reproduce the problem.

The problem also seems to be very sensitive to the nature of your project. Removing the RUSTFLAGS: "-Ctarget-feature=+aes,+ssse3" makes the problem go away, as well as removing --features "aes-pmac-siv". I have no idea why this would affect things.

I would recommend you follow up with GitHub on their issue tracker (https://github.com/actions/cache) to figure out why the cache is corrupting that file.

Here's a basic example from one of my tests:

  1. From a cleared cache: https://github.com/ehuss/shadowsocks-rust/runs/1004677582?check_suite_focus=true
    The last step shows the shasum of the file: 0aeb971a8830e77ee92843e42e424887a448862e target/debug/deps/libserde_derive-797b01cb80d42716.dylib
  2. The next run, which makes a small modification to src/lib.rs: https://github.com/ehuss/shadowsocks-rust/runs/1004692842?check_suite_focus=true
    Immediately after the cache has been extracted, the shasum of the file is: 4e2ef1b28a57bd1c72f5df110b800dcbf8d472d1 target/debug/deps/libserde_derive-797b01cb80d42716.dylib

I think this pretty clearly shows the cache is modifying it somehow. I did some other tests where I downloaded the file, and confirmed that the first 8MB of the file has been zeroed out.

@dhadka
Copy link

dhadka commented Aug 21, 2020

@ehuss This is a really weird one. I put some more details over in actions/cache#403, but the summary is that:

  1. I can reproduce the issue by just running tar -cf ... followed by tar -xf ..., so it seems unrelated to the cache and instead happens during tar
  2. Depending on what other commands I'm running before creating the tar, the issue magically disappears. For example, if I add sleep 10 or just add verbose logging to tar (tar -cvf ...) I can't repro.

zonyitoo added a commit to shadowsocks/shadowsocks-rust that referenced this issue Aug 22, 2020
@bk2204
Copy link
Contributor

bk2204 commented Aug 31, 2020

If this is Catalina, I think this may be the same bug as Homebrew/brew#6539. (I'm familiar with this because I maintain Git LFS and it broke that Homebrew package.) That bug mentions a workaround, so it may be worth trying to figure out (or ask) what the Homebrew folks did and see if that might work for you.

@ehuss
Copy link
Contributor

ehuss commented Aug 31, 2020

I think the workaround they are referencing is calling /usr/sbin/purge (as root), which flushes the disk cache.

Another workaround is to use GNU tar as described here: actions/cache#403 (comment)

I wonder if there is some strange interaction with APFS sparse files.

rrbutani added a commit to rrbutani/lc3tools-sys that referenced this issue Sep 1, 2020
actions/cache#403 tracks this issue.
rust-lang/cargo#8603 has the exact issue that we run into.
mkantor added a commit to mkantor/operator that referenced this issue Oct 6, 2020
mkantor added a commit to mkantor/operator that referenced this issue Oct 6, 2020
mkantor added a commit to mkantor/operator that referenced this issue Oct 7, 2020
Frederick888 added a commit to Frederick888/git-credential-keepassxc that referenced this issue Dec 9, 2020
xu-cheng added a commit to xu-cheng/howlong that referenced this issue Dec 24, 2020
xu-cheng added a commit to xu-cheng/howlong that referenced this issue Dec 24, 2020
xu-cheng added a commit to xu-cheng/howlong that referenced this issue Dec 24, 2020
xu-cheng added a commit to xu-cheng/pandoc-katex that referenced this issue Dec 24, 2020
Frederick888 added a commit to Frederick888/git-credential-keepassxc that referenced this issue Jan 21, 2021
fanninpm added a commit to fanninpm/RustPython that referenced this issue Apr 24, 2021
This will probably not work due to some currently unresolved issues:
 - rust-lang/cargo#8603
 - actions/cache#403
coolreader18 pushed a commit to RustPython/RustPython that referenced this issue May 2, 2021
* Unskip test_argparse test_json test_bytes test_bytearray test_long test_unicode test_array test_asyncgen test_list test_complex test_set test_dis test_calendar in macOS CI

* Re-enable cache on macOS

This will probably not work due to some currently unresolved issues:
 - rust-lang/cargo#8603
 - actions/cache#403

* Use Swatinem's rust-cache action
drgora added a commit to HorizenOfficial/zen that referenced this issue Apr 12, 2023
This is needed to fix the CI on recent OSX systems. For reference, see rust-lang/cargo#8603.
@overlookmotel
Copy link

I believe actions/cache#403 is resolved (about 2 years ago) and according to rust-lang/miri#2565 no workaround is required any more.

Can this issue be closed now?

@weihanglo
Copy link
Member

Sounds good. If it still happens, we can reopen or create a new one.

@tgross35
Copy link

tgross35 commented Dec 7, 2024

Just for posterity: GHA changed to GNU tar which doesn't hit this problem as often. However, there is a lot more discussion at https://trac.macports.org/ticket/67336?cversion=0&cnum_hist=15 and the conclusion seems to be that this is a still-unfixed filesystem bug rather than a tar bug (come on Apple...). So it is probably still possible to hit this bug, even if various workarounds may have made it less likely.

This is not actionable on Cargo's end so leaving the issue closed makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rebuild-detection Area: rebuild detection and fingerprinting C-bug Category: bug
Projects
None yet
Development

No branches or pull requests

7 participants