Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust nightlies published to the same archive date two nights in a row, broke everything #33063

Closed
brson opened this issue Apr 17, 2016 · 8 comments
Labels
E-help-wanted Call for participation: Help is requested to fix this issue. P-low Low priority

Comments

@brson
Copy link
Contributor

brson commented Apr 17, 2016

These two nightlies published to the same archive (04-17) twice in a row:

Then the packaging build failed.

This seems to have broken rustup, which is getting checksum failures, and rustup.sh (which is downloading the wrong files).

cc @alexcrichton

I've started rebuilding the rust-packaging build, but seems like we should have some more validation that we don't reuse archive dates.

@brson
Copy link
Contributor Author

brson commented Apr 17, 2016

rust-packaging is failing because arm-unknown-linux-gnueabi doesn't exist in channel-rustc-nightly, and indeed you can see that in the recent rustc build there was no rustc for that architecture.

@brson
Copy link
Contributor Author

brson commented Apr 17, 2016

@brson
Copy link
Contributor Author

brson commented Apr 17, 2016

I think there are at least two fixes that need to happen here: 1) rust-buildbot shouldn't overwrite files in the archives. Unfortunately it's not obvious how to make s3cmd do this. 2) rust-buildbot should allow fetch-inputs/package-rust to fail for nogate platforms.

@brson
Copy link
Contributor Author

brson commented Apr 18, 2016

Hopefully this fixes itself with the nightly that starts in an hour.

@alexcrichton
Copy link
Member

Hm I think what may have happened here was:

  • The bots were offline at night for 4/15 (for some unknown reason), but the builds were scheduled.
  • Noticing this, I restarted buildbot the next day, 4/16
  • The builders, taking forever to finish, started a build on 4/16 UTC and finished 4/17 UTC (as midnight UTC is 5pm PST). That is, I restarted the bots in the morning of 4/16 PST (probably ~10am) and ~8 hrs later (the length of the build) it was 6pm 4/16 PST but 4/17 UTC
  • The builders then uploaded the 4/16 nightly to the date of 4/17
  • The next nightly, on 4/17, was then rightfully uploaded to 4/17 as well.

So... I guess we could do a few things. One is to actually prevent s3cmd from doing this (boy would that be nice). Another would be to always cancel nightly builds whenever buildbot restarts, assuming that the nightly build only works if triggered and completed normally. Finally we could somehow give as input to the build step the nightly date it's building for (e.g. written out by the "trigger") which hopefully only ever runs once a day. I guess we could even extend that to have the trigger detect when it was last run and if it was < 24 hours it bails.

Now as for the ARM build, I have no idea what's going on there.

@alexcrichton
Copy link
Member

We could try to use s3cmd's sync feature with --skip-overwrite, but doesn't look to be the same as "upload but error if we overwrite". We'd probably have to script it ourselves to list things, see if any inputs are in the output, and then bail if yes.

@edunham
Copy link
Member

edunham commented Aug 22, 2016

Help wanted: Add build step to test whether anything exists at the URL it's about to upload to. https://github.com/rust-lang/rust-buildbot/blob/master/master/master.cfg

@edunham edunham added the E-help-wanted Call for participation: Help is requested to fix this issue. label Aug 22, 2016
@alexcrichton
Copy link
Member

No longer a problem that buildbot is gone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E-help-wanted Call for participation: Help is requested to fix this issue. P-low Low priority
Projects
None yet
Development

No branches or pull requests

4 participants