Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How many requests are being used to resolve packages‽ (429 Too Many Requests) #1229

Closed
CAD97 opened this issue Jul 23, 2023 · 8 comments · Fixed by #1231
Closed

How many requests are being used to resolve packages‽ (429 Too Many Requests) #1229

CAD97 opened this issue Jul 23, 2023 · 8 comments · Fixed by #1231

Comments

@CAD97
Copy link
Contributor

CAD97 commented Jul 23, 2023

I just got 2500 (... now 4500, just CTRL-C'd it) lines dumped into my terminal that cargo-binstall was receiving HTTP 429 during a cargo install-update -a (while resolving package wasm-bindgen-cli). It really feels like something has to have gone wrong for it to generate so many 429 responses in a relatively short period of time.

Whatever API is getting hit (GitHub, presumably?), I seem to have gotten my cargo-binstall throttled for the moment.

cargo install-update wasm-pack gave me a single 429 message before I aborted it. cargo install-update wasm-bindgen-cli seems to spit out around 500 at a time -- I'm not completely sure because my terminal emulator can't really keep up with them getting spewed out.

I attempted setting RUST_LOG but didn't get any output other than the large number of INFO about the 429 response.

It's I guess somewhat reasonable that resolution can take a bit to talk to GitHub, but it seems odd at a minimum that package resolution would involve 500+ requests in flight at the same time (and that getting throttled dumps 500+ log messages simultaneously).

Originally reported against cargo-install-update, where it was suggested that providing a GITHUB_TOKEN can potentially speed up resolution (and perhaps raise usage quotas); if this is the case it'd probably be a good idea to mention this (and what permissions are needed) in the README.

Version info:

cargo-binstall: 1.1.2
build-date: 2023-07-21
build-target: x86_64-pc-windows-msvc
build-features: cross_lang_fat_lto,default,fancy_no_backtrace,git,log_release_max_level_debug,rustls,static,trust_dns,zlib_ng,zstd_thin
build-commit-hash: 38d37c69b24972477426c65f7a5adce24b27b20a
build-commit-date: 2023-07-21
rustc-version: 1.71.0
rustc-commit-hash: 8ede3aae28fe6e4d52b38157d7bfe0d3bceef225
rustc-llvm-version: 16.0

If nothing else, this issue can track coalescing concurrent requests onto a shared retry/backoff timer/report, to avoid generating the log spam when running into 429.

@NobodyXu
Copy link
Member

It really feels like something has to have gone wrong for it to generate so many 429 responses in a relatively short period of time.

Can you try providing GITHUB_TOKEN to cargo-binatall via environment variable?

NOTE that it tries to scrap from your .git-credentials or .config/gh by default, so if you already have it set, it shouldn't give such a massive 429 errors unless your GITHUB_TOKEN has hit its API rate limit, which is unlikely.

Whatever API is getting hit (GitHub, presumably?), I seem to have gotten my cargo-binstall throttled for the moment.

Yes, it's github since wasm-bindgen-cli source code is hosted on https://github.com/rustwasm/wasm-bindgen

I attempted setting RUST_LOG but didn't get any output other than the large number of INFO about the 429 response.

cargo-binstall doesn't accept RUST_LOG, you can set log-level using --log-level and the environment variable documented in "cargo binstall --help" for --log-level.

It's I guess somewhat reasonable that resolution can take a bit to talk to GitHub, but it seems odd at a minimum that package resolution would involve 500+ requests in flight at the same time (and that getting throttled dumps 500+ log messages simultaneously).

500+ requests does seem to be strange because according to my memory, for one crate, binstall will fire at most 25-30 requests

I will double check this.

where it was suggested that providing a GITHUB_TOKEN can potentially speed up resolution (and perhaps raise usage quotas); if this is the case it'd probably be a good idea to mention this (and what permissions are needed) in the README.

Yes that's right, I will open a PR to update README.

If nothing else, this issue can track coalescing concurrent requests onto a shared retry/backoff timer/report, to avoid generating the log spam when running into 429.

Well we already uses a single backoff timer, but I guess we log too much?

Would also have to double check this one.

@CAD97
Copy link
Contributor Author

CAD97 commented Jul 23, 2023

I use SSH to authenticate the git CLI with GitHub, so I don't have an ambient GITHUB_TOKEN available, so cargo-binstall is probably using whatever the default fallback path is.

I've got to log off for right now (it's just after midnight here and I have something in the morning), but I'll check and see how it behaves sometime tomorrow, along with --log-level. I'll probably be dethrottled by then, though; I guess we'll find out. (I can always just try reinstalling it a few times in a row to hit the quota, I suppose.)

(I don't know why I didn't check --help instead of just trying RUST_LOG, tbh; probably because I was initially hitting this behind cargo-install-update.)

The 500 number comes from seeing the log lines stop being printed to the terminal for an instant, hitting CTRL-C, then counting the lines generated. It took my terminal roughly a minute to print all those lines actually, I think, and a few more logs had been generated by the time I actually stopped the process. If it's supposed to cap around 30 for a single crate, though, something's gone wrong, because I unambiguously had over 100 log entries for a 429 result and 60s backoff in a span of less than 60s.

Random guess, not having looked: make sure the requests are still holding the semaphore tokens (or whatever equivalent) while waiting on the backoff timer, so you don't end up just spinning up more requests instead of actually backing off. Not that I particularly think that's what's happening, just that it's a bug I've written before 😅

Well we already uses a single backoff timer, but I guess we log too much?

Yeah, I'd suggest a rough flow of that when receiving a 429, to log the retry iff the retry timer isn't already waiting. Then join the wait, resetting the timer (but not incrementing the backoff except for the first hit).

@NobodyXu
Copy link
Member

I use SSH to authenticate the git CLI with GitHub, so I don't have an ambient GITHUB_TOKEN available, so cargo-binstall is probably using whatever the default fallback path is.

You can set GITHUB_TOKEN via env, or set it in your .bashrc/.zshrc

I'll probably be dethrottled by then, though; I guess we'll find out.

Yeah, github applies rate limit on per-hour basis.

(I can always just try reinstalling it a few times in a row to hit the quota, I suppose.)

You can simply wait for binstall to sort it out itself, when it encountered 429 it will simply wait and retry.

The 500 number comes from seeing the log lines stop being printed to the terminal for an instant, hitting CTRL-C, then counting the lines generated. It took my terminal roughly a minute to print all those lines actually, I think, and a few more logs had been generated by the time I actually stopped the process.

That's definitely something we can improve.

If it's supposed to cap around 30 for a single crate, though, something's gone wrong, because I unambiguously had over 100 log entries for a 429 result and 60s backoff in a span of less than 60s.

Yeah, it could be that:

  • one http request hits 429
  • all http requests are held off until the time point specified in 429 response header, logging out
  • timer expired, all http requests can now retry at the rate of 1 request per 10ms (you can override this by --rate-limit)
  • hitting limit again, logging

Yeah, I'd suggest a rough flow of that when receiving a 429, to log the retry iff the retry timer isn't already waiting. Then join the wait, resetting the timer (but not incrementing the backoff except for the first hit).

429 response header gives binstall a absolute time point which we should hold off until then.

It doesn't give us an interval though.

@halostatue
Copy link

I’m also seeing unconstrained 429s. Like @CAD97, I first saw this via cargo install-update -a, but I can reproduce it with cargo-binstall directly. Using cargo-binstall 1.1.2.

I’ve attached the terminal output of

$ GITHUB_TOKEN=[REDACTED] cargo-binstall --log-level debug  --force cargo-info

cargo-binstall.log

NobodyXu added a commit that referenced this issue Jul 24, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
@0atman
Copy link

0atman commented Jul 31, 2023

Can confirm this on a fresh reinstall today of arch linux, rustup, binstall: this bug seems to make binstall unusable:

deck@steamdeck ~> cargo binstall exa
 INFO resolve: Resolving package: 'exa'
 WARN Your GitHub API token (if any) has reached its rate limit and cannot be used again until Instant { tv_sec: 1690802118, tv_nsec: 143434284 }, so we will fallback to HEAD/GET on the url.
 WARN If you did not supply a github token, consider doing so: GitHub limits unauthorized users to 60 requests per hour per origin IP address.
 WARN Your GitHub API token (if any) has reached its rate limit and cannot be used again until Instant { tv_sec: 1690802118, tv_nsec: 143434284 }, so we will fallback to HEAD/GET on the url.
 WARN If you did not supply a github token, consider doing so: GitHub limits unauthorized users to 60 requests per hour per origin IP address.
^C WARN Installation cancelled

The rate limiting warnings appear SO fast that it was hard to not exceed my terminal's history!

Thanks so much for your work here!

@NobodyXu
Copy link
Member

NobodyXu commented Jul 31, 2023

@0atman Can you please test PR #1231 to see if it fixed the bug?

@0atman
Copy link

0atman commented Jul 31, 2023

@NobodyXu I'll test now

@0atman
Copy link

0atman commented Jul 31, 2023

sadly not, I've given an update on #1231

NobodyXu added a commit that referenced this issue Aug 1, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 1, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 1, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 1, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 2, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 2, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 2, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 2, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 3, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 3, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 4, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 4, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 7, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
NobodyXu added a commit that referenced this issue Aug 7, 2023
Fixed #1229

Signed-off-by: Jiahao XU <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants