Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many open files #260

Open
KevinWuWon opened this issue Feb 8, 2022 · 8 comments
Open

Too many open files #260

KevinWuWon opened this issue Feb 8, 2022 · 8 comments
Labels
bug Something isn't working no-planned-fix I don't plan to work on this. Feel free to ask if there's an update, or try fixing it yourself.

Comments

@KevinWuWon
Copy link

Description of the bug

I got the following while trying to do a move. Tried it a few times and got the same error again. Ended up using git rebase instead.

❯ g move -s 3a1b4f32 -d green
The application panicked (crashed).
Message:  A fatal error occurred:
   0: Git error GenericError: could not open '/Users/kevinwuwon/work/.git/objects/ec/e53826228db6bfdbe45285a18fad18f8d48c99': Too many open files

Location:
   src/git/repo.rs:45

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ SPANTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

   0: branchless::git::repo::find_tree with self=<Git repository at: "/Users/kevinwuwon/work/.git/"> oid=NonZeroOid(ece53826228db6bfdbe45285a18fad18f8d48c99)
      at src/git/repo.rs:1083
   1: branchless::git::tree::get_changed_paths_between_trees with repo=<Git repository at: "/Users/kevinwuwon/work/.git/"> lhs=Some(Tree { id: f356c54deafa9098cd7df4b8022a589e13824b73 }) rhs=Some(Tree { id: 0dd521a0fdf599f49516098abe7fa40b60071d69 })
      at src/git/tree.rs:270
   2: branchless::git::repo::get_paths_touched_by_commit with self=<Git repository at: "/Users/kevinwuwon/work/.git/"> commit=Commit { inner: Commit { id: a17ae10bd61939cffe5f7403dec915af6597260e, summary: "[PAY-2762] Separate CHARGEBACK from REFUNDED status in PurchaseHistory (#240240)" } }
      at src/git/repo.rs:590

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
Location: src/commands/mod.rs:262

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

Expected behavior

No response

Actual behavior

No response

Version of git-branchless

02bc9b5

Version of git

git version 2.34.0

Version of rustc

rustc 1.56.1 (59eed8a2a 2021-11-01)

@KevinWuWon KevinWuWon added the bug Something isn't working label Feb 8, 2022
@arxanas
Copy link
Owner

arxanas commented Feb 8, 2022

Thanks for the report @KevinWuWon. I haven't seen an issue like this before.

  • What OS are you using?
  • What's the maximum number of open files for your system? (You might be able to find this information with ulimit.)
  • How many commits are rooted at 3a1b4f32/how many commits are you moving?
  • How big is the diff for the commit(s) which you're applying?
  • Is there anything interesting about the properties of the moved commit(s)?
    • Are there any directories which have very many files in them? Even if few of those files have changed, we may allocate TreeEntrys for all of them. We might expect a problem at 32k-64k files in a directory (according to your OS's maximum file limit).
    • Are any changed paths very deep into the directory hierarchy?
  • Does your repository have a lot of commits/files? (Say, more than 500k commits or files?) I've stress-tested on various commits for https://github.com/mozilla/gecko-dev, but it's certainly possible that I didn't test a commit with a big enough diff.

This comment rust-lang/git2-rs#626 (comment) suggests that resources aren't freed until the owning Repository is freed, which could be a problem for this use-case. In

fn get_changed_paths_between_trees_internal(
, we do a dual depth-first search of the two trees, and I expected the allocated git2::TreeEntrys to be freed after returning from each function call, but that might not be the case.

@KevinWuWon
Copy link
Author

KevinWuWon commented Feb 8, 2022

What OS are you using?

macOS 11.6.2

What's the maximum number of open files for your system? (You might be able to find this information with ulimit.)

ulimit says unlimited

How big is the diff for the commit(s) which you're applying?

The git diff of the 3 commits I'm moving is 483 lines with 4 changed files.

Does your repository have a lot of commits/files?

200k commits, 385k files

Are there any directories which have very many files in them?

No, there are 110 files descendent from the directory it touches.

Are any changed paths very deep into the directory hierarchy?

No, 6 levels deep.

I restarted my computer and the problem went away so I suspect it's a resource leak. The computer had been on for a few weeks.

@KevinWuWon
Copy link
Author

KevinWuWon commented Feb 9, 2022

I'm still getting "Too many open files" even after a system restart. This time on git amend:

❯ g amend
The application panicked (crashed).
Message:  A fatal error occurred:
   0: could not open '/Users/kevinwuwon/work/.git/objects/92/bfc18cfa594ae75d0839df956707bdab5fd6a2': Too many open files; class=Os (2)

Location:
   /rustc/59eed8a2aac0230a8b53e89d4e99d55912ba6b35/library/core/src/result.rs:1915

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
Location: src/commands/mod.rs:262

Backtrace omitted.
Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.

This file is not particularly big, nor is it in a directory with many files. git commit -a --amend --no-edit worked fine as an alternative.

@arxanas
Copy link
Owner

arxanas commented Feb 9, 2022

How many branches and references do you have in your repository? You can try running this and report the results:

$ find .git/refs | wc -l
$ find .git/refs/heads | wc -l
$ find .git/refs/remotes | wc -l
$ find .git/refs/branchless | wc -l

After that, can you try running git branchless gc? That just cleans up dangling references under .git/refs/branchless/. Maybe we're somehow holding onto the objects pointed to by those references.

I don't have a hypothesis as to what's opening all the files. I think it's 50/50 odds between that the commit-rewrite process itself opens too many files at once or that some other operation prior to that opens too many files and holds onto them.

@KevinWuWon
Copy link
Author

❯ find .git/refs | wc -l
    2881

❯  find .git/refs/heads | wc -l
      22

❯ find .git/refs/remotes | wc -l
     445

❯ find .git/refs/branchless | wc -l
    2408

The git branchless gc didn't help.

But someone suggested I type ulimit -n 10240 (even though ulimit returns "unlimited") and that remedied it successfully.

@arxanas
Copy link
Owner

arxanas commented Feb 11, 2022

I think ulimit shows the "hard" limit, but the "soft" limit is different . ulimit -n indicates that the soft limit on my machine (macOS 11.6.3) is 65535. I don't really have a good idea of where we can avoid opening as many files, and in any case, 1024 seems a little low.

@martinvonz
Copy link
Collaborator

Did you figure out which files are opened? Is it refs or loose objects or something else? I suppose strace could help you at least figure that part out.

@martinvonz
Copy link
Collaborator

Oh, regular git gc should help if the problem is with either refs or objects -- the command packs both refs and objects.

@arxanas arxanas added the no-planned-fix I don't plan to work on this. Feel free to ask if there's an update, or try fixing it yourself. label Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-planned-fix I don't plan to work on this. Feel free to ask if there's an update, or try fixing it yourself.
Projects
None yet
Development

No branches or pull requests

3 participants