Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloader: Sparse checkout is (no longer) working #8195

Closed
wkl3nk opened this issue Jan 30, 2024 · 6 comments · Fixed by #8199
Closed

Downloader: Sparse checkout is (no longer) working #8195

wkl3nk opened this issue Jan 30, 2024 · 6 comments · Fixed by #8199
Labels
bug Issues that are considered to be bugs downloader About the downloader tool

Comments

@wkl3nk
Copy link
Contributor

wkl3nk commented Jan 30, 2024

Possibly some code changes in #8170 led to the effect that git sparse checkouts are no longer working.

To reproduce:
Let's say, from a gradle managed repository with many modules, I only want to check out an Angular web application subdirectory

ort --debug download --project-url=https://github.com/wkl3nk/fun-with-notes.git --vcs-path=frontend-angular16 --vcs-revision=main -o .

Looking into .git/info/sparse-checkout it looks good, everything is prepared, also .git/config is set to sparseCheckout = true
But the final step, a "git.reset() ..." in the code has no effect.

If I manually do a git reset --hard FETCH_HEAD in the target directory, then everything looks good.
As I said, this line here in the code of Git.kt just seems to have no effect.

git.reset().setMode(ResetCommand.ResetType.HARD).setRef(resolvedRevision).call()

The OSS Review Toolkit, version 15.0.0-051.sha.64fd9db.
Running 'download' under Java 17.0.8.1 on Mac OS X

@sschuberth sschuberth added bug Issues that are considered to be bugs downloader About the downloader tool labels Jan 30, 2024
@sschuberth
Copy link
Member

sschuberth commented Jan 30, 2024

JGit does not support sparse checkouts currently, that's a known limitation:

// TODO: Migrate this to JGit once sparse checkout (https://bugs.eclipse.org/bugs/show_bug.cgi?id=383772) is
// implemented.
run("checkout", revision, workingDir = workingTree.workingDir)

I'm not sure what we can do about it without breaking #7725 again other than contributing sparse checkout support to upstream JGit...

@mnonnenmacher
Copy link
Member

mnonnenmacher commented Jan 30, 2024

JGit does not support sparse checkouts currently

I forgot about that part, that's unfortunate. I have put this on the agenda for the next community meeting so that we can discuss the options:

  • Remove support for sparse checkout completely. That would be bad when you need to scan only part of a large repo.
  • Revert the change and find another solution for Analyzer: Git downloader does not use latest commit when revision is not fixed #7725.
  • Go back to Git CLI. What benefits does JGit bring as long as we still depend on the CLI as well?
  • Provide two alternative Git implementations, one using JGit and one using the CLI. The former for users who do not want to depend on the CLI, the latter for users who need sparse checkout.
  • Contribute sparse-checkout support to JGit. What is the effort and are the maintainers even interested in the feature?

@sschuberth
Copy link
Member

  • Go back to Git CLI. What benefits does JGit bring as long as we still depend on the CLI as well?

That's a valid question. My initial motivation to replace Git CLI with JGit was to reduce dependencies on external tools and make ORT as self-contained as possible. However, we've been struggling to get rid of Git CLI completely for a long time, mostly due to limitations in JGit.

Also, I had to accept that when trying to target Kotlin Multiplatform, it would actually be easier to call Git CLI on all platforms rather than finding native Git implementations on all platforms.

  • Provide two alternative Git implementations

I would not want to maintain those two implementations. We already are in the state of using Git CLI and JGit, but that should have been only a temporary solution.

  • Contribute sparse-checkout support to JGit. What is the effort and are the maintainers even interested in the feature?

That would be very nice from a community perspective. I believe the maintains would happily accept such a feature, but they're not actively looking into it, mostly because JGIt seems to be more and more about the Git server side (as it's used in Gerrit) than the Git client side.

@sschuberth
Copy link
Member

I'm actually a bit surprised that the code path with

git.reset().setMode(ResetCommand.ResetType.HARD).setRef(resolvedRevision).call()

is even triggered for the given example. IIUC this code should only be necessary when updating a previously checked out non-fixed revision. That is, it should not be necessary when doing an initial sparse checkout in a freshly initialized working tree. So, avoid to call this on the first "update" call which actually populates the working tree initially should also fix the problem, or?

@sschuberth
Copy link
Member

But the final step, a "git.reset() ..." in the code has no effect.

Debugging this actually shows that git.reset()... does have an effect, but an unwanted one: It turn the sparse checkout that's correctly done by run("checkout", ...) into a full checkout. A simple work-around is to only reset if in non-sparse mode. That should be rather easy to add and cover most of the cases. I'll make a proposal.

sschuberth added a commit that referenced this issue Jan 31, 2024
JGit still does not support sparse checkouts [1], and its `reset` call
turns the sparse checkout done by `git checkout` into a full checkout.

So again use the Git CLI to perform the actual reset like done before
8472931, but continue to determine the resolved revision to reset to via
JGit.

Fixes #8195.

[1]: https://bugs.eclipse.org/bugs/show_bug.cgi?id=383772

Signed-off-by: Sebastian Schuberth <[email protected]>
@sschuberth
Copy link
Member

A simple work-around is to only reset if in non-sparse mode. That should be rather easy to add and cover most of the cases. I'll make a proposal.

I've found a better very simple solution, see #8199.

sschuberth added a commit that referenced this issue Jan 31, 2024
JGit still does not support sparse checkouts [1], and its `reset` call
turns the sparse checkout done by `git checkout` into a full checkout.

So again use the Git CLI to perform the actual reset like done before
8472931, but continue to determine the resolved revision to reset to via
JGit.

Fixes #8195.

[1]: https://bugs.eclipse.org/bugs/show_bug.cgi?id=383772

Signed-off-by: Sebastian Schuberth <[email protected]>
sschuberth added a commit that referenced this issue Jan 31, 2024
JGit still does not support sparse checkouts [1], and its `reset` call
turns the sparse checkout done by `git checkout` into a full checkout.

So again use the Git CLI to perform the actual reset like done before
8472931, but continue to determine the resolved revision to reset to via
JGit.

Fixes #8195.

[1]: https://bugs.eclipse.org/bugs/show_bug.cgi?id=383772

Signed-off-by: Sebastian Schuberth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues that are considered to be bugs downloader About the downloader tool
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants