[JENKINS-49757] Remove redundant fetch #904

rishabhBudhouliya · 2020-06-10T09:52:02Z

Pull Request-845

The concern of removing the second fetch call is a possible repository data loss or
misconfiguration in terms of the extra behaviors applied during checkout.

If the second fetch call is ignored, CleanBeforeCheckout option will also be ignored
as it doesn't implement the decorateCloneCommand which is used by the git fetch call.
To ensure that it is not ignored, decorateCloneCommand has been implemented for this
particular extension. The unit test GitSCMTest#testCleanBeforeCheckout doesn't fail
after the removal of second fetch call now.

Checklist

I have read the CONTRIBUTING doc
I have referenced the Jira issue related to my changes in one or more commit messages
I have added tests that verify my changes
Unit tests pass locally with my changes
I have added documentation as necessary
No Javadoc warnings were introduced with my changes
No spotbugs warnings were introduced with my changes
I have interactively tested my changes
Any dependent changes have been merged and published in upstream modules (like git-client-plugin)

Types of changes

Breaking change (fix or feature that would cause existing functionality to not work as expected)

…FromLogs

My fork is 91 commits behind master

Merged because way behind in commits

Co-authored-by: Francisco Javier Fernandez <[email protected]>

The build is supposed to fail as the narrow refspec only fetches "foo" branch (honor initial refspec) and we are commiting in a branch which doesnt exist for the project, the master branch.

The concern of removing the second fetch call is a possible repository data loss or misconfiguration in terms of the extra behaviors applied during checkout. If the second fetch call is ignored, CleanBeforeCheckout option will also be ignored as it doesn't implement the decorateCloneCommand which is used by the git fetch call. To ensure that it is not ignored, decorateCloneCommand has been implemented for this particular extension. The unit test GitSCMTest#testCleanBeforeCheckout doesn't fail after the removal of second fetch call now.

rishabhBudhouliya · 2020-06-10T10:20:13Z

@MarkEWaite @fcojfernandez I created a separate PR for this as I think this addition will require separate discussion.

Also, I was thinking of creating separate branches cut from the fix branch (JENKINS-49757) which will represent the changes related to fix in order to achieve a non-breaking solution.

rishabhBudhouliya · 2020-06-10T10:32:03Z

Also, I have to add unit tests confirming the before and after behavior because of this change.

…t to be searched in the build logs

rishabhBudhouliya · 2020-06-11T05:40:33Z

The build is failing for a windows instance while it is passing on my local machine and linux instances of ci.jenkins.io.

Currently, I can't figure out the reason 4 test cases are failing. Any pointers on this would greatly help me.

MarkEWaite · 2020-06-13T15:14:43Z

@rishabhBudhouliya thanks for your work on this one!

While staring at the code in the debugger, I realized that the problem is in the existing testCleanBeforeCheckout, not in your implementation of redundant fetch removal.

The test is incorrectly asserting that there should be a clean on the first build. A clean on the first build is not needed because the first build is assured of an empty workspace. The git plugin even has code which will delete files in the workspace before the first build.

The test should assert that there should be a clean on the second build.

The test failed on Windows for a particularly disconcerting reason. The CleanBeforeCheckout extension was being called before the first clone. Before the first clone, there is no git repository in the workspace. The call to git clean -xffd . was being made on a directory which did not contain a git repository. The build method was swallowing and hiding that exception and allowing the job to continue.

It was even more complicated because the tests run in the git-plugin/target directory. That means there is a git repository in one of the parent directories, and command line git happily searches up the parent directories to find that git repository.

We may want to consider setting GIT_CIELING_DIRECTORIES in our tests to avoid this type of surprise in the future.

See my proposed change in https://github.com/MarkEWaite/git-plugin/commits/CleanBeforeCheckout

rishabhBudhouliya · 2020-06-13T18:10:18Z

I'm surprised on how I totally missed the very behavior of this extension!

There's still one thing I don't understand. Why is this happening to Windows alone? If git clean -xffd is being called before the first clone, it is bound to create an error on any platform?

I understand that it searches for a git repository in one of the parent directories and finds them, does it clean the working tree of that particular repository, if it does, why not in the case of Windows?

MarkEWaite · 2020-06-13T18:52:08Z

There's still one thing I don't understand. Why is this happening to Windows alone? If git clean -xffd is being called before the first clone, it is bound to create an error on any platform?

The `git clean -xffd .` fails on windows because it is not allowed to remove a busy directory. Linux and mac allow removing a busy directory. If we had been running the tests in a directory like `/tmp` where there is no parent directory with a git repository, then the `git clean -xffd .` would have failed differently. The failure would still have been silently hidden by the `build()` step in the test, but it would have been a different failure mode.

rishabhBudhouliya · 2020-06-15T04:07:11Z

Oh, I see. Adding GIT_CIELING_DIRECTORIES should be the right thing to do, I wonder if we would need additional tests to confirm this behavior.

Would you be willing to merge this commit to master or would you like me to add it to this PR? Both works fine for me.

MarkEWaite · 2020-06-15T04:11:53Z

This PR still needs the fixes that I proposed in https://github.com/MarkEWaite/git-plugin/commits/CleanBeforeCheckout . Even with those fixes, I'm not ready to merge this until after git client plugin 3.3.0 has been released with JGit 5.8.0 and the other changes that are in the master branch.

I hope to release git client plugin 3.3.0 within the next week or 10 days so that it is available near the release of Jenkins 2.235.1 LTS.

The reason to add a "clean" before checkout behaviour is to clean the working directory of a repository before checking out a branch. In this case, the remote repository is being cloned for the first time which means there is no local git repository.

First build is known to have a clean workspace so there is no need to clean it before checkout. Second build should clean because the workspace might be cluttered with extra files from the first build. Also removes several useless operations in the test (related to buildLog)

rishabhBudhouliya · 2020-06-24T09:36:39Z

@MarkEWaite @fcojfernandez As discussed on the Gitter channel, I have added a check on refspec and CloneOption to ensure user provided refspec is not missed by the new fix.

I have also interactively tested the new commit, I will share my findings during the GSoC weekly sync.

fcojfernandez · 2020-06-24T11:06:49Z

src/main/java/hudson/plugins/git/GitSCM.java

+            for (RefSpec ref:initialFetchRefSpecs) {
+                if (!ref.toString().contains("refs/heads")) {
+                    isDefaultRefspec = false; // if refspec is not of default type, preserve second fetch
+                }
+            }


If only one of the elements does not contain refs/heads then you are marking that is not a defaultRefspec. Could you explain what is your intention here when there is more than one element? Sorry but it's a bit unclear for me

If honor refspec for initial clone is disabled, the first clone is going to fetch all the branches and references related to them. All of those refspecs contain "refs/heads/xx" in the : format of refspec.

My assumption is that the given refspec contains ""refs/heads" this pattern, there is no need to call the second fetch in any case.

In the case of a narrow refspec which contains any references other than of branches, we will allow the second fetch call to take place, i.e, we will not avoid the redundant fetch call.
For an example, if honorRefSpec == false and refspec = "+refs/heads/master:refs/remotes/origin/master +refs/pull/553/head:refs/remotes/origin/pull/553", we can safely assume that this is not the default refspec because we have a "refs/pull" here.
In the case of finding multiple refspecs, if any refspec is other than the default format pattern, the code assumes that it is worth fetching with the second fetch call because it may have been missed by the first/initial clone.

src/main/java/hudson/plugins/git/GitSCM.java

src/main/java/hudson/plugins/git/extensions/impl/CleanBeforeCheckout.java

fcojfernandez

If honor refspec for initial clone is disabled, the first clone is going to fetch all the branches and references related to them. All of those refspecs contain "refs/heads/xx" in the : format of refspec.

My assumption is that the given refspec contains ""refs/heads" this pattern, there is no need to call the second fetch in any case.

In the case of a narrow refspec which contains any references other than of branches, we will allow the second fetch call to take place, i.e, we will not avoid the redundant fetch call.
For an example, if honorRefSpec == false and refspec = "+refs/heads/master:refs/remotes/origin/master +refs/pull/553/head:refs/remotes/origin/pull/553", we can safely assume that this is not the default refspec because we have a "refs/pull" here.
In the case of finding multiple refspecs, if any refspec is other than the default format pattern, the code assumes that it is worth fetching with the second fetch call because it may have been missed by the first/initial clone.

Thanks for the example! Now I understand well your code :)

fcojfernandez · 2020-06-26T07:34:28Z

src/main/java/hudson/plugins/git/GitSCM.java

+            }
+        }
+        // if initial fetch refspec contains "refs/heads/*" (default refspec), ignore the second fetch call
+        return removeSecondFetch;


if initialFetchRefSpecs is null, then you're removing the second fetch. At this moment, I cannot think about a situation when it will be null, but ...

I added a test that passed in a null refspec but the UserRemoteConfig constructor converts nulls to empty strings. I think that is unreachable code due to the UserRemoteConfig use of fixEmpty().

According to my assumptions, the need to add the null check was to check for empty refspecs, as Mark as correctly pointed out, the fixEmpty() converts any empty refspec into a null refspec.

If the UserRemoteConfig was not fixing empty refspecs, the logic presented by me would gladly pass empty refspecs as a narrow refspec and will not avoid the second fetch. But since it is there, it should not be a concern for us.

fcojfernandez · 2020-06-26T07:44:26Z

src/main/java/hudson/plugins/git/GitSCM.java

+                if (!option.isHonorRefspec()) {
+                    removeSecondFetch = isDefaultRefspec;
+                } else {
+                    removeSecondFetch = true; // avoid second fetch call if honor refspec is enabled
+                }


This if clause didn't help me to see clearer what you meant. While having a "!" in an if clause is perfectly fine, the "!" in the if-else might cause confusion. I would have expected

if something is true sentence1 else sentence2

while this piece of code is

if something is not false sentence2 else sentence1

My personal preference is to have the first approach in terms of readability. But it's just a matter of personal taste, so take this advice as it is, just an advice.

This is a great suggestion @fcojfernandez. I have added this in c2e097d.

Reduce differences to master branch

One of the branches was not being reached by tests prior to this change.

Also reuses the existing assertion to provide a new assertion.

Code was already assuming that argument is non-null. The annotation makes it explicit that the value must be non-null and can be checked by spotbugs that it is non-null.

MarkEWaite · 2020-06-28T11:34:08Z

@rishabhBudhouliya I was able to spend more time working through the code in a debugger to assure that the tests covered the exisitng branches in the new code. I added two new tests to increase the coverage of the cases we had detected during interactive testing.

I've confirmed that the UserRemoteConfig constructor will convert a null refspec into a non-null value. However, I did not feel confident enough in all paths to justify removing the null handling in the determine... method. I think it is great as is.

I consider this code ready to merge once the CI jobs confirm it passes tests. I'll plan to merge it within the next 24 hours unless someone else raises an objection.

rishabhBudhouliya · 2020-06-28T11:51:10Z

src/test/java/hudson/plugins/git/GitSCMTest.java

+
+        /* Create a ref for the fake pull in the source repository */
+        String[] expectedResult = {""};
+        CliGitCommand gitCmd = new CliGitCommand(testRepo.git, "update-ref", "refs/pull/553/head", "HEAD");


While writing the code, I was thinking how would I proceed with writing a unit test for the same. The solution I had in my mind was to actually clone a remote the git client repository and fetch a pull request reference.

I didn't imagine to update the reference to point to a pull request. Thanks for showing me!

👍 I hadn't discovered the technique until just recently myself. Plenty to learn for all of us.

rishabhBudhouliya · 2020-06-28T12:45:20Z

@MarkEWaite Thank you for adding two very important test cases. I have reviewed the changes and applied a small change in the if-else condition for the purpose of better readability.

While I couldn't write a test case like testRetainRedundantFetch(), I did in fact interactively test these cases:

Scenario 1: CloneOption with honor refspec is false

Test#1 -> refspec → +refs/heads/master:refs/remotes/origin/master +refs/pull/553/head:refs/remotes/origin/pull/553
Results -> Second fetch is not skipped.

Test#2 -> refspec → +refs/heads/master:refs/remotes/origin/master
Results -> Second fetch is skipped

Test#3 -> respec is null
Results -> second fetch is skipped

Scenario 2: CloneOption is not added by the user, it is null
Same tests repeated with the same expected results.

rishabhBudhouliya · 2020-06-28T12:51:47Z

src/test/java/hudson/plugins/git/GitSCMTest.java

+        if (random.nextBoolean()) {
+            /* Randomly enable shallow clone, should not alter test assertions */
+            CloneOption cloneOptionMaster = new CloneOption(false, null, null);
+            cloneOptionMaster.setDepth(1);


I fail to understand, why would need a shallow clone to improve coverage when we already have a full normal clone?

Good question. One branch was not being reached when a cloneOption extension is detected but does not have honor refspec enabled.

Okay. I didn't consider other CloneOptions. If honor refspec is false, with avoiding the second fetch, we will also avoid other clone options like Shallow Clone, Disable tags, Reference a repo and timeout.

The good news is that the first fetch is capable to execute all of these options if we miss the second fetch. It should not make any difference to the user's expectation on git repository information after the checkout.

rishabhBudhouliya · 2020-06-28T14:28:47Z

@MarkEWaite https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fgit-plugin/detail/PR-904/25/pipeline this build is failing for a linux instance for some reason I can't figure out. The maven build has succeeded.

MarkEWaite · 2020-06-28T16:56:23Z

@MarkEWaite https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fgit-plugin/detail/PR-904/25/pipeline this build is failing for a linux instance for some reason I can't figure out. The maven build has succeeded.

The ci.jenkins.io agents are provisioned by the EC2 plugin. Either the EC2 plugin or the agents have a reliability issue. They seem to randomly disconnect. We have a reconnect script that runs once a minute, but that is just a temporary technique until we can spend more time to identify and resolve the root problem.

MarkEWaite · 2020-06-29T17:31:43Z

@MarkEWaite I have added the fixes you have proposed and merged the master branch as well.
I promised to help you with the new release but I failed in doing so.

Is there something I can do currently to help and to speed up the process?

That's great. I've built a version of the git plugin and placed it into my test infrastructure. The publicly visible portion of the test infrastructure is in my docker-lfs repository on the lts-with-plugins branch.

I had an unexpected set of pauses and stops from about 12:00 to 8:00 AM my time. GitHub was having some issues during that time, but I can't tell if that was related. I'll run additional tests in that environment over the next day or two.

I think you're doing exactly the right thing with your focus on preparing for your presentation this week. I hope to have completed my exploratory testing by the end of my working day tomorrow.

MarkEWaite

We'll need to add an "opt-out" global switch before release, but this has passed all my testing and multiple code reviews. Thanks @rishabhBudhouliya

rishabhBudhouliya · 2020-07-02T09:37:50Z

We'll need to add an "opt-out" global switch before release, but this has passed all my testing and multiple code reviews. Thanks @rishabhBudhouliya

Thanks @MarkEWaite for merging the fix, I will work on the opt-out switch on priority and hopefully raise a pull request soon.

MarkEWaite · 2020-07-02T11:54:13Z

Thanks @MarkEWaite for merging the fix, I will work on the opt-out switch on priority and hopefully raise a pull request soon.

In case it helps, refer to #924 as a good example

rishabhBudhouliya and others added 14 commits February 25, 2020 02:52

Add flag to avoid redundant fetch in GitSCM checkout

8d58567

Automated test to check redundance fetch call: testRedundantFetchCall…

23158ea

…FromLogs

Added tests that confirm no data loss with avoiding second fetch

9b25935

Merge pull request #1 from jenkinsci/master

2cf858e

My fork is 91 commits behind master

Merge branch 'master' of https://github.com/jenkinsci/git-plugin

11a87d4

Merged because way behind in commits

Merge branch 'master' into JENKINS-49757

5d94f2e

Use equals rather than ==

1cfb539

Co-authored-by: Francisco Javier Fernandez <[email protected]>

Use assertThat and is for better msgs

e6ac2ed

Co-authored-by: Francisco Javier Fernandez <[email protected]>

Clarify assertion

eaded93

Co-authored-by: Francisco Javier Fernandez <[email protected]>

Simpler assertThat

21ead1e

Co-authored-by: Francisco Javier Fernandez <[email protected]>

Merge branch 'master' into JENKINS-49757

67ced0e

Fix compilation error

995d803

Fix redundant fetch test failure

ae036cc

The build is supposed to fail as the narrow refspec only fetches "foo" branch (honor initial refspec) and we are commiting in a branch which doesnt exist for the project, the master branch.

Fix assertRedundantFetchIsTrue by reducing the scope of fetch argumen…

3bc0059

…t to be searched in the build logs

rishabhBudhouliya added 2 commits June 12, 2020 09:10

Merge branch 'master' into CleanBeforeCheckout

57ab0d0

Update assertion of git fetch arg from build logs with pattern matching

05daa5e

MarkEWaite added the enhancement Improvement or new feature label Jun 12, 2020

rishabhBudhouliya added 4 commits June 17, 2020 08:48

Merge remote-tracking branch 'upstream/master'

d3a1e22

Merge branch 'master' into CleanBeforeCheckout

36e3e7d

Do not aggregate GitSCMExtension imports

b54faeb

fcojfernandez reviewed Jun 24, 2020

View reviewed changes

rishabhBudhouliya added 4 commits June 24, 2020 18:21

Merge branch 'master' of https://github.com/jenkinsci/git-plugin

79cf010

Merge branch 'master' into CleanBeforeCheckout

a38f1e2

Correct spacing in foreach loop

1be3697

Remove unused imports

15e2f52

fcojfernandez reviewed Jun 26, 2020

View reviewed changes

MarkEWaite added 8 commits June 27, 2020 18:03

Merge branch 'master' into CleanBeforeCheckout

7957017

Add missing import

f2fecc8

Fix imports

a7685db

Reduce differences to master branch

Run the git command to configure test repo

1623253

Use shallow clone randomly to improve coverage

09726f1

One of the branches was not being reached by tests prior to this change.

Place CloneOption import in its sorted location

c7b7ec6

Add test to check second fetch is used when needed

f7b77e6

Also reuses the existing assertion to provide a new assertion.

Mark the rc arg as NonNull, caller checks for null

534818a

Code was already assuming that argument is non-null. The annotation makes it explicit that the value must be non-null and can be checked by spotbugs that it is non-null.

rishabhBudhouliya commented Jun 28, 2020

View reviewed changes

Non breaking change: simplification of if-else clause

c2e097d

rishabhBudhouliya commented Jun 28, 2020

View reviewed changes

MarkEWaite approved these changes Jul 2, 2020

View reviewed changes

MarkEWaite mentioned this pull request Jul 2, 2020

[JENKINS-49757] Add flag to avoid redundant fetch in GitSCM checkout #845

Closed

MarkEWaite merged commit 3481beb into jenkinsci:master Jul 2, 2020

rishabhBudhouliya mentioned this pull request Jul 2, 2020

Add an "opt-out" global switch to retain second fetch #927

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JENKINS-49757] Remove redundant fetch #904

[JENKINS-49757] Remove redundant fetch #904

rishabhBudhouliya commented Jun 10, 2020 •

edited by MarkEWaite

Loading

rishabhBudhouliya commented Jun 10, 2020

rishabhBudhouliya commented Jun 10, 2020

rishabhBudhouliya commented Jun 11, 2020

MarkEWaite commented Jun 13, 2020

rishabhBudhouliya commented Jun 13, 2020

MarkEWaite commented Jun 13, 2020 via email •

edited

Loading

rishabhBudhouliya commented Jun 15, 2020

MarkEWaite commented Jun 15, 2020

rishabhBudhouliya commented Jun 24, 2020 •

edited

Loading

fcojfernandez Jun 24, 2020

rishabhBudhouliya Jun 24, 2020

fcojfernandez left a comment

fcojfernandez Jun 26, 2020

MarkEWaite Jun 28, 2020

rishabhBudhouliya Jun 28, 2020

fcojfernandez Jun 26, 2020

rishabhBudhouliya Jun 28, 2020

MarkEWaite commented Jun 28, 2020

rishabhBudhouliya Jun 28, 2020

MarkEWaite Jun 28, 2020

rishabhBudhouliya commented Jun 28, 2020 •

edited

Loading

rishabhBudhouliya Jun 28, 2020

MarkEWaite Jun 28, 2020

rishabhBudhouliya Jun 29, 2020

rishabhBudhouliya commented Jun 28, 2020

MarkEWaite commented Jun 28, 2020

MarkEWaite commented Jun 29, 2020

MarkEWaite left a comment

rishabhBudhouliya commented Jul 2, 2020

MarkEWaite commented Jul 2, 2020

[JENKINS-49757] Remove redundant fetch #904

[JENKINS-49757] Remove redundant fetch #904

Conversation

rishabhBudhouliya commented Jun 10, 2020 • edited by MarkEWaite Loading

Checklist

Types of changes

rishabhBudhouliya commented Jun 10, 2020

rishabhBudhouliya commented Jun 10, 2020

rishabhBudhouliya commented Jun 11, 2020

MarkEWaite commented Jun 13, 2020

rishabhBudhouliya commented Jun 13, 2020

MarkEWaite commented Jun 13, 2020 via email • edited Loading

rishabhBudhouliya commented Jun 15, 2020

MarkEWaite commented Jun 15, 2020

rishabhBudhouliya commented Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fcojfernandez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarkEWaite commented Jun 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rishabhBudhouliya commented Jun 28, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rishabhBudhouliya commented Jun 28, 2020

MarkEWaite commented Jun 28, 2020

MarkEWaite commented Jun 29, 2020

MarkEWaite left a comment

Choose a reason for hiding this comment

rishabhBudhouliya commented Jul 2, 2020

MarkEWaite commented Jul 2, 2020

rishabhBudhouliya commented Jun 10, 2020 •

edited by MarkEWaite

Loading

MarkEWaite commented Jun 13, 2020 via email •

edited

Loading

rishabhBudhouliya commented Jun 24, 2020 •

edited

Loading

rishabhBudhouliya commented Jun 28, 2020 •

edited

Loading