-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-49757] Remove redundant fetch #904
[JENKINS-49757] Remove redundant fetch #904
Conversation
My fork is 91 commits behind master
Merged because way behind in commits
Co-authored-by: Francisco Javier Fernandez <[email protected]>
Co-authored-by: Francisco Javier Fernandez <[email protected]>
Co-authored-by: Francisco Javier Fernandez <[email protected]>
Co-authored-by: Francisco Javier Fernandez <[email protected]>
The build is supposed to fail as the narrow refspec only fetches "foo" branch (honor initial refspec) and we are commiting in a branch which doesnt exist for the project, the master branch.
The concern of removing the second fetch call is a possible repository data loss or misconfiguration in terms of the extra behaviors applied during checkout. If the second fetch call is ignored, CleanBeforeCheckout option will also be ignored as it doesn't implement the decorateCloneCommand which is used by the git fetch call. To ensure that it is not ignored, decorateCloneCommand has been implemented for this particular extension. The unit test GitSCMTest#testCleanBeforeCheckout doesn't fail after the removal of second fetch call now.
@MarkEWaite @fcojfernandez I created a separate PR for this as I think this addition will require separate discussion. Also, I was thinking of creating separate branches cut from the fix branch (JENKINS-49757) which will represent the changes related to fix in order to achieve a non-breaking solution. |
Also, I have to add unit tests confirming the before and after behavior because of this change. |
…t to be searched in the build logs
The build is failing for a windows instance while it is passing on my local machine and linux instances of ci.jenkins.io. Currently, I can't figure out the reason 4 test cases are failing. Any pointers on this would greatly help me. |
@rishabhBudhouliya thanks for your work on this one! While staring at the code in the debugger, I realized that the problem is in the existing The test is incorrectly asserting that there should be a clean on the first build. A clean on the first build is not needed because the first build is assured of an empty workspace. The git plugin even has code which will delete files in the workspace before the first build. The test should assert that there should be a clean on the second build. The test failed on Windows for a particularly disconcerting reason. The CleanBeforeCheckout extension was being called before the first clone. Before the first clone, there is no git repository in the workspace. The call to It was even more complicated because the tests run in the We may want to consider setting See my proposed change in https://github.com/MarkEWaite/git-plugin/commits/CleanBeforeCheckout |
I'm surprised on how I totally missed the very behavior of this extension! There's still one thing I don't understand. Why is this happening to Windows alone? If I understand that it searches for a git repository in one of the parent directories and finds them, does it clean the working tree of that particular repository, if it does, why not in the case of Windows? |
There's still one thing I don't understand. Why is this happening to Windows alone? If git clean -xffd is being called before the first clone, it is bound to create an error on any platform?
The `git clean -xffd .` fails on windows because it is not allowed to remove a busy directory. Linux and mac allow removing a busy directory.
If we had been running the tests in a directory like `/tmp` where there is no parent directory with a git repository, then the `git clean -xffd .` would have failed differently. The failure would still have been silently hidden by the `build()` step in the test, but it would have been a different failure mode.
|
Oh, I see. Adding Would you be willing to merge this commit to master or would you like me to add it to this PR? Both works fine for me. |
This PR still needs the fixes that I proposed in https://github.com/MarkEWaite/git-plugin/commits/CleanBeforeCheckout . Even with those fixes, I'm not ready to merge this until after git client plugin 3.3.0 has been released with JGit 5.8.0 and the other changes that are in the master branch. I hope to release git client plugin 3.3.0 within the next week or 10 days so that it is available near the release of Jenkins 2.235.1 LTS. |
The reason to add a "clean" before checkout behaviour is to clean the working directory of a repository before checking out a branch. In this case, the remote repository is being cloned for the first time which means there is no local git repository.
First build is known to have a clean workspace so there is no need to clean it before checkout. Second build should clean because the workspace might be cluttered with extra files from the first build. Also removes several useless operations in the test (related to buildLog)
@MarkEWaite @fcojfernandez As discussed on the Gitter channel, I have added a check on refspec and CloneOption to ensure user provided refspec is not missed by the new fix. I have also interactively tested the new commit, I will share my findings during the GSoC weekly sync. |
for (RefSpec ref:initialFetchRefSpecs) { | ||
if (!ref.toString().contains("refs/heads")) { | ||
isDefaultRefspec = false; // if refspec is not of default type, preserve second fetch | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If only one of the elements does not contain refs/heads
then you are marking that is not a defaultRefspec. Could you explain what is your intention here when there is more than one element? Sorry but it's a bit unclear for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If honor refspec for initial clone is disabled, the first clone is going to fetch all the branches and references related to them. All of those refspecs contain "refs/heads/xx" in the : format of refspec.
My assumption is that the given refspec contains ""refs/heads" this pattern, there is no need to call the second fetch in any case.
In the case of a narrow refspec which contains any references other than of branches, we will allow the second fetch call to take place, i.e, we will not avoid the redundant fetch call.
For an example, if honorRefSpec == false and refspec = "+refs/heads/master:refs/remotes/origin/master +refs/pull/553/head:refs/remotes/origin/pull/553"
, we can safely assume that this is not the default refspec because we have a "refs/pull" here.
In the case of finding multiple refspecs, if any refspec is other than the default format pattern, the code assumes that it is worth fetching with the second fetch call because it may have been missed by the first/initial clone.
src/main/java/hudson/plugins/git/extensions/impl/CleanBeforeCheckout.java
Outdated
Show resolved
Hide resolved
src/main/java/hudson/plugins/git/extensions/impl/CleanBeforeCheckout.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If honor refspec for initial clone is disabled, the first clone is going to fetch all the branches and references related to them. All of those refspecs contain "refs/heads/xx" in the : format of refspec.
My assumption is that the given refspec contains ""refs/heads" this pattern, there is no need to call the second fetch in any case.
In the case of a narrow refspec which contains any references other than of branches, we will allow the second fetch call to take place, i.e, we will not avoid the redundant fetch call.
For an example, if honorRefSpec == false and refspec = "+refs/heads/master:refs/remotes/origin/master +refs/pull/553/head:refs/remotes/origin/pull/553", we can safely assume that this is not the default refspec because we have a "refs/pull" here.
In the case of finding multiple refspecs, if any refspec is other than the default format pattern, the code assumes that it is worth fetching with the second fetch call because it may have been missed by the first/initial clone.
Thanks for the example! Now I understand well your code :)
} | ||
} | ||
// if initial fetch refspec contains "refs/heads/*" (default refspec), ignore the second fetch call | ||
return removeSecondFetch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if initialFetchRefSpecs
is null, then you're removing the second fetch. At this moment, I cannot think about a situation when it will be null, but ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test that passed in a null refspec but the UserRemoteConfig constructor converts nulls to empty strings. I think that is unreachable code due to the UserRemoteConfig use of fixEmpty()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to my assumptions, the need to add the null check was to check for empty refspecs, as Mark as correctly pointed out, the fixEmpty()
converts any empty refspec into a null refspec.
If the UserRemoteConfig was not fixing empty refspecs, the logic presented by me would gladly pass empty
refspecs as a narrow refspec and will not avoid the second fetch. But since it is there, it should not be a concern for us.
if (!option.isHonorRefspec()) { | ||
removeSecondFetch = isDefaultRefspec; | ||
} else { | ||
removeSecondFetch = true; // avoid second fetch call if honor refspec is enabled | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if clause didn't help me to see clearer what you meant. While having a "!" in an if
clause is perfectly fine, the "!" in the if-else
might cause confusion. I would have expected
if something is true
sentence1
else
sentence2
while this piece of code is
if something is not false
sentence2
else
sentence1
My personal preference is to have the first approach in terms of readability. But it's just a matter of personal taste, so take this advice as it is, just an advice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great suggestion @fcojfernandez. I have added this in c2e097d.
Reduce differences to master branch
One of the branches was not being reached by tests prior to this change.
Also reuses the existing assertion to provide a new assertion.
Code was already assuming that argument is non-null. The annotation makes it explicit that the value must be non-null and can be checked by spotbugs that it is non-null.
@rishabhBudhouliya I was able to spend more time working through the code in a debugger to assure that the tests covered the exisitng branches in the new code. I added two new tests to increase the coverage of the cases we had detected during interactive testing. I've confirmed that the UserRemoteConfig constructor will convert a null refspec into a non-null value. However, I did not feel confident enough in all paths to justify removing the null handling in the I consider this code ready to merge once the CI jobs confirm it passes tests. I'll plan to merge it within the next 24 hours unless someone else raises an objection. |
|
||
/* Create a ref for the fake pull in the source repository */ | ||
String[] expectedResult = {""}; | ||
CliGitCommand gitCmd = new CliGitCommand(testRepo.git, "update-ref", "refs/pull/553/head", "HEAD"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While writing the code, I was thinking how would I proceed with writing a unit test for the same. The solution I had in my mind was to actually clone a remote the git client repository and fetch a pull request reference.
I didn't imagine to update the reference to point to a pull request. Thanks for showing me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I hadn't discovered the technique until just recently myself. Plenty to learn for all of us.
@MarkEWaite Thank you for adding two very important test cases. I have reviewed the changes and applied a small change in the While I couldn't write a test case like
Test#1 -> Test#2 -> Test#3 ->
|
if (random.nextBoolean()) { | ||
/* Randomly enable shallow clone, should not alter test assertions */ | ||
CloneOption cloneOptionMaster = new CloneOption(false, null, null); | ||
cloneOptionMaster.setDepth(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fail to understand, why would need a shallow clone to improve coverage when we already have a full normal clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. One branch was not being reached when a cloneOption extension is detected but does not have honor refspec enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I didn't consider other CloneOptions. If honor refspec is false, with avoiding the second fetch, we will also avoid other clone options like Shallow Clone, Disable tags, Reference a repo and timeout.
The good news is that the first fetch is capable to execute all of these options if we miss the second fetch. It should not make any difference to the user's expectation on git repository information after the checkout.
@MarkEWaite https://ci.jenkins.io/blue/organizations/jenkins/Plugins%2Fgit-plugin/detail/PR-904/25/pipeline this build is failing for a linux instance for some reason I can't figure out. The maven build has succeeded. |
The ci.jenkins.io agents are provisioned by the EC2 plugin. Either the EC2 plugin or the agents have a reliability issue. They seem to randomly disconnect. We have a reconnect script that runs once a minute, but that is just a temporary technique until we can spend more time to identify and resolve the root problem. |
That's great. I've built a version of the git plugin and placed it into my test infrastructure. The publicly visible portion of the test infrastructure is in my docker-lfs repository on the lts-with-plugins branch. I had an unexpected set of pauses and stops from about 12:00 to 8:00 AM my time. GitHub was having some issues during that time, but I can't tell if that was related. I'll run additional tests in that environment over the next day or two. I think you're doing exactly the right thing with your focus on preparing for your presentation this week. I hope to have completed my exploratory testing by the end of my working day tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to add an "opt-out" global switch before release, but this has passed all my testing and multiple code reviews. Thanks @rishabhBudhouliya
Thanks @MarkEWaite for merging the fix, I will work on the opt-out switch on priority and hopefully raise a pull request soon. |
In case it helps, refer to #924 as a good example |
Pull Request-845
The concern of removing the second fetch call is a possible repository data loss or
misconfiguration in terms of the extra behaviors applied during checkout.
If the second fetch call is ignored, CleanBeforeCheckout option will also be ignored
as it doesn't implement the decorateCloneCommand which is used by the git fetch call.
To ensure that it is not ignored, decorateCloneCommand has been implemented for this
particular extension. The unit test GitSCMTest#testCleanBeforeCheckout doesn't fail
after the removal of second fetch call now.
Checklist
Types of changes