-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitRepositoryReconciler no-op clone improvements #724
Conversation
4ee6088
to
e0a121c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@darkowlzz great stuff again! Just found a few small nits here and there.
e0a121c
to
1b929a5
Compare
tested this in cluster while playing with |
Generic error is an attempt to avoid creating new error type for every new unique scenario. It can be used to configure and build custom error handling behavior, logging and event recording at present. Contextual errors, Stalling and Waiting error, have special meaning for the reconciliation results. But the Event error type can be replaced with Generic error with some specific configurations. The Event error is kept for a gradual migation to Generic error. Similarly, the Generic error can be used to easily create new error handling behaviors. The error Config can be used to configure any of the errors, including contextual errors, without altering their contextual meaning, to modify how they are handled. The error constructors configure the errors with common default configurations. These configurations can be modified to alter the behavior. Signed-off-by: Sunny <[email protected]>
ErrorActionHandler processes the reconciliation error results based on their configurations. It performs actions like logging and event recording based on the error configuration. More actions can be accommodated in the future with more error configurations. It can be a replacement for RecordContextualError() which does the same operations but can't be configured much. Signed-off-by: Sunny <[email protected]>
Add Generic error in RuntimeResultBuilder and ComputeReconcileResult implementation with consideration to the error configurations. Safeguards are added in the runtime result builder to ensure default requeue after interval is set when is's set to zero or unset. Signed-off-by: Sunny <[email protected]>
For gradual migration to Generic error, update only the GitRepo reconciler to use Generic error. Replace the Waiting error for git no change scenario with a Generic error with proper no-op, early return, error configurations. This ensures that the no-op only results in log and K8s native events at normal level. Fixes a reconciliation issue when recovering from a failure state (with previous success state and artifact in the storage) and optimized git clone feature is on, which results in failure to persist as the git optimization prevented full reconciliation due to already existing artifact and removal of failure negative conditions on the object status. In order to allow failure recovery, the git clone optimizations are now only applied when the object is already in a ready state. Signed-off-by: Sunny <[email protected]>
Introduce a new field in the GitRepositoryReconciler to set the enabled features. This makes it test friendly compared to using global flags for setting and checking flags in the tests. Enable default feature gates in all the GitRepo reconciler tests. Add test cases for reconcileSource() to test the behavior of optimized git clone when the Repo is ready and not ready. This ensures that the full reconciliation is not skipped when GitRepo is not ready. Signed-off-by: Sunny <[email protected]>
Introduce concrete and partial commits. Concrete commits have all the information from remote including the hash and commit content. Partial commits are based on locally available copy of a repo, they may only contain the commit hash and reference. IsConcreteCommit() can be used to find out if a given commit is based on local information or full remote repo information. Update go-git and libgit2 branch/tag clone optimization to return a partial commit and no error. Update and simplify the go-git and libgit2 tests for the same. Signed-off-by: Sunny <[email protected]>
Introduce contentConfigChecksum in the GitRepository.Status to track the configurations that affect the content of the artifact. It is used to detect a change in the configuration that requires rebuilding the whole artifact. This helps skip the reconciliation early when we find out that the remote repository has not changed. Moves fetching the included repositories in reconcileSource() to collect enough information in reconcileSource() to be able to decide if the full reconciliation can be skipped. This results in reconcileInclude() to just copy artifact to the source build directory. Introduce a gitCheckout() method to perform construction of all the git checkout options and perform the checkout operation. This helps to easily perform checkout multiple times when we need it in reconcileSource(). When we check with the remote repository if there's an update, and find out that there's no update, we check if any other configurations that affect the source content has changed, like includes, ignore rules, etc. If there's a change, we need to perform a full checkout of the remote repository in order to fetch the complete source. The git checkout no-op optimization is enabled in this method based on the presence of an artifact in the storage. The failure notification handler is modifed to handle the recovery of a no-op reconcile failure and create a notification message accordingly with the partial commit. Signed-off-by: Sunny <[email protected]>
1b929a5
to
581695b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@darkowlzz great work! thanks for making no-op clones feasible! 🙇 🙇♂️ 🙇
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks solid to me, thanks @darkowlzz 🙇
The no-op git clone implementation resulted in some bugs in the GitRepository reconciler due to the way it's currently integrated in the reconciliation loop. Skipping the reconciliation in the middle based on no update in the remote repository ignores the other factors that contribute to the source artifact like recursive submodules, ignore rules and included repositories, ignore rules, included repositories, etc. This causes inconsistent behavior and sometimes persistent reconciliation failures.
This change restructures the GitRepository reconciler to be able to correctly determine if the reconciliation can be skipped and updates all the other components affected by this change in behavior like the reconciliation result builder and processor, recovery notifications, etc.
Errors
It introduces a new error configurations for the reconciliation error abstractions to add action information in those errors. There are both contextual and actionable errors in reconciliation errors. Certain scenarios require an error to be both contextual and actionable. The error config helps achieve that by embedding action related configurations in the errors. Using this, a user of an error can configure if the error should result in logging or event notification or just ignored. For ease of use, the reconciliation errors now have constructors with default configurations for their expected behavior in most of the cases. Some of the error configurations may not always be effectual for contextual errors due to the contextual meaning. For example, the Stalling contextual error with ignore option should not result in the error being ignored. For ignorable no-op error result, a non-contextual error is more appropriate.
It also introduces a
Generic
error type which can be used to replace the existing non-contextual errors like the Event error. Generic error can be configured to replace Event error and also add more custom behaviors based on the requirement of the caller.Error action handler
Since the error configurations now allow define action based configurations in the errors, a new reconciliation result processor
ErrorActionHancler
is introduced to handle the error configurations accordingly. It implements the result processor and can replace the existing error processors.Runtime results
The runtime result builder has been updated to identify Generic errors which may have an ignore option for a no-op result.
Generic ignorable errors do result in status observed generation to be updated.
Git packages
The Git implementations have been updated to return a partial commit when it's a no-op clone. The partial commit contains only the hash and reference of a commit, since we don't have full commit information without cloning again. A concrete commit contains all the information related to a commit.
GitRepository Reconciler
The GitRepositoryReconciler's
reconcileSource
is updated to fetch all the information required to determine if the reconciliation can be skipped. This includes fetching the metadata of all the included repositories and comparing them with the previously seen included repositories.A new status field
contentConfigChecksum
is introduced to calculate a checksum of all the configuration values that contribute to a change in the source artifact.When no-op git clone is enabled, the remote repo is checked for an update and a partial commit is returned if there's no update. If the
contentConfigChecksum
hasn't changed, the reconciliation is skipped. But if thecontentConfigChecksum
has changed, another git checkout is performed for a concrete commit to perform a full reconciliation to build a new artifact.All the Event error in the reconciler are replaced with Generic error. Other reconcilers can adopt Generic error gradually, both the errors work for now.
All the GitRepositoryReconciler tests have been updated to use the feature gates.
NOTE: This change updates the GitRepository CRD with a new field which should be updated before any testing.