-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gitrepo: Refactor feature usage and fix tests #746
Conversation
7108fc8
to
d740f91
Compare
d740f91
to
cfc4daa
Compare
9d4bf67
to
48b74ec
Compare
@aryan9600 please rebase |
main.go
Outdated
) | ||
if err = managed.InitManagedTransport(ctrl.Log.WithName("managed-transport")); err != nil { | ||
setupLog.Error(err, "could not register managed transports") | ||
os.Exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces a hard stop when managed transport initialization fails. This is a new behavior compared to what we've done in the previous releases.
Highlighting so that we can discuss and decide if we really want to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but IMO it makes sense, since managed transports are the default now. So unless, users explicitly disable it, they would want to see the managed transports being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change removes the enabled
package variable from pkg/git/libgit2/managed/init.go which was set at the end of InitManagedTransport()
to indicate that managed transport was enabled successfully. But after removing that, we won't have any way to know if the managed transport is enabled or not, other than just trusting the feature flag values, which may not be accurate. This could be the reason to exit instead of continuing. But I don't know if we should fail or just continue running.
If we continue running, all the code that depends on managed transport need to not depend on the feature flag but have something like the Enabled()
function to help them know the reality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but IMO it makes sense, since managed transports are the default now. So unless, users explicitly disable it, they would want to see the managed transports being used.
@aryan9600 okay. Let's also have others' opinion about this since it's critical. @fluxcd/core-maintainers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that only the reconciler should persistently error, instead of it disabling the processing of all source kinds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind this as well, but then I think that raises the argument that is this error much different from an error returned while starting a reconciler through SetupWithManager()
? Asking because, we do error out if any of one the reconcilers fail to start. Now that managed transports are supposed to be the default and the intended way to checkout git repos, imo failing to register them, is equivalent to not being able to start the reconciler itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's another discussion to have. Given the recent insights on multi-tenancy (DoS) attack patterns, I think attempting to reconcile whatever we can (and possibly even making use of previous observed state backed by persistent storage) is more failure resistant than having the controller fail hard. It might be that we need to revise this in other areas as well, but within the context of this PR, I think this approach would be best.
controllers/suite_test.go
Outdated
fg := feathelper.FeatureGates{} | ||
fg.SupportedFeatures(map[string]bool{ | ||
features.GitManagedTransport: true, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing something here. The reconciler below uses the global features set in internal/features/features.go. And this is just passing a dynamically constructed map of features to the feature gates helper. Is this influencing those features in any way? Both the features we'd be interested in testing by default are already enabled in the default feature gates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetupWithManagerAndOptions
calls features.Enabled()
to check if a feature is enabled, which is just a wrapper around the Enabled()
in fluxcd/pkg/features
, the map with the features and the default values are just for reference in SC, they don't actually enable or disable any feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better if aligned with what we do at main.go
and simply rely on the existing default values, so as they change our tests would automatically capture/validate such changes.
fg := feathelper.FeatureGates{}
fg.SupportedFeatures(features.FeatureGates())
https://github.com/fluxcd/source-controller/blob/main/main.go#L153-L154
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisiting this, looking at the code again, I see that this does affect the default reconcilers that we run in the test environment. So, it affects how TestGitRepositoryReconciler_Reconcile()
behaves.
I'd support @pjbgf 's recommendation to do the same that's done in main.go to ensure we don't have different configurations in tests and main code.
// authOpts can't be nil | ||
err := registerManagedTransportOptions(context.TODO(), "", nil) | ||
g.Expect(err).To(HaveOccurred()) | ||
g.Expect(err.Error()).To(ContainSubstring("can't use managed transport with an empty set of auth options")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay, but just checking if error happened or not should also have been fine.
11608c6
to
cd25362
Compare
ControllerName string | ||
Storage *Storage | ||
ControllerName string | ||
ManagedTransportsRegistered bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name ManagedTransportsRegistered
may mislead folks that don't fully understand how managed transport works.
Given that transport registration happens at git2go
, and any registered transport can be overwritten, we can never confirm at this abstraction level that our Managed Transport is registered. What we can do is to define the intention that Managed Transport is to be enabled/used.
Maybe?
ManagedTransportsRegistered bool | |
UseManagedTransport bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though transport registration at git2go
, I think we can be confident about it since, if the transports fail to register, a non-nil error would be returned. I think it's safe to assume that if the error is nil, our transports have been registered. UseManagedTransport
implies the same thing as GitManagedTransport=true
, which could cause confusion, in terms of code readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this field reflects the state of the world, not a configuration but what happened when we tried to register. If it succeeded, it's true, else it's false. It's a way for main.go (almost a controller manager) to communicate with the reconciler that the transport registration was successful or failed. What we want is expressed in the feature gate, but the effective state of the system is reflected in this field.
It'd be good to have test coverage for the scenario this introduces. The checkout related tests that will behave differently should have a case for when the managed transport registration failed.
I'm observing discrepancy in the usage of the terms. The feature gate has it in singular form GitManagedTransport
, but this field makes it plural and it's also reflected in the associated error message ("they"). Based on my observations, we present this as one thing and don't provide granular options to register multiple managed transports. Another observation is that a lot of the code and comments @aryan9600 introduced has it in plural and the ones by @pjbgf use it in singular.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, based on this discussion, it may be better to have a description of this field, what it means. Unlike other public fields, the intention of this field may not be clear.
// feature is enabled, since the controller can't recover from this. | ||
if !r.ManagedTransportsRegistered { | ||
e := &serror.Stalling{ | ||
Err: errors.New("can't use managed transports because they are not registered"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the other point above, ideally we would keep terminology based on the level of abstraction we are operating at. In this case, if the feature gate is enabled, the controller should use managed transport or have it enabled or something around those lines.
Err: errors.New("can't use managed transports because they are not registered"), | |
Err: errors.New("invalid state: GitManagedTransport must be used when feature is enabled"), |
// We return a stalling error if managed transports aren't registered and the related | ||
// feature is enabled, since the controller can't recover from this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// We return a stalling error if managed transports aren't registered and the related | |
// feature is enabled, since the controller can't recover from this. | |
// We return a stalling error if managed transports is not being used when the related | |
// feature is enabled, as this is an invalid state. |
|
||
err = managed.InitManagedTransport(logr.Discard()) | ||
if err != nil { | ||
panic(fmt.Sprintf("Failed to register managed transports; %v", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
panic(fmt.Sprintf("Failed to register managed transports; %v", err)) | |
panic(fmt.Sprintf("failed to register managed transports: %v", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in alignment with the rest of the panic errors in this file.
I'd prefer to move all such code into a separate function so that we don't have to write so many panic calls, similar to what's done for setting up test OCI registry. But maybe separately.
var registered bool | ||
if enabled, _ := features.Enabled(features.GitManagedTransport); enabled { | ||
if err = managed.InitManagedTransport(ctrl.Log.WithName("managed-transport")); err != nil { | ||
setupLog.Error(err, "could not register managed transports") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setupLog.Error(err, "could not register managed transports") | |
setupLog.Error(err, "could not initialize managed transport") |
// Enabled defines whether the use of Managed Transport is enabled which | ||
// is only true if InitManagedTransport was called successfully at least | ||
// once. | ||
// | ||
// This is only affects git operations that uses libgit2 implementation. | ||
func Enabled() bool { | ||
return enabled | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given how this has evolved through the last few releases, I think this should be kept, but renamed to Initialized()
instead.
Without this we may not find a way to identify the corrupted state in which Managed Transport is enabled (feature gate) but not truly initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of removing this (and the related flag) was to avoid having multiple global states that refer to the same thing in general, since this can very easily cause hidden issues as the code grows. Making sure that there exists only one way to check if managed transports are enabled, i.e. the feature gate, avoids this. To get information about the initialisation of the transports, the reconciler has been updated to contain that info, which it can easily pass down to the methods it calls using CheckoutOptions
, which is more in line with the existing API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer @aryan9600's approach here because it makes it easier to test different scenarios without trying to find ways to change the global state. The ManagedTransportsRegistered
field in the gitrepo reconciler now allows easy testing of scenarios where managed transport initialization failed without affecting other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, managed.Enabled()
was introduced to work as the feature flag to switch the Managed Transport ON/OFF. It then gained a different meaning when feature gates were introduced, and unfortunately it was not renamed during that refactoring exercise.
This now represents that the initialisation process was successfully executed once during the lifetime of the controller.
It is important to highlight that this has a complete different meaning to the ManagedTransportsRegistered
, as the latter shows the intention to use Managed Transport. However, if those two values ever mismatch, that is an indication that we are operating at one of two possible corrupted states:
a) Managed Transport is wanted but was not initilised (e.g. errored during Init)
b) Unmanaged Transport was wanted, but Managed Transport is being used.
This could be specially useful on tests to ensure we are testing what we believe/want to be testing.
Without keeping this here (and renaming for something more appropriate), we have no way (that I can think of) to detect such corrupt state.
Unless we decide to initialise the Managed Transport on func init()
, which makes sense during Unmanaged transport deprecation, but probably not as part of the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I'm reading it wrong, but ManagedTransportsRegistered
is
This now represents that the initialisation process was successfully executed once during the lifetime of the controller.
with this change.
And based on https://github.com/fluxcd/source-controller/pull/746/files#diff-09c39f8050c8ddc85087f6347aa826007dd6cb4da2b0417f1221cc7a44498cb3R723-R729, in addition to ManagedTransportsRegistered
being true, GitManagedTransport
feature must be enabled to actually use it. Else, the reconciler puts the object into a stalled state, which prevents from entering into a corrupted state. It doesn't seem to show the intention to use managed transport, that's still based on the value of GitManagedTransport
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation seems to lack the opposite check
b) Unmanaged Transport was wanted, but Managed Transport is being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pjbgf I think there might me a misunderstanding. ManagedTransportsRegistered
is a way for the reconciler to know whether main()
was successful in registering/initializing our managed transports. If the GitManagedTransport
feature gate is enabled, then the reconciler can use this information check if it should go ahead and reconcile the object (if MangedTransportsRegistered=true
) or put the object in a stalled state (if ManagedTransportsRegistered=false
). The intention to use managed transports is only derived via the feature gate, since that's a decision that resides on users.
@darkowlzz the opposite check isn't required since, we only attempt to register our managed transports when the feature gate was enabled, i.e. if Unmanaged Transport was wanted, it's impossible for our managed transports to be registered, and thus consequently ManagedTransportsRegistered
is always false
.
if mt { | ||
r.features[features.GitManagedTransport] = true | ||
} | ||
// OptimizedGitClones is only enabled when GitManagedTransport is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// OptimizedGitClones is only enabled when GitManagedTransport is enabled. | |
// OptimizedGitClones is only supported when GitManagedTransport is enabled. |
controllers/suite_test.go
Outdated
fg := feathelper.FeatureGates{} | ||
fg.SupportedFeatures(map[string]bool{ | ||
features.GitManagedTransport: true, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better if aligned with what we do at main.go
and simply rely on the existing default values, so as they change our tests would automatically capture/validate such changes.
fg := feathelper.FeatureGates{}
fg.SupportedFeatures(features.FeatureGates())
https://github.com/fluxcd/source-controller/blob/main/main.go#L153-L154
Signed-off-by: Sanskar Jaiswal <[email protected]>
Signed-off-by: Sanskar Jaiswal <[email protected]>
Adds a new field `Managed` to `CheckoutOpts` to let checkout impls figure out when to use managed transport. Updates the reconciler to return a stalling error when managed transports are not registered but the feature is enabled. Signed-off-by: Sanskar Jaiswal <[email protected]>
52b1e7b
to
02a89ad
Compare
Closing in favor of #779 |
This PR is a follow up to #727. It refactors feature usage in the reconciler alongside the checkout code, by getting rid of
managed.Enabled()
. Instead, a new fieldManaged
inCheckoutOpts
is added which is used to check for managed transport inCheckout()
along withfeatures.Enabled()
in the reconciler, which lets us avoid having multiple global states for the same thing. If we can't register our managed transports before starting the reconciler, the reconciler will return aStalling
error.It also addresses various shortcomings related to GitRepository (and libgit2) tests. It arranges the tests in different files carefully, to influence the order in which golang executes the tests. All the unmanaged tests go in
checkout_test.go
, which is the first file in it's package (alphabetically speaking), which means those tests always run prior to any other test in that package.Related to #745.
Signed-off-by: Sanskar Jaiswal [email protected]