Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Refactor: Don't use infrav1.Instance internally #971

Merged
merged 1 commit into from
Aug 20, 2021

Conversation

mdbooth
Copy link
Contributor

@mdbooth mdbooth commented Aug 16, 2021

This is predominantly a non-functional change. It contains some minor
changes to log messages, and omits a check for instance name being empty
prior to delete in OpenStackMachine controller because this check is no
longer relevant when we delete by ID. Apart from that it contains no
(deliberate) functional changes.

Currently we use infrav1.Instance as our internal representation of an
OpenStack instance. There are several problems with this:

  • infrav1.Instance is part of the API, so is hard to change for internal
    use
  • infrav1.Instance is used to represent both the 'spec' and 'status' of
    an instance
  • When used as a spec for the Bastion host, it allows fields to be set
    which will be ignored
  • When used as the status of an instance fetched from OpenStack not all
    fields are populated, leading to potential accidental usage errors

This change creates several new types representing different types of
instance data purely for internal use. This has the following
advantages:

  • As they are not API types they can be updated easily
  • When used in a function signature they make it clear what data is
    being used
  • The type system will prevent accidental use of uninitialised data

The new types are:

  • InstanceSpec
  • InstanceIdentifier
  • InstanceStatus
  • InstanceNetworkStatus

They are defined and described in instance_types.go.

The primary driver for this work is to eventually fix the documented
errors in InstanceNetworkStatus in IP() and FloatingIP().

What this PR does / why we need it:

This is a refactor in support of a future change to fix #926. I would like to merge these in separate PRs to separate potential errors introduced by a refactor from the functional changes required to fix the issue.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):

Special notes for your reviewer:

I suggest the easiest way to read this change is:

  • Read the commit message
  • Read instance_types.go
  • Everything else is pretty much a mechanical consequence

/hold

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 16, 2021
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Aug 16, 2021
@@ -196,24 +196,34 @@ func deleteBastion(log logr.Logger, osProviderClient *gophercloud.ProviderClient
return err
}

instance, err := computeService.GetInstanceByName(openStackCluster, fmt.Sprintf("%s-bastion", cluster.Name))
instanceStatus, err := computeService.GetInstanceByName(openStackCluster, fmt.Sprintf("%s-bastion", cluster.Name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we need rename to GetInstanceStatusByName?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to leave it for now because it's already getting a bit long, but it's easy to change if we decide it's confusing. I think it should be ok, though, because now everything returns InstanceStatus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for rename it to GetInstanceStatusByName

@@ -0,0 +1,171 @@
/*
Copyright 2018 The Kubernetes Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit : 2021

}

// APIInstance returns an infrav1.Instance object for use by the API.
func (is *InstanceStatus) APIInstance() (*infrav1.Instance, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we have more param for Metadata, ConfigDrive etc for Infrav1.Instance
will it be inited here or somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the intention, although I don't currently think we'll add any paramaters to APIInstance() specifically. We will almost definitely add parameters to NetworkStatus(), though.

@hidekazuna
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2021
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 19, 2021
@mdbooth
Copy link
Contributor Author

mdbooth commented Aug 19, 2021

I've rebased on to master and fixed the copyright nit. No other changes.

@jichenjc
Copy link
Contributor

/approve

ok, I think we can do additional update over time, so far it looks great

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jichenjc, mdbooth

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 20, 2021
Copy link
Member

@tobiasgiese tobiasgiese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a few nits. But all in all it looks good :)

@@ -196,24 +196,34 @@ func deleteBastion(log logr.Logger, osProviderClient *gophercloud.ProviderClient
return err
}

instance, err := computeService.GetInstanceByName(openStackCluster, fmt.Sprintf("%s-bastion", cluster.Name))
instanceStatus, err := computeService.GetInstanceByName(openStackCluster, fmt.Sprintf("%s-bastion", cluster.Name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for rename it to GetInstanceStatusByName

Comment on lines 204 to 202
instanceNS, err := instanceStatus.NetworkStatus()
if err != nil {
return err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should move this block inside the instanceStatus != nil condition, as it's only used there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also have been a bug!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil pointer incoming :)

Comment on lines 210 to 206
floatingIP := instanceNS.FloatingIP()
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't define err here, you can remove the err handling as no error will be returned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. FloatingIP() returned an error in an earlier version and I missed this!

Comment on lines 238 to 241
ns, err := instance.NetworkStatus()
if err != nil {
handleUpdateMachineError(logger, openStackMachine, errors.Errorf("error getting network status for OpenStack instance %s with ID %s: %v", instance.Name(), instance.ID(), err))
return ctrl.Result{}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the current status here? If yes, you should call GetInstanceByName (or GetInstanceStatusByName) again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also instance should be called instanceStatus :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to fetch again here. Were you thinking of some specific reason we might need to? We'd potentially race with the prior delete, so it might even be a bug to re-fetch.

Copy link
Member

@tobiasgiese tobiasgiese Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we delete the instance and then check the status of the instance? Shouldn't we first check the status (i.e., to get the fips) and if that fails, we don't want to delete the instance? Otherwise we could have orphaned fips, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're creating a InstanceNetworkStatus here so we can get the FloatingIP. This doesn't go back to OpenStack; it's just doing some extra work on the previously returned server object. The only reason it can fail is json parsing. All we need to know is which FloatingIP to delete.

The previous code also had this failure mode, but it was in the return of DeleteInstance()->GetInstance()->serverToInstance()->GetIPFromInstance(). Moving that call up to the 'top level' in the controller is the primary purpose of this refactor, because it means we can pass additional arguments to NetworkStatus() so we can have more context. I don't think it currently matters for FloatingIP(), for for IP() it means we can pass OpenStackCluster as an argument here and use the additional context to determine the correct 'primary' IP from those returned by OpenStack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally, for robustness we should probably reverse the order of these deletions. The reason is that if deletion of the server succeeds but the deletion of the floating ip fails, the next reconcile will short-cut because there's no server and we'll leak the floating ip.

This patch is just a refactor, though, so I don't want to fix that here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly the point I was worried about. We should first delete the floating IPs and the instance right after the successful deletion.
But okay for me to fix that after this refactor PR.

Further, we could maybe improve the error message for the failed json parsing. error getting network status for OpenStack instance is a bit misleading, tbh.

Copy link
Member

@tobiasgiese tobiasgiese Aug 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And starting errors with error is redundant. But we have to fix that in the complete repo anyway

Copy link
Contributor Author

@mdbooth mdbooth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for a thorough review, @tobiasgiese! I'm not in a hurry to land this as I still haven't written the fix for the actual multi-network bug, so I'd prefer to address these before merging.

With 2 people thinking I should rename the GetInstance functions I'll do that, too.

Comment on lines 210 to 206
floatingIP := instanceNS.FloatingIP()
if err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. FloatingIP() returned an error in an earlier version and I missed this!

Comment on lines 238 to 241
ns, err := instance.NetworkStatus()
if err != nil {
handleUpdateMachineError(logger, openStackMachine, errors.Errorf("error getting network status for OpenStack instance %s with ID %s: %v", instance.Name(), instance.ID(), err))
return ctrl.Result{}, nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to fetch again here. Were you thinking of some specific reason we might need to? We'd potentially race with the prior delete, so it might even be a bug to re-fetch.

@tobiasgiese
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2021
@tobiasgiese
Copy link
Member

/lgtm cancel

waiting for the findings to be updated

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2021
This is predominantly a non-functional change. It contains some minor
changes to log messages, and omits a check for instance name being empty
prior to delete in OpenStackMachine controller because this check is no
longer relevant when we delete by ID. Apart from that it contains no
(deliberate) functional changes.

Currently we use infrav1.Instance as our internal representation of an
OpenStack instance. There are several problems with this:

* infrav1.Instance is part of the API, so is hard to change for internal
  use
* infrav1.Instance is used to represent both the 'spec' and 'status' of
  an instance
* When used as a spec for the Bastion host, it allows fields to be set
  which will be ignored
* When used as the status of an instance fetched from OpenStack not all
  fields are populated, leading to potential accidental usage errors

This change creates several new types representing different types of
instance data purely for internal use. This has the following
advantages:

* As they are not API types they can be updated easily
* When used in a function signature they make it clear what data is
  being used
* The type system will prevent accidental use of uninitialised data

The new types are:
* InstanceSpec
* InstanceIdentifier
* InstanceStatus
* InstanceNetworkStatus

They are defined and described in `instance_types.go`.

The primary driver for this work is to eventually fix the documented
errors in InstanceNetworkStatus in IP() and FloatingIP().
@mdbooth
Copy link
Contributor Author

mdbooth commented Aug 20, 2021

That change is intended to address all review comments. Please check!

@tobiasgiese
Copy link
Member

/lgtm
👍🏻

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2021
@jichenjc
Copy link
Contributor

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 20, 2021
@k8s-ci-robot k8s-ci-robot merged commit ed6893b into kubernetes-sigs:master Aug 20, 2021
@mdbooth mdbooth deleted the instance_status branch November 5, 2021 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Instance.IP is set randomly when a server has multiple networks
5 participants