-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds the LVMVolumeGroup CRD and processing logic #72
Conversation
return lvmvgName | ||
} | ||
|
||
func (c lvmVG) ensureCreated(r *LVMClusterReconciler, ctx context.Context, lvmCluster *lvmv1alpha1.LVMCluster) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Not related to this PR). Just realised the context
should usually be the first argument of the function.
cmd/vgmanager/main.go
Outdated
Scheme: mgr.GetScheme(), | ||
Client: mgr.GetClient(), | ||
Scheme: mgr.GetScheme(), | ||
Nodename: os.Getenv("NODE_NAME"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need Nodename
in the VGReconciler
object ? It s used only when we update the status
. So we can directly call os.Getenv("NODE_NAME")
over there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a problem with saving it. Similar to how we save the namespace elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. name should be NodeName
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -46,7 +46,7 @@ const ( | |||
func (r *VGReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
r.deviceAgeMap = newAgeMap(&wallTime{}) | |||
return ctrl.NewControllerManagedBy(mgr). | |||
For(&lvmv1alpha1.LVMCluster{}). | |||
For(&lvmv1alpha1.LVMVolumeGroup{}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will creation of a new LVMVolumeGroup
trigger controller reconcile for all the other existing LVMVolumeGroups?
If yes, then I think we should avoid that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the request will have the name of the object it was triggered for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor things related with the use of Caps letters in log messages
cmd/vgmanager/main.go
Outdated
Scheme: mgr.GetScheme(), | ||
Client: mgr.GetClient(), | ||
Scheme: mgr.GetScheme(), | ||
Nodename: os.Getenv("NODE_NAME"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is set this env. variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here.
controllers/lvm_volumegroup.go
Outdated
r.Log.Info("LVMVG deleted", "Name", vgcr.Name) | ||
return nil | ||
} | ||
r.Log.Error(err, "failed to retrieve LVMVG resource", "Name", vgcr.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
":s/Name/name" in log lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Name is the fieldname. shouldn't that be capitalized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not consistent through out. Both name
and Name
are used in the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Changing it to "name".
controllers/lvm_volumegroup.go
Outdated
if err != nil { | ||
// already deleted in previous reconcile | ||
if errors.IsNotFound(err) { | ||
r.Log.Info("LVMVG deleted", "Name", vgcr.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
":s/Name/name"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
controllers/lvm_volumegroup.go
Outdated
// if not deleted, initiate deletion | ||
if existingLvmVg.GetDeletionTimestamp().IsZero() { | ||
if err = r.Client.Delete(ctx, existingLvmVg); err != nil { | ||
r.Log.Error(err, "failed to delete LVMVG", "Name", existingLvmVg.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
controllers/lvm_volumegroup.go
Outdated
r.Log.Error(err, "failed to delete LVMVG", "Name", existingLvmVg.Name) | ||
return err | ||
} else { | ||
r.Log.Info("initiated LVMVG deletion", "Name", existingLvmVg.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
err := r.Client.Get(ctx, req.NamespacedName, lvmCluster) | ||
|
||
var status string = "available" | ||
var reason string = "Success" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
":/s/Success/success"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status strings are usually capitalized.
var matchingDevices []internal.BlockDevice | ||
_, matchingDevices, err = filterMatchingDevices(remainingValidDevices, volumeGroup) | ||
if err != nil { | ||
r.Log.Error(err, "could not filter matching Devices", "VGName", volumeGroup.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
":S/Devices/devices"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
reason = "failed to create VG" | ||
} else { | ||
status = "degraded" | ||
reason = "Failed to extend VG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
":S/Failed/failed"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status messages are usually capitalized.
Eg. This is a condition in a deployment status:
- lastTransitionTime: "2022-01-14T13:48:03Z"
lastUpdateTime: "2022-01-14T13:48:03Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
controllers/lvm_volumegroup.go
Outdated
return nil | ||
} | ||
|
||
func (c lvmVG) getLvmVolumeGroups(r *LVMClusterReconciler, instance *lvmv1alpha1.LVMCluster) ([]*lvmv1alpha1.LVMVolumeGroup, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only pass the namespace
instead of entire LVMClusterReconciler
, if we only plan to use the namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to address that in a separate PR.
|
||
deviceClasses := instance.Spec.DeviceClasses | ||
for _, deviceClass := range deviceClasses { | ||
lvmVolumeGroup := &lvmv1alpha1.LVMVolumeGroup{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we decide which deviceClass
is default
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will need to be changed in the CRD design. I will take that up in a separate PR.
controllers/lvm_volumegroup.go
Outdated
lvmVolumeGroups = append(lvmVolumeGroups, lvmVolumeGroup) | ||
} | ||
return lvmVolumeGroups, nil | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
controllers/lvm_volumegroup.go
Outdated
) | ||
|
||
const ( | ||
lvmvgName = "lvmvg-manager" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lvmvgName = "lvmvg-manager" | |
lvmVGName = "lvmvg-manager" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
controllers/lvm_volumegroup.go
Outdated
}) | ||
|
||
if err != nil { | ||
r.Log.Error(err, "LvmVolumeGroup reconcile failure", "name", volumeGroup.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency
r.Log.Error(err, "LvmVolumeGroup reconcile failure", "name", volumeGroup.Name) | |
r.Log.Error(err, "failed to reconcile LvmVolumeGroup", "name", volumeGroup.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
controllers/lvm_volumegroup.go
Outdated
|
||
for _, vgcr := range vgcrs { | ||
err := r.Client.Get(ctx, types.NamespacedName{Name: vgcr.Name, Namespace: vgcr.Namespace}, existingLvmVg) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
controllers/lvm_volumegroup.go
Outdated
// already deleted in previous reconcile | ||
if errors.IsNotFound(err) { | ||
r.Log.Info("LVMVG deleted", "Name", vgcr.Name) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we continue
here rather than returning nil ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
controllers/lvm_volumegroup.go
Outdated
return nil | ||
} | ||
|
||
func (c lvmVG) getLvmVolumeGroups(r *LVMClusterReconciler, instance *lvmv1alpha1.LVMCluster) ([]*lvmv1alpha1.LVMVolumeGroup, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getLvmVolumeGroups
does not return any error. Should remove error
from the return types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
controllers/lvm_volumegroup.go
Outdated
if err != nil { | ||
// already deleted in previous reconcile | ||
if errors.IsNotFound(err) { | ||
r.Log.Info("LVMVG deleted", "Name", vgcr.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logs should be more consistent. ensureDeleted
uses LVMVG
in the log messages while ensureCreated
uses LvmVolumeGroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if err = r.Client.Delete(ctx, existingLvmVg); err != nil { | ||
r.Log.Error(err, "failed to delete LVMVG", "Name", existingLvmVg.Name) | ||
return err | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else
block can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
found = true | ||
nodeStatus.Status = status | ||
nodeStatus.Reason = reason | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
break
for loop once NodeName
is found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
nodeStatus.Reason = reason | ||
} | ||
} | ||
if !found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will update the Status
when the NodeName
was found
but it has different Status
or Reason
. So we should add status if NodeName not found and update status if NodeName is found but its Status
or Reason
are different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't understand this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are only updating the instance.Status.NodeStatus
in line 307, inside the if !found
block. What if found
is true
but Status
and/or Reason
is different. In this case, the Status
and Reason
will never be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The status is being updated in the original array if the node is found. append should create a new slice which is why it is being assigned again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it.
VolumeGroup: deviceClass.Name, | ||
Default: true, | ||
}) | ||
status = "unavailable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status
should be used as constants (enums) so that its easier to know what all statuses are available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be taken up in a different PR.
The LVM operator will create individual LVMVolumeGroup objects for the deviceClasses defined in the LVMCluster resource. Signed-off-by: N Balachandran <[email protected]>
This commit adds the LVMVolumeGroup CRD and rbac permissions for the LVM controller. Signed-off-by: N Balachandran <[email protected]>
|
||
func (r *VGReconciler) updateStatus(ctx context.Context, status string, reason string, instance *lvmv1alpha1.LVMVolumeGroup) error { | ||
|
||
found := false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still have a situation where multiple controllers will be updating the same LVMVolumeGroup
status, right?
If that's the case, should we first get the latest LVMVolumeGroup
instance before verifying the NodeStatus
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may still NOT* be enough to ensure that we are using the updated instance of LVMVolumeGroup
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not the latest version, I expect the update to fail. I will confirm after testing on a multi node cluster.
Default: true, | ||
}) | ||
status = "unavailable" | ||
reason = "failed to create VG" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reason
for unavailable
starts with small f
, while that of degraded
starts with capital F
|
||
lvmVolumeGroups := c.getLvmVolumeGroups(r, lvmCluster) | ||
|
||
for _, volumeGroup := range lvmVolumeGroups { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating the LVMVolumeGroup CRs from the LVMCluster CR is still a design concern. The admin should create the LVMVolumeGroup CRs directly. Not a blocker for feature complete, but we really need to discuss this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. The admin should create the LVMCluster CR and the controller should create the LVMVolumeGroup CRs. The intention is to have a single CR for the controller to create the internal pieces from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still haven't seen a design backing this decision, so my position remains to disagree with it so we can follow standard K8s patterns.
This commit adds the LVMVolumeGroupNodeStatus CRD and related files. It also removes the NodeStatus from the LVMVolumegroup.Status as that is now part of LVMVolumeGroupNodeStatus. Signed-off-by: N Balachandran <[email protected]>
This commit adds the rbac permissions required by the vgmanager for LVMVolumeGroup and LVMVolumeGroupNodeStatus. Signed-off-by: N Balachandran <[email protected]>
The vgmanager will watch and reconcile the LVMVolumeGroup instead of the LVMCluster CR. This commit also makes minor changes the vgmanager reconcile code. Signed-off-by: N Balachandran <[email protected]>
This commit adds tests to check the creation and deletion of LVMVolumeGroups by the LVM controller. Signed-off-by: N Balachandran <[email protected]>
return ctrl.Result{}, err | ||
} | ||
|
||
r.Log.Info("listing block devices") | ||
//TODO: actually check the node against the nodeSelector. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to match with the node selector anymore? Since the LVMVolumeGroup CR is running on the correct node, should that already handle this check for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CR is the object stored in etcd and does not run on a node. The vgmanager pod is expected to check the node it is running on against the nodeselector specified in the LVMVolumeGroup CR to see if it expected to act on it. This is currently not being done.
Q: When we add the functionality to read any new device add events on the nodes, how will we tell all the LVMVolumeGroup CRs associated with the node to reconcile? |
The VG manager will need to reread the CRs once it receives the event. No changes will be made to the CR to trigger a reconcile. |
|
||
for _, volumeGroup := range lvmVolumeGroups { | ||
result, err := cutil.CreateOrUpdate(ctx, r.Client, volumeGroup, func() error { | ||
// no need to mutate any field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- might not always be true, since deviceSelector isn't implemented fully yet we can get away without not using mutate function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. For now this is fine. The comment can be removed when that is actually done.
remainingValidDevices, delayedDevices, err := r.filterAvailableDevices(blockDevices) | ||
if err != nil { | ||
_ = err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- either don't handle the err at all,
a, b, _ := func
or pls add a comment on why this is being not handled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the earlier existing code which I have not changed.
} else { | ||
if found { | ||
// Update the status again just to be safe. | ||
if statuserr := r.updateStatus(ctx, status, volumeGroup); statuserr != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- just a question, this has a possibility of conflicts when a vgcr matches multiple nodes right?
- we didn't create the vg yet, why should we update the status as ready or the presence of name in config file is just enough for vg status to be ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because the lvmd.yaml file is only updated once the VG is created. This block is mainly to handle cases when the status might not have been updated correctly. That function can also be modified to not update if there is no change for this entry.
|
||
status := &lvmv1alpha1.VGStatus{ | ||
Name: req.Name, | ||
Status: lvmv1alpha1.VGStatusReady, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- IMHO initial vgstatus should be unknown and we try to set it correctly by the end of reconcile 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be taken up in a separate PR?
if err != nil { | ||
return ctrl.Result{}, err | ||
if err == nil { | ||
err = os.WriteFile(controllers.LvmdConfigFile, out, 0600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not move this line to 218 and remove this conditional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checks work like this:
if unmarshall fails,-> return error
If unmarshall succeeds -> attempt to write
if write fails -> return error
Notice that there is no else check - so the err check will work for both scenarios.
if nodeStatus.CreationTimestamp.IsZero() { | ||
vgNodeStatus.DeepCopyInto(nodeStatus) | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- i think
setStatus
already handles this condition by creating a new status when one is not found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is working on the existing CR in the system so we need to update that object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/lgtm |
@nbalacha: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jmolmo, leelavg, nbalacha, sp98 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The LVMCluster reconciliation will now create individual LVMVolumeGroup objects for each deviceClass. The VGManager will now reconcile the LVMVolumeGroup instead of the LVMCluster CR.