-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ControllerModifyVolume Status API #4298
Conversation
sunnylovestiramisu
commented
Oct 17, 2023
- One-line PR description:
- Google doc discussion of resizing status change
- Notes of discussion of modify volume
- Other comments:
406e8eb
to
dc6c59c
Compare
dc6c59c
to
eb58d9f
Compare
/assign @msau42 |
@@ -177,22 +177,27 @@ type PersistentVolumeClaimStatus struct { | |||
... | |||
// The VolumeAttributesClassName string cannot be empty | |||
VolumeAttributesClassName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this field since it is now moved to actualclassname
... | ||
} | ||
|
||
type ModifyVolumeStatus struct { | ||
ActualClassName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace Class with VolumeAttributesClass
// +enum | ||
type PersistentVolumeClaimModifyVolumeStatus string | ||
|
||
const ( | ||
// When modify volume is complete, the empty string is set by modify volume controller. | ||
PersistentVolumeClaimNoModifyVolumeInProgress PersistentVolumeClaimModifyVolumeStatus = "" | ||
PersistentVolumeClaimNoModifyVolumeInSucceeded PersistentVolumeClaimModifyVolumeStatus = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's either succeeded or not started. I don't know if we need a const for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between a nil value and an empty value? the Status field of the ModifyVolumeStatus is pointer type.
... | ||
} | ||
|
||
type ModifyVolumeStatus struct { | ||
ActualClassName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether we should actually make this field an object reference instead of just a name.
I am thinking of the scenario where someone may delete and recreate the VAC with the same name but different content. But actually, if we really want to keep track of the attributes are, we may need to keep track of it in the PV object. @jsafrane @gnufied wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is uuid enough to distinguish instead of doing comparison of the whole VAC object?
Usually the protection controller will prevent the VAC to be deleted but an admin can force delete.
/assign @gnufied |
eb58d9f
to
e75773c
Compare
// +enum | ||
type PersistentVolumeClaimModifyVolumeStatus string | ||
|
||
const ( | ||
// When modify volume is complete, the empty string is set by modify volume controller. | ||
PersistentVolumeClaimNoModifyVolumeInProgress PersistentVolumeClaimModifyVolumeStatus = "" | ||
PersistentVolumeClaimNoModifyVolumeInSucceeded PersistentVolumeClaimModifyVolumeStatus = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between a nil value and an empty value? the Status field of the ModifyVolumeStatus is pointer type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, If succeed, pvc.ModifyVolumeStatus.ActualClassName should be updated to pvc.ModifyVolumeStatus.TargetClassName and pvc.ModifyVolumeStatus.Status should be changed to nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes both the status and actualClassName will be set as you said.
A user creates a pvc with a VolumeAttributesClassName set. if the parameters of the PVC are not accepted by the lower version csi-driver but are accepted by the higher version csi-driver, the status of the PVC may be stuck in infeasible state. If the admin upgrades the csi-driver to the higher version, the PVC how to recover the PVC to user expected state? To modify the PVC to a different VolumeAttributesClass which parameters is same as the previous one is not a good UX for the user. |
I am thinking that even in the Infeasible case, we will still have to periodically retry, but at a lower frequency than if it was InProgress |
The |
e75773c
to
6376b2f
Compare
6376b2f
to
c25c07c
Compare
const ( | ||
... | ||
// PersistentVolumeClaimModifyVolume - a user trigger modify volume of pvc has been started | ||
PersistentVolumeClaimModifyVolume PersistentVolumeClaimConditionType = "ModifyVolume" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble understanding this name of the new condition type. Is this saying that all the moditication of the volume which includes the reszie operation or just for changing volume attributes? If it's the latter, I think we should rename it to something like ModifyVolumeAttributes
or ModifyVolumeAttributesClass
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a strong opinion of the naming, I was following the previous naming by using actions instead of a noun. This condition is used to document the last error the controller sees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in the sig-storage meeting, we will name it "VolumeAttriburesModifyError"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are not going to change status when we retry ModifyVolume
after failing (and status is set to Infeasible
), then we need to somehow communicate to the user that, we are again retrying ModifyVolume
(albeit slowly), so we will I think need a condition for that too. cc @msau42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do you propose we set a ModifyVolume condition to true and false? From api-conventions, it says:
For state transitions which take a long period of time (e.g. more than 1 minute), it is reasonable to treat the transition itself as an observed state. In these cases, the Condition (such as "Resizing") itself should not be transient, and should instead be signalled using the True/False/Unknown pattern. This allows other observers to determine the last update from the controller, whether successful or failed. In cases where the state transition is unable to complete and continued reconciliation is not feasible, the Reason and Message should be used to indicate that the transition failed.
I interpret the last sentence to mean that if we still intend to retry, the status should still be True.
So the main benefits I see of having the condition is:
- that the timestamp can be updated every time we retry an operation.
- we could potentially return false for some preconditions like VAC doesn't exist or is invalid, PVC not bound, etc.
// +enum | ||
// When a controller receives persistentvolume claim update with PersistentVolumeClaimModifyVolumeStatus for a resource | ||
// that it does not recognizes, then it should ignore that update and let other controllers | ||
// handle it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case the controller cannot recognize the status value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not happen but like if someone setting their own status in a fork etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is for telling consumers that how to handle the scenario where we may add new statuses in the future. I think it's simpler to reword it like "New statuses can be added in the future. Consumers should check for unknown statuses and fail appropriately".
type PersistentVolumeClaimModifyVolumeStatus string | ||
|
||
const ( | ||
// When modify volume is complete, the empty string is set by modify volume controller. | ||
PersistentVolumeClaimNoModifyVolumeInProgress PersistentVolumeClaimModifyVolumeStatus = "" | ||
// State set when modify volume controller wants to modify the volume in control-plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the modify volume controller
? pv controller in kube-controller-manager or CSI driver controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is the resize controller actually, we are putting this modify volume controller logic in the external-resizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to specify exactly which controller is handling what. That's more of an implementation detail. I think we can reword it to say something like "Pending indicates that the PersistentVolumeClaim cannot be modified due to requirements not being met, such as the PersistentVolumeClaim being in an invalid state or the specified VolumeAttributesClass is existing"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto below. I think we can remove references to specific controllers in the comments.
// State set when modify volume controller starts modifying the volume in control-plane | ||
// When the request has been rejected as invalid by the CSI driver. To resolve this error, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should apply to the "Infeasible" one, not "InProgress"
// parameters it intends to modify. SPs MUST NOT modify volumes based | ||
// on the absence of keys, only keys that are specified should result | ||
// in modifications to the volume. | ||
map<string, string> mutable_parameters = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m curious why the required and optional keyword is not used in fields in the csi proto file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add it, but that is the CSI spec update. The CSI Spec release has been cut already. We can revisit at next CSI meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually in the CSI spec, there are some extra paragraph I just did not copy pasted here because it is stand alone:
// Plugin specific creation-time parameters passed in as opaque
// key-value pairs. These mutable_parameteres MAY also be
// changed during the lifetime of the volume via a subsequent
// `ControllerModifyVolume` RPC. This field is OPTIONAL.
// The Plugin is responsible for parsing and validating these
// parameters. COs will treat these as opaque.
Please refer to the CSI spec for accuracy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also in CSI spec:
message ControllerModifyVolumeRequest {
option (alpha_message) = true;
// Contains identity information for the existing volume.
// This field is REQUIRED.
string volume_id = 1;
// Secrets required by plugin to complete modify volume request.
// This field is OPTIONAL. Refer to the `Secrets Requirements`
// section on how to use this field.
map<string, string> secrets = 2 [(csi_secret) = true];
// Plugin specific volume attributes to mutate, passed in as
// opaque key-value pairs.
// This field is REQUIRED. The Plugin is responsible for
// parsing and validating these parameters. COs will treat these
// as opaque. The CO SHOULD specify the intended values of all mutable
// parameters it intends to modify. SPs MUST NOT modify volumes based
// on the absence of keys, only keys that are specified should result
// in modifications to the volume.
map<string, string> mutable_parameters = 3;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we defer this to CSI meeting in general? This PR is focusing on unblocking the k8s API change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in CreateVolume it is optional, but in ModifyVolume it is required. I think that makes sense.
From a k8s POV, if a user doesn't specify any vacName in the PVC, then that results in empty mutable_parameters.
A VAC having empty parameters doesn't make sense. If there is nothing to change then why would someone specify a vacName in their PVC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to remove this session in the doc and please refer to the CSI spec.md
5b996bd
to
e244cc4
Compare
// Condition is used to document the last error controller sees | ||
const ( | ||
... | ||
// PersistentVolumeClaimVolumeAttriburesModifyError - a user trigger modify volume of pvc has been started |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment needs to be updated
// ModifyVolumeStatus represents the status object of ControllerModifyVolume operation | ||
type ModifyVolumeStatus struct { | ||
// ActualVolumeAttributesClassName is the actual name of the VolumeAttributesClass the PVC is using | ||
ActualVolumeAttributesClassName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of "Actual", I would use "Current" to be consistent with how other resources specify current status (like StatefulSet.status.currentReplicas)
type ModifyVolumeStatus struct { | ||
// ActualVolumeAttributesClassName is the actual name of the VolumeAttributesClass the PVC is using | ||
ActualVolumeAttributesClassName string | ||
// TargetVolumeAttributesClassName is the target name of the VolumeAttributesClass the PVC wants to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not quite what was requested in the PVC. It is what is currently being reconciled, which may not necessarily match. For example, if user sets PVC.spec.vacName = A, then B, then C, we may still being trying to reconcile B even though the user has set C.
// +enum | ||
// When a controller receives persistentvolume claim update with PersistentVolumeClaimModifyVolumeStatus for a resource | ||
// that it does not recognizes, then it should ignore that update and let other controllers | ||
// handle it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment is for telling consumers that how to handle the scenario where we may add new statuses in the future. I think it's simpler to reword it like "New statuses can be added in the future. Consumers should check for unknown statuses and fail appropriately".
type PersistentVolumeClaimModifyVolumeStatus string | ||
|
||
const ( | ||
// When modify volume is complete, the empty string is set by modify volume controller. | ||
PersistentVolumeClaimNoModifyVolumeInProgress PersistentVolumeClaimModifyVolumeStatus = "" | ||
// State set when modify volume controller wants to modify the volume in control-plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to specify exactly which controller is handling what. That's more of an implementation detail. I think we can reword it to say something like "Pending indicates that the PersistentVolumeClaim cannot be modified due to requirements not being met, such as the PersistentVolumeClaim being in an invalid state or the specified VolumeAttributesClass is existing"
// State set when modify volume controller starts modifying the volume in control-plane | ||
// When the request has been rejected as invalid by the CSI driver. To resolve this error, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should apply to the "Infeasible" one, not "InProgress"
type PersistentVolumeClaimModifyVolumeStatus string | ||
|
||
const ( | ||
// When modify volume is complete, the empty string is set by modify volume controller. | ||
PersistentVolumeClaimNoModifyVolumeInProgress PersistentVolumeClaimModifyVolumeStatus = "" | ||
// State set when modify volume controller wants to modify the volume in control-plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto below. I think we can remove references to specific controllers in the comments.
@@ -409,12 +449,25 @@ The resource quota controller is the only component capable of monitoring and re | |||
### 3. Add new statuses in PVC API to indicate changes of VolumeAttributesClass and the status of the ModifyVolume operation. | |||
|
|||
``` | |||
type VolumeAttributesClassStatus string | |||
type ModifyVolumeStatus struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would save this for a future cleanup, but I'm not sure why we want to define the api definitions in multiple places in the doc. It means if we want to make changes like this, we need to remember to modify it everywhere.
PersistentVolumeClaimNoModifyVolumeInProgress PersistentVolumeClaimModifyVolumeStatus = "" | ||
// State set when modify volume controller wants to modify the volume in control-plane | ||
// but the VolumeAttributesClass does not exist. | ||
PersistentVolumeClaimControllerModifyVolumePending PersistentVolumeClaimModifyVolumeStatus = "ControllerModifyVolumePending" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we need the "Controller" in the name since we're not going to support Node operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not going to but will we ever consider supporting node operations?
e244cc4
to
77912d8
Compare
77912d8
to
256bee5
Compare
Accroding to the discussion on the API PR, we also need to change the
On the other hand, we need change the parameters field of the VolumeAttributesClass to required from optional. Users must provide at least one key-value pair for the parameters field. |
// TargetVolumeAttributesClassName is the name of the VolumeAttributesClass the PVC currently being reconciled | ||
TargetVolumeAttributesClassName string | ||
// Status is the status of the ControllerModifyVolume operation | ||
Status *PersistentVolumeClaimModifyVolumeStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a pointer type? Do we want to distinguish a nil
value and a zero value? If so, why? If not, why not use a non-pointer type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
History of the VAC name being a pointer: https://github.com/kubernetes/enhancements/pull/3780/files#r1230300561
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually this should not be a pointer. It should be consistent of the AllocatedResourceStatus
fce58d1
to
5f0a6c5
Compare
5f0a6c5
to
23fa960
Compare
VolumeAttributesClassName string | ||
ModifyVolumeStatus *PersistentVolumeClaimModifyVolumeStatus | ||
// ModifyVolumeStatus represents the status object of ControllerModifyVolume operation | ||
ModifyVolumeStatus ModifyVolumeStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ( | ||
... | ||
// PersistentVolumeClaimVolumeAttriburesModifyError - a user trigger modify volume of pvc has error | ||
PersistentVolumeClaimVolumeAttriburesModifyError PersistentVolumeClaimConditionType = "VolumeAttriburesModifyError" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also want to add the "VolumeAttributesModifying" condition based on this discussion with @gnufied #4298 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a modifying? We cannot just say "Err but we are retrying" in the messaging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does an operator that is possibly monitoring the pvc knows volume is being modified or not? By regexp. for message on the condition? Seems fragile.
... | ||
} | ||
|
||
// ModifyVolumeStatus represents the status object of ControllerModifyVolume operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't want to distinct the ControllerModifyVolume and XXXModifyVolume, Is it possible to drop the Controller prefix in the ModifyVolumeStatus description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is already dropped in the newest revision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ControllerModifyVolume
word still exists in L197 and L201
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msau42, sunnylovestiramisu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |