Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node/Driver labeling mechanism #2

Closed
wants to merge 3 commits into from

Conversation

davidz627
Copy link
Owner

@davidz627 davidz627 commented Oct 25, 2018

What this PR does / why we need it:

This PR adds a field to the CSINodeInfo object to specify whether each specific driver/node combination has migration turned on. This in turn is used by the Attach/Detach controller to decide whether to attach a volume with migration or not.

This superscedes the mechanism described here: https://github.com/kubernetes/community/blame/master/contributors/design-proposals/storage/csi-migration.md#L236

Part of the implementation for feature: kubernetes/enhancements#625

/assign @jsafrane @ddebroy @leakingtapan

@davidz627 davidz627 force-pushed the feature/migrationNodeLabel branch from a2c11af to a521bf1 Compare October 26, 2018 18:27
@davidz627
Copy link
Owner Author

@ddebroy comments addressed!

@davidz627 davidz627 force-pushed the feature/migrationNodeLabel branch from a521bf1 to d3941ee Compare October 26, 2018 18:54
@davidz627 davidz627 force-pushed the feature/migrationNodeLabel branch from d3941ee to 29832e1 Compare October 26, 2018 18:56
@davidz627 davidz627 force-pushed the feature/migrationNodeLabel branch 2 times, most recently from 2512123 to ad8f72b Compare October 26, 2018 21:57
@davidz627 davidz627 force-pushed the feature/migrationNodeLabel branch from ad8f72b to b582653 Compare October 26, 2018 23:50
@davidz627
Copy link
Owner Author

@jsafrane PTAL, I believe this should solve the node/controller race/sync issues

// migration to CSI for this driver. It is used by storage controllers on
// the control plane to determine whether to use a migrated version of the
// plugin for this driver on this node.
IsDriverMigratableOnNode bool `json:"isDriverMigratableOnNode"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. should you regenerate the clients?

  2. I'd use DriverMigrated (or VolumePluginMigrated to emphasize what's migrated)

  • OnNode it's implied by CSINodeInfo. The other field is no TopologyKeysOnNode, but TopologyKeys.
  • We don't use Is prefix for booleans, like AttachRequired or ReadOnly.
  1. Again, "Migratable" vs "Migrated". Does this field say that now CSI must be used (= driver is migrated, don't attach the old way) or may be used (=migratable, attach any way you want, kubelet will handle that)?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think the naming has been a little inconsistent.

Going forwards I will refer to anything related to feature flags as "Migrated"
whereas the fact that translation library exists is "Migratable"

The difference being that feature flags is something the admin sets to choose to migrate so it is Migrated. Whereas there is another concept of - does the logic to do this exist (we have translation logic for this plugin) - is Migratable.

Please flag any incorrect usage in code reviews

@ddebroy @leakingtapan

return false, fmt.Errorf("failed to get CSI node info for node %v from informer: %v", string(nodeName), err)
}
} else {
glog.Warningf("CSINodeInfo informer not synced, please check that CSINodeInfo CRD is installed. If so, this warning should not appear again after ~1 minute")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still some crazy race there: CRD gets installed, informer is (slowly) syncing and it gets an old CSINodeInfo. User/kubelet changes CSINodeInfo, the informer is not synced yet and the volume plugin does API GET at this line. It gets new CSINodeInfo.

Now, the informer finishes sync and it still has old CSINodeInfo, with the new one enqueued (or waiting somewhere in API server). Second call to this function does informer.Get(), which returns the old value. It seems the plugin goes back in time.

In this particular case, the informer has old CSINodeInfo for very short time (depending on API server connectivity), so it perhaps does not really matter...

Can we just return error here and kubelet / A/D controller re-tries in a while? Frankly, I think that whole CSI volume plugin should return errors in all calls until the informers are synced.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little confused about the race you described. Are you saying that if informer is synced it may have stale API objects? I was under the impression that synced informer means that any GET to the informer will guarantee to be the freshest object, otherwise what's the point in having an inconsistent cache?

The only two options I see here in the current code are:

  1. Informer synced -> get from informer (freshest object)
  2. Informer not yet synced -> get from API Server (freshest object)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, informers can return old data. It does List + Watch and the watch can have delay on network. That's OK. Informer will let you know when it gets new data.

GET from API server can get old data too - at the time your client parses the data there may be already new on the server. This time, you don't get event when it changes.

In this particular case, GET from API server could be faster than initial List + Watch and return something newer than the informer. Under "normal" conditions the informer syncs very quickly, so this discussion is more or less academical.

return false, fmt.Errorf("plugin %v is migratable but driver %v is not installed. Please install the driver and retry", pluginName, driverName)
}

func isMigratable(og *operationGenerator, spec *volume.Spec, uniqueVolumeName v1.UniqueVolumeName, nodeName types.NodeName) (bool, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to isVolumeMigratable? Or isVolumePluginMigratable? It's confusing to see isMigratable and isNodeMigratable.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am doing large scale fix for all naming to be consistent with #2 (comment).

thanks for pointing this out.

@davidz627
Copy link
Owner Author

davidz627 commented Oct 29, 2018

@jsafrane @ddebroy @saad-ali

I've noticed a problem with using CSINodeInfo (this may be the same one Jan mentioned in the comments):
By using CSINodeInfo we have lost some fidelity in understanding the cases of node migration flags and driver installation. In order to solve races in upgrade/downgrade we need to distinguish these cases:

// if yes flag and yes driver -> node is migrated
// if yes flag and no driver -> throw an error about installing driver
// if no flag and yes driver -> node is not migrated
// if no flag and no driver -> node is not migrated 

However, by using the CSINodeInfo object we lost the ability to distinguish the two cases where there is no driver (since there will be no corresponding CSINodeInfo object for that driver) and we want different behavior for the two cases.

I can see this being solved by pre-creating CSIDriverInfo for each migrated driver, and adding a field DriverInstalled bool to the CSIDriverInfo object that is false on creation, and is flipped to true when the driver is installed. In this case we will be able to see whether the migration flags are set or not whether the driver is installed or not and we can distinguish the above cases.

We would have to change a lot of unrelated (to migration) logic to instead of parsing the CSIDriverInfo existence as a proxy for driver installation, to specifically checking the DriverInstalled field on the object.

@jsafrane
Copy link

I am slightly confused by CSIDriverInfo - do you mean global CSIDriver or node specific CSINodeInfo? I'm assuming CSIDriver, CSINodeInfo makes little sense.

I am not sure that a flag in CSIDriver would help. What if CSIDriver itself does not exist? It's still optional.

I think we should take a step back and re-think how a driver is installed and how each Kubernetes component knows about it.

@davidz627
Copy link
Owner Author

I am slightly confused by CSIDriverInfo - do you mean global CSIDriver or node specific CSINodeInfo? I'm assuming CSIDriver, CSINodeInfo makes little sense.

I am not sure that a flag in CSIDriver would help. What if CSIDriver itself does not exist? It's still optional.

I think we should take a step back and re-think how a driver is installed and how each Kubernetes component knows about it.

I mean the CSIDriverInfo object that is inside CSINodeInfo.

@davidz627
Copy link
Owner Author

This PR is on hold until CSINodeInfo changes go in

davidz627 added a commit that referenced this pull request Jan 17, 2019
@davidz627
Copy link
Owner Author

/close
replaced by kubernetes#74835

@davidz627 davidz627 closed this Mar 5, 2019
@davidz627 davidz627 deleted the feature/migrationNodeLabel branch March 5, 2019 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants