Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollout spec validation #358

Closed
jessesuen opened this issue Jan 15, 2020 · 5 comments
Closed

Rollout spec validation #358

jessesuen opened this issue Jan 15, 2020 · 5 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@jessesuen
Copy link
Member

jessesuen commented Jan 15, 2020

There are some static validation checks we can do to prevent updates of rollouts in an invalid way.

We need two things:

  1. an optional validating admission webhook to prevent these rollouts from entering the system.
  2. if the cluster does not support or is not configured to use admission webhooks, for the rollout to have an "InvalidSpec" condition, which is surfaced to the conditions, and subsequently to tools like Argo CD.

Some things to be aware of:

  1. We have to be cognizant that any referenced resources (e.g. Services and VirtualServices) could become invalid out-of-band. So for those class of errors, we do not to need to prevent the rollout admission from happening.
  2. Nothing should prevent the controller from making status updates. For example even if the spec is invalid, the controller should still be able to update status. The validating webhook should not prevent this.
  3. The order in which resources are applied, should not affect the ability to admit rollouts to the system. For example, if a rollout references a non-existant service, this might just be because the service object has not yet been kubectl applied to the system simply because it came lower in the YAML file.
@jessesuen
Copy link
Member Author

Just to elaborate. The checks performed by the validating admission webhook, need to actually be less strict than what is required when validating a rollout as part of reconciliation, which will be stricter and deeper.

For example, the validating admission webhook should not deny admission of a rollout when the rollout references a non-existent service (imagine the scenario when the service is deleted out-of-band). This could cause the controller to stop updating status of the rollout. Instead, during reconciliation, the rollout will have an error condition which identifies the problem of the non-existent service.

@dthomson25 dthomson25 added the enhancement New feature or request label Jan 16, 2020
@jessesuen
Copy link
Member Author

jessesuen commented Jan 21, 2020

Examples of things that would need to be in the validating webhook and static validation during reconciliation:

  • volumeMounts are referencing things that are listed in volumes.

I am assuming we will be able to leverage existing kubernetes static validation libraries for ReplicaSet and/or PodTemplate.

@dthomson25
Copy link
Member

Another example would be preventing a rollout that creates an experiment that never terminates

@jessesuen jessesuen assigned jessesuen and khhirani and unassigned jessesuen Jan 23, 2020
@khhirani khhirani removed their assignment Jan 23, 2020
@jessesuen
Copy link
Member Author

Experiment templates names should be lowercased since we will be having reserved words when referencing the podTemplateHash during experiment/analysis steps.

@jessesuen
Copy link
Member Author

Closing as fixed. Will punt on the validating admission webhook since we now have the tolerant informer preventing users from borking the controller with an invalid spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants