Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache [MKP] - Create ServiceAccount, Roles and RoleBindings before running the deployer #3459

Conversation

Ark-kun
Copy link
Contributor

@Ark-kun Ark-kun commented Apr 7, 2020

The MKP postsubmit tests are flaky. I suspect that the reson for the failure is that at the time the deployer job starts, the deployer service account is not bound to the roles yet, so it has no permissions and fails to create any resources.
This commit uses Heml hooks to solve the issue.
See https://helm.sh/docs/topics/charts_hooks/ helm/helm#4884 https://github.com/helm/charts/blob/edecc9842d1a63758c3bd0fc13a5d019e079f656/stable/prometheus-operator/templates/prometheus-operator/admission-webhooks/job-patch/clusterrolebinding.yaml

P.S. In the future we should switch the deployer to Job. Helm has explicit support for running pre-install hooks and will automatically wait for the job to complete and then delete the job and the deployer resources (job, service account, roles, rolebindings) https://github.com/helm/charts/blob/edecc9842d1a63758c3bd0fc13a5d019e079f656/stable/prometheus-operator/templates/prometheus-operator/admission-webhooks/job-patch/job-patchWebhook.yaml

…nning the deployer

The MKP postsubmit tests are flaky. I suspect that the reson for the failure is that at the time the deployer job starts, the deployer service account is not bound to the roles yet, so it has no permissions and fails to create any resources.
This commit uses Heml hooks to solve the issue.
See https://helm.sh/docs/topics/charts_hooks/ helm/helm#4884 https://github.com/helm/charts/blob/edecc9842d1a63758c3bd0fc13a5d019e079f656/stable/prometheus-operator/templates/prometheus-operator/admission-webhooks/job-patch/clusterrolebinding.yaml
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign rmgogogo
You can assign the PR to them by writing /assign @rmgogogo in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeflow-bot
Copy link

This change is Reviewable

@rmgogogo
Copy link
Contributor

rmgogogo commented Apr 7, 2020

less knowledge on this useage. I'm wondering whether it can work. (MKP only use HELM as template engine). If this change can let the part of yaml be generated ahead, then it can help.

When you mean flaky, do you mean it sometimes pass but sometimes fail? After this PR, it always pass?

@rmgogogo
Copy link
Contributor

rmgogogo commented Apr 7, 2020

FYI, I'm adding the MKP test to presubmit:
https://github.com/kubernetes/test-infra/pull/17124/files#
#3438

@Bobgy
Copy link
Contributor

Bobgy commented Apr 7, 2020

Why doesn't the deployment retry/restart when it fails?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Apr 7, 2020

Why doesn't the deployment retry/restart when it fails?

Interesting question. It has restartPolicy: Always, so it should restart.

@rmgogogo
Copy link
Contributor

rmgogogo commented Apr 8, 2020

is this one still required after #3422

@rmgogogo
Copy link
Contributor

is this one still required after #3422

Kindly ping.

BTW, the Kustomize path also can't work.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Apr 21, 2020

is this one still required after #3422

Are you sure you've liked to the correct PR?

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Apr 21, 2020

is this one still required

If the cache is deployed reliably (both MKP and standalone), then this is not needed.

@rmgogogo
Copy link
Contributor

is this one still required

If the cache is deployed reliably (both MKP and standalone), then this is not needed.

Kustomize side is due to I tested it via customized-namespace but ClusterRoleBinding in ClusterScopedResource has namespace value.

MKP side, I didn't test it but this PR may can't help (I'm not 100% sure) as internally it didn't use HELM. Here let's keep this PR here for more test to see if necessary.

@Ark-kun
Copy link
Contributor Author

Ark-kun commented Apr 24, 2020

Closing this experimental PR since the cache deployer seems to work fine as it is.
Also Marketplace might not use Helm and so might not understand these annotations.

@Ark-kun Ark-kun closed this Apr 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants