Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow with PodDisruptionBudgetSpec fails #10649

Closed
2 of 3 tasks
shibataka000 opened this issue Mar 7, 2023 · 7 comments · Fixed by #10712
Closed
2 of 3 tasks

Workflow with PodDisruptionBudgetSpec fails #10649

shibataka000 opened this issue Mar 7, 2023 · 7 comments · Fixed by #10712
Assignees
Labels
P1 High priority. All bugs with >=5 thumbs up that aren’t P0, plus: Any other bugs deemed high priority type/bug

Comments

@shibataka000
Copy link
Contributor

shibataka000 commented Mar 7, 2023

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

Workflow with PodDisruptionBudgetSpec fails in Kubernetes v1.25. No pod is created.

When I do same steps in Kubernetes v1.24, it works correctly.

The policy/v1beta1 API version of PodDisruptionBudget is no longer served as of v1.25. It might cause this issue.

Steps to reproduce issue

# 1. Create k8s cluster by kind.
$ kind create cluster --image kindest/node:v1.25.3

# 2. Install Argo Workflows.
$ kubectl create namespace argo
$ kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.4.5/install.yaml

# 3. Submit workflow.
$ argo submit https://raw.githubusercontent.com/argoproj/argo-workflows/fcf4e9929a411a7c6083e67c1c37e9c798e4c7d9/examples/default-pdb-support.yaml
Name:                default-pdb-support-tdtsq
Namespace:           default
ServiceAccount:      default
Status:              Pending
Created:             Tue Mar 07 07:50:18 +0000 (now)
Progress:

# 4. It fails.
$ argo get default-pdb-support-tdtsq
Name:                default-pdb-support-tdtsq
Namespace:           default
ServiceAccount:      default
Status:              Failed
Message:             Unable to create PDB resource for workflow, default-pdb-support-tdtsq error: the server could not find the requested resource
Conditions:
 Completed           True
Created:             Tue Mar 07 07:50:18 +0000 (40 seconds ago)
Started:             Tue Mar 07 07:50:18 +0000 (40 seconds ago)
Finished:            Tue Mar 07 07:50:18 +0000 (40 seconds ago)
Duration:            0 seconds
Progress:            0/0

# 5. No pod is created.
$ kubectl get pod
No resources found in default namespace.

Version

v3.4.5 or latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

This is same as default-pdb-support.yaml

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: default-pdb-support-
spec:
  entrypoint: pdbcreate
  serviceAccountName: default
  podDisruptionBudget:
    minAvailable: "9999"
  templates:
  - name: pdbcreate
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["sleep 10"]

Logs from the workflow controller

time="2023-03-07T07:50:18.142Z" level=info msg="Processing workflow" namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.147Z" level=info msg="Updated phase  -> Running" namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.158Z" level=info msg="Updated phase Running -> Failed" namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.158Z" level=info msg="Updated message  -> Unable to create PDB resource for workflow, default-pdb-support-tdtsq error: the server could not find the requested resource" namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.158Z" level=info msg="Marking workflow completed" namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.159Z" level=info msg="Deleted PDB resource for workflow." namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.160Z" level=info msg="Checking daemoned children of " namespace=default workflow=default-pdb-support-tdtsq
time="2023-03-07T07:50:18.166Z" level=info msg="cleaning up pod" action=deletePod key=default/default-pdb-support-tdtsq-1340600742-agent/deletePod
time="2023-03-07T07:50:18.169Z" level=info msg="Workflow update successful" namespace=default phase=Failed resourceVersion=891 workflow=default-pdb-support-tdtsq

Logs from in your workflow's wait container

None. No pod was created.

@tico24
Copy link
Member

tico24 commented Mar 14, 2023

For reference, this is one of the examples: https://github.com/argoproj/argo-workflows/blob/master/examples/default-pdb-support.yaml

To add to this, the UI throws the same error:
image

Setting minavailable to a int (which is what I'd expect it to be) also gives the same result.

The k8s docs, and the workflows docs imply that a selector might be mandatory, although an empty one should work. I tested this and got the same result.

So yes, I can confirm that this doesn't work in k8s 1.25+

@JPZ13 JPZ13 added the P1 High priority. All bugs with >=5 thumbs up that aren’t P0, plus: Any other bugs deemed high priority label Mar 16, 2023
@terrytangyuan
Copy link
Member

This should be fixed in #10712

terrytangyuan added a commit that referenced this issue Mar 20, 2023
@terrytangyuan terrytangyuan self-assigned this Mar 20, 2023
@as42sl
Copy link

as42sl commented Mar 20, 2023

Thanks for the fix! Can you please create a new release with this fix? That would be really great!

@terrytangyuan
Copy link
Member

I am not sure if there's any plan yet but for now you it should be available in the latest tag: https://quay.io/repository/argoproj/workflow-controller?tab=history. cc @alexec @sarabala1979

@terrytangyuan
Copy link
Member

BTW, minAvailable needs to be numeric. Fixed in the example #10715

@tico24
Copy link
Member

tico24 commented Mar 20, 2023

BTW, minAvailable needs to be numeric. Fixed in the example #10715

The docs say string or int is acceptable. String seems crazy to me, but that's what the docs say!

terrytangyuan added a commit that referenced this issue Mar 30, 2023
JPZ13 pushed a commit to pipekit/argo-workflows that referenced this issue Jul 4, 2023
@souravsingh-tivo
Copy link

Message: Unable to create PDB resource for workflow, pod-name error: the server could not find the requested resource

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority. All bugs with >=5 thumbs up that aren’t P0, plus: Any other bugs deemed high priority type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants