Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeout applied to taskrun from pipelinerun does not work #3038

Closed
VeereshAradhya opened this issue Jul 31, 2020 · 13 comments
Closed

timeout applied to taskrun from pipelinerun does not work #3038

VeereshAradhya opened this issue Jul 31, 2020 · 13 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@VeereshAradhya
Copy link

Expected Behavior

The timeout given to pipelinerun should get applied to taskrun and the taskrun should get failed when it reaches timeout

Actual Behavior

The timeout given to pipelinerun is getting applied to taskrun but the taskrun is not getting failed when it reaches timeout

Steps to Reproduce the Problem

  1. Create a task and create a pipeline with the task
  2. Make sure that task runs for 10 mins
  3. Start the pipeline with timeout of 5 mins
  4. Observe the taskrun and the pipelinerun status after 6 mins

Additional Info

Command logs:

$ tkn pr describe pipeline-test-run-5n9rx 
Name:              pipeline-test-run-5n9rx
Namespace:         veeresh-testing
Pipeline Ref:      pipeline-test
Service Account:   pipeline
Timeout:           5m0s
Labels:
 tekton.dev/pipeline=pipeline-test

🌡️  Status

STARTED         DURATION    STATUS
5 minutes ago   5 minutes   Failed(PipelineRunTimeout)

💌 Message

PipelineRun "pipeline-test-run-5n9rx" failed to finish within "5m0s"

📦 Resources

 No resources

⚓ Params

 No params

🗂  Taskruns

 NAME                                              TASK NAME         STARTED         DURATION   STATUS
 ∙ pipeline-test-run-5n9rx-run-script-1-f4bfd      run-script-1      5 minutes ago   ---        Running
 ∙ pipeline-test-run-5n9rx-run-script-fail-vn6z8   run-script-fail   5 minutes ago   ---        Running

$ tkn tr describe pipeline-test-run-5n9rx-run-script-1-f4bfd 
Name:              pipeline-test-run-5n9rx-run-script-1-f4bfd
Namespace:         veeresh-testing
Task Ref:          first-task
Service Account:   pipeline
Timeout:           5m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/pipeline=pipeline-test
 tekton.dev/pipelineRun=pipeline-test-run-5n9rx
 tekton.dev/pipelineTask=run-script-1
 tekton.dev/task=first-task

🌡️  Status

STARTED       DURATION     STATUS
2 hours ago   20 minutes   Failed(TaskRunTimeout)

Message

TaskRun "pipeline-test-run-5n9rx-run-script-1-f4bfd" failed to finish within "5m0s"

📨 Input Resources

 No input resources

📡 Output Resources

 No output resources

⚓ Params

 No params

🦶 Steps

 NAME         STATUS
 ∙ step-one   Running

🚗 Sidecars

No sidecars

  • Kubernetes version:

    Output of kubectl version:

    Client Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2020-01-29T21:26:39Z", GoVersion:"go1.14beta1", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.3+6025c28", GitCommit:"6025c28", GitTreeState:"clean", BuildDate:"2020-07-01T23:26:48Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
    
  • Tekton Pipeline version:

    Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

    tkn version
    Client version: 0.11.0
    Pipeline version: v0.14.3
    Triggers version: v0.6.1
    
@VeereshAradhya VeereshAradhya added the kind/bug Categorizes issue or PR as related to a bug. label Jul 31, 2020
@danielhelfand
Copy link
Member

danielhelfand commented Jul 31, 2020

My suspicion is this would be an issue with pipelines as tkn is only responsible for setting the timeout value for the PipelineRun. Pipelines is responsible for enforcing the timeouts. We can take a look though and see perhaps how this is occurring server side/see if tkn can help to address this.

If you could please share all your resources (e.g. Tasks, Pipeline, etc.) and how you started the PipelineRun with tkn, that would be helpful.

Edit: Just noting this was originally opened in the cli repo, but looks like someone transferred over.

@VeereshAradhya
Copy link
Author

task:

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: first-task
spec:
  description: This is cluster task
  steps:
    - name: step-one
      image: python  
      script: |
        #!/usr/bin/env python3
        import time
        for i in range(60):
          time.sleep(10)
          print("sleeping for 10 seconds", flush=True)
          print("this is first task", flush=True)
        raise Exception()

Pipeline:

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: pipeline-test
spec:
  tasks:
    - name: run-script-1
      taskRef: 
        name: first-task
        kind: Task

Command used to start pipeline tkn p start pipeline-test --timeout 5m

@bobcatfish
Copy link
Collaborator

I'm looking into the timeout logic right now, so I can take a look at this!

/assign

@danielhelfand
Copy link
Member

One thing I have noticed with this is that it seems TaskRun pods are being deleted when a timeout occurs. I am assuming this would affect how the TaskRun statuses are being updated.

@danielhelfand
Copy link
Member

FWIW though, I can't reproduce this with the task/pipeline/start command above.

@pritidesai
Copy link
Member

I am not able to reproduce on pipeline master. @VeereshAradhya I notice you have an additional task pipeline-test-run-5n9rx-run-script-fail-vn6z8 in your pipeline 🤔

$ tkn p start pipeline-test --timeout 5m
Pipelinerun started: pipeline-test-run-78s2n

$ tkn pipelinerun logs pipeline-test-run-78s2n -f -n default
[run-script-1 : step-one] sleeping for 10 seconds
[run-script-1 : step-one] this is first task
[run-script-1 : step-one] sleeping for 10 seconds
...
[run-script-1 : step-one] rpc error: code = Unknown desc = Error: No such container: 67695bba4458d9e3f1a5e10fab4d5846406cf819ddcd3d9afe7e574b9b2caebc

$ tkn pr describe pipeline-test-run-78s2n
Name:           pipeline-test-run-78s2n
Namespace:      default
Pipeline Ref:   pipeline-test
Timeout:        5m0s
Labels:
 tekton.dev/pipeline=pipeline-test

🌡️  Status

STARTED         DURATION    STATUS
5 minutes ago   5 minutes   Failed(PipelineRunTimeout)

💌 Message

PipelineRun "pipeline-test-run-78s2n" failed to finish within "5m0s" (TaskRun "pipeline-test-run-78s2n-run-script-1-27797" failed to finish within "5m0s")

🗂  Taskruns

 NAME                                           TASK NAME      STARTED         DURATION    STATUS
 ∙ pipeline-test-run-78s2n-run-script-1-27797   run-script-1   5 minutes ago   5 minutes   Failed(TaskRunTimeout)

$ tkn tr describe pipeline-test-run-78s2n-run-script-1-27797
Name:        pipeline-test-run-78s2n-run-script-1-27797
Namespace:   default
Task Ref:    first-task
Timeout:     5m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/pipeline=pipeline-test
 tekton.dev/pipelineRun=pipeline-test-run-78s2n
 tekton.dev/pipelineTask=run-script-1
 tekton.dev/task=first-task

🌡️  Status

STARTED         DURATION    STATUS
6 minutes ago   5 minutes   Failed(TaskRunTimeout)

Message

TaskRun "pipeline-test-run-78s2n-run-script-1-27797" failed to finish within "5m0s"

@VeereshAradhya
Copy link
Author

@pritidesai @danielhelfand when I filed the bug I was able to reproduce the issue with the spec that I have provided. When I checked today, with same spec I was not able to reproduce the issue. I observed that the issue is not reproducible all the time. Today with repeated trials I was able to reproduce the issue with below specs
Command logs:

$ tkn pr ls
NAME                      STARTED          DURATION     STATUS
pipeline-test-run-9dqtw   22 minutes ago   12 minutes   Failed(PipelineRunTimeout)
pipeline-test-run-bv8vs   46 minutes ago   12 minutes   Failed(PipelineRunTimeout)
$ 
$ 
$ tkn pr describe pipeline-test-run-9dqtw
Name:              pipeline-test-run-9dqtw
Namespace:         veeresh-testing
Pipeline Ref:      pipeline-test
Service Account:   pipeline
Timeout:           12m0s
Labels:
 tekton.dev/pipeline=pipeline-test

🌡️  Status

STARTED          DURATION     STATUS
23 minutes ago   12 minutes   Failed(PipelineRunTimeout)

💌 Message

PipelineRun "pipeline-test-run-9dqtw" failed to finish within "12m0s" (TaskRun "pipeline-test-run-9dqtw-run-script-1-fk7g4" failed to finish within "12m0s")

📦 Resources

 No resources

⚓ Params

 No params

🗂  Taskruns

 NAME                                           TASK NAME      STARTED          DURATION     STATUS
 ∙ pipeline-test-run-9dqtw-run-script-1-fk7g4   run-script-1   23 minutes ago   20 minutes   Failed(TaskRunTimeout)
 ∙ pipeline-test-run-9dqtw-run-script-2-fhrdv   run-script-2   23 minutes ago   12 minutes   Failed(TaskRunTimeout)
$ 
$ tkn tr describe pipeline-test-run-9dqtw-run-script-1-fk7g4
Name:              pipeline-test-run-9dqtw-run-script-1-fk7g4
Namespace:         veeresh-testing
Task Ref:          first-task
Service Account:   pipeline
Timeout:           12m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/pipeline=pipeline-test
 tekton.dev/pipelineRun=pipeline-test-run-9dqtw
 tekton.dev/pipelineTask=run-script-1
 tekton.dev/task=first-task

🌡️  Status

STARTED          DURATION     STATUS
23 minutes ago   20 minutes   Failed(TaskRunTimeout)

Message

TaskRun "pipeline-test-run-9dqtw-run-script-1-fk7g4" failed to finish within "12m0s"

📨 Input Resources

 No input resources

📡 Output Resources

 No output resources

⚓ Params

 No params

📝 Results

 No results

🦶 Steps

 NAME         STATUS
 ∙ step-one   Running

🚗 Sidecars

No sidecars
$ 
$ 
$ 
$ tkn pr describe pipeline-test-run-9dqtw-run-script-2-fhrdv
Error: failed to find pipelinerun "pipeline-test-run-9dqtw-run-script-2-fhrdv"
$ 
$ 
$ 
$ 
$ tkn pr describe pipeline-test-run-bv8vs
Name:              pipeline-test-run-bv8vs
Namespace:         veeresh-testing
Pipeline Ref:      pipeline-test
Service Account:   pipeline
Timeout:           12m0s
Labels:
 tekton.dev/pipeline=pipeline-test

🌡️  Status

STARTED          DURATION     STATUS
47 minutes ago   12 minutes   Failed(PipelineRunTimeout)

💌 Message

PipelineRun "pipeline-test-run-bv8vs" failed to finish within "12m0s" (TaskRun "pipeline-test-run-bv8vs-run-script-1-vnp4f" failed to finish within "12m0s")

📦 Resources

 No resources

⚓ Params

 No params

🗂  Taskruns

 NAME                                           TASK NAME      STARTED          DURATION     STATUS
 ∙ pipeline-test-run-bv8vs-run-script-1-vnp4f   run-script-1   47 minutes ago   20 minutes   Failed(TaskRunTimeout)
 ∙ pipeline-test-run-bv8vs-run-script-2-xnwqq   run-script-2   47 minutes ago   12 minutes   Failed(TaskRunTimeout)
$ 
$ 
$ tkn tr describe pipeline-test-run-bv8vs-run-script-1-vnp4f
Name:              pipeline-test-run-bv8vs-run-script-1-vnp4f
Namespace:         veeresh-testing
Task Ref:          first-task
Service Account:   pipeline
Timeout:           12m0s
Labels:
 app.kubernetes.io/managed-by=tekton-pipelines
 tekton.dev/pipeline=pipeline-test
 tekton.dev/pipelineRun=pipeline-test-run-bv8vs
 tekton.dev/pipelineTask=run-script-1
 tekton.dev/task=first-task

🌡️  Status

STARTED          DURATION     STATUS
47 minutes ago   20 minutes   Failed(TaskRunTimeout)

Message

TaskRun "pipeline-test-run-bv8vs-run-script-1-vnp4f" failed to finish within "12m0s"

📨 Input Resources

 No input resources

📡 Output Resources

 No output resources

⚓ Params

 No params

📝 Results

 No results

🦶 Steps

 NAME         STATUS
 ∙ step-one   Running

🚗 Sidecars

No sidecars
$ 

pipeline:

apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
  name: pipeline-test
spec:
  tasks:
    - name: run-script-1
      taskRef: 
        name: first-task
        kind: Task
    - name: run-script-2
      taskRef:
        name: second-task
        kind: Task
  finally:
    - name: final-task
      taskRef:
        name: finally-task
        kind: task

Task:
first-task, second-task and finally-task are same spec with different names

apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
  name: finally-task
spec:
  description: This is cluster task
  steps:
    - name: step-one
      image: python  
      script: |
        #!/usr/bin/env python3
        import time
        for i in range(60):
          time.sleep(20)
          print("sleeping for 10 seconds", flush=True)
          print("this is finally task", flush=True)

@jerop
Copy link
Member

jerop commented Oct 23, 2020

/assign

@bobcatfish bobcatfish removed their assignment Nov 3, 2020
@jerop
Copy link
Member

jerop commented Jan 19, 2021

@VeereshAradhya I tried but couldn't reproduce this issue, is this still a problem for you?

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2021
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 19, 2021
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants