Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditionally retry a step on failure #1645

Open
jithine opened this issue May 22, 2019 · 7 comments
Open

Conditionally retry a step on failure #1645

jithine opened this issue May 22, 2019 · 7 comments
Assignees
Labels
Milestone

Comments

@jithine
Copy link
Member

jithine commented May 22, 2019

What happened:

Sometimes a step can fail because of external dependencies. There is no option to retry the command under different circumstances without restarting a new build or code changes.

What you expected to happen:

Provide an option to conditionally retry failed steps.

  1. Screwdriver workflow to provide a step config to specify that step must be retried.
  2. Retry should support setting different environment variables.
  3. Optionally provide a condition which should determine whether retry should happen or not.
  4. For Screwdriver provided setup & teardown steps. cluster admins should be able to define the retry condition.
    1. User's can optionally specific retry condition without ability to override command
  5. Provide means to easily add retry configuration to multiple steps

For example

steps:
 sd-setup-scm:
    command: git clone foo bar....
    retry: # object below or just `true`
      condition: $GIT_SHALLOW_CLONE == true # optional
      maxRetry: 3 # optional, default 1
      interval: 3 # optional, default 0 (second)
      environment: # optional
         GIT_SHALLOW_CLONE: false

How to reproduce it:

N/A

@jithine
Copy link
Member Author

jithine commented May 22, 2019

This is also related to #1208 When making model changes we should keep both features in mind

@catto
Copy link
Member

catto commented May 28, 2019

I really want this feature 👍 How about adding some useful keys and changing indentation?

steps:
 sd-setup-scm:
    command: git clone foo bar....
    retry: # object below or just `true`
      condition: $GIT_SHALLOW_CLONE == true # optional
      maxRetry: 3 # optional, default 1
      interval: 3 # optional, default 0 (second)
      environment: # optional
         GIT_SHALLOW_CLONE: false

@jithine
Copy link
Member Author

jithine commented May 28, 2019

Adding retry options under retry object makes sense.

@tkyi
Copy link
Member

tkyi commented Dec 18, 2019

Another ability a user asked for related to this issue was optionally being able to specify restarting from a previous job.

@rm-you
Copy link

rm-you commented Mar 31, 2020

Also -- it would be ideal if condition could be a regex matcher or something for the log output. For example, scanning the output for .*dial tcp: i/o timeout.* (and being able to set that restart config GLOBALLY in our template) would resolve more than 50% of our spurious failures.

@jkusa
Copy link

jkusa commented Nov 4, 2021

Any update on this feature?

@yakanechi
Copy link
Contributor

Any progress on this one?

We agree with adding the retry setting to steps as above as the user can change the setting at will.

On the other hand, sd-setup-scm are setup steps and cannot be arbitrarily changed from screwdriver.yaml.

The main setup steps where steps can fail due to external dependencies are the Image Pull in sd-setup-init and the git clone in sd-setup-scm.

However, the sd-setup-scm step does not have a retry process when a git clone fails, so even if you are an administrator, you cannot make it retry automatically.
In addition, a retry has already been added to sd-setup-init, but the status is set to Failed if the ImagePull fails, so the Pull is not retried.

Therefore, we are currently considering the following modifications.

  • Allow administrators to set retry enable flag and retry count in sd-setup-scm via environment variables from custom-environment-variables.yaml
  • Add image pull errors to retries in sd-setup-init

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

7 participants