-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE]: Different exit codes for different failure modes in Garden commands #3297
Comments
Thanks for reporting this @anna-yn! We'll take a look soon. |
This has been partially fixed in #3309. OOMs are now reported properly when running tests with artifacts. Also, non-zero exit codes are logged in the service sections. If an exit code is not available, Garden still returns the default one which is |
Partially fixed in #3388. Reopening again, because it's still necessary to test and verify the error cases from the list above. @anna-yn could you check and verify the cases from the list in the description, please?
This should be fixed now
This fails with Garden's internal
Please let us know if there are any missing details. It would be very helpful to share the example error messages and the expected output for the cases below:
|
This issue has been automatically marked as stale because it hasn't had any activity in 90 days. It will be closed in 14 days if no further activity occurs (e.g. changing labels, comments, commits, etc.). Please feel free to tag a maintainer and ask them to remove the label if you think it doesn't apply. Thank you for submitting this issue and helping make Garden a better product! |
We have a similar use case. We are using If a job fails because of a genuine test failure, then we just want to fail the pipeline as usual. However, if a job fails because of a garden deployment error (which could be caused by any sort of k8s problem), then we would like to retry the pipeline job, because there is a good chance that it will pass on retry. If |
Feature Request
Background / Motivation
We're using Garden for CI and inner loop development, so we utilize the
garden test
and thegarden deploy
commands pretty heavily. There are a few reasons that these commands might fail:We would like to be able to automatically retry some of these failures in CI. For example, if it's something unrelated to the test run - the helm chart couldn't be pulled or if the Garden namespace couldn't be created, we'd like to retry twice because we noticed that those are usually flakes.
An easy way to achieve this would be to set retry rules based on the exit code of the Garden commands. Right now the commands would produce an exit code
1
no matter what the failure is, so all those failure modes get lumped together. It would be a huge help for us if say K8s related errors exit at103
, timeouts exit at104
and user errors (say tests failed) exit at1
or something like that.What should the user be able to do?
The user should be able to know the type of error based on the exit code of the Garden command
Why do they want to do this? What problem does it solve?
We'd like to set different auto retry rules for different failure modes. If the Garden command produced different exit codes, auto retry would be very easy to set up for us. We don't want to just retry everything 3 times because the tests take like 20min to run and we don't want flaky tests to make it into the main branch, but if it's a k8s flake then we'd like to retry away and not have our engineers see those errors if possible.
Suggested Implementation(s)
When emitting an error from the Garden command, check what it is (whether it was the command it ran that failed or some k8s failure) and produce a different exit code accordingly. To start we'd be very happy if the commands just produced 2 different errors - command failure vs everything else. More granularity would help but we could get started with just 2.
How important is this feature for you/your team?
🌵 Not having this feature makes using Garden painful
The text was updated successfully, but these errors were encountered: