-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUP-3258 - Implement .spec.job.activeDeadlineSeconds #497
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Other than one comment below, ideally this would also have a test. The easiest way currently is probably an integration test that uses a pipeline YAML fixture with a small activeDeadlineSeconds
and a command that sleeps longer than it. But it may be tricky to take image pull time into account, so the deadline and sleep would probably be at least 30 seconds to avoid flakiness.
@@ -75,6 +76,7 @@ func New(logger *zap.Logger, client kubernetes.Interface, cfg Config) *worker { | |||
} | |||
|
|||
type KubernetesPlugin struct { | |||
Job *batchv1.Job `json:"job,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using *batchv1.Job
here could get confusing - someone might try to write:
plugins:
- kubernetes:
job:
template:
spec:
containers:
...
which wouldn't be rejected for any failure to parse, and the user might expect that defining containers that way would work but instead it would be silently ignored.
(Maybe one day the whole KubernetesPlugin
should be changed to revolve around a *batchv1.Job
instead of defining pieces of it. Needs thought.)
In the meantime I would suggest either just adding JobActiveDeadlineSeconds
as a KubernetesPlugin
field directly, or maybe wrapping it in a new Job
struct that can be grown over time to include other job parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is step 1 of a long journey to fully transition to *batchv1.Job
. I went back and forth on whether to implement it this way and possibly cause confusion for users that try to leverage the entire Job spec. There are additional tickets we have that will continue to build off of the Job spec (backoffLimit
, podFailurePolicy
, etc.) bit by bit. We could implement this (and the future additional fields) as fields in the plugin and eventually bring them back into the full Job spec when the time comes to more precisely control the feature set.
fd05068
to
7a13241
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ✅
Implements
.spec.job.activeDeadlineSeconds
as a means of setting runtime boundaries for Kubernetes jobs that are unable to complete within a "reasonable" amount of time.This is configured with a default value of
21600
seconds (6 hours) in the controller configuration. This value can be overridden by the user via the command line--job-active-deadline-seconds 43200
, directly in the controller configurationjob-active-deadline-seconds: 43200
, or on a per Buildkite job basis via thekubernetes
plugin:At this time
.spec.job.activeDeadlineSeconds
will not be able to be disabled, as we agree that a job runtime boundary should be defined at all times.Fixes #480
Fixes #479
Fixes #463