-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable completion time-to-live to be set on all jobs #407
Conversation
Codeclimate issues are related to the autogenerated code. |
Codecov Report
@@ Coverage Diff @@
## master #407 +/- ##
==========================================
+ Coverage 91.59% 91.62% +0.02%
==========================================
Files 64 64
Lines 3142 3153 +11
==========================================
+ Hits 2878 2889 +11
Misses 184 184
Partials 80 80
Continue to review full report at Codecov.
|
@objectiser I do not have 1.12 Kubernetes, I have,
Test status:
CR: storage:
type: elasticsearch
esIndexCleaner:
enabled: true
schedule: "*/1 * * * *"
completedTTL: 180
dependencies:
enabled: true
schedule: "*/1 * * * *"
completedTTL: 300
elasticsearch:
nodeCount: 3
resources: |
@jkandasa Ok thanks. @jpkrohling @pavolloffay At present the 'completed time-to-live' for all of the jobs is set to a default of 10 minutes, but wondering whether each job type (e.g. rollover, index cleaner, etc) should have their own defaults which can more closely relate to the schedules (when used) - so if a schedule only runs a job once a day, then maybe the ttl should be two days so any failures are around for a while? |
Maybe we could make it always double to what is the schedule? |
@pavolloffay On further thought, the time the job should remain shouldn't really be related to schedule, as if a failure/retry occurs it can result in a large number of jobs being left lying around for a long time. On the other hand we need to give time for someone to detect the failure and capture any relevant information. So have updated default to 1 hour. If we want to try to come up with a more complex scheme, then we could do it in a separate PR? |
The jobs we have here run once per day at midnight. Not sure if one hour is a good default in this case. |
If there is a logging framework, then any failures would be captured centrally - so having the job hanging around would not be as relevant. |
The question is if those logs would get attention and if they would indicate a problem. |
@jpkrohling Any thoughts on this? |
If we run each job on a daily basis, it's safe to assume that they shouldn't take more than one day to complete. |
@jpkrohling Could you take a look at this? codeclimate errors not relevant, and travis job has finished, but not updated here for some reason. |
This is all green now. |
@jpkrohling Are the changes ok to merge? |
LGTM, but I would prefer |
@jpkrohling No problem, can change. |
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
…and change default TTL to 1 day Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
@@ -43,6 +43,7 @@ github.com/bradfitz/go-smtpd v0.0.0-20170404230938-deb6d6237625/go.mod h1:HYsPBT | |||
github.com/census-instrumentation/opencensus-proto v0.2.0 h1:LzQXZOgg4CQfE6bFvXGM30YZL1WW/M337pXml+GrcZ4= | |||
github.com/census-instrumentation/opencensus-proto v0.2.0/go.mod h1:f6KPmirojxKA12rnyqOA5BBL4O983OfeGPqjHWSTneU= | |||
github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= | |||
github.com/codahale/hdrhistogram v0.0.0-20161010025455-3a0bb77429bd h1:qMd81Ts1T2OTKmB4acZcyKaMtRnY5Y44NuXGX2GFJ1w= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpkrohling Would you expect these changes to the go.sum file? They seem similar except next line as /go.mod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, but the version seems strange. Perhaps it's pointing to master and they had a new commit recently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would have expected it to just update the existing line rather than add a new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point. I need to understand go modules better in order to answer that. I have seen duplicated entries in different sections in the go.mod, but not sure I would expect the same for go.sum...
Fixes #406
@jkandasa Could you give this a test to make sure it fixes the issue. Note it requires kubernetes 1.12 or higher.
Signed-off-by: Gary Brown [email protected]