Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run rollover cronjob by default daily at midnight #812

Merged
merged 2 commits into from
Dec 9, 2019

Conversation

pavolloffay
Copy link
Member

https://issues.jboss.org/browse/TRACING-824

The currently the default rollover configuration creates:

NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cronjob.batch/simple-prod-es-index-cleaner 55 23 * * * False 0 <none> 2m15s
cronjob.batch/simple-prod-es-lookback */30 * * * * False 0 <none> 2m15s
cronjob.batch/simple-prod-es-rollover */30 * * * * False 0 <none> 2m15s

The rollover condition: '{"max_age": "7d"}'

This means we run rollover cronjob every 30 minutes and the action takes place every week. Multiple runs a day make might make sense if the conditions are based on the size of the index - however this is supported only in ES6.

We should change the schedule to run once a day. I think it makes the most sense at the moment.

Regarding the default conditions - rollover every 7 days might be too much given the default (no rollover) strategy creates an index every day and some organizations keep the data only for 7 days. We should shrink it to 2 days by default - and raise discussion on the PR.

Signed-off-by: Pavol Loffay [email protected]

@pavolloffay pavolloffay added the Elasticsearch The issues related to Elasticsearch storage label Dec 4, 2019
@@ -183,7 +183,7 @@ func normalizeElasticsearch(spec *v1.ElasticsearchSpec) {
func normalizeRollover(spec *v1.JaegerEsRolloverSpec) {
spec.Image = util.ImageName(spec.Image, "jaeger-es-rollover-image")
if spec.Schedule == "" {
spec.Schedule = "*/30 * * * *"
spec.Schedule = "0 3 * * *"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to use @daily? It's typically better to use this kind of notation, as it allows the scheduler to spread jobs across the day, to avoid all jobs to run at the same time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to pick the lowest traffic time rather than run it at random time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that Kubernetes will run the same job at random times when @daily is set? That's certainly not in line with other cron implementations...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well didn't you just say it?

as it allows the scheduler to spread jobs across the day, to avoid all jobs to run at the same time.

I don't know how it is implemented. But I think it's better to run the rollover in low traffic hours.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, what I said is that the scheduler can spread the jobs (not executions) across the day. This way, your job might run every day at 4:07am (give or take).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

low traffic hours.

3am for some might be prime time for others :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see thanks for the explanation. Once the time set it will stick with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As these are default values, I would recommend setting to midnight, as this is a natural day boundary. Then if users want something else, they will need to specify.

@pavolloffay pavolloffay changed the title Run rollover cronjob by default daily at 3am Run rollover cronjob by default daily at midnight Dec 5, 2019
Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, @daily is still my preference, but LGTM.

@jpkrohling
Copy link
Contributor

Should this be merged for 1.16, or post-1.16?

@jpkrohling jpkrohling merged commit 4ae96a0 into jaegertracing:master Dec 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Elasticsearch The issues related to Elasticsearch storage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants