Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ETL resolution configurations #802

Merged
merged 1 commit into from
Mar 5, 2021
Merged

Conversation

nikovacevic
Copy link
Contributor

Supports opencost/opencost#719 and https://github.com/kubecost/kubecost-cost-model/pull/347

Changes

  • kubecostModel.etlResolutionSeconds (i.e. ETL_RESOLUTION_SECONDS) determines how high of a resolution ETL should use. Defaults to 60 (i.e. 1m) but should be set to 300 (i.e. 5m) or even 600 (i.e. 10m) for big clients.
  • kubecostModel.etlMaxBatchHours (i.e. ETL_MAX_BATCH_HOURS) limits the window duration of expensive ETL queries, batching them out so that any one query will not time out. Defaults to 6 (i.e. 6h) but might need to be set to 3 (i.e. 3h) for big clients who want to keep high resolution.

Testing

  • Will do a clean install today with helm values set and report back

@@ -436,6 +436,10 @@ spec:
- name: ETL_STORE_DURATION_DAYS
value: {{ (quote .Values.kubecostModel.etlStoreDurationDays) | default (quote 120) }}
{{- end }}
- name: ETL_RESOLUTION_SECONDS
value: {{ (quote .Values.kubecostModel.etlResolutionSeconds) | default (quote 60) }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our process for default in the template vs. default in values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I provide both to be "safe", but we should decide as a team. Do you have a preference?

Copy link
Contributor

@AjayTripathy AjayTripathy Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best to provide both for common levers so people can easily read values to see the default and users who don't often update their values get a sensible default, in case they are installing from an old values.yaml

If its something hairy like a scaling parameter, it can be left off of values though IMO to avoid clutter. This feels hairy enough that it can be hidden from the majority of users and just put in the template.

Any other thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with that. Only thing left to discuss is what the default values should be. Here, I've opted for ones that prioritize accuracy and hedge against timeouts for medium- to medium-large clusters. But a large customer might still want to make resolution 5m, or even batch 3h, to hedge further. I'd just hate to default to something that, for instance, misses all the checks pods. But open to opposing arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed to discuss Monday with @dwbrown2 and @AjayTripathy

Copy link
Contributor

@michaelmdresser michaelmdresser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.

@nikovacevic nikovacevic merged commit ea46f3c into develop Mar 5, 2021
@nikovacevic nikovacevic deleted the niko/etl-resolution branch March 5, 2021 22:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants