-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto-scale DynamoDB provision based on Prometheus metrics #841
Conversation
b7990d1
to
8497e4f
Compare
At the weekly table changeover (which happens Wednesday night for me) it initially does a scale-down of the new table, because the error rate is zero. Having thought about it a bit, I think this is best addressed by setting the minimum write capacity to a level which is enough to handle the expected traffic at midnight. I like the idea that the core algorithm should operate without knowing anything specific about the individual tables. |
Scaling up by a fixed fraction of current provision isn't very good. As a simple improvement I might floor the increment to a fraction of the max, e.g. if max is 10K then min increment is 1K. |
func chunkTagsToDynamoDB(ts chunk.Tags) []*dynamodb.Tag { | ||
if ts == nil { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this redundant? Looks like result
will be nil if len(ts) == 0 (or ts is nill)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes; it's cut/pasted from what was there before
err = d.disableAutoScaling(ctx, expected) | ||
} else if current.WriteScale != expected.WriteScale { | ||
err = d.enableAutoScaling(ctx, expected) | ||
if expected.WriteScale.Enabled && d.metrics != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the check expected.WriteScale.Enabled && d.metrics != nil
should be put in the constructor so we can fail early? Then we can just check expected.WriteScale.Enabled
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, in what situation is metrics nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics is nil when you want AWS auto-scaling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how - must be missing something. Would you mind explaining a little further for me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plus or minus a bug...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the bug.
@@ -319,7 +353,11 @@ func (d dynamoTableClient) UpdateTable(ctx context.Context, current, expected ch | |||
return err | |||
}) | |||
}); err != nil { | |||
return err | |||
if awsErr, ok := err.(awserr.Error); ok && awsErr.Code() == "LimitExceededException" { | |||
level.Warn(util.Logger).Log("msg", "update limit exceeded", "err", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps export a counter for this too?
pkg/chunk/aws/metrics_autoscaling.go
Outdated
if err != nil { | ||
return nil, err | ||
} | ||
promAPI = promV1.NewAPI(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read this right, this will return a non-functioning but non-nil metricsData
if the addr is empty? That seems odd to me; I'd expect either an error or a fully-functioningmetricsData
- perhaps this logic should be pushed up to the called?
pkg/chunk/aws/metrics_autoscaling.go
Outdated
} | ||
|
||
func extractRates(matrix model.Matrix) (map[string]float64, error) { | ||
ret := make(map[string]float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC throughout the codebase we tend to prefer map[string]float64
, and only use make
when we know the size.
pkg/chunk/aws/metrics_autoscaling.go
Outdated
} | ||
|
||
// fetch write error rate per DynamoDB table | ||
deMatrix, err := promQuery(ctx, m.promAPI, `sum(rate(cortex_dynamo_failures_total{error="ProvisionedThroughputExceededException",operation=~".*Write.*"}[1m])) by (table) > 0`, 0, time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be be able to inject extra selectors into these queries, for instance we run multiple cortex cluster monitored by the same prometheus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the whole promql expression configurable, on the basis that other things like rate periods might need tweaking too.
return chunk.TableDesc{ | ||
Name: name, | ||
}, dynamodb.TableStatusActive, nil | ||
}, true, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
f.Int64Var(&cfg.OutCooldown, argPrefix+".out-cooldown", 3000, "DynamoDB minimum time between each autoscaling event that increases provision capacity.") | ||
f.Int64Var(&cfg.InCooldown, argPrefix+".in-cooldown", 3000, "DynamoDB minimum time between each autoscaling event that decreases provision capacity.") | ||
f.Int64Var(&cfg.OutCooldown, argPrefix+".out-cooldown", 1800, "DynamoDB minimum seconds between each autoscale up.") | ||
f.Int64Var(&cfg.InCooldown, argPrefix+".in-cooldown", 1800, "DynamoDB minimum seconds between each autoscale down.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All usages of this are as a time.Duration, so perhaps we should make this a DurationVar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but not backwards-compatible. I guess we could add two new flags and deprecate the old.
}) | ||
} | ||
return result | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/chunk/aws/metrics_autoscaling.go
Outdated
queueLengths []float64 | ||
errorRates map[string]float64 | ||
usageRates map[string]float64 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is quite a lot of coupling between metricsData
and dynamoTableClient
; I wonder if more work needs to be done to tease out a clean interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I think much of this was simply the metrics methods were on the wrong struct, but I went ahead and created an abstract interface so the choice between old-style and new-style is in the constructor. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now!
A few minor nits, main concern is about coupling between the metricsData and tableClient. I wonder if a better interface could be teased out? |
pkg/chunk/aws/metrics_autoscaling.go
Outdated
} | ||
} | ||
|
||
func newMetrics(cfg DynamoDBConfig) (*metricsData, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably belongs at the top of the file now?
}, []string{"operation", "status_code"}) | ||
// Pluggable auto-scaler implementation | ||
type autoscale interface { | ||
CreateTable(ctx context.Context, desc chunk.TableDesc) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really PostCreateTable
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I was wrestling with how similar it is to the higher-level interface
If the current provision is |
and bump the failure counter too
so we can supply multiple sets of results
Don't want to scale down if all our metrics are returning zero, or all the ingesters have crashed, etc
50 minutes seems weird. Currently using 30 minutes in production.
Move old-style AWS autoscaling to a separate file; make metrics autoscaling implement the common interface.
If the current capacity is very low, adding 20% doesn't get us very far, so add minimum 10% of max capacity. Refactored to extract 'compute' functions for readability.
47eaed8
to
0dd6116
Compare
0dd6116
to
f93ace9
Compare
LGTM! |
Fixes #735
To use this feature, set
-dynamodb....scale.enabled
on in command-line args but do not supply an AWS autoscaler-applicationautoscaling.url
.Also set
-metrics.url
to point at a Prometheus API-compatible server which will serve the metrics.(at the time of writing this can't be a multi-tenant server because I didn't put in an option for the tenant)
PR also includes much refactoring of existing tests because I couldn't stand so much repetition.
Outstanding things I should do:
queueLengthAcceptable
to be changedout-cooldown
andin-cooldown
For extra credit: