-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Musings on AutoScaling #856
Comments
By the way, this is all predicated on the assumption that users would rather think in terms of the very first thing I showed: absolute metric values and the scaling behavior based on that; all the drilldown that happens to scaling policies to me feels just like implementation details and calculations that can and should be hidden. I don't have data for that, it's just a gut feeling. Am I sorely wrong on that? |
Can I ask why you're not considering |
I love the idea of target tracking, and I will definitely implement that as well. But do you think I should not implement step scaling at all? Just leave it out of the API and force all customers to use target tracking if they want to autoscale? |
@allisaurus, by the way, do custom metrics work for target tracking? Because the CloudFormation docs seem to imply they don't. |
It looks like custom metrics related to EC2 instance utilization are permissible with target tracking. Looks like there may be a gap in what's currently supported in the service vs. via CFN. To your second point: accommodating use cases that require significantly different scale in/out behavior, or those that work off metrics incompatible w/ target tracking, would be a good argument for implementing step scaling as well (though I'm ill equipped to speak to how prevalent they are). |
I don't think we can afford to only implement only part of the feature set. So that means we have to build an API for step scaling anyway, and I'd like it to be as good as possible. |
The link is about Application Auto Scaling, not EC2 Auto Scaling. |
@jungseoklee where did you get the impression this topic was about EC2 autoscaling? I don't think I've mentioned it anywhere, and in fact I did come at this by just looking at App AutoScaling so far (although I do believe the instance autoscaling API is very similar). |
Here's another question, feel free to provide input: How are we going to model the API to represent thresholds and scaling actions? Let's say for a fictitious CPU usage/scaling example: Option 1: fluent APIscaling
.at(0).scale(-2)
.at(10).scale(-1)
.at(20).scale(0)
.at(80).scale(+1)
.at(90).scale(+2)
.end() Option 2: allow omitting bounds of bordering intervalsscaling.addTier({ upperBound: 10, adjustment: -2 });
scaling.addTier({ upperBound: 20, adjustment: -1 });
scaling.addTier({ lowerBound: 80, upperBound: 90, adjustment: +1 });
scaling.addTier({ adjustment: +2 }); or: scaling.addTier({ upperBound: 10, adjustment: -2 });
scaling.addTier({ upperBound: 20, adjustment: -1 });
scaling.addTier({ lowerBound: 80, adjustment: +1 });
scaling.addTier({ lowerBound: 90, adjustment: +2 }); Option 3: mixing thresholds and scales in a single array:scale([ 0, -2, 10, -1, 20, 0, 80, +1, 90, +2, 100 ]) Option 4: separate thresholds and scales.thresholds([0, 10, 20, 80, 90, 100])
.scales([-2, -1, 0, 1, 2]) |
True. You mentioned two terms, "CPU usage" and "instance is added", only. Nevertheless, I unconsciously combined the terms with 1) the link, custom metrics related to EC2 instance utilization, in the comment stream and 2) my understanding that step scaling policy is not applicable to DynamoDB [1] which is about Application Auto Scaling, so I got the impression. I would like to understand this topic correctly. There are no other intentions. |
I would vote for option 1 which is a great abstraction and modeling to me. Regarding option 2, users need to understand what Regarding option 3, it seems error-prone compared to other options. For example, what if I switch In case of option 4, this option looks better than options 2 and option 3, but we probably need to check if |
@rix0rrr 👍 agreed re: not implementing just part of the feature set (and wanting to make step scaling the best it can be). I only meant that since use cases exist which won't be well accommodated by target tracking, I think we should implement step scaling too vs. just target tracking. Thanks, btw, for the creating this issue for discussion! |
Adds a construct library for Application AutoScaling. The DynamoDB construct library has been updated to use the new AutoScaling mechanism, which allows more configuration and uses a Service Linked Role instead of a role per table. BREAKING CHANGE: instead of `addReadAutoScaling()`, call `autoScaleReadCapacity()`, and similar for write scaling. Fixes #856, #861, #640, #644.
__IMPORTANT NOTE__: when upgrading to this version of the CDK framework, you must also upgrade your installation the CDK Toolkit to the matching version: ```shell $ npm i -g aws-cdk $ cdk --version 0.14.0 (build ...) ``` Bug Fixes ========= * remove CloudFormation property renames ([#973](#973)) ([3f86603](3f86603)), closes [#852](#852) * **aws-ec2:** fix retention of all egress traffic rule ([#998](#998)) ([b9d5b43](b9d5b43)), closes [#987](#987) * **aws-s3-deployment:** avoid deletion during update using physical ids ([#1006](#1006)) ([bca99c6](bca99c6)), closes [#981](#981) [#981](#981) * **cloudformation-diff:** ignore changes to DependsOn ([#1005](#1005)) ([3605f9c](3605f9c)), closes [#274](#274) * **cloudformation-diff:** track replacements ([#1003](#1003)) ([a83ac5f](a83ac5f)), closes [#1001](#1001) * **docs:** fix EC2 readme for "natgatway" configuration ([#994](#994)) ([0b1e7cc](0b1e7cc)) * **docs:** updates to contribution guide ([#997](#997)) ([b42e742](b42e742)) * **iam:** Merge multiple principals correctly ([#983](#983)) ([3fc5c8c](3fc5c8c)), closes [#924](#924) [#916](#916) [#958](#958) Features ========= * add construct library for Application AutoScaling ([#933](#933)) ([7861c6f](7861c6f)), closes [#856](#856) [#861](#861) [#640](#640) [#644](#644) * add HostedZone context provider ([#823](#823)) ([1626c37](1626c37)) * **assert:** haveResource lists failing properties ([#1016](#1016)) ([7f6f3fd](7f6f3fd)) * **aws-cdk:** add CDK app version negotiation ([#988](#988)) ([db4e718](db4e718)), closes [#891](#891) * **aws-codebuild:** Introduce a CodePipeline test Action. ([#873](#873)) ([770f9aa](770f9aa)) * **aws-sqs:** Add grantXxx() methods ([#1004](#1004)) ([8c90350](8c90350)) * **core:** Pre-concatenate Fn::Join ([#967](#967)) ([33c32a8](33c32a8)), closes [#916](#916) [#958](#958) BREAKING CHANGES ========= * DynamoDB AutoScaling: Instead of `addReadAutoScaling()`, call `autoScaleReadCapacity()`, and similar for write scaling. * CloudFormation resource usage: If you use L1s, you may need to change some `XxxName` properties back into `Name`. These will match the CloudFormation property names. * You must use the matching `aws-cdk` toolkit when upgrading to this version, or context providers will cease to work. All existing cached context values in `cdk.json` will be invalidated and refreshed.
__IMPORTANT NOTE__: when upgrading to this version of the CDK framework, you must also upgrade your installation the CDK Toolkit to the matching version: ```shell $ npm i -g aws-cdk $ cdk --version 0.14.0 (build ...) ``` Bug Fixes ========= * remove CloudFormation property renames ([#973](#973)) ([3f86603](3f86603)), closes [#852](#852) * **aws-ec2:** fix retention of all egress traffic rule ([#998](#998)) ([b9d5b43](b9d5b43)), closes [#987](#987) * **aws-s3-deployment:** avoid deletion during update using physical ids ([#1006](#1006)) ([bca99c6](bca99c6)), closes [#981](#981) [#981](#981) * **cloudformation-diff:** ignore changes to DependsOn ([#1005](#1005)) ([3605f9c](3605f9c)), closes [#274](#274) * **cloudformation-diff:** track replacements ([#1003](#1003)) ([a83ac5f](a83ac5f)), closes [#1001](#1001) * **docs:** fix EC2 readme for "natgatway" configuration ([#994](#994)) ([0b1e7cc](0b1e7cc)) * **docs:** updates to contribution guide ([#997](#997)) ([b42e742](b42e742)) * **iam:** Merge multiple principals correctly ([#983](#983)) ([3fc5c8c](3fc5c8c)), closes [#924](#924) [#916](#916) [#958](#958) Features ========= * add construct library for Application AutoScaling ([#933](#933)) ([7861c6f](7861c6f)), closes [#856](#856) [#861](#861) [#640](#640) [#644](#644) * add HostedZone context provider ([#823](#823)) ([1626c37](1626c37)) * **assert:** haveResource lists failing properties ([#1016](#1016)) ([7f6f3fd](7f6f3fd)) * **aws-cdk:** add CDK app version negotiation ([#988](#988)) ([db4e718](db4e718)), closes [#891](#891) * **aws-codebuild:** Introduce a CodePipeline test Action. ([#873](#873)) ([770f9aa](770f9aa)) * **aws-sqs:** Add grantXxx() methods ([#1004](#1004)) ([8c90350](8c90350)) * **core:** Pre-concatenate Fn::Join ([#967](#967)) ([33c32a8](33c32a8)), closes [#916](#916) [#958](#958) BREAKING CHANGES ========= * DynamoDB AutoScaling: Instead of `addReadAutoScaling()`, call `autoScaleReadCapacity()`, and similar for write scaling. * CloudFormation resource usage: If you use L1s, you may need to change some `XxxName` properties back into `Name`. These will match the CloudFormation property names. * You must use the matching `aws-cdk` toolkit when upgrading to this version, or context providers will cease to work. All existing cached context values in `cdk.json` will be invalidated and refreshed.
__IMPORTANT NOTE__: when upgrading to this version of the CDK framework, you must also upgrade your installation the CDK Toolkit to the matching version: ```shell $ npm i -g aws-cdk $ cdk --version 0.14.0 (build ...) ``` Bug Fixes ========= * remove CloudFormation property renames ([aws#973](aws#973)) ([3f86603](aws@3f86603)), closes [aws#852](aws#852) * **aws-ec2:** fix retention of all egress traffic rule ([aws#998](aws#998)) ([b9d5b43](aws@b9d5b43)), closes [aws#987](aws#987) * **aws-s3-deployment:** avoid deletion during update using physical ids ([aws#1006](aws#1006)) ([bca99c6](aws@bca99c6)), closes [aws#981](aws#981) [aws#981](aws#981) * **cloudformation-diff:** ignore changes to DependsOn ([aws#1005](aws#1005)) ([3605f9c](aws@3605f9c)), closes [aws#274](aws#274) * **cloudformation-diff:** track replacements ([aws#1003](aws#1003)) ([a83ac5f](aws@a83ac5f)), closes [aws#1001](aws#1001) * **docs:** fix EC2 readme for "natgatway" configuration ([aws#994](aws#994)) ([0b1e7cc](aws@0b1e7cc)) * **docs:** updates to contribution guide ([aws#997](aws#997)) ([b42e742](aws@b42e742)) * **iam:** Merge multiple principals correctly ([aws#983](aws#983)) ([3fc5c8c](aws@3fc5c8c)), closes [aws#924](aws#924) [aws#916](aws#916) [aws#958](aws#958) Features ========= * add construct library for Application AutoScaling ([aws#933](aws#933)) ([7861c6f](aws@7861c6f)), closes [aws#856](aws#856) [aws#861](aws#861) [aws#640](aws#640) [aws#644](aws#644) * add HostedZone context provider ([aws#823](aws#823)) ([1626c37](aws@1626c37)) * **assert:** haveResource lists failing properties ([aws#1016](aws#1016)) ([7f6f3fd](aws@7f6f3fd)) * **aws-cdk:** add CDK app version negotiation ([aws#988](aws#988)) ([db4e718](aws@db4e718)), closes [aws#891](aws#891) * **aws-codebuild:** Introduce a CodePipeline test Action. ([aws#873](aws#873)) ([770f9aa](aws@770f9aa)) * **aws-sqs:** Add grantXxx() methods ([aws#1004](aws#1004)) ([8c90350](aws@8c90350)) * **core:** Pre-concatenate Fn::Join ([aws#967](aws#967)) ([33c32a8](aws@33c32a8)), closes [aws#916](aws#916) [aws#958](aws#958) BREAKING CHANGES ========= * DynamoDB AutoScaling: Instead of `addReadAutoScaling()`, call `autoScaleReadCapacity()`, and similar for write scaling. * CloudFormation resource usage: If you use L1s, you may need to change some `XxxName` properties back into `Name`. These will match the CloudFormation property names. * You must use the matching `aws-cdk` toolkit when upgrading to this version, or context providers will cease to work. All existing cached context values in `cdk.json` will be invalidated and refreshed.
I imagine when people want to set up autoscaling, they want to set up something like this:
Disregarding
TargetTracking
scaling for a moment, the way to set this up is with StepScaling. You'll make a StepScaling policy which is activated by a CloudWatch alarm. The StepScaling policy has different scaling tiers depending on the distance of the current metric value to its Alarm threshold. Configuration for a step scaling policy looks like this:Normally, CloudWatch Alarm Actions are edge-triggered (that is, an Action occurs only when the alarm transitions from OK to ALARM or vice-versa). However, if the Alarm Action is an AutoScaling policy the Alarm keeps on triggering the AutoScaling policy periodically, so that if the alarm goes further out of spec, higher scaling tiers can be activated. (For example, the CPU usage goes to 75% and an instance is added. However, that doesn't make the load go down enough yet, and after an (undefined) while the policy is activated again and another instance is added).
The question is, how to set up step scaling policies to achieve the scaling that the user wants?
Solution 1: Two alarms, two policies
This is what I see most on the internet--a separate alarm and a separate scaling policy for scaling out and scaling in. Seems inefficient, resource-wise.
Also, AutoScaling policies seem to imply they can both scale in and scale out in a single policy, but you'd never take advantage of that in this way.
Solution 2: One alarm, one policy
Ideally, it seems like I would want just a single alarm/metric/scalingpolicy configuration. But I don't know if this would even work, it would require that the scaling policy is activated on both sides of the CloudWatch alarm, and it might not do that.
Solution 3: Two alarms, one policy
Another potential way to go would be to have two alarms trigger the same scaling policy (one lower-than-threshold and one greater-than-threshold) and just describe the scaling behavior with respect to the alarm thresholds on either side.
We can save on a ScalingPolicy in this way (with respect to solution 1). It's not as obvious what's going on though, and would rely on the fact that 2 different alarms give the deltas to 2 different thresholds to the same scaling policy ONLY when they're in alarm.
Questions
Can an AutoScaling Policy be in the=> yesOKActions
of a CloudWatch Alarm? Does that work?If it does work, will it also continue evaluating and triggering the Scaling Policy periodically if it's on the=> yesOKActions
, just like it does inAlarmActions
?ChangeInCapacity = 0
?The text was updated successfully, but these errors were encountered: