Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cheap and cheerful autoscaler #229

Merged
merged 62 commits into from
Feb 28, 2018
Merged
Show file tree
Hide file tree
Changes from 60 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
520284f
Bring back the queue.
josephburnett Feb 2, 2018
0ef9a10
Wire queue between nginx and app.
josephburnett Feb 5, 2018
6373b51
Autoscaler and queue that share a stat type.
josephburnett Feb 6, 2018
09ca18e
Initialize queue with autoscaler service before starting stat reporter.
josephburnett Feb 6, 2018
aef729a
Connect stat sink.
josephburnett Feb 14, 2018
ea33034
Add gorilla websocket to deps.
josephburnett Feb 14, 2018
847c907
Build the queue with bazel and pass diget into controller through com…
josephburnett Feb 14, 2018
cf94ccb
Setup env variables and service account for queue to find the autosca…
josephburnett Feb 15, 2018
5cc75a3
Create autoscaler service and deployment and connect queue.
josephburnett Feb 15, 2018
3025926
Reconnect to autoscaler and send pod name.
josephburnett Feb 16, 2018
8af0392
Calculate 6 and 60 second QPS and scaling action.
josephburnett Feb 16, 2018
0eca4d4
Replace 6 and 60 with parameters.
josephburnett Feb 16, 2018
aaf102c
Do actual scaling. Tune parameters.
josephburnett Feb 16, 2018
eea535b
Scale deployment in the background.
josephburnett Feb 20, 2018
fec0ce1
Request a full CPU for each ela pod.
josephburnett Feb 22, 2018
1861bc7
Calculate QPS with floats.
josephburnett Feb 22, 2018
f2ff89f
Scale on concurrent requests instead of QPS.
josephburnett Feb 22, 2018
2e736bb
Provide desired concurrency per process in revision spec.
josephburnett Feb 22, 2018
8f776d5
Add test for queue-proxy. Fails becuase of extra autoscaler deployment.
josephburnett Feb 22, 2018
81ed376
Fix unit tests by checking ela deployment separately from autoscaler …
josephburnett Feb 23, 2018
98beef3
Add autoscaler deployment env variable test.
josephburnett Feb 23, 2018
ef41614
Move core autoscaler logic into lib for unit testing.
josephburnett Feb 23, 2018
45861ae
Refactoring autoscaler for unit testing.
josephburnett Feb 23, 2018
b8eb904
Autoscaler unit tests.
josephburnett Feb 25, 2018
81091e8
Autoscaler comments.
josephburnett Feb 25, 2018
812585d
Only accept target concurrency of 1+.
josephburnett Feb 25, 2018
7a82f11
Limit scale up ratio to 10x.
josephburnett Feb 28, 2018
26b5893
Fix git rebase mistakes.
josephburnett Feb 26, 2018
76ecac5
Move autoscaler main to cmd.
josephburnett Feb 26, 2018
023abbb
Move queue sidecar to cmd.
josephburnett Feb 26, 2018
185848a
Remove TargetConcurrencyPerProcess revision parameter.
josephburnett Feb 26, 2018
6951abf
Replace log with glog.
josephburnett Feb 26, 2018
f2da749
Use defaults for websocket upgrader.
josephburnett Feb 26, 2018
c686fc5
Add service account and binding for autoscaler.
josephburnett Feb 26, 2018
0d1fa95
Update deps.
josephburnett Feb 26, 2018
eb6cabf
Fix incorrect usage of glog.
josephburnett Feb 26, 2018
94c5033
Const parameters.
josephburnett Feb 26, 2018
1cf32bc
Fix targetConcurrency typo.
josephburnett Feb 28, 2018
582ba29
Fix typo.
josephburnett Feb 27, 2018
fdfaf11
Inject autoscaler name to remove hardcoded value.
josephburnett Feb 27, 2018
dc4256a
Pull out queue parameters into constants.
josephburnett Feb 27, 2018
831490a
Add liscense headers to queue and autoscaler.
josephburnett Feb 27, 2018
59dd1d4
Cpu requests in constants.
josephburnett Feb 27, 2018
3cfa284
Plumb autoscaler port through env from single constant.
josephburnett Feb 27, 2018
a4c01e1
Comment for autoscaler types.
josephburnett Feb 27, 2018
e72087b
Fix log statement formatting.
josephburnett Feb 28, 2018
0b75d97
Add back ela-revision service account.
josephburnett Feb 28, 2018
5adb66d
Report time from pod with concurrency stat.
josephburnett Feb 28, 2018
9ecea8d
Send only one scale request at a time with a 5 second timeout.
josephburnett Feb 28, 2018
13dbf7c
Move autoscaler docs to package documentation.
josephburnett Feb 28, 2018
829bfe8
Include pod name in stat key.
josephburnett Feb 28, 2018
b1c8f1a
Comment about waiting for autoscaler IP.
josephburnett Feb 28, 2018
3567398
Add queue->autoscaler connect sleep comment.
josephburnett Feb 28, 2018
51eeea1
Parse losthost url once.
josephburnett Feb 28, 2018
28f1f1b
Use singleton proxy in queue.
josephburnett Feb 28, 2018
e01bf7e
Fold autoscaler/types package into autoscaler.
josephburnett Feb 28, 2018
a5f8b7d
Add Record and Scale function comments.
josephburnett Feb 28, 2018
dd47fa0
Move environment variable access into init.
josephburnett Feb 28, 2018
0912ee4
Merge branch 'master' into caca
josephburnett Feb 28, 2018
65d1c4f
Remove enableQueue nginx template parameter.
josephburnett Feb 28, 2018
aa9de12
Comments and copyright headers.
josephburnett Feb 28, 2018
0b22485
Copyright header.
josephburnett Feb 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ k8s_object(
name = "controller",
images = {
"ela-controller:latest": "//cmd/ela-controller:image",
"ela-queue:latest": "//cmd/ela-queue:image",
"ela-autoscaler:latest": "//cmd/ela-autoscaler:image",
},
template = "controller.yaml",
)
Expand Down
8 changes: 7 additions & 1 deletion Gopkg.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 26 additions & 0 deletions clusterrolebinding.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,29 @@ roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: ela-autoscaler-write
subjects:
- kind: ServiceAccount
name: ela-autoscaler
namespace: default
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: ela-revision-read
subjects:
- kind: ServiceAccount
name: ela-revision
namespace: default
roleRef:
kind: ClusterRole
name: cluster-admin # TODO(josephburnett): reduce this role to read-only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How deeply do we understand the capabilities the autoscaler needs right now? Can we just do this TODO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the medium term, we want to collect metrics from Prometheus, in which case we can do away with this queue->autoscaler websocket pipeline and associated permissions. In the short term, we should turn the client-server relationship around and have the autoscaler scrape the pods (@evankanderson and @vaikas-google 's suggestion) which would also do away with the pod permission requirement. This is just to play around with and I plan to get rid of it. Will update to comment accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the medium term, we want to collect metrics from Prometheus

If the autoscaler collects metrics from Prometheus instead of pods directly, its reaction time is coupled to Prometheus' sampling interval. It would be more flexible to have the autoscaler scrape pods directly (using their Prometheus endpoints). Then it can decide its own sampling interval. Here are some examples of situations when the autoscaler might want to vary sampling frequency:

  • Watch pod creation events and sample new revisions more frequently
  • Sample highly scaled revisions slower on the assumption that they're less likely to need fast reactions and are more expensive to sample
  • Increase sampling frequency of revisions that were recently scaled

If the autoscaler does its own scraping, it still needs a ClusterRoleBinding with read permissions so it can enumerate the list of pods to target.

It's possible that Prometheus has an API the autoscaler can use to increase or decrease the sampling frequency of a particular tagged metric. That might be sufficient and we could avoid writing a bunch of scraping code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its reaction time is coupled to Prometheus' sampling interval.

Agree. Maybe we will stick with scraping the pods if we can't get the Envroy->Mixer->Prometheus pipeline latency low enough.

Yes, the autoscaler will still need a role to find the pods. And to modify the deployement. The queue is also using this role binding and that should go away.

apiGroup: rbac.authorization.k8s.io
32 changes: 32 additions & 0 deletions cmd/ela-autoscaler/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
load("@io_bazel_rules_go//go:def.bzl", "go_binary", "go_library")

go_library(
name = "go_default_library",
srcs = ["main.go"],
importpath = "github.com/google/elafros/cmd/ela-autoscaler",
visibility = ["//visibility:private"],
deps = [
"//pkg/autoscaler:go_default_library",
"//vendor/github.com/golang/glog:go_default_library",
"//vendor/github.com/gorilla/websocket:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/apis/meta/v1:go_default_library",
"//vendor/k8s.io/client-go/kubernetes:go_default_library",
"//vendor/k8s.io/client-go/rest:go_default_library",
],
)

go_binary(
name = "ela-autoscaler",
embed = [":go_default_library"],
importpath = "github.com/google/elafros/cmd/ela-autoscaler",
pure = "on",
visibility = ["//visibility:public"],
)

load("@io_bazel_rules_docker//go:image.bzl", "go_image")

go_image(
name = "image",
binary = ":ela-autoscaler",
visibility = ["//visibility:public"],
)
190 changes: 190 additions & 0 deletions cmd/ela-autoscaler/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
/*
Copyright 2018 Google Inc. All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package main

import (
"bytes"
"encoding/gob"
"net/http"
"os"
"time"

ela_autoscaler "github.com/google/elafros/pkg/autoscaler"

"github.com/golang/glog"
"github.com/gorilla/websocket"

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
)

const (
// The desired number of concurrent requests for each pod. This
// is the primary knob for the fast autoscaler which will try
// achieve a 60-second average concurrency per pod of
// targetConcurrency. Another process may tune targetConcurrency
// to best handle the resource requirements of the revision.
targetConcurrency = float64(1.0)

// A big enough buffer to handle 1000 pods sending stats every 1
// second while we do the autoscaling computation (a few hundred
// milliseconds).
statBufferSize = 1000

// Enough buffer to store scale requests generated every 2
// seconds while an http request is taking the full timeout of 5
// second.
scaleBufferSize = 10
)

var (
upgrader = websocket.Upgrader{}
kubeClient *kubernetes.Clientset
statChan = make(chan ela_autoscaler.Stat, statBufferSize)
scaleChan = make(chan int32, scaleBufferSize)
elaNamespace string
elaDeployment string
elaAutoscalerPort string
)

func init() {
elaNamespace = os.Getenv("ELA_NAMESPACE")
if elaNamespace == "" {
glog.Fatal("No ELA_NAMESPACE provided.")
}
glog.Infof("ELA_NAMESPACE=%v", elaNamespace)

elaDeployment = os.Getenv("ELA_DEPLOYMENT")
if elaDeployment == "" {
glog.Fatal("No ELA_DEPLOYMENT provided.")
}
glog.Infof("ELA_DEPLOYMENT=%v", elaDeployment)

elaAutoscalerPort = os.Getenv("ELA_AUTOSCALER_PORT")
if elaAutoscalerPort == "" {
glog.Fatal("No ELA_AUTOSCALER_PORT provided.")
}
glog.Infof("ELA_AUTOSCALER_PORT=%v", elaAutoscalerPort)
}

func autoscaler() {
glog.Infof("Target concurrency: %0.2f.", targetConcurrency)

a := ela_autoscaler.NewAutoscaler(targetConcurrency)
ticker := time.NewTicker(2 * time.Second)

for {
select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems these two cases are independent. If both are ready, we should do both instead of choosing one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autoscaler is not safe for concurrent access, so we only do one of these at a time. Not a big deal since the Scale computation is pretty fast, even with 1000's of pods. And the stat channel is buffered.

case <-ticker.C:
scale, ok := a.Scale(time.Now())
if ok {
scaleChan <- scale
}
case s := <-statChan:
a.Record(s)
}
}
}

func scaleSerializer() {
for {
select {
case desiredPodCount := <-scaleChan:
FastForward:
// Fast forward to the most recent desired pod
// count since the http timeout (5 sec) is more
// than the autoscaling rate (2 sec) and there
// could be multiple pending scale requests.
for {
select {
case p := <-scaleChan:
glog.Warning("Scaling is not keeping up with autoscaling requests.")
desiredPodCount = p
default:
break FastForward
}
}
scaleTo(desiredPodCount)
}
}
}

func scaleTo(podCount int32) {
glog.Infof("Target scale is %v", podCount)
dc := kubeClient.ExtensionsV1beta1().Deployments(elaNamespace)
deployment, err := dc.Get(elaDeployment, metav1.GetOptions{})
if err != nil {
glog.Error("Error getting Deployment %q: %s", elaDeployment, err)
return
}
if *deployment.Spec.Replicas == podCount {
glog.Info("Already at scale.")
return
}
deployment.Spec.Replicas = &podCount
_, err = dc.Update(deployment)
if err != nil {
glog.Errorf("Error updating Deployment %q: %s", elaDeployment, err)
}
glog.Info("Successfully scaled.")
}

func handler(w http.ResponseWriter, r *http.Request) {
conn, err := upgrader.Upgrade(w, r, nil)
if err != nil {
glog.Error(err)
return
}
glog.Info("New metrics source online.")
for {
messageType, msg, err := conn.ReadMessage()
if err != nil {
glog.Info("Metrics source dropping off.")
return
}
if messageType != websocket.BinaryMessage {
glog.Error("Dropping non-binary message.")
continue
}
dec := gob.NewDecoder(bytes.NewBuffer(msg))
var stat ela_autoscaler.Stat
err = dec.Decode(&stat)
if err != nil {
glog.Error(err)
continue
}
statChan <- stat
}
}

func main() {
glog.Info("Autoscaler up")
config, err := rest.InClusterConfig()
if err != nil {
glog.Fatal(err)
}
config.Timeout = time.Duration(5 * time.Second)
kc, err := kubernetes.NewForConfig(config)
if err != nil {
glog.Fatal(err)
}
kubeClient = kc
go autoscaler()
go scaleSerializer()
http.HandleFunc("/", handler)
http.ListenAndServe(":"+elaAutoscalerPort, nil)
}
33 changes: 33 additions & 0 deletions cmd/ela-queue/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
load("@io_bazel_rules_go//go:def.bzl", "go_binary", "go_library")

go_library(
name = "go_default_library",
srcs = ["main.go"],
importpath = "github.com/google/elafros/cmd/ela-queue",
visibility = ["//visibility:private"],
deps = [
"//pkg/autoscaler:go_default_library",
"//vendor/github.com/golang/glog:go_default_library",
"//vendor/github.com/gorilla/websocket:go_default_library",
"//vendor/k8s.io/api/core/v1:go_default_library",
"//vendor/k8s.io/apimachinery/pkg/apis/meta/v1:go_default_library",
"//vendor/k8s.io/client-go/kubernetes:go_default_library",
"//vendor/k8s.io/client-go/rest:go_default_library",
],
)

go_binary(
name = "ela-queue",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the queue is changed to forwarding stats?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um ... yes. It's a hack. The queue's original purpose was to support enforced request serialization, and it's still in the right place to do that. But I am using it to also count how many requests are in that queue. And not enforcing serialization.

embed = [":go_default_library"],
importpath = "github.com/google/elafros/cmd/ela-queue",
pure = "on",
visibility = ["//visibility:public"],
)

load("@io_bazel_rules_docker//go:image.bzl", "go_image")

go_image(
name = "image",
binary = ":ela-queue",
visibility = ["//visibility:public"],
)
Loading