-
Notifications
You must be signed in to change notification settings - Fork 41
How to Install SCF
- Table Of Contents
- Requirements for Kubernetes
- Verifying Kubernetes
- Kube DNS
- Storage Classes
- Cloud Foundry Console UI (Stratos UI)
- Helm installation
- SCF Installation
- Removal and Cleanup via helm
- CF documentation
Note these instructions are applicable to SCF 2.14.5. For previous release install instructions see the SCF install pages in the sidebar.
The various machines (api
, kube
, and node
) of the kubernetes cluster must be configured in a particular way to support the execution of SCF
. These requirements are, in general:
- Kubernetes API versions 1.8+ (tested on 1.9)
- Kernel parameters
swapaccount=1
-
docker info
must not showaufs
as the storage driver. -
kube-dns
must be running and be fully ready. See section Kube DNS. - Either
ntp
orsystemd-timesyncd
must be installed and active. - The kubernetes cluster must have a storage class SCF can refer to. See section Storage Classes.
- Docker must be configured to allow privileged containers.
- Privileged container must be enabled in
kube-apiserver
. See https://kubernetes.io/docs/admin/kube-apiserver - Privileged must be enabled in
kubelet
. - The
TasksMax
property of thecontainerd
service definition must be set to infinity. - Helm's Tiller has to be installed and active.
An easy way of setting up a small single-machine kubernetes cluster with all the necessary properties is to use the Vagrant definition in the SCF repository. The details of this approach are explained in https://github.com/SUSE/scf/blob/develop/README.md#deploying-scf-on-vagrant
For ease of verification of the above requirements a script (kube-ready-state-check.sh) is made available which contains the necessary checks.
The important part to know is this script must be run on different machines. A machine "category" is defined in the script to tell it which pieces of the script to run. The categories are described below:
Category | Explanation |
---|---|
api | Run this on the machine running the apiserver container |
node | Run this on the machine that's running the kubelet container |
kube | Run this on the machine that has access to the Kubernetes cluster via kubectl
|
Note: In CaaSP run api on the kube-master, node on the kube-workers, and kube on your desktop/laptop that has kubectl installed and connected to CaaSP. However, on EKS/AKS/GKE the readiness script can only be run on the worker nodes as EKS, AKS or GKE do not expose the master node.
An example invocation that you might run on the machine running the apiserver
container might be:
./kube-ready-state-check.sh api
The script will run the tests applicable to the named category.
Positive results are prefixed with Verified:
,
whereas failed requirements are prefixed with Configuration problem detected:
.
The cluster must have an active kube-dns
.
If you are running CaaSP you can simply use the following command to install it:
kubectl apply \
-f https://raw.githubusercontent.com/SUSE/caasp-services/b0cf20ca424c41fa8eaef6d84bc5b5147e6f8b70/contrib/addons/kubedns/dns.yaml
The kubernetes cluster must have a storage class SCF can refer to so that its database components have a place for their persistent data.
This class may have any name, in the case of vagrant it uses persistent
.
Important information on storage classes and how to create and configure them can be found here:
Note: while the distribution comes with an example storage-class persistent
of type hostpath
, for use with the vagrant box, this is a toy option and should not be used with anything but the vagrant box. It is actually quite likely that whatever kube setup is used will not even support the type hostpath
for storage classes, automatically preventing its use.
To enable hostpath support for testing, the kube-controller-manager
must be run with the --enable-hostpath-provisioner
command line option.
See https://github.com/SUSE/stratos-ui/releases for distributions of Stratos UI - the Cloud Foundry Console UI. It it also deployed using Helm. Please follow the steps below to see when to install it.
SCF uses Helm charts to deploy on kubernetes clusters. To install Helm see
#### Mixed case DOMAINs in values.yaml
Using something like this in the scf-config-values.yaml will results in an error:
UAA_HOST: uaa.751LSjkQ.mydomain.com
error (when visiting https://uaa.751lsjkq.mydomain.com:2793/login):
The subdomain does not map to a valid identity zone.
the reason is the way uaa matches hostnames internally and is should probably be considered a bug in UAA (https://github.com/cloudfoundry/uaa/issues/797). This issue has been resolved and merged upstream.
If an installation or upgrade results in a failed installation with StatefulSet roles not coming online, subsequent upgrades must be followed by manually restarting the pods in the offline StatefulSet roles.
This happens because Kubernetes won't replace a previous generation pod of a StatefulSet unless it's alive and ready. To recover, you must manually delete the pods of the failing StatefulSets:
# Look and see which StatefulSets are not ok (*desired* count is more than *current* count)
kubectl get sts --namespace NAMESPACE
# Delete the offending pods
kubectl delete pods -l skiff-role-name=STATEFULSET_NAME --namespace NAMESPACE
Get the distribution archive from https://github.com/SUSE/scf/releases (the first link under Assets, not the Source code
). Create a directory and extract the archive into it.
wget https://github.com/SUSE/scf/releases/download/scf-X.Y.Z.linux-amd64.zip # example url
mkdir deploy
unzip scf-X.Y.Z.linux-amd64.zip -d deploy # example zipfile
cd deploy
> ls
helm/
kube/
kube-ready-state-check.sh*
scripts/
We now have the helm charts for SCF and UAA in a subdirectory helm
.
Additional configuration files are found under kube
.
The scripts
directory contains helpers for cert generation.
Choose the name of the kube storage class to use, and create the class if it doesn't exist.
See section Storage Classes for important notes. To see if you have a
storage class you can use for scf run the command: kubectl get storageclasses
.
Note: The persistent
class created below is of type hostpath
which is only meant for toy
examples and is not to be used in production deployments (it's use is disabled in Kubernetes by default).
Here we use the hostpath
storage class for simplicity of setup. Note that
the storageclass
the storageclass apiVersion
used in the manifest should either be storage.k8s.io/v1beta1
(for
kubernetes 1.5.x) or storage.k8s.io/v1
(for kubernetes 1.6.x)apiVersion
used in the manifest should be storage.k8s.io/v1
Use kubectl to check your kubernetes server version:
kubectl version --short | grep "Server Version"
For kubernetes 1.5.x:
echo '{"kind":"StorageClass","apiVersion":"storage.k8s.io/v1beta1","metadata":{"name":"persistent"},"provisioner":"kubernetes.io/host-path"}' | kubectl create -f -
For kubernetes 1.6.x and 1.7.x:
For kubernetes 1.6.x 1.8 and above:
echo '{"kind":"StorageClass","apiVersion":"storage.k8s.io/v1","metadata":{"name":"persistent"},"provisioner":"kubernetes.io/host-path"}' | kubectl create -f -
Next create a values.yaml file (the rest of the docs assume filename: scf-config-values.yaml
) with the settings required for the install. Copy the below as a template for this file and modify
the values to suit your installation.
env:
# Domain for SCF. DNS for *.DOMAIN must point to a kube node's (not master)
# external ip address.
DOMAIN: cf-dev.io
# UAA host/port that SCF will talk to. If you have a custom UAA
# provide its host and port here. If you are using the UAA that comes
# with the SCF distribution, simply use the two values below and
# substitute the cf-dev.io for your DOMAIN used above.
# UAA_HOST: uaa.cf-dev.io
# UAA_PORT: 2793
kube:
# The IP address assigned to the kube node pointed to by the domain.
#### the external_ip setting changed to accept a list of IPs, and was
#### renamed to external_ips
external_ips:
- 192.168.77.77
storage_class:
# Make sure to change the value in here to whatever storage class you use
persistent: "persistent"
shared: "shared"
auth: rbac
secrets:
# Password for user 'admin' in the cluster
CLUSTER_ADMIN_PASSWORD: changeme
# Password for SCF to authenticate with UAA
UAA_ADMIN_CLIENT_SECRET: uaa-admin-client-secret
The previous section gave a reference to the Helm documentation explaining how to install Helm itself. Remember also that in the Vagrant-based setup helm
is already installed and ready.
-
Deploy UAA
helm install helm/uaa \ --namespace uaa \ --values scf-config-values.yaml \ --name uaa
-
With UAA deployed and running, get the
internal-ca-cert
for talking to the UAASECRET=$(kubectl get pods --namespace uaa -o jsonpath='{.items[?(.metadata.name=="uaa-0")].spec.containers[?(.name=="uaa")].env[?(.name=="INTERNAL_CA_CERT")].valueFrom.secretKeyRef.name}') CA_CERT="$(kubectl get secret $SECRET --namespace uaa -o jsonpath="{.data['internal-ca-cert']}" | base64 --decode -)"
Note that secrets are versioned and the numerical suffix on the secret name will change if you upgrade the chart; please check
helm list
orkubectl get secrets --namespace uaa
for the correct number. -
With UAA deployed, use Helm to deploy SCF. This step uses the cert determined by the previous step.
helm install suse/cf \ --namespace scf \ --name scf \ --values scf-config-values.yaml \ --set "secrets.UAA_CA_CERT=${CA_CERT}"
-
Wait for everything to be ready:
watch -c 'kubectl get pods --all-namespaces'
Stop watching when all pods show state
Running
and Ready isn/n
(instead ofk/n
,k < n
).
Stratos UI is also deployed using Helm.
Add the Stratos UI Helm Repository with the command:
helm repo add stratos-ui https://cloudfoundry-incubator.github.io/stratos
Deploy Stratos UI: (do this from the folder where you created the scf-config-values.yaml
configuration file)
helm install stratos-ui/console \
--namespace stratos \
--values scf-config-values.yaml
This will install Stratos UI using the configuration that you created in the scf-config-values.yaml
previously.
Please see here - Accessing the Console - for details on how to determine the URL of your Stratos Console UI.
When deploying with the SCF config values, you should be able to login with your Cloud Foundry credentials. If you see an upgrade message, please wait up to a minute for the installation to complete.
If you do not wish to use the SCF configuration values, then more information is available on deploying the UI in Kubernetes here - https://github.com/SUSE/stratos-ui/tree/master/deploy/kubernetes.
Note: If you deploy without the SCF configuration you will need to use the Setup UI to provider UAA configuration. Typical values are:
- UAA URL: This is composed of
https://NAMESPACE.uaa.DOMAIN:2793
(ie.https://scf.uaa.10.10.10.10.nip.io:2793
)- Client ID:
cf
- Client Secret: EMPTY (do not fill in this box)
- Admin Username: User provided value
- Admin Password: User provided value
These example instructions deploy a MySQL server and an according sidecar as Cloud Foundry docker apps and expose the service via USB.
CF_DOMAIN=cf-dev.io # Set to match the DOMAIN value of your config
CF_MYSQL_DOMAIN="mysql.${CF_DOMAIN}"
SERVER_APP=mysql
MYSQL_USER=root
MYSQL_PASS=testpass
SIDECAR_API_KEY=secret-key
SIDECAR_APP=msc
# Create a shared domain
cf create-shared-domain "${CF_MYSQL_DOMAIN}" --router-group default-tcp
cf update-quota default --reserved-route-ports -1
# Create a security group
echo > "internal-services.json" '[{ "destination": "0.0.0.0/0", "protocol": "all" }]'
cf create-security-group internal-services-workaround internal-services.json
cf bind-running-security-group internal-services-workaround
cf bind-staging-security-group internal-services-workaround
# Enable docker support in diego
cf enable-feature-flag diego_docker
# Deploy mysql server
cf push --no-start --no-route --health-check-type none "${SERVER_APP}" -o mysql/mysql-server
cf map-route "${SERVER_APP}" "${CF_MYSQL_DOMAIN}" --random-port
cf set-env "${SERVER_APP}" MYSQL_ROOT_PASSWORD "${MYSQL_PASS}"
cf set-env "${SERVER_APP}" MYSQL_ROOT_HOST '%'
cf start "${SERVER_APP}"
MYSQL_PORT=`cf routes | grep $CF_MYSQL_DOMAIN | awk '{print $3}'`
# Wait for MySQL to be ready
function wait_on_port
{
endpoint="${CF_MYSQL_DOMAIN}:${1}"
for (( i = 0; i < 12 ; i++ )) ; do
if curl --fail -s -o /dev/null "${endpoint}" ; then
break
fi
sleep 5
done
# Last try, any error will abort the test
curl -s "${endpoint}" > /dev/null
}
wait_on_port "${MYSQL_PORT}"
# Push the sidecar app
cf push "${SIDECAR_APP}" --no-start -o splatform/cf-usb-sidecar-dev-mysql
cf set-env "${SIDECAR_APP}" SIDECAR_API_KEY "${SIDECAR_API_KEY}"
cf set-env "${SIDECAR_APP}" SERVICE_MYSQL_HOST "${CF_MYSQL_DOMAIN}"
cf set-env "${SIDECAR_APP}" SERVICE_MYSQL_PORT "${MYSQL_PORT}"
cf set-env "${SIDECAR_APP}" SERVICE_MYSQL_USER "${MYSQL_USER}"
cf set-env "${SIDECAR_APP}" SERVICE_MYSQL_PASS "${MYSQL_PASS}"
cf start "${SIDECAR_APP}"
# Install cf-usb-plugin from https://github.com/SUSE/cf-usb-plugin/releases
# Download the zip archive you need, unpack it, then
cf install-plugin ./cf-plugin-usb
# Verify that USB is OK
cf usb-info
# Create a driver endpoint to the mysql sidecar
# Note that the -c ":" is required as a workaround to a known issue
cf usb-create-driver-endpoint my-service "https://${SIDECAR_APP}.${CF_DOMAIN}" "${SIDECAR_API_KEY}" -c ":"
# Check the service is available in the marketplace and use it
cf marketplace
cf create-service my-service default mydb
cf services
To deploy an HA version of SCF, amend the values.yaml
file you're using with helm install
with the following - note that some of the role names have changed from the previous release.
sizing:
api_group:
count: 2
cc_clock:
count: 2
cc_uploader
count: 2
cc_worker
count: 2
cf_usb:
count: 2
diego_api:
count: 3
diego_brain:
count: 2
diego_cell:
count: 3
diego_ssh
count: 2
doppler:
count: 2
log_api:
count: 2
mysql:
count: 2
nats:
count: 2
nfs_broker
count: 2
router:
count: 2
routing_api:
count: 2
syslog_scheduler:
count: 2
tcp_router:
count: 2
The below role's HA pods will enter in
passive state
and won't show a ready state:* diego-api
* diego-brain
* routing-api
You can confirm this by looking at the logs inside the container. The logs will state .consul-lock.acquiring-lock
.
You can also optionally enable the Application Autoscaler and Credhub features which are turned off by default. To do so amend the values in the values.yaml to include the following:
sizing:
...
autoscaler_api:
count: 2
autoscaler_metrics:
count: 2
autoscaler_postgres:
count: 1
credhub_user:
count: 1
...
Note that credhub is considered an experimental feature on Azure AKS.
- roles that cannot be scaled:
tcp-router
(no strategy for exposing ports correctly)-
blobstore
(needs shared volume support and an active/passive configuration)
some roles follow an active/passive scaling model, meaning all pods except one (the active) will be shown as NOT READY by kubernetes; this is appropriate and expected behavior:- the resources required to run an HA deployment are considerably higher; for example, running HA in the vagrant box requires at least 24GB memory, 8 VCPUs and fast storage
- when moving from a basic deployment to an HA one, the platform will be unavailable while the upgrade is happening
-
upgrading from a basic deployment to an HA one is not currently possible, because secrets get rotated even though(jandubois: secrets should not get rotated when doing areuse-values
is specified when doinghelm upgrade
helm upgrade
ever since we switched to using the scf-secrets-generator mechanism)
-
Basic operation of the deployed SCF can be verified by running the CF smoke tests.
To invoke the tests, you must first modify the
kube/cf/bosh-task/smoke-tests.yaml
'sDOMAIN
parameter to match your config.Then run the command
kubectl create \ --namespace=scf \ --filename="kube/cf/bosh-task/smoke-tests.yaml" # Wait for completion kubectl logs --follow --namespace=scf smoke-tests
-
If the deployed SCF is not intended as a production system then its operation can be verified further by running the CF acceptance tests.
CAUTION: tests are only meant for acceptance environments, and while they attempt to clean up after themselves, no guarantees are made that they won't change the state of the system in an undesirable way. -- https://github.com/cloudfoundry/cf-acceptance-tests/
To invoke the tests, you must first modify the
kube/cf/bosh-task/acceptance-tests.yaml
'sDOMAIN
parameter to match your config.Then run the command
kubectl create \ --namespace=scf \ --filename="kube/cf/bosh-task/acceptance-tests.yaml" # Wait for completion kubectl logs --follow --namespace=scf acceptance-tests
There are some slight changes when running SCF on CaaSP. Main difference in the configuration are domain, ip address, and storageclass. Related to that, there are additional commands to generate and feed CEPH secrets into the kube, for use by the storageclass:
cat > scf-config-values.yaml <<END
env:
# Domain for SCF. DNS for *.DOMAIN must point to the kube node's
# external ip. This must match the value passed to the
# cert-generator.sh script.
DOMAIN: 10.0.0.154.nip.io
kube:
# The IP address assigned to the kube node. The example value here
# is what the vagrant setup assigns
external_ips:
- 10.0.0.154
storage_class:
persistent: persistent
secrets:
# Password for the cluster
CLUSTER_ADMIN_PASSWORD: changeme
# Password for SCF to authenticate with UAA
UAA_ADMIN_CLIENT_SECRET: uaa-admin-client-secret
END
kubectl create namespace uaa
# Use Ceph admin secret for now, until we determine how to grant appropriate permissions for non-admin client.
kubectl get secret ceph-secret-admin -o json --namespace default | jq ".metadata.namespace = \"uaa\"" | kubectl create -f -
helm install helm/uaa \
--namespace uaa \
--values scf-config-values.yaml
kubectl create namespace scf
kubectl get secret ceph-secret-admin -o json --namespace default |sed's/"namespace": "default"/"namespace": "uaa"/' | kubectl create -f -
CA_CERT="$(kubectl get secret secret --namespace uaa -o jsonpath="{.data['internal-ca-cert']}" | base64 --decode -)"
helm install helm/cf \
--namespace scf \
--values scf-config-values.yaml
--set "secrets.UAA_CA_CERT=${CA_CERT}"
If the error message:
""
appears when attempting to run helm install, then the RBAC permissions on your Kubernetes installation are too restrictive.
Run
kubectl create clusterrolebinding permissive-binding \
--clusterrole=cluster-admin \
--user=admin \
--user=kubelet \
--group=system:serviceaccounts
this makes the underlying Kubernetes less restrictive and installation can continue ref: https://kubernetes.io/docs/admin/authorization/rbac/#permissive-rbac-permissions
First delete the running system at the kube level
kubectl delete namespace uaa
kubectl delete namespace scf
This will especially remove all the associated volumes as well.
After that use helm list
to locate the releases for the SCF and UAA charts and helm delete
to remove them at helm's level as well.