diff --git a/docs/reference/README.md b/docs/reference/README.md index c59fdfadd..b5f53adef 100644 --- a/docs/reference/README.md +++ b/docs/reference/README.md @@ -2,6 +2,6 @@ In this object reference, we introduce all objects that are specific for this provider integration. The naming of objects, servers, machines, etc. can be confusing. Without claiming to be consistent throughout these docs, we would like to give an overview of how we name things here. -First, there are some important counterparts of our objects and CAPI objects. ```HetznerCluster``` has CAPI's ```Cluster``` object. CAPI's ```Machine``` object is the counterpart of both ```HCloudMachine``` and ```HetznerBareMetalMachine```. These two are objects of the provider integration that are reconciled by the ```HCloudMachineController``` and the ```HetznerBareMetalMachineController``` respectively. The ```HCloudMachineController``` checks whether there is a server in the HCloud API already and if not, buys/creates one that corresponds to a ```HCloudMachine``` object. The ```HetznerBareMetalMachineController``` does not buy new bare metal machines, but instead consumes a host of the inventory of ```HetznerBareMetalHosts```, which have a one-to-one relationship to Hetzner dedicated/root/bare metal servers that have been bought manually by the user. +First, there are some important counterparts of our objects and CAPI objects. `HetznerCluster` has CAPI's `Cluster` object. CAPI's `Machine` object is the counterpart of both `HCloudMachine` and `HetznerBareMetalMachine`. These two are objects of the provider integration that are reconciled by the `HCloudMachineController` and the `HetznerBareMetalMachineController` respectively. The `HCloudMachineController` checks whether there is a server in the HCloud API already and if not, buys/creates one that corresponds to a `HCloudMachine` object. The `HetznerBareMetalMachineController` does not buy new bare metal machines, but instead consumes a host of the inventory of `HetznerBareMetalHosts`, which have a one-to-one relationship to Hetzner dedicated/root/bare metal servers that have been bought manually by the user. -Therefore, there is an important difference between the ```HCloudMachine``` object and a server in the HCloud API. For bare metal, we have even three terms: the ```HetznerBareMetalMachine``` object, the ```HetznerBareMetalHost``` object, and the actual bare metal server that can be accessed through Hetzner's robot API. \ No newline at end of file +Therefore, there is an important difference between the `HCloudMachine` object and a server in the HCloud API. For bare metal, we have even three terms: the `HetznerBareMetalMachine` object, the `HetznerBareMetalHost` object, and the actual bare metal server that can be accessed through Hetzner's robot API. diff --git a/docs/topics/hetzner-baremetal.md b/docs/topics/hetzner-baremetal.md new file mode 100644 index 000000000..c8a0e8bf5 --- /dev/null +++ b/docs/topics/hetzner-baremetal.md @@ -0,0 +1,384 @@ + +### Hetzner Baremetal +Hetzner have two offerings primarily: +1. Hetzner Cloud/ Hcloud -> for virtualized servers +2. Hetzner Dedicated/ Robot -> for bare metal servers + +In this guide, we will focus on creating a cluster from baremetal servers. + +### Flavors of Hetzner Baremetal +Now, there are different ways you can use baremetal servers, you can use them as controlplanes or as worker nodes or both. Based on that we have created some templates and those templates are released as flavors in GitHub releases. + +These flavors can be consumed using [clusterctl](https://main.cluster-api.sigs.k8s.io/user/quick-start.html#install-clusterctl) tool: + + To use bare metal servers for your deployment, you can choose one of the following flavors: + +| Flavor | What it does | +| -------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | +| hetzner-baremetal-control-planes-remediation | Uses bare metal servers for the control plane nodes - with custom remediation (try to reboot machines first) | +| hetzner-baremetal-control-planes | Uses bare metal servers for the control plane nodes - with normal remediation (unprovision/recreate machines) | +| hetzner-hcloud-control-planes | Uses the hcloud servers for the control plane nodes and the bare metal servers for the worker nodes | + + +NOTE: These flavors are only for demonstration purposes and should not be used in production. + +### Purchasing Bare Metal Servers + +If you want to create a cluster with bare metal servers, you will also need to set up the robot credentials. For setting robot credentials, as described in the [reference](/docs/reference/hetzner-bare-metal-machine-template.md), you need to purchase bare metal servers beforehand manually. + +### Creating a bootstrap cluster +In this guide, we will focus on creating a bootstrap cluster which is basically a local management cluster created using [kind](https://kind.sigs.k8s.io). + +To create a bootstrap cluster, you can use the following command: + +```bash +kind create cluster +``` + +```bash +Creating cluster "kind" ... + βœ“ Ensuring node image (kindest/node:v1.29.2) πŸ–Ό + βœ“ Preparing nodes πŸ“¦ + βœ“ Writing configuration πŸ“œ + βœ“ Starting control-plane πŸ•ΉοΈ + βœ“ Installing CNI πŸ”Œ + βœ“ Installing StorageClass πŸ’Ύ +Set kubectl context to "kind-kind" +You can now use your cluster with: + +kubectl cluster-info --context kind-kind + +Have a question, bug, or feature request? Let us know! https://kind.sigs.k8s.io/#community πŸ™‚ +``` + +After creating the bootstrap cluster, it is also required to have some variables exported and the name of the variables that needs to be exported can be known by running the following command: + +```bash +$ clusterctl generate cluster my-cluster --list-variables --flavor hetzner-hcloud-control-planes +Required Variables: + - HCLOUD_CONTROL_PLANE_MACHINE_TYPE + - HCLOUD_REGION + - HCLOUD_SSH_KEY + - HCLOUD_WORKER_MACHINE_TYPE + +Optional Variables: + - CLUSTER_NAME (defaults to my-cluster) + - CONTROL_PLANE_MACHINE_COUNT (defaults to 3) + - KUBERNETES_VERSION (defaults to v1.27.7) + - WORKER_MACHINE_COUNT (defaults to 3) +``` + +These variables are used during the deployment of Hetzner infrastructure provider in the cluster. + +Installing the Hetzner provider can be done using the following command: +```bash +clusterctl init --infrastructure hetzner +```` + +```bash +Fetching providers +Installing cert-manager Version="v1.14.2" +Waiting for cert-manager to be available... +Installing Provider="cluster-api" Version="v1.6.3" TargetNamespace="capi-system" +Installing Provider="bootstrap-kubeadm" Version="v1.6.3" TargetNamespace="capi-kubeadm-bootstrap-system" +Installing Provider="control-plane-kubeadm" Version="v1.6.3" TargetNamespace="capi-kubeadm-control-plane-system" +Installing Provider="infrastructure-hetzner" Version="v1.0.0-beta.33" TargetNamespace="caph-system" + +Your management cluster has been initialized successfully! + +You can now create your first workload cluster by running the following: + + clusterctl generate cluster [name] --kubernetes-version [version] | kubectl apply -f - +``` + +### Generating Workload Cluster Manifest + +Once the infrastructure provider is ready, we can create a workload cluster manifest using `clusterctl generate` + +```bash +clusterctl generate cluster my-cluster --flavor hetzner-hcloud-control-planes > my-cluster.yaml +``` + +As of now, our cluster manifest lives in `my-cluster.yaml` file and we will apply this at a later stage after preparing secrets and ssh-keys. + +### Preparing Hetzner Robot + +1. Create a new web service user. [Here](https://robot.your-server.de/preferences/index), you can define a password and copy your user name. +2. Generate an SSH key. You can either upload it via Hetzner Robot UI or just rely on the controller to upload a key that it does not find in the robot API. You have to store the public and private key together with the SSH key's name in a secret that the controller reads. + +For this tutorial, we will let the controller upload keys to hetzner robot. + +#### Creating new user in Robot +To create new user in Robot, click on the `Create User` button in the Hetzner Robot console. Once you create the new user, a user ID will be provided to you via email from Hetzner Robot. The password will be the same that you used while creating the user. + +![robot user](../pics/robot_user.png) + +This is a required for following the next step. + +### creating and verify ssh-key in hcloud +First you need to create a ssh-key locally and you can `ssh-keygen` command for creation. +```bash +ssh-keygen -t ed25519 -f ~/.ssh/caph +``` +Above command will create a public and private key in your `~/.ssh` directory. + +You can use the public key `~/.ssh/caph.pub` and upload it to your hcloud project. Go to your project and under `Security` -> `SSH Keys` click on `Add SSH key` and add your public key there and in the `Name` of ssh key you'll use the name `test`. + +NOTE: There is also a helper CLI called [hcloud](https://github.com/hetznercloud/cli) that can be used for the purpose of uploading the SSH key. + +In the above step, the name of the ssh-key that is recognized by hcloud is `test`. This is important because we will reference the name of the ssh-key later. + +This is an important step because the same ssh key is used to access the servers. Make sure you are using the correct ssh key name. + +The `test` is the name of the ssh key that we have created above. It is because the generated manifest references `test` as the ssh key name. +```yaml + sshKeys: + hcloud: + - name: test +``` + +NOTE: If you want to use some other name then you can modify it accordingly. + +### Create Secrets In Management Cluster (Hcloud + Robot) + +In order for the provider integration hetzner to communicate with the Hetzner API ([HCloud API](https://docs.hetzner.cloud/) + [Robot API](https://robot.your-server.de/doc/webservice/en.html#preface)), we need to create secrets with the access data. The secret must be in the same namespace as the other CRs. + +We create two secrets named `hetzner` for Hetzner Cloud and Robot API access and `robot-ssh` for provisioning bare metal servers via SSH. +The `hetzner` secret contains API token for hcloud token. It also contains username and password that is used to interact with robot API. `robot-ssh` secret contains the public-key, private-key and name of the ssh-key used for baremetal servers. + +```shell +export HCLOUD_TOKEN="" \ +export HETZNER_ROBOT_USER="" \ +export HETZNER_ROBOT_PASSWORD="" \ +export HETZNER_SSH_PUB_PATH="" \ +export HETZNER_SSH_PRIV_PATH="" +``` + +- HCLOUD_TOKEN: The project where your cluster will be placed. You have to get a token from your HCloud Project. +- HETZNER_ROBOT_USER: The User you have defined in Robot under settings/web. +- HETZNER_ROBOT_PASSWORD: The Robot Password you have set in Robot under settings/web. +- HETZNER_SSH_PUB_PATH: The Path to your generated Public SSH Key. +- HETZNER_SSH_PRIV_PATH: The Path to your generated Private SSH Key. This is needed because CAPH uses this key to provision the node in Hetzner Dedicated. + +```shell +kubectl create secret generic hetzner --from-literal=hcloud=$HCLOUD_TOKEN --from-literal=robot-user=$HETZNER_ROBOT_USER --from-literal=robot-password=$HETZNER_ROBOT_PASSWORD + +kubectl create secret generic robot-ssh --from-literal=sshkey-name=test --from-file=ssh-privatekey=$HETZNER_SSH_PRIV_PATH --from-file=ssh-publickey=$HETZNER_SSH_PUB_PATH +``` + +> NOTE: sshkey-name should must match the name that is present in hetzner otherwise the controller will not know how to reach the machine. + +Patch the created secrets so that they get automatically moved to the target cluster later. The following command helps you do that: + +```shell +kubectl patch secret hetzner -p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}' +kubectl patch secret robot-ssh -p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}' +``` + +The secret name and the tokens can also be customized in the cluster template. + + +### Creating Host Object In Management Cluster + + +For using baremetal servers as nodes, you need to create a `HetznerBareMetalHost` object for each bare metal server that you bought and specify its server ID in the specs. Below is a sample manifest for HetznerBareMetalHost object. + +```yaml +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: HetznerBareMetalHost +metadata: + name: "caph-baremetal-server" + namespace: default +spec: + description: CAPH BareMetal Server + serverID: # please check robot console + rootDeviceHints: + wwn: + maintenanceMode: false +``` + +If you already know the WWN of the storage device you want to choose for booting, specify it in the `rootDeviceHints` of the object. If not, you can proceed. During the provisioning process, the controller will fetch information about all available storage devices and store it in the status of the object. + +For example, let's consider a `HetznerBareMetalHost` object without specify it's WWN. + +```yaml +apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 +kind: HetznerBareMetalHost +metadata: + name: "caph-baremetal-server" + namespace: default +spec: + description: CAPH BareMetal Server + serverID: # please check robot console + maintenanceMode: false +``` +In the above server, we have not specified the WWN of the server and we have applied it in the cluster. + +After a while, you will see that there is an error in provisioning of `HetznerBareMetalHost` object that you just applied above. The error will look the following: +```bash +$ kubectl get hetznerbaremetalhost -A +default my-cluster-md-1-tgvl5 my-cluster default/test-bm-gpu my-cluster-md-1-t9znj-694hs Provisioning 23m ValidationFailed no root device hints specified +``` + +After you see the error, get the YAML output of the `HetznerBareMetalHost` object and then you will find the list of storage devices and their `wwn` in the status of the `HetznerBareMetalHost` resource. + +```yaml +storage: +- hctl: "2:0:0:0" + model: Micron_1100_MTFDDAK512TBN + name: sda + serialNumber: 18081BB48B25 + sizeBytes: 512110190592 + sizeGB: 512 + vendor: 'ATA ' + wwn: "0x500a07511bb48b25" +- hctl: "1:0:0:0" + model: Micron_1100_MTFDDAK512TBN + name: sdb + serialNumber: 18081BB48992 + sizeBytes: 512110190592 + sizeGB: 512 + vendor: 'ATA ' + wwn: "0x500a07511bb48992" +``` + +In the output above, we can see that on this baremetal servers we have two disk with their respective `Wwn`. We can also verify it by making an ssh connection to the rescue system and executing the following command: +```bash +# lsblk --nodeps --output name,type,wwn +NAME TYPE WWN +sda disk 0x500a07511bb48992 +sdb disk 0x500a07511bb48b25 +``` + +Since, we are now confirmed about wwn of the two disks, we can use either of them. We will use `kubectl edit` and update the following information in the `HetznerBareMetalHost` object. + +NOTE: Defining `rootDeviceHints` on your baremetal server is important otherwise the baremetal server will not be able join the cluster. + +```yaml +rootDeviceHints: + wwn: "0x500a07511bb48992" +``` + +NOTE: If you've more than one disk then it's recommended to use smaller disk for OS installation so that we can retain the data in between provisioning of machine. + +We will apply this file in the cluster and the provisioning of the machine will be successful. + +To summarize, if you don't know the WWN of your server then there are two ways to find it out: +1. Create the HetznerBareMetalHost without WWN and wait for the controller to fetch all information about the available storage devices. Afterwards, look at status of `HetznerBareMetalHost` by running `kubectl get hetznerbaremetalhost -o yaml` in your management cluster. There you will find `hardwareDetails` of all of your bare metal hosts, in which you can see a list of all the relevant storage devices as well as their properties. You can copy+paste the WWN of your desired storage device into the `rootDeviceHints` of your `HetznerBareMetalHost` objects. +2. SSH into the rescue system of the server and use `lsblk --nodeps --output name,type,wwn` + +NOTE: There might be cases where you've more than one disk. +```bash +lsblk -d -o name,type,wwn,size +NAME TYPE WWN SIZE +sda disk 238.5G +sdb disk 238.5G +sdc disk 1.8T +sdd disk 1.8T +``` + +In the above case, you can use any of the four disks available to you on a baremetal server. + + +### Creating Workload Cluster + +NOTE: Secrets as of now are hardcoded given we are using a flavor which is essentially a template. If you want to use your own naming convention for secrets then you'll have to update the templates. Please make sure that you pay attention to the sshkey name. + +Since we have already created secret in hetzner robot, hcloud and ssh-keys as secret in management cluster, we can now apply the cluster. +```bash +kubectl apply -f my-cluster.yaml +``` + +```bash +$ kubectl apply -f my-cluster.yaml +kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/my-cluster-md-0 created +kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/my-cluster-md-1 created +cluster.cluster.x-k8s.io/my-cluster created +machinedeployment.cluster.x-k8s.io/my-cluster-md-0 created +machinedeployment.cluster.x-k8s.io/my-cluster-md-1 created +machinehealthcheck.cluster.x-k8s.io/my-cluster-control-plane-unhealthy-5m created +machinehealthcheck.cluster.x-k8s.io/my-cluster-md-0-unhealthy-5m created +machinehealthcheck.cluster.x-k8s.io/my-cluster-md-1-unhealthy-5m created +kubeadmcontrolplane.controlplane.cluster.x-k8s.io/my-cluster-control-plane created +hcloudmachinetemplate.infrastructure.cluster.x-k8s.io/my-cluster-control-plane created +hcloudmachinetemplate.infrastructure.cluster.x-k8s.io/my-cluster-md-0 created +hcloudremediationtemplate.infrastructure.cluster.x-k8s.io/control-plane-remediation-request created +hcloudremediationtemplate.infrastructure.cluster.x-k8s.io/worker-remediation-request created +hetznerbaremetalmachinetemplate.infrastructure.cluster.x-k8s.io/my-cluster-md-1 created +hetznercluster.infrastructure.cluster.x-k8s.io/my-cluster created +``` + +### Getting the kubeconfig of workload cluster +After a while, our first controlplane should be up and running. You can verify it using the output of `kubectl get kcp` followed by `kubectl get machines` + +Once it's up and running, you can get the kubeconfig of the workload cluster using the following command: + +```bash +clusterctl get kubeconfig my-cluster > workload-kubeconfig +chmod go-r workload-kubeconfig # required to avoid helm warning +``` + +### Deploy Cluster Addons + +NOTE: This is important for the functioning of the cluster otherwise the cluster won't work. + +#### Deploying the Hetzner Cloud Controller Manager + +> This requires a secret containing access credentials to both Hetzner Robot and HCloud. + +If you have configured your secret correctly in the previous step then you already have the secret in your cluster. +Let's deploy the hetzner CCM helm chart. + +```shell +helm repo add syself https://charts.syself.com +helm repo update syself + +$ helm upgrade --install ccm syself/ccm-hetzner --version 1.1.10 \ + --namespace kube-system \ + --set privateNetwork.enabled=false \ + --kubeconfig workload-kubeconfig +Release "ccm" does not exist. Installing it now. +NAME: ccm +LAST DEPLOYED: Thu Apr 4 21:09:25 2024 +NAMESPACE: kube-system +STATUS: deployed +REVISION: 1 +TEST SUITE: None +``` + +#### Installing CNI +For CNI, let's deploy cilium in the workload cluster that will facilitate the networking in the cluster. +```bash +$ helm install cilium cilium/cilium --version 1.15.3 --kubeconfig workload-kubeconfig +NAME: cilium +LAST DEPLOYED: Thu Apr 4 21:11:13 2024 +NAMESPACE: default +STATUS: deployed +REVISION: 1 +TEST SUITE: None +NOTES: +You have successfully installed Cilium with Hubble. + +Your release version is 1.15.3. + +For any further help, visit https://docs.cilium.io/en/v1.15/gettinghelp +``` + +### verifying the cluster +Now, the cluster should be up and you can verify it by running the following commands: +```bash +$ kubectl get clusters -A +NAMESPACE NAME CLUSTERCLASS PHASE AGE VERSION +default my-cluster Provisioned 10h +$ kubectl get machines -A +NAMESPACE NAME CLUSTER NODENAME PROVIDERID PHASE AGE VERSION +default my-cluster-control-plane-6m6zf my-cluster my-cluster-control-plane-84hsn hcloud://45443706 Running 10h v1.27.7 +default my-cluster-control-plane-m6frm my-cluster my-cluster-control-plane-hvl5d hcloud://45443651 Running 10h v1.27.7 +default my-cluster-control-plane-qwsq6 my-cluster my-cluster-control-plane-ss9kc hcloud://45443746 Running 10h v1.27.7 +default my-cluster-md-0-2xgj5-c5bhc my-cluster my-cluster-md-0-6xttr hcloud://45443694 Running 10h v1.27.7 +default my-cluster-md-0-2xgj5-rbnbw my-cluster my-cluster-md-0-fdq9l hcloud://45443693 Running 10h v1.27.7 +default my-cluster-md-0-2xgj5-tl2jr my-cluster my-cluster-md-0-59cgw hcloud://45443692 Running 10h v1.27.7 +default my-cluster-md-1-cp2fd-7nld7 my-cluster bm-my-cluster-md-1-d7526 hcloud://bm-2317525 Running 9h v1.27.7 +default my-cluster-md-1-cp2fd-n74sm my-cluster bm-my-cluster-md-1-l5dnr hcloud://bm-2105469 Running 10h v1.27.7 +``` +Please note that hcloud servers are prefixed with `hcloud://` and baremetal servers are prefixed with `hcloud://bm-`. diff --git a/docs/topics/managing-ssh-keys.md b/docs/topics/managing-ssh-keys.md index 315de8014..1458ad74f 100644 --- a/docs/topics/managing-ssh-keys.md +++ b/docs/topics/managing-ssh-keys.md @@ -1,13 +1,39 @@ ## Managing SSH keys +This section provides details about SSH keys and its importance with regards to CAPH. + +### What are SSH keys? + +SSH keys are a crucial component of secured network communication. They provide a secure and convenient method for authenticating to and communicating with remote servers over unsecured networks. They are used as an access credential in the SSH (Secure Shell) protocol, which is used for logging in remotely from one system to another. SSH keys come in pairs with a public and a private key and its strong encryption is used for executing remote commands and remotely managing vital system components. + +### SSH keys in CAPH + +In CAPH, SSH keys help in establishing secure communication remotely with Kubernetes nodes running on Hetzner cloud. They help you get complete access to the underlying Kubernetes nodes that are machines provisioned in Hetzner cloud and retrieve required information related to the system. With the help of these keys, you can SSH into the nodes in case of troubleshooting. + ### In Hetzner Cloud -In pure HCloud clusters, without bare metal servers, there is no need for SSH keys. All keys that exist in HCloud API and are specified in ```HetznerCluster``` properties are included when provisioning machines. Therefore, they can be used to access those machines via SSH. Note that you have to upload those keys via Hetzner UI or API beforehand. +NOTE: You are responsible for uploading your public ssh key to hetzner cloud. This can be done using `hcloud` CLI or hetznercloud console. +All keys that exist in Hetzner Cloud and are specified in `HetznerCluster` spec are included when provisioning machines. Therefore, they can be used to access those machines via SSH. + +```bash +hcloud ssh-key create --name caph --public-key-from-file ~/.ssh/hetzner-cluster.pub +``` +Once this is done, you'll have to reference it while creating your cluster. + +For example, if you've specified four keys in your hetzner cloud project and you reference all of them while creating your cluster in `HetznerCluster.spec.sshKeys.hcloud` then you can access the machines with all the four keys. +```yaml + sshKeys: + hcloud: + - name: testing + - name: test + - name: hello + - name: another +``` -The SSH keys can be either specified cluster-wide in the specs of the ```HetznerCluster``` object or scoped to one machine in the specs of ```HCloudMachine```. +The SSH keys can be either specified cluster-wide in the `HetznerCluster.spec.sshKeys` or scoped to one machine via `HCloudMachine.spec.sshKeys`. The HCloudMachine sshkey overrides the cluster-wide sshkey. If one SSH key is changed in the specs of the cluster, then keep in mind that the SSH key is still valid to access all servers that have been created with it. If it is a potential security vulnerability, then all of these servers should be removed and re-created with the new SSH keys. ### In Hetzner Robot -For bare metal servers, two SSH keys are required. One of them is used for the rescue system, and the other for the actual system. The two can, under the hood, of course, be the same. These SSH keys do not have to be uploaded into Robot API but have to be stored in two secrets (again, the same secret is also possible if the same reference is given twice). Not only the name of the SSH key but also the public and private key. The private key is necessary for provisioning the server with SSH. The SSH key for the actual system is specified in ```HetznerBareMetalMachineTemplate``` - there are no cluster-wide alternatives. The SSH key for the rescue system is defined in a cluster-wide manner in the specs of ```HetznerCluster```. +For bare metal servers, two SSH keys are required. One of them is used for the rescue system, and the other for the actual system. The two can, under the hood, of course, be the same. These SSH keys do not have to be uploaded into Robot API but have to be stored in two secrets (again, the same secret is also possible if the same reference is given twice). Not only the name of the SSH key but also the public and private key. The private key is necessary for provisioning the server with SSH. The SSH key for the actual system is specified in `HetznerBareMetalMachineTemplate` - there are no cluster-wide alternatives. The SSH key for the rescue system is defined in a cluster-wide manner in the specs of `HetznerCluster`. -The secret reference to an SSH key cannot be changed - the secret data, i.e., the SSH key, can. The host that is consumed by the ```HetznerBareMetalMachine``` object reacts in different ways to the change of the secret data of the secret referenced in its specs, depending on its provisioning state. If the host is already provisioned, it will emit an event warning that provisioned hosts can't change SSH keys. The corresponding machine object should instead be deleted and recreated. When the host is provisioning, it restarts this process again if a change of the SSH key makes it necessary. This depends on whether it is the SSH key for the rescue or the actual system and the exact provisioning state. +The secret reference to an SSH key cannot be changed - the secret data, i.e., the SSH key, can. The host that is consumed by the `HetznerBareMetalMachine` object reacts in different ways to the change of the secret data of the secret referenced in its specs, depending on its provisioning state. If the host is already provisioned, it will emit an event warning that provisioned hosts can't change SSH keys. The corresponding machine object should instead be deleted and recreated. When the host is provisioning, it restarts this process again if a change of the SSH key makes it necessary. This depends on whether it is the SSH key for the rescue or the actual system and the exact provisioning state. diff --git a/docs/topics/node-image.md b/docs/topics/node-image.md index aa3e9f805..f8a9ebe36 100644 --- a/docs/topics/node-image.md +++ b/docs/topics/node-image.md @@ -1,8 +1,24 @@ # Node Images +## What are node-images? + +Node-images are pre-configured operating system (OS) images for setting up nodes within a Kubernetes cluster. A Kubernetes cluster consists of multiple nodes that are physical or virtual machines. To run the necessary Kubernetes components for a fully functional cluster, each node runs an operating system that hosts these components. These OS images should be compatible with Kubernetes and ensure that the node has the required environment to join and perform in the cluster. The images often comes with necessary components that are pre-installed or can be easy installable facilitating a smooth setup process for Kubernetes nodes. + +## Node-images in CAPH + +Node-image is necessary for using CAPH in production and as a user, there can be some specific needs as per your requirements that needs to be in the linux instance. The popular linux distributions might not contain all of those specifics. In such cases, the user need to build a node-image. These images can be uploaded to the Hetzner cloud as a snapshot and then the user can use these node-images for cluster creation. + +## Creating a Node Image + +For using cluster-API with the bootstrap provider kubeadm, we need a server with all the necessary components for running Kubernetes. +There are several ways to achieve this. In the quick-start guide, we use `pre-kubeadm` commands in the `KubeadmControlPlane` and `KubeadmConfigTemplate` objects. These are propagated from the bootstrap-provider-kubeadm and the control-plane-provider-kubeadm to the node as cloud-init commands. This way is usable universally also in other infrastructure providers. + +For Hcloud, there is an alternative way of doing this using Packer. It creates a snapshot to boot from. This makes it easier to version the images, and creating new nodes using this image is faster. The same is possible for Hetzner BareMetal, as we could use installimage and a prepared tarball, which then gets installed as the OS for your nodes. + To use CAPH in production, it needs a node image. In Hetzner Cloud, it is not possible to upload your own images directly. However, a server can be created, configured, and then snapshotted. For this, Packer could be used, which already has support for Hetzner Cloud. In this repository, there is also an example `Packer node-image`. To use it, do the following: + ```shell export HCLOUD_TOKEN= diff --git a/docs/topics/preparation.md b/docs/topics/preparation.md index 81aec38cf..f25fa91cd 100644 --- a/docs/topics/preparation.md +++ b/docs/topics/preparation.md @@ -140,11 +140,4 @@ kubectl patch secret robot-ssh -p '{"metadata":{"labels":{"clusterctl.cluster.x- The secret name and the tokens can also be customized in the cluster template. - -### Creating a viable Node Image - -For using cluster-API with the bootstrap provider kubeadm, we need a server with all the necessary binaries and settings for running Kubernetes. -There are several ways to achieve this. In the quick-start guide, we use `pre-kubeadm` commands in the KubeadmControlPlane and KubeadmConfigTemplate objects. These are propagated from the bootstrap provider kubeadm and the control plane provider kubeadm to the node as cloud-init commands. This way is usable universally also in other infrastructure providers. -For Hcloud, there is an alternative way of doing this using Packer. It creates a snapshot to boot from. This makes it easier to version the images, and creating new nodes using this image is faster. The same is possible for Hetzner Bare Metal, as we could use installimage and a prepared tarball, which then gets installed. - See [node-image](./node-image.md) for more information. diff --git a/docs/topics/quickstart.md b/docs/topics/quickstart.md index 79a39c372..938795d6a 100644 --- a/docs/topics/quickstart.md +++ b/docs/topics/quickstart.md @@ -1,10 +1,26 @@ # Quickstart Guide -This guide goes through all the necessary steps to create a cluster on Hetzner infrastructure (on HCloud & Hetzner Dedicated). +This guide goes through all the necessary steps to create a cluster on Hetzner infrastructure (on HCloud). -## Preparing Hetzner +>Note: The cluster templates used in the repository and in this guide for creating clusters are for development purposes only. These templates are not advised to be used in the production environment. However, the software is production-ready and users use it in their production environment. Make your clusters production-ready with the help of Syself Autopilot. For more information, contact . -You have two options: either create a pure HCloud cluster or a hybrid cluster with Hetzner dedicated (bare metal) servers. For a full list of flavors, please check out the [release page](https://github.com/syself/cluster-api-provider-hetzner/releases). +## Prerequisites + +There are certain prerequisites that you need to comply with before getting started with this guide. + +### Installing Helm + +Helm is a package manager that facilitates the installation and management of applications in a Kubernetes cluster. Refer to the [official docs](https://helm.sh/docs/intro/install/) for installation. + +### Understanding Cluster API and clusterctl + +Cluster API Provider Hetzner uses Cluster API to create a cluster in provider Hetzner. So, it is essential to understand Cluster API before getting started with the cluster creation on Hetzner infrastructure. It is a subproject of Kubernetes focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. Know more about Cluster API from its [official documentation](https://cluster-api.sigs.k8s.io/introduction). + +`clusterctl` is the command-line tool used for managing the lifecycle of a Cluster API management cluster. Learn more about `clusterctl` and its commands from the official documentation of Cluster API [here](https://cluster-api.sigs.k8s.io/clusterctl/overview). + +## Preparation + +You have two options: either create a pure HCloud cluster or a hybrid cluster with Hetzner dedicated (bare metal) servers. For a full list of flavors, please check out the [release page](https://github.com/syself/cluster-api-provider-hetzner/releases). In the quickstart guide, we will go with the cluster creation on a pure Hetzner Cloud server. To create a workload cluster, we need to do some preparation: @@ -13,43 +29,134 @@ To create a workload cluster, we need to do some preparation: - Export variables needed for cluster-template. - Create a secret with the credentials. -For more information about this step, please see [here](./preparation.md) +### Preparation of the Hetzner Project and Credentials -## Generate your cluster.yaml -> Please note that ready-to-use Kubernetes configurations, production-ready node images, kubeadm configuration, cluster add-ons like CNI, and similar services need to be separately prepared or acquired to ensure a comprehensive and secure Kubernetes deployment. This is where **Syself Autopilot** comes into play, taking on these challenges to offer you a seamless, worry-free Kubernetes experience. Feel free to contact us via e-mail: info@syself.com. +There are several tasks that have to be completed before a workload cluster can be created. -The clusterctl generate cluster command returns a YAML template for creating a workload cluster. -It generates a YAML file named `my-cluster.yaml` with a predefined list of Cluster API objects (`Cluster`, `Machines`, `MachineDeployments`, etc.) to be deployed in the current namespace. +#### Preparing Hetzner Cloud + +1. Create a new [HCloud project](https://console.hetzner.cloud/projects). +1. Generate an API token with read and write access. You'll find this if you click on the project and go to "security". +1. If you want to use it, generate an SSH key, upload the public key to HCloud (also via "security"), and give it a name. Read more about [Managing SSH Keys](managing-ssh-keys.md). + +### Bootstrap or Management Cluster Installation + +#### Common Prerequisites + +- Install and setup kubectl in your local environment +- Install Kind and Docker + +#### Install and configure a Kubernetes cluster + +Cluster API requires an existing Kubernetes cluster accessible via kubectl. During the installation process, the Kubernetes cluster will be transformed into a management cluster by installing the Cluster API provider components, so it is recommended to keep it separated from any application workload. + +It is a common practice to create a temporary, local bootstrap cluster, which is then used to provision a target management cluster on the selected infrastructure provider. + +### Choose one of the options below: + +#### 1. Existing Management Cluster. + +For production use, a β€œreal” Kubernetes cluster should be used with appropriate backup and DR policies and procedures in place. The Kubernetes cluster must be at least a [supported version](../../README.md#fire-compatibility-with-cluster-api-and-kubernetes-versions). + +#### 2. Kind. + +[kind](https://kind.sigs.k8s.io/) can be used for creating a local Kubernetes cluster for development environments or for the creation of a temporary bootstrap cluster used to provision a target management cluster on the selected infrastructure provider. + +--- +### Install Clusterctl and initialize Management Cluster + +#### Install Clusterctl +To install Clusterctl, refer to the instructions available in the official ClusterAPI documentation [here](https://cluster-api.sigs.k8s.io/user/quick-start.html#install-clusterctl). +Alternatively, use the `make install-clusterctl` command to do the same. + +#### Initialize the management cluster +Now that we’ve got clusterctl installed and all the prerequisites are in place, we can transform the Kubernetes cluster into a management cluster by using the `clusterctl init` command. More information about clusterctl can be found [here](https://cluster-api.sigs.k8s.io/clusterctl/commands/commands.html). + +For the latest version: ```shell -clusterctl generate cluster my-cluster --kubernetes-version v1.28.4 --control-plane-machine-count=3 --worker-machine-count=3 > my-cluster.yaml +clusterctl init --core cluster-api --bootstrap kubeadm --control-plane kubeadm --infrastructure hetzner ``` ->Note: With the `--target-namespace` flag, you can specify a different target namespace. -Run the `clusterctl generate cluster --help` command for more information. -You can also use different flavors, e.g., to create a cluster with the private network: +>Note: For a specific version, use the `--infrastructure hetzner:vX.X.X` flag with the above command. + +--- +### Variable Preparation to generate a cluster-template ```shell -clusterctl generate cluster my-cluster --kubernetes-version v1.28.4 --control-plane-machine-count=3 --worker-machine-count=3 --flavor hcloud-network > my-cluster.yaml +export HCLOUD_SSH_KEY="" \ +export CLUSTER_NAME="my-cluster" \ +export HCLOUD_REGION="fsn1" \ +export CONTROL_PLANE_MACHINE_COUNT=3 \ +export WORKER_MACHINE_COUNT=3 \ +export KUBERNETES_VERSION=1.28.4 \ +export HCLOUD_CONTROL_PLANE_MACHINE_TYPE=cpx31 \ +export HCLOUD_WORKER_MACHINE_TYPE=cpx31 ``` -All pre-configured flavors can be found on the [release page](https://github.com/syself/cluster-api-provider-hetzner/releases). The cluster-templates start with `cluster-template-`. The flavor name is the suffix. +* **HCLOUD_SSH_KEY**: The SSH Key name you loaded in HCloud. +* **HCLOUD_REGION**: The region of the Hcloud cluster. Find the full list of regions [here](https://docs.hetzner.com/cloud/general/locations/). +* **HCLOUD_IMAGE_NAME**: The Image name of the operating system. +* **HCLOUD_X_MACHINE_TYPE**: The type of the Hetzner cloud server. Find more information [here](https://www.hetzner.com/cloud#pricing). + +For a list of all variables needed for generating a cluster manifest (from the cluster-template.yaml), use the following command: -## Hetzner Dedicated / Bare Metal Server +````shell +clusterctl generate cluster my-cluster --list-variables +```` +Running the above command will give you an output in the following manner: -If you want to create a cluster with bare metal servers, you will also need to set up the robot credentials in the preparation step. As described in the [reference](/docs/reference/hetzner-bare-metal-machine-template.md), you need to buy bare metal servers beforehand manually. To use bare metal servers for your deployment, you should choose one of the following flavors: +```shell +Required Variables: + - HCLOUD_CONTROL_PLANE_MACHINE_TYPE + - HCLOUD_REGION + - HCLOUD_SSH_KEY + - HCLOUD_WORKER_MACHINE_TYPE + +Optional Variables: + - CLUSTER_NAME (defaults to my-cluster) + - CONTROL_PLANE_MACHINE_COUNT (defaults to 1) + - WORKER_MACHINE_COUNT (defaults to 0) +``` -| Flavor | What it does | -| -------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | -| hetzner-baremetal-control-planes-remediation | Uses bare metal servers for the control plane nodes - with custom remediation (try to reboot machines first) | -| hetzner-baremetal-control-planes | Uses bare metal servers for the control plane nodes - with normal remediation (unprovision/recreate machines) | -| hetzner-hcloud-control-planes | Uses the hcloud servers for the control plane nodes and the bare metal servers for the worker nodes | +### Create a secret for hcloud only -Next, you need to create a `HetznerBareMetalHost` object for each bare metal server that you bought and specify its server ID in the specs. Refer to an example [here](/docs/reference/hetzner-bare-metal-host.md). Add the created objects to your `my-cluster.yaml` file. If you already know the WWN of the storage device you want to choose for booting, specify it in the `rootDeviceHints` of the object. If not, you can apply the workload cluster, start the provisioning without specifying the WWN, and then wait for the bare metal hosts to show an error. +In order for the provider integration hetzner to communicate with the Hetzner API ([HCloud API](https://docs.hetzner.cloud/)), we need to create a secret with the access data. The secret must be in the same namespace as the other CRs. -After that, look at the status of `HetznerBareMetalHost` by running `kubectl describe hetznerbaremetalhost` in your management cluster. There you will find `hardwareDetails` of all of your bare metal hosts, in which you can see a list of all the relevant storage devices as well as their properties. You can copy+paste the WWN:s of your desired storage device into the `rootDeviceHints` of your `HetznerBareMetalHost` objects. +`export HCLOUD_TOKEN="" ` -## Apply the workload cluster +- HCLOUD_TOKEN: The project where your cluster will be placed. You have to get a token from your HCloud Project. + +Use the below command to create the required secret with the access data: + +```shell +kubectl create secret generic hetzner --from-literal=hcloud=$HCLOUD_TOKEN +``` + +Patch the created secret so that it can be automatically moved to the target cluster later. The following command helps you do that: + +```shell +kubectl patch secret hetzner -p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}' +``` + +The secret name and the tokens can also be customized in the cluster template. + +## Generating the cluster.yaml + +The `clusterctl generate cluster` command returns a YAML template for creating a workload cluster. +It generates a YAML file named `my-cluster.yaml` with a predefined list of Cluster API objects (`Cluster`, `Machines`, `MachineDeployments`, etc.) to be deployed in the current namespace. + +```shell +clusterctl generate cluster my-cluster --kubernetes-version v1.28.4 --control-plane-machine-count=3 --worker-machine-count=3 > my-cluster.yaml +``` +>Note: With the `--target-namespace` flag, you can specify a different target namespace. +Run the `clusterctl generate cluster --help` command for more information. + +>**Note**: Please note that ready-to-use Kubernetes configurations, production-ready node images, kubeadm configuration, cluster add-ons like CNI, and similar services need to be separately prepared or acquired to ensure a comprehensive and secure Kubernetes deployment. This is where **Syself Autopilot** comes into play, taking on these challenges to offer you a seamless, worry-free Kubernetes experience. Feel free to contact us via e-mail: . + +## Applying the workload cluster + +The following command applies the configuration of the workload cluster: ```shell kubectl apply -f my-cluster.yaml @@ -69,22 +176,24 @@ You can also view the cluster and its resources at a glance by running: clusterctl describe cluster my-cluster ``` -To verify the first control plane is up, use this command: +To verify the first control plane is up, use the following command: ```shell kubectl get kubeadmcontrolplane ``` -> The control plane won’t be `ready` until we install a CNI in the next step. +>Note: The control plane won’t be `ready` until we install a CNI in the next step. -After the first control plane node is up and running, we can retrieve the kubeconfig of the workload cluster: +After the first control plane node is up and running, we can retrieve the kubeconfig of the workload cluster with: ```shell export CAPH_WORKER_CLUSTER_KUBECONFIG=/tmp/workload-kubeconfig clusterctl get kubeconfig my-cluster > $CAPH_WORKER_CLUSTER_KUBECONFIG ``` -## Deploy a CNI solution +## Deploying the CNI solution + +Cilium is used as a CNI solution in this guide. The following command deploys it to your cluster: ```shell helm repo add cilium https://helm.cilium.io/ @@ -96,18 +205,20 @@ KUBECONFIG=$CAPH_WORKER_CLUSTER_KUBECONFIG helm upgrade --install cilium cilium/ You can, of course, also install an alternative CNI, e.g., calico. -> There is a bug in Ubuntu that requires the older version of Cilium for this quickstart guide. +>Note: There is a bug in Ubuntu that requires the older version of Cilium for this quickstart guide. ## Deploy the CCM ### Deploy HCloud Cloud Controller Manager - _hcloud only_ -This `make` command will install the CCM in your workload cluster. +The following `make` command will install the CCM in your workload cluster: `make install-ccm-in-wl-cluster PRIVATE_NETWORK=false` + +For a cluster without a private network, use the following command: + ```shell -# For a cluster without a private network: helm repo add syself https://charts.syself.com helm repo update syself @@ -118,22 +229,7 @@ KUBECONFIG=$CAPH_WORKER_CLUSTER_KUBECONFIG helm upgrade --install ccm syself/ccm --set privateNetwork.enabled=false ``` -### Deploy Hetzner Cloud Controller Manager - -> This requires a secret containing access credentials to both Hetzner Robot and HCloud - -`make install-manifests-ccm-hetzner PRIVATE_NETWORK=false` - -```shell -helm repo add syself https://charts.syself.com -helm repo update syself - -KUBECONFIG=$CAPH_WORKER_CLUSTER_KUBECONFIG helm upgrade --install ccm syself/ccm-hetzner --version 1.1.10 \ ---namespace kube-system \ ---set privateNetwork.enabled=false -``` - -## Deploy the CSI (optional) +## Deploying the CSI (optional) ```shell cat << EOF > csi-values.yaml @@ -149,15 +245,15 @@ KUBECONFIG=$CAPH_WORKER_CLUSTER_KUBECONFIG helm upgrade --install csi syself/csi ## Clean Up -Delete workload cluster. +Delete the workload cluster and remove all of the components by using: ```shell kubectl delete cluster my-cluster ``` -> **IMPORTANT**: In order to ensure a proper clean-up of your infrastructure, you must always delete the cluster object. Deleting the entire cluster template with kubectl delete -f capi-quickstart.yaml might lead to pending resources that have to be cleaned up manually. +> **IMPORTANT**: In order to ensure a proper clean-up of your infrastructure, you must always delete the cluster object. Deleting the entire cluster template with the `kubectl delete -f capi-quickstart.yaml` command might lead to pending resources that have to be cleaned up manually. -Delete management cluster with +Delete management cluster with the following command: ```shell kind delete cluster @@ -165,7 +261,9 @@ kind delete cluster ## Next Steps -### Switch to the workload cluster +### Switching to the workload cluster + +As a next step, you need to switch to the workload cluster and the below command will do it: ```shell export KUBECONFIG=/tmp/workload-kubeconfig @@ -173,22 +271,21 @@ export KUBECONFIG=/tmp/workload-kubeconfig ### Moving components -To move the Cluster API objects from your bootstrap cluster to the new management cluster, you need first to install the Cluster API controllers. To install the components with the latest version, please run: +To move the Cluster API objects from your bootstrap cluster to the new management cluster, firstly you need to install the Cluster API controllers. To install the components with the latest version, run the below command: ```shell clusterctl init --core cluster-api --bootstrap kubeadm --control-plane kubeadm --infrastructure hetzner - ``` -If you want a specific version, use the flag `--infrastructure hetzner:vX.X.X`. +>Note: For a specific version, use the flag `--infrastructure hetzner:vX.X.X` with the above command. -Now you can switch back to the management cluster, for example, with +You can switch back to the management cluster with the following command: ```shell export KUBECONFIG=~/.kube/config ``` -You can now move the objects into the new cluster by using: +Move the objects into the new cluster by using: ```shell clusterctl move --to-kubeconfig $CAPH_WORKER_CLUSTER_KUBECONFIG