Skip to content

Commit

Permalink
feat: add ipxe-script-url support for Equinix Metal
Browse files Browse the repository at this point in the history
Signed-off-by: Marques Johansson <[email protected]>
  • Loading branch information
displague committed Feb 26, 2024
1 parent 0c62f54 commit 59f5a8f
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 20 deletions.
18 changes: 13 additions & 5 deletions cluster-autoscaler/cloudprovider/equinixmetal/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,15 @@ In the above file you can modify the following fields:
| cluster-autoscaler-equinixmetal | authtoken | Your Equinix Metal API token. It must be base64 encoded. |
| cluster-autoscaler-cloud-config | Global/project-id | Your Equinix Metal project id |
| cluster-autoscaler-cloud-config | Global/api-server | The ip:port for you cluster's k8s api (e.g. K8S_MASTER_PUBLIC_IP:6443) |
| cluster-autoscaler-cloud-config | Global/facility | The Equinix Metal facility for the devices in your nodepool (eg: sv15) |
| cluster-autoscaler-cloud-config | Global/metro | The Equinix Metal metro for the devices in your nodepool (eg: sv) |
| cluster-autoscaler-cloud-config | Global/plan | The Equinix Metal plan (aka size/flavor) for new nodes in the nodepool (eg: c3.small.x86) |
| cluster-autoscaler-cloud-config | Global/billing | The billing interval for new nodes (default: hourly) |
| cluster-autoscaler-cloud-config | Global/os | The OS image to use for new nodes (default: ubuntu_18_04). If you change this also update cloudinit. |
| cluster-autoscaler-cloud-config | Global/cloudinit | The base64 encoded [user data](https://metal.equinix.com/developers/docs/servers/user-data/) submitted when provisioning devices. In the example file, the default value has been tested with Ubuntu 18.04 to install Docker & kubelet and then to bootstrap the node into the cluster using kubeadm. The kubeadm, kubelet, kubectl are pinned to version 1.17.4. For a different base OS or bootstrap method, this needs to be customized accordingly|
| cluster-autoscaler-cloud-config | Global/reservation | The values "require" or "prefer" will request the next available hardware reservation for new devices in selected facility & plan. If no hardware reservations match, "require" will trigger a failure, while "prefer" will launch on-demand devices instead (default: none) |
| cluster-autoscaler-cloud-config | Global/hostname-pattern | The pattern for the names of new Equinix Metal devices (default: "k8s-{{.ClusterName}}-{{.NodeGroup}}-{{.RandString8}}" ) |
| cluster-autoscaler-cloud-config | Global/ipxe-script-url | The iPXE script to use for provisioning devices. |
| cluster-autoscaler-cloud-config | Global/always-ipxe | Whether iPXE should be used on every boot. Defaults to first boot only. |

You can always update the secret with more nodepool definitions (with different plans etc.) as shown in the example, but you should always provide a default nodepool configuration.

Expand All @@ -57,7 +59,8 @@ to match your cluster.
In case you want to target a specific nodepool(s) for e.g. a deployment, you can add a `nodeAffinity` with the key `pool` and with value the nodepool name that you want to target. This functionality is not backwards compatible, which means that nodes provisioned with older cluster-autoscaler images won't have the key `pool`. But you can overcome this limitation by manually adding the correct labels. Here are some examples:

Target a nodepool with a specific name:
```

```yaml
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
Expand All @@ -68,8 +71,10 @@ affinity:
values:
- pool3
```
Target a nodepool with a specific Equinix Metal instance:
```
```yaml
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
Expand All @@ -84,14 +89,17 @@ affinity:
## CCM and Controller node labels
### CCM
By default, autoscaler assumes that you have an older deprecated version of `packet-ccm` installed in your
cluster. If however, that is not the case and you've migrated to the new `cloud-provider-equinix-metal` CCM,
then this must be told to autoscaler. This can be done via setting an environment variable in the deployment:
```

```yaml
env:
- name: INSTALLED_CCM
value: cloud-provider-equinix-metal
```

**NOTE**: As a prerequisite, ensure that all worker nodes in your cluster have the prefix `equinixmetal://` in
the Node spec `.spec.providerID`. If there are any existing worker nodes with prefix `packet://`, then drain
the node, remove the node and restart the kubelet on that worker node to re-register the node in the cluster,
Expand All @@ -104,7 +112,7 @@ Autoscaler assumes that control plane nodes in your cluster are identified by th
`node-role.kubernetes.io/master`. If for some reason, this assumption is not true in your case, then set the
environment variable in the deployment:

```
```yaml
env:
- name: METAL_CONTROLLER_NODE_IDENTIFIER_LABEL
value: <label>
Expand Down
13 changes: 6 additions & 7 deletions cluster-autoscaler/cloudprovider/equinixmetal/cloud_provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ import (

const (
// GPULabel is the label added to nodes with GPU resource.
GPULabel = "cloud.google.com/gke-accelerator"
GPULabel = "accelerator"
// DefaultControllerNodeLabelKey is the label added to Master/Controller to identify as
// master/controller node.
DefaultControllerNodeLabelKey = "node-role.kubernetes.io/master"
Expand All @@ -48,11 +48,9 @@ const (
ControllerNodeIdentifierMetalEnv = "METAL_CONTROLLER_NODE_IDENTIFIER_LABEL"
)

var (
availableGPUTypes = map[string]struct{}{
"nvidia-tesla-v100": {},
}
)
var availableGPUTypes = map[string]struct{}{
"nvidia-tesla-v100": {},
}

// equinixMetalCloudProvider implements CloudProvider interface from cluster-autoscaler/cloudprovider module.
type equinixMetalCloudProvider struct {
Expand Down Expand Up @@ -152,7 +150,8 @@ func (pcp *equinixMetalCloudProvider) GetAvailableMachineTypes() ([]string, erro

// NewNodeGroup is not implemented.
func (pcp *equinixMetalCloudProvider) NewNodeGroup(machineType string, labels map[string]string, systemLabels map[string]string,
taints []apiv1.Taint, extraResources map[string]resource.Quantity) (cloudprovider.NodeGroup, error) {
taints []apiv1.Taint, extraResources map[string]resource.Quantity,
) (cloudprovider.NodeGroup, error) {
return nil, cloudprovider.ErrNotImplemented
}

Expand Down
29 changes: 21 additions & 8 deletions cluster-autoscaler/cloudprovider/equinixmetal/manager_rest.go
Original file line number Diff line number Diff line change
Expand Up @@ -158,14 +158,16 @@ type equinixMetalManagerNodePool struct {
cloudinit string
reservation string
hostnamePattern string
ipxeScriptURL string
alwaysPXE bool
}

type equinixMetalManagerRest struct {
authToken string
equinixMetalManagerNodePools map[string]*equinixMetalManagerNodePool
}

// ConfigNodepool options only include the project-id for now
// ConfigNodepool options for an Equinix Metal Nodepool
type ConfigNodepool struct {
ClusterName string `gcfg:"cluster-name"`
ProjectID string `gcfg:"project-id"`
Expand All @@ -177,6 +179,8 @@ type ConfigNodepool struct {
CloudInit string `gcfg:"cloudinit"`
Reservation string `gcfg:"reservation"`
HostnamePattern string `gcfg:"hostname-pattern"`
IPXEScriptURL string `gcfg:"ipxe-script-url"`
AlwaysPXE bool `gcfg:"always-pxe"`
}

// ConfigFile is used to read and store information from the cloud configuration file
Expand Down Expand Up @@ -218,6 +222,8 @@ type DeviceCreateRequest struct {
Storage string `json:"storage,omitempty"`
Tags []string `json:"tags"`
CustomData string `json:"customdata,omitempty"`
IPXEScriptURL string `json:"ipxe_script_url,omitempty"`
AlwaysPXE bool `json:"always_pxe,omitempty"`
IPAddresses []IPAddressCreateRequest `json:"ip_addresses,omitempty"`
HardwareReservationID string `json:"hardware_reservation_id,omitempty"`
}
Expand Down Expand Up @@ -330,6 +336,8 @@ func createEquinixMetalManagerRest(configReader io.Reader, discoverOpts cloudpro
cloudinit: cfg.Nodegroupdef[nodepool].CloudInit,
reservation: cfg.Nodegroupdef[nodepool].Reservation,
hostnamePattern: cfg.Nodegroupdef[nodepool].HostnamePattern,
ipxeScriptURL: cfg.Nodegroupdef[nodepool].IPXEScriptURL,
alwaysPXE: cfg.Nodegroupdef[nodepool].AlwaysPXE,
}
}

Expand Down Expand Up @@ -535,6 +543,8 @@ func (mgr *equinixMetalManagerRest) createDevice(ctx context.Context, hostname,
UserData: userData,
Tags: []string{"k8s-cluster-" + mgr.getNodePoolDefinition(nodegroup).clusterName, "k8s-nodepool-" + nodegroup},
HardwareReservationID: reservation,
IPXEScriptURL: mgr.getNodePoolDefinition(nodegroup).ipxeScriptURL,
AlwaysPXE: mgr.getNodePoolDefinition(nodegroup).alwaysPXE,
}

if err := mgr.createDeviceRequest(ctx, cr, nodegroup); err != nil {
Expand Down Expand Up @@ -624,7 +634,6 @@ func (mgr *equinixMetalManagerRest) deleteDevice(ctx context.Context, nodegroup,

klog.Infof("Deleted device %s: %v", id, result)
return nil

}

// deleteNodes deletes nodes by passing a comma separated list of names or IPs
Expand Down Expand Up @@ -675,13 +684,15 @@ func (mgr *equinixMetalManagerRest) deleteNodes(nodegroup string, nodes []NodeRe
}

// BuildGenericLabels builds basic labels for equinix metal nodes
func BuildGenericLabels(nodegroup string, instanceType string) map[string]string {
func BuildGenericLabels(nodegroup string, instanceType, instanceMetro string) map[string]string {
result := make(map[string]string)

result[apiv1.LabelInstanceType] = instanceType
//result[apiv1.LabelZoneRegion] = ""
//result[apiv1.LabelZoneFailureDomain] = "0"
//result[apiv1.LabelHostname] = ""
result[apiv1.LabelInstanceTypeStable] = instanceType
result[apiv1.LabelZoneRegion] = instanceMetro

// result[apiv1.LabelZoneRegion] = ""
// result[apiv1.LabelZoneFailureDomain] = "0"
result["pool"] = nodegroup

return result
Expand All @@ -695,7 +706,9 @@ func (mgr *equinixMetalManagerRest) templateNodeInfo(nodegroup string) (*schedul
node.ObjectMeta = metav1.ObjectMeta{
Name: nodeName,
SelfLink: fmt.Sprintf("/api/v1/nodes/%s", nodeName),
Labels: map[string]string{},
Labels: map[string]string{
// apiv1.LabelHostname: nodeName,
},
}
node.Status = apiv1.NodeStatus{
Capacity: apiv1.ResourceList{},
Expand All @@ -714,7 +727,7 @@ func (mgr *equinixMetalManagerRest) templateNodeInfo(nodegroup string) (*schedul
node.Status.Conditions = cloudprovider.BuildReadyConditions()

// GenericLabels
node.Labels = cloudprovider.JoinStringMaps(node.Labels, BuildGenericLabels(nodegroup, mgr.getNodePoolDefinition(nodegroup).plan))
node.Labels = cloudprovider.JoinStringMaps(node.Labels, BuildGenericLabels(nodegroup, mgr.getNodePoolDefinition(nodegroup).plan, mgr.getNodePoolDefinition(nodegroup).metro))

nodeInfo := schedulerframework.NewNodeInfo(cloudprovider.BuildKubeProxy(nodegroup))
nodeInfo.SetNode(&node)
Expand Down

0 comments on commit 59f5a8f

Please sign in to comment.