Skip to content

Commit

Permalink
ceph: added priority classes to components
Browse files Browse the repository at this point in the history
Adds priority class support to Ceph components
  to influence scheduler's pod preemption

Signed-off-by: d-luu <[email protected]>
  • Loading branch information
d-luu authored and binoue committed Apr 10, 2020
1 parent c325862 commit 0ee420d
Show file tree
Hide file tree
Showing 53 changed files with 377 additions and 63 deletions.
17 changes: 16 additions & 1 deletion Documentation/ceph-cluster-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ For more details on the mons and when to choose a number other than `3`, see the
* `annotations`: [annotations configuration settings](#annotations-configuration-settings)
* `placement`: [placement configuration settings](#placement-configuration-settings)
* `resources`: [resources configuration settings](#cluster-wide-resources-configuration-settings)
* `priorityClassNames`: [priority class names configuration settings](#priority-class-names-configuration-settings)
* `storage`: Storage selection and configuration that will be used across the cluster. Note that these settings can be overridden for specific nodes.
* `useAllNodes`: `true` or `false`, indicating if all nodes in the cluster should be used for storage according to the cluster level storage selection and configuration values.
If individual nodes are specified under the `nodes` field, then `useAllNodes` must be set to `false`.
Expand Down Expand Up @@ -264,7 +265,7 @@ The following storage selection settings are specific to Ceph and do not apply t

Annotations can be specified so that the Rook components will have those annotations added to them.

You can set annotations for Rook components through the a list of key value pairs:
You can set annotations for Rook components for the list of key value pairs:

* `all`: Set annotations for all components
* `mgr`: Set annotations for MGRs
Expand Down Expand Up @@ -332,6 +333,20 @@ For more information on resource requests/limits see the official Kubernetes doc
* `cpu`: Limit for CPU (example: one CPU core `1`, 50% of one CPU core `500m`).
* `memory`: Limit for Memory (example: one gigabyte of memory `1Gi`, half a gigabyte of memory `512Mi`).

### Priority Class Names Configuration Settings
Priority class names can be specified so that the Rook components will have those priority class names added to them.

You can set priority class names for Rook components for the list of key value pairs:

- `all`: Set priority class names for MGRs, Mons, OSDs, and RBD Mirrors.
- `mgr`: Set priority class names for MGRs.
- `mon`: Set priority class names for Mons.
- `osd`: Set priority class names for OSDs.
- `rbdmirror`: Set priority class names for RBD Mirrors.

The specific component keys will act as overrides to `all`.


## Samples

Here are several samples for configuring Ceph clusters. Each of the samples must also include the namespace and corresponding access granted for management by the Ceph operator. See the [common cluster resources](#common-cluster-resources) below.
Expand Down
1 change: 1 addition & 0 deletions Documentation/ceph-filesystem-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,4 @@ The metadata server settings correspond to the MDS daemon settings.
* `annotations`: Key value pair list of annotations to add.
* `placement`: The mds pods can be given standard Kubernetes placement restrictions with `nodeAffinity`, `tolerations`, `podAffinity`, and `podAntiAffinity` similar to placement defined for daemons configured by the [cluster CRD](https://github.com/rook/rook/blob/{{ branchName }}/cluster/examples/kubernetes/ceph/cluster.yaml).
* `resources`: Set resource requests/limits for the Filesystem MDS Pod(s), see [Resource Requirements/Limits](ceph-cluster-crd.md#resource-requirementslimits).
* `priorityClassName`: Set priority class name for the Filesystem MDS Pod(s)
2 changes: 2 additions & 0 deletions Documentation/ceph-nfs-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# the priority class to set to influence the scheduler's pod preemption
priorityClassName:
```
## NFS Settings
Expand Down
1 change: 1 addition & 0 deletions Documentation/ceph-object-store-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ The gateway settings correspond to the RGW daemon settings.
* `annotations`: Key value pair list of annotations to add.
* `placement`: The Kubernetes placement settings to determine where the RGW pods should be started in the cluster.
* `resources`: Set resource requests/limits for the Gateway Pod(s), see [Resource Requirements/Limits](ceph-cluster-crd.md#resource-requirementslimits).
* `priorityClassName`: Set priority class name for the Gateway Pod(s)

## Runtime settings

Expand Down
2 changes: 2 additions & 0 deletions Documentation/helm-operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ The following tables lists the configurable parameters of the rook-operator char
| `hostpathRequiresPrivileged` | Runs Ceph Pods as privileged to be able to write to `hostPath`s in OpenShift with SELinux restrictions. | `false` |
| `mon.healthCheckInterval` | The frequency for the operator to check the mon health | `45s` |
| `mon.monOutTimeout` | The time to wait before failing over an unhealthy mon | `600s` |
| `discover.priorityClassName` | The priority class name to add to the discover pods | <none> |
| `discover.toleration` | Toleration for the discover pods | <none> |
| `discover.tolerationKey` | The specific key of the taint to tolerate | <none> |
| `discover.tolerations` | Array of tolerations in YAML format which will be added to discover deployment | <none> |
Expand All @@ -136,6 +137,7 @@ The following tables lists the configurable parameters of the rook-operator char
| `agent.libModulesDirPath` | Path where the Rook agent should look for kernel modules (*) | `/lib/modules` |
| `agent.mounts` | Additional paths to be mounted in the agent container (**) | <none> |
| `agent.mountSecurityMode` | Mount Security Mode for the agent. | `Any` |
| `agent.priorityClassName` | The priority class name to add to the agent pods | <none> |
| `agent.toleration` | Toleration for the agent pods | <none> |
| `agent.tolerationKey` | The specific key of the taint to tolerate | <none> |
| `agent.tolerations` | Array of tolerations in YAML format which will be added to agent deployment | <none> |
Expand Down
6 changes: 5 additions & 1 deletion PendingReleaseNotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,11 @@
fix this to be `<dataDirHostPath>/log/<namespace>`, the same as other daemons.
- Use the mon configuration database for directory-based OSDs, and do not generate a config
- A new ceph-crashcollector controller has been added, that new pod will run on any node where a Ceph pod is running. Read more about this in the [doc](Documentation/ceph-cluster-crd.html#cluster-wide-resources-configuration-settings)

- PriorityClassNames can now be added to the Rook/Ceph components to influence the scheduler's pod preemption.
- mgr/mon/osd/rbdmirror: [priority class names configuration settings](Documentation/ceph-cluster-crd.md#priority-class-names-configuration-settings)
- filesystem: [metadata server settings](Documentation/ceph-filesystem-crd.md#metadata-server-settings)
- rgw: [gateway settings](Documentation/ceph-object-store-crd.md#gateway-settings)
- nfs: [samples](Documentation/ceph-nfs-crd.md#samples)

### EdgeFS

Expand Down
8 changes: 8 additions & 0 deletions cluster/charts/rook-ceph/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,10 @@ spec:
- name: AGENT_NODE_AFFINITY
value: {{ .Values.agent.nodeAffinity }}
{{- end }}
{{- if .Values.agent.priorityClassName }}
- name: AGENT_PRIORITY_CLASS_NAME
value: {{ .Values.agent.priorityClassName }}
{{- end }}
{{- if .Values.agent.mountSecurityMode }}
- name: AGENT_MOUNT_SECURITY_MODE
value: {{ .Values.agent.mountSecurityMode }}
Expand Down Expand Up @@ -80,6 +84,10 @@ spec:
- name: DISCOVER_TOLERATIONS
value: {{ toYaml .Values.discover.tolerations | quote }}
{{- end }}
{{- if .Values.discover.priorityClassName }}
- name: DISCOVER_PRIORITY_CLASS_NAME
value: {{ .Values.discover.priorityClassName }}
{{- end }}
{{- if .Values.discover.nodeAffinity }}
- name: DISCOVER_AGENT_NODE_AFFINITY
value: {{ .Values.discover.nodeAffinity }}
Expand Down
5 changes: 5 additions & 0 deletions cluster/examples/kubernetes/ceph/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,11 @@ spec:
# crashcollector:
# The option to automatically remove OSDs that are out and are safe to destroy.
removeOSDsIfOutAndSafeToRemove: false
# priorityClassNames:
# all: rook-ceph-default-priority-class
# mon: rook-ceph-mon-priority-class
# osd: rook-ceph-osd-priority-class
# mgr: rook-ceph-mgr-priority-class
storage: # cluster level storage configuration and selection
useAllNodes: true
useAllDevices: true
Expand Down
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/filesystem-ec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,4 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# priorityClassName: my-priority-class
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/filesystem.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,4 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# priorityClassName: my-priority-class
2 changes: 2 additions & 0 deletions cluster/examples/kubernetes/ceph/nfs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,5 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# the priority class to set to influence the scheduler's pod preemption
priorityClassName:
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/object-ec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,3 +61,4 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# priorityClassName: my-priority-class
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/object-openshift.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,4 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# priorityClassName: my-priority-class
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/object.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,4 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# priorityClassName: my-priority-class
6 changes: 6 additions & 0 deletions cluster/examples/kubernetes/ceph/operator-openshift.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,9 @@ spec:
# (Optional) Rook Agent toleration key. Set this to the key of the taint you want to tolerate
# - name: AGENT_TOLERATION_KEY
# value: "<KeyOfTheTaintToTolerate>"
# (Optional) Rook Agent priority class name to set on the pod(s)
# - name: AGENT_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Rook Agent NodeAffinity.
# - name: AGENT_NODE_AFFINITY
# value: "role=storage-node; storage=rook,ceph"
Expand All @@ -157,6 +160,9 @@ spec:
# (Optional) Rook Discover toleration key. Set this to the key of the taint you want to tolerate
# - name: DISCOVER_TOLERATION_KEY
# value: "<KeyOfTheTaintToTolerate>"
# (Optional) Rook Discover priority class name to set on the pod(s)
# - name: DISCOVER_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Discover Agent NodeAffinity.
# - name: DISCOVER_AGENT_NODE_AFFINITY
# value: "role=storage-node; storage=rook, ceph"
Expand Down
6 changes: 6 additions & 0 deletions cluster/examples/kubernetes/ceph/operator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,9 @@ spec:
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# (Optional) Rook Agent priority class name to set on the pod(s)
# - name: AGENT_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Rook Agent NodeAffinity.
# - name: AGENT_NODE_AFFINITY
# value: "role=storage-node; storage=rook,ceph"
Expand Down Expand Up @@ -97,6 +100,9 @@ spec:
# - effect: NoExecute
# key: node-role.kubernetes.io/etcd
# operator: Exists
# (Optional) Rook Discover priority class name to set on the pod(s)
# - name: DISCOVER_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Discover Agent NodeAffinity.
# - name: DISCOVER_AGENT_NODE_AFFINITY
# value: "role=storage-node; storage=rook, ceph"
Expand Down
2 changes: 2 additions & 0 deletions design/ceph/ceph-nfs-ganesha.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@ spec:
# requests:
# cpu: "500m"
# memory: "1024Mi"
# the priority class to set to influence the scheduler's pod preemption
priorityClassName:
```
When the nfs-ganesha.yaml is created the following will happen:
Expand Down
53 changes: 53 additions & 0 deletions pkg/apis/ceph.rook.io/v1/priorityclasses.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
/*
Copyright 2019 The Rook Authors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1

import (
rook "github.com/rook/rook/pkg/apis/rook.io/v1alpha2"
)

// GetMgrPriorityClassName returns the priority class name for the MGR service
func GetMgrPriorityClassName(p rook.PriorityClassNamesSpec) string {
if _, ok := p[KeyMgr]; !ok {
return p.All()
}
return p[KeyMgr]
}

// GetMonPriorityClassName returns the priority class name for the monitors
func GetMonPriorityClassName(p rook.PriorityClassNamesSpec) string {
if _, ok := p[KeyMon]; !ok {
return p.All()
}
return p[KeyMon]
}

// GetOSDPriorityClassName returns the priority class name for the OSDs
func GetOSDPriorityClassName(p rook.PriorityClassNamesSpec) string {
if _, ok := p[KeyOSD]; !ok {
return p.All()
}
return p[KeyOSD]
}

// GetRBDMirrorPriorityClassName returns the priority class name for the RBD Mirrors
func GetRBDMirrorPriorityClassName(p rook.PriorityClassNamesSpec) string {
if _, ok := p[KeyRBDMirror]; !ok {
return p.All()
}
return p[KeyRBDMirror]
}
12 changes: 12 additions & 0 deletions pkg/apis/ceph.rook.io/v1/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ type ClusterSpec struct {
// Resources set resource requests and limits
Resources rook.ResourceSpec `json:"resources,omitempty"`

// PriorityClassNames sets priority classes on components
PriorityClassNames rook.PriorityClassNamesSpec `json:"priorityClassNames,omitempty"`

// The path on the host where config and data can be persisted.
DataDirHostPath string `json:"dataDirHostPath,omitempty"`

Expand Down Expand Up @@ -293,6 +296,9 @@ type MetadataServerSpec struct {

// The resource requirements for the rgw pods
Resources v1.ResourceRequirements `json:"resources"`

// PriorityClassName sets priority classes on components
PriorityClassName string `json:"priorityClassName,omitempty"`
}

// +genclient
Expand Down Expand Up @@ -378,6 +384,9 @@ type GatewaySpec struct {

// The resource requirements for the rgw pods
Resources v1.ResourceRequirements `json:"resources"`

// PriorityClassName sets priority classes on the rgw pods
PriorityClassName string `json:"priorityClassName,omitempty"`
}

// +genclient
Expand Down Expand Up @@ -425,6 +434,9 @@ type GaneshaServerSpec struct {

// Resources set resource requests and limits
Resources v1.ResourceRequirements `json:"resources,omitempty"`

// PriorityClassName sets the priority class on the pods
PriorityClassName string `json:"priorityClassName,omitempty"`
}

// NetworkSpec for Ceph includes backward compatibility code
Expand Down
7 changes: 7 additions & 0 deletions pkg/apis/ceph.rook.io/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 25 additions & 0 deletions pkg/apis/rook.io/v1alpha2/priorityclasses.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
Copyright 2019 The Rook Authors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1alpha2

// All returns the priority class name defined for 'all' daemons in the Ceph cluster CRD.
func (p PriorityClassNamesSpec) All() string {
if val, ok := p[KeyAll]; ok {
return val
}
return ""
}
61 changes: 61 additions & 0 deletions pkg/apis/rook.io/v1alpha2/priorityclasses_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
/*
Copyright 2019 The Rook Authors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package v1alpha2

import (
"encoding/json"
"testing"

"github.com/ghodss/yaml"
"github.com/stretchr/testify/assert"
)

func TestPriorityClassNamesSpec(t *testing.T) {
specYaml := []byte(`
all: all-class
mgr: mgr-class
mon: mon-class
osd: osd-class
`)

// convert the raw spec yaml into JSON
rawJSON, err := yaml.YAMLToJSON(specYaml)
assert.Nil(t, err)

// unmarshal the JSON into a strongly typed annotations spec object
var priorityClassNames PriorityClassNamesSpec
err = json.Unmarshal(rawJSON, &priorityClassNames)
assert.Nil(t, err)

// the unmarshalled priority class names spec should equal the expected spec below
expected := PriorityClassNamesSpec{
"all": "all-class",
"mgr": "mgr-class",
"mon": "mon-class",
"osd": "osd-class",
}
assert.Equal(t, expected, priorityClassNames)
}

func TestPriorityClassNamesDefaultToAll(t *testing.T) {
priorityClassNames := PriorityClassNamesSpec{
"all": "all-class",
"mon": "mon-class",
}

assert.Equal(t, "all-class", priorityClassNames.All())
}
Loading

0 comments on commit 0ee420d

Please sign in to comment.