Skip to content

Commit

Permalink
Commit introduced a third CPU pool type, called "default". (nokia#8)
Browse files Browse the repository at this point in the history
The idea behind the default pool is that CPUs belonging to this pool need not be advertised as devices.
Workloads not explicitly asking for neither exclusive, nor shared pool resources will be automatically restricted to the "default" pool's CPUset.
For the sake of uniforma handling every misconfigured pool (neither exclusive, nor shared) is considered "default" type from now on, and not advertised to Device Manager.

Documentation enhanced with the functional change.
References to the configurable resourceBaseName parameter were also cleaned-up.
  • Loading branch information
Levovar authored and TimoLindqvist committed Feb 7, 2019
1 parent b222d26 commit 2ad6c03
Show file tree
Hide file tree
Showing 5 changed files with 46 additions and 35 deletions.
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,6 @@ kind: ConfigMap
metadata:
name: cpu-pooler-configmap
data:
cpu-pooler.yaml: |
resourceBaseName: "<name>"
poolconfig-<name>.yaml: |
pools:
<poolname1>:
Expand All @@ -33,19 +31,21 @@ data:
nodeSelector:
<key> : <value>
```
The cpu-pooler.yaml file must exist in the data section. It defines the resourceBaseName field which is the advertised resource name without the resource - i.e only the `vendor-domain`.
The cpu-pooler.yaml file must exist in the data section.
The cpu pools are defined in poolconfig-<name>.yaml files. There must be at least one poolconfig-<name>.yaml file in the data section.
Pool name from the config will be the resource in the fully qualified resource name (`<resurceBaseName>/<pool name>`). The pool name must have pool type prefix - 'exclusive' for exclusive cpu pool or 'shared' for shared cpu pool. The nodeSelector is used to tell in which node this pool configuration is used. CPU pooler reads the node labels and selects the config that matches the nodeSelector.
Pool name from the config will be the resource in the fully qualified resource name (`nokia.k8s.io/<pool name>`). The pool name must have pool type prefix - 'exclusive' for exclusive cpu pool or 'shared' for shared cpu pool.
A CPU pool not having either of these special prefixes is considered as the cluster-wide 'default' CPU pool, and as such, CPU cores belonging to this pool will not be advertised to the Device Manager as schedulable resources.
The nodeSelector is used to tell in which node this pool configuration is used. CPU pooler reads the node labels and selects the config that matches the nodeSelector.

In the deployment directory there is a sample pool config with two exclusive pools (both have two cpus) and one shared pool (one cpu). Nodes for the pool configurations are selected by `nodeType` label.

### Pod spec

The cpu-device-plugin advertises the resources as name: `<resoruceBaseName>/<poolname>`. The poolname is pool name configured in cpu-pooler-configmap. The cpus are requested in the resources section of container in the pod spec.
The cpu-device-plugin advertises the resources of exclusive, and shared CPU pools as name: `nokia.k8s.io/<poolname>`. The poolname is pool name configured in cpu-pooler-configmap. The cpus are requested in the resources section of container in the pod spec.

### Annotation:

Annotation schema is following and the name for the annotation is `<resourceBaseName>/cpus`. Resource being the the advertised resource name.
Annotation schema is following and the name for the annotation is `nokia.k8s.io/cpus`. Resource being the the advertised resource name.
```
{
"$schema": "http://json-schema.org/draft-07/schema#",
Expand Down Expand Up @@ -109,6 +109,7 @@ Following restrictions apply when allocating cpu from pools and configuring pool

* There can be only one shared pool in the node
* Containter can ask cpus from one type of pool only (shared or exclusive)
* Resources belonging to the default pool are not advertised. The default pool definition is only used by the CPUSetter component

## Build

Expand Down
34 changes: 12 additions & 22 deletions cmd/cpu-device-plugin/cpu-device-plugin.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,10 @@ import (
"time"
)

var resourceBaseName = "nokia.k8s.io"
var (
resourceBaseName = "nokia.k8s.io"
cdms []*cpuDeviceManager
)

type cpuDeviceManager struct {
pool types.Pool
Expand Down Expand Up @@ -55,7 +58,6 @@ func (cdm *cpuDeviceManager) Start() error {
return net.DialTimeout("unix", addr, timeout)
}),
)

if err != nil {
glog.Errorf("Error. Could not establish connection with gRPC server: %v", err)
return err
Expand All @@ -70,7 +72,6 @@ func (cdm *cpuDeviceManager) cleanup() error {
if err := os.Remove(pluginEndpoint); err != nil && !os.IsNotExist(err) {
return err
}

return nil
}

Expand All @@ -79,13 +80,9 @@ func (cdm *cpuDeviceManager) Stop() error {
if cdm.grpcServer == nil {
return nil
}

cdm.grpcServer.Stop()
cdm.grpcServer = nil

return cdm.cleanup()

return nil
}

func (cdm *cpuDeviceManager) ListAndWatch(e *pluginapi.Empty, stream pluginapi.DevicePlugin_ListAndWatchServer) error {
Expand All @@ -108,7 +105,6 @@ func (cdm *cpuDeviceManager) ListAndWatch(e *pluginapi.Empty, stream pluginapi.D
glog.Errorf("Error. Cannot update device states: %v\n", err)
return err
}

updateNeeded = false
}
//TODO: When is update needed ?
Expand Down Expand Up @@ -171,21 +167,12 @@ func (cdm *cpuDeviceManager) Register(kubeletEndpoint, resourceName string) erro
}

func newCPUDeviceManager(poolName string, pool types.Pool, sharedCPUs string) *cpuDeviceManager {

var poolType string
glog.Infof("Starting plugin for pool: %s", poolName)

if strings.HasPrefix(poolName, "shared") {
poolType = "shared"
} else {
poolType = "exclusive"
}

return &cpuDeviceManager{
pool: pool,
socketFile: fmt.Sprintf("cpudp_%s.sock", poolName),
sharedPoolCPUs: sharedCPUs,
poolType: poolType,
poolType: types.DeterminePoolType(poolName),
}
}

Expand All @@ -199,14 +186,19 @@ func createPluginsForPools() error {
glog.Fatal(err)
}
}
var sharedCPUs string
poolConf, _, err := types.DeterminePoolConfig()
if err != nil {
glog.Fatal(err)
}
glog.Infof("Pool configuration %v", poolConf)
var sharedCPUs string
for poolName, pool := range poolConf.Pools {
if strings.HasPrefix(poolName, "shared") {
poolType := types.DeterminePoolType(poolName)
//Deault or unrecognizable pools need not be made available to Device Manager as schedulable devices
if poolType == types.DefaultPoolID {
continue
}
if poolType == types.SharedPoolID {
if sharedCPUs != "" {
err = fmt.Errorf("Only one shared pool allowed")
glog.Errorf("Pool config : %v", poolConf)
Expand Down Expand Up @@ -238,8 +230,6 @@ func createPluginsForPools() error {
return err
}

var cdms []*cpuDeviceManager

func main() {
flag.Parse()
watcher, _ := fsnotify.NewWatcher()
Expand Down
2 changes: 0 additions & 2 deletions deployment/cpu-dp-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ kind: ConfigMap
metadata:
name: cpu-pooler-configmap
data:
cpu-pooler.yaml: |
resourceBaseName: "nokia.k8s.io"
poolconfig-controller.yaml: |
pools:
exclusive-pool:
Expand Down
6 changes: 3 additions & 3 deletions pkg/sethandler/controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -117,13 +117,13 @@ func (setHandler *SetHandler) adjustContainerSets(pod v1.Pod) {
func (setHandler *SetHandler) determineCorrectCpuset(pod v1.Pod, container v1.Container) (cpuset.CPUSet, error) {
for resourceName := range container.Resources.Requests {
resNameAsString := string(resourceName)
if strings.Contains(resNameAsString, resourceBaseName) && strings.Contains(resNameAsString, "shared") {
if strings.Contains(resNameAsString, resourceBaseName) && strings.Contains(resNameAsString, types.SharedPoolID) {
return cpuset.Parse(setHandler.poolConfig.SelectPool(resNameAsString).CPUs)
} else if strings.Contains(resNameAsString, resourceBaseName) && strings.Contains(resNameAsString, "exclusive") {
} else if strings.Contains(resNameAsString, resourceBaseName) && strings.Contains(resNameAsString, types.ExclusivePoolID) {
return setHandler.getListOfAllocatedExclusiveCpus(resNameAsString, pod, container)
}
}
return cpuset.Parse(setHandler.poolConfig.SelectPool(resourceBaseName + "/default").CPUs)
return cpuset.Parse(setHandler.poolConfig.SelectPool(resourceBaseName + "/" + types.DefaultPoolID).CPUs)
}

func (setHandler *SetHandler) getListOfAllocatedExclusiveCpus(exclusivePoolName string, pod v1.Pod, container v1.Container) (cpuset.CPUSet, error) {
Expand Down
26 changes: 24 additions & 2 deletions pkg/types/pool.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,19 @@ import (
"path/filepath"
)

const (
//SharedPoolID is the constant prefix in the name of the CPU pool. It is used to signal that a CPU pool is of shared type
SharedPoolID = "shared"
//ExclusivePoolID is the constant prefix in the name of the CPU pool. It is used to signal that a CPU pool is of exclusive type
ExclusivePoolID = "exclusive"
//DefaultPoolID is the constant prefix in the name of the CPU pool. It is used to signal that a CPU pool is of default type
DefaultPoolID = "default"
)

var (
//PoolConfigDir defines the pool configuration file location
PoolConfigDir = "/etc/cpu-pooler"
)
// Pool defines cpupool
type Pool struct {
CPUs string `yaml:"cpus"`
Expand All @@ -21,8 +34,17 @@ type PoolConfig struct {
NodeSelector map[string]string `yaml:"nodeSelector"`
}

// PoolConfigDir defines the pool configuration file location
var PoolConfigDir = "/etc/cpu-pooler"
//DeterminePoolType takes the name of CPU pool as defined in the CPU-Pooler ConfigMap, and returns the type of CPU pool it represents.
//Type of the pool is determined based on the constant prefixes used in the name of the pool.
//A type can be shared, exclusive, or default.
func DeterminePoolType(poolName string) string {
if strings.HasPrefix(poolName, SharedPoolID) {
return SharedPoolID
} else if strings.HasPrefix(poolName, ExclusivePoolID) {
return ExclusivePoolID
}
return DefaultPoolID
}

//DeterminePoolConfig first interrogates the label set of the Node this process runs on.
//It uses this information to select the specific PoolConfig file corresponding to the Node.
Expand Down

0 comments on commit 2ad6c03

Please sign in to comment.