-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
blog: node: device manager graduation to GA
Signed-off-by: Swati Sehgal <[email protected]>
- Loading branch information
1 parent
1efa1e6
commit 6221639
Showing
1 changed file
with
95 additions
and
0 deletions.
There are no files selected for viewing
95 changes: 95 additions & 0 deletions
95
content/en/blog/_posts/2022-11-21-devicemanager-ga.md/index.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
--- | ||
layout: blog | ||
title: 'Kubernetes 1.26: Graduation of Device Manager to GA!' | ||
date: 2021-11-21 | ||
slug: graduation-of-devicemanager-to-GA | ||
--- | ||
|
||
**Author:** Swati Sehgal (Red Hat) | ||
|
||
## Quick Intro to the Device Plugin framework | ||
|
||
### Introduction | ||
Device Plugin framework was introduced in Kubernetes v1.8 release as a vendor | ||
independent framework to enable discovery, advertisement and allocation of external | ||
devices without modifying core Kubernetes. The feature graduated to Beta in v1.10. | ||
Due to its widespread use and adoption, this feature is being graduated to GA in | ||
v1.26. | ||
|
||
### Device Manager in Kubelet and Device Plugins | ||
Device Manager was introduced as a component in Kubelet to facilitate communication | ||
with device plugins over gRPC through Unix sockets. Device Manager and Device plugins | ||
act as both gRPC servers and clients by serving and connecting to the exposed gRPC | ||
services respectively. Device plugins serve a gRPC service that Kubelet connects to | ||
for device discovery, advertisement (as extended resources) and allocation. Device | ||
Manager connects to the `Registration` gRPC service served by Kubelet to register | ||
itself to Kubelet. | ||
|
||
### Pod requesting a device | ||
Please refer to the documentation for an [example](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#example-pod) on how a pod can request a device exposed to the cluster by a device plugin. | ||
|
||
### Example Device Plugin implmentations | ||
Here are some device plugins for example implmentations: | ||
- [NVIDIA device plugin for Kubernetes](https://github.com/NVIDIA/k8s-device-plugin) | ||
- [Collection of Intel device plugins for Kubernetes](https://github.com/intel/intel-device-plugins-for-kubernetes) | ||
- [SRIOV network device plugin for Kubernetes](https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin) | ||
- [AMD GPU device plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin) | ||
|
||
## Noteworthy developments since Device Plugin framework introduction | ||
|
||
### Kubelet APIs moved to kubelet staging repo | ||
External facing `deviceplugin` API packages moved from `k8s.io/kubernetes/pkg/kubelet/apis/` | ||
to `k8s.io/kubelet/pkg/apis/` in v1.17. Refer to [Move external facing kubelet apis to staging](https://github.com/kubernetes/kubernetes/pull/83551) for more details on the rationale behind this change. | ||
|
||
### Device Plugin API updates | ||
Additional gRPC endpoints introduced: | ||
1. `GetDevicePluginOptions` is used by device plugins to communicate | ||
options to the `DeviceManager`. | ||
1. `GetPreferredAllocation` allows a device plugin to forward allocation | ||
preferrence to the `DeviceManager` so it can incorporate this information | ||
into its allocation decisions. The `DeviceManager` will call out to a | ||
plugin at pod admission time asking for a preferred device allocation | ||
of a given size from a list of available devices to make a more informed | ||
decision. | ||
1. `PreStartContainer` is called before each container start if indicated by | ||
device plugins during registeration phase. It allows Device Plugins to run device | ||
specific operations on the Devices requested. | ||
|
||
Pull Requests that introduced these changes are here: | ||
1. [Invoke preStart RPC call before container start, if desired by plugin](https://github.com/kubernetes/kubernetes/pull/58282) | ||
1. [Add GetPreferredAllocation() call to the v1beta1 device plugin API](https://github.com/kubernetes/kubernetes/pull/92665) | ||
|
||
With introduction of the above endpoints the interaction between Device Manager in | ||
Kubelet and Device Manager can be shown as below: | ||
|
||
<TODO: Add a diagram> | ||
|
||
### Change in semantics for the device plugin registration process | ||
Device plugin code was refactored to separate 'plugin' package under the `devicemanager` | ||
package to lay the groundwork for introducing a `v1beta2` device plugin API. This would | ||
allow adding support in `devicemanager` to service multiple device plugin APIs at the | ||
same time. | ||
|
||
With this refactoring work, it is now mandatory for a device plugin to start serving its gRPC | ||
service before registering itself to Kubelet. Previously, these two operations were asynchornous | ||
and device plugin could register itself before starting its gRPC server which is no longer the | ||
ase. For more details, refer to [PR #109016](https://github.com/kubernetes/kubernetes/pull/109016) and [Issue #112395](https://github.com/kubernetes/kubernetes/issues/112395). | ||
|
||
### Dynamic resource allocation | ||
In Kubernetes 1.26, inspired by how [Persistent Volumes](/docs/concepts/storage/persistent-volumes) are handled in Kubernetes, [Dynamic Resource Allocation](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation) has been introduced to cater to devices that have more sophisticated resource requirements | ||
like: | ||
1. Decouple device initialization and allocation from the pod lifecycle. | ||
1. Facilitate dynamic sharing of devices between containers and pods. | ||
1. Support custom resource-specific parameters | ||
1. Enable resource-specific setup and cleanup actions | ||
1. Enable support for Network-attached resources, not just node-local resources | ||
|
||
## Is the Device Plugin API stable now? | ||
No, the Device Plugin API is still not stable, the latest Device Plugin API version | ||
available is `v1beta1`. There are plans in the community to introduce `v1beta2` API | ||
to service multiple plugin APIs at once. A per-API call with request/response types would allow adding support for | ||
newer API versions without explicitly bumping the API. | ||
|
||
In addition to that, there are existing proposals in the community to introduce additional endpoints [KEP-3162: Add Deallocate and PostStopContainer to Device Manager API](https://github.com/kubernetes/kubernetes/pull/109016). | ||
|
||
For more details, refer to the slack thread [here](https://kubernetes.slack.com/archives/C0BP8PW9G/p1667481109075849). |