Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
libcontainer: add support for Intel RDT/CAT in runc
This PR fixes issue opencontainers#433 opencontainers#433 About Intel RDT/CAT feature: Intel platforms with new Xeon CPU support Resource Director Technology (RDT). Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3 Cache is the only resource that is supported in RDT. This feature provides a way for the software to restrict cache allocation to a defined 'subset' of L3 cache which may be overlapping with other 'subsets'. The different subsets are identified by class of service (CLOS) and each CLOS has a capacity bitmask (CBM). For more information about Intel RDT/CAT can be found in the section 17.17 of Intel Software Developer Manual and the kernel document: https://lkml.org/lkml/2016/7/12/747 About Intel RDT/CAT kernel interface: Intel Cache Allocation Technology (CAT) is a sub-feature of Resource Director Technology (RDT), which currently supports L3 cache resource allocation. In Linux kernel, it is exposed via "resource control" filesystem, which is a "cgroup-like" interface. Intel RDT "resource control" filesystem hierarchy: /sys/fs/rscctrl |-- cpus |-- info | |-- info | |-- l3 | |-- domain_to_cache_id | |-- max_cbm_len | |-- max_closid |-- schemas |-- tasks |-- <container_id> |-- cpus |-- schemas |-- tasks The file `tasks` has all task ids belonging to the partition "container_id". The task ids in the file will be added or removed among partitions. A task id only stays in one directory at the same time. The file `schemas` has allocation bitmasks/values for L3 cache on each socket, which contains L3 cache id and capacity bitmask (CBM). Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..." For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0` which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0. The valid L3 cache CBM is a *contiguous bits set* and number of bits that can be set is less than the max bit. The max bits in the CBM is varied among supported Intel Xeon platforms. In Intel RDT "resource control" filesystem layout, the CBM in a "partition" should be a subset of the CBM in root. Kernel will check if it is valid when writing. e.g., 0xfffff in root indicates the max bits of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM values to set in a "partition": 0xf, 0xf0, 0x3ff, 0x1f00 and etc. The file `cpus` has a cpu bitmask that specifies the CPUs that are bound to the schemas. Any tasks scheduled on the cpus will use the schemas. Comparing with cgroups, intelRdt has similar process management lifecycle and interfaces in a container. But unlike cgroups' hierarchy, it has single level filesystem layout. When intelRdt is joined, the statistics can be collected from a container. For more information about Intel RDT/CAT kernel interface: https://lkml.org/lkml/2016/7/12/764 An example for runc: There are two L3 caches in the two-socket machine, the default CBM is 0xfffff and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache id 0 and the whole L3 cache id 1 for the container: "linux": { "resources": { "intelRdt": { "l3CacheSchema": "L3:0=ffff0;1=fffff", "L3CacheCpus": "00000000,00000000,00000000,00000000,00000000,00000000" } } } Signed-off-by: Xiaochen Shen <[email protected]>
- Loading branch information