-
Notifications
You must be signed in to change notification settings - Fork 1
vmaffione/bpfhv
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
=== Structure of this repository === The driver/ directory contains the bpfhv guest driver for Linux kernels (>= 4.18): - bpfhv.c: driver source code The proxy/ directory contains the implementation of an external backend process associated to the QEMU bpfhv-proxy network backend (-netdev bpfhv-proxy). The external backend provides the queue processing functionalities for a single bpfhv device that belongs to a QEMU VM. In other words, QEMU only implements the control functionalities of the device, while RX and TX queues are processed by the external process. Similarly to vhost-user, QEMU and the external backend process use a dedicated control channel to exchange some information needed for the packet processing tasks (e.g. guest memory map, addresses of each TX and RX queue, file descriptors for notifications, etc.). Files: - backend.c: main file, implements the control protocol and two packet processing loops (the first one is a poll() event loop, whereas the second one uses busy-wait; - sring.[ch]: hv implementation of a device which uses a minimal descriptor format, with no support for offloads (and reduced per-packet overhead); - sring_progs.c: eBPF programs for the sring device; - sring_gso.[ch]: hv implementation of a device which uses an extended descriptor format, supporting checksum offloads and TCP/UDP segmentation offloads; - sring_gso_progs.c: eBPF programs for the sring_gso device - vring_packed.[ch]: hv implementation of the packed virtqueue in the VirtIO 1.1 specification; - vring_packed_progs.c: eBPF programs for the vring_packed device; - start-qemu.sh: an example script to start a QEMU VM with a bpfhv device peered with a bpfhv-proxy network backend; - start-proxy.sh: an example script to start the external backend process and configure the backend network device (e.g. a TAP interface or a netmap port); === Some advantages of bpfhv === - Have doorbells on separate pages (configurable stride) - Provider can evolve the medatata header (e.g., virtio-net) to balance between the needs of FreeBSD and Linux. (virtio-net is good for Linux, but not for FreeBSD). - Virtio 1.1 vs 1.0 (while 0.95 is still around). This is a sign that there is a need for evolution and compatibility problems. - You can define a metadata format (e.g. virtio-net header) that fits the specific hardware NIC features used by the cloud provider. - Let the provider inject code to encrypt/decrypt the payload, together with the hardcoded key. The encrypt/decrypt routines can be helper functions that take as argument the OS packet pointer and the key. - Simplification of device paravirtualization. Fixed datapath ABI means that you need to be backward compatible. Look at virtio implementation in Linux 4.20: it needs to support both split and packet ring --> complex, error prone, less efficient. - Change virtual switch and backend under the hood (tap, netmap, other). - Adapt to changing workloads. === TODOs (driver) === - Let BPFHV_MAX_TX_BUFS and BPFHV_MAX_RX_BUFS be variable. This requires of course reshaping the layout of the context data structures. - Try to replace dma_map_single() with dma_map_page() on the RX datapath ? Not sure this is relevant. - What if the eBPF program needs to modify the SG layout, e.g., for encapsulation or encryption? This would require changing the paddr/vaddr/len in the buffer descriptors, and DMA mapping and unmapping... So maybe we should ask the eBPF program to DMA map/unmap so, that it can do that after encapsulation or encryption (i.e. once the SG layout is stable). === TODOs (qemu) === - Replace cpu_physical_memory_[un]map() with dma_memory_[un]map() and the MemoryRegionCache library. This should be only necessary if the guest platform has an IOMMU. Code in virtqueue_pop() and virtqueue_push(). - let backend.ops.init fail (vring packed < 2^15) - move vring_packed generic code at the top of the files
About
Adaptive network paravirtualization for continuous Cloud Provider evolution
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published