-
Notifications
You must be signed in to change notification settings - Fork 4
1_Runtime_Environment_Build_and_Automation
Our baseline runtime environment is composed of a Debian Bookworm (12.1) image customized using Packer.
This baseline runtime environment is further composed by the configuration of these runtime dependencies:
Fundamental runtime dependencies:
Governance | Product | From Source | Software License | Source instructions | Done |
---|---|---|---|---|---|
Docker | buildx | Docker Inc. | Apache 2.0 | None | ✅ |
NVIDIA | CUDA | NVIDIA CUDA | Proprietary | None | ✅ |
Intel | Math Kernel Libraries | Intel Libraries | Intel Simplified | Maybe, or don't build | ? |
FAISS | Facebook FAISS | MIT | Built! | ✅ | |
Linux Foundation | Pytorch | Pytorch from source | 3-clause BSD | Built! | ✅ / 🔰 |
Symas | LMDB | An extremely well regarded key/value store | Open LDAP Public 2.8 | unknown | 📛 |
Linux Foundation | Cloud-Hypervisor | Cloud Hypervisor built with musl |
Dual: 3-clause BSD, Apache 2.0 | use cargo deb
|
📛 |
Red Hat, Inc. | virtiofsd | build and configure virtiofsd | Dual: 3-clause BSD, Apache 2.0 | use cargo deb
|
📛 |
Linux Foundation | Istio Service Mesh | - | Apache 2.0 | do we need to? | 📛 |
Future interest?
https://github.com/tunib-ai/parallelformers
** N/P indicates impossible because source code is not available ** CUDA 12.2
- Intel Math Kernel Libraries
- Facebook FAISS
https://github.com/artificialwisdomai/origin/pull/109#pullrequestreview-1662554300
These kernel options have been extensively tested, are correct on both AMD and Intel, and focused on virtualization.
option | value | what it does | why we set it |
---|---|---|---|
modprobe.blacklist | nouveau | prevents nouveau from being loaded |
interferes with nVidia proprietary module nvidia
|
pci | realloc=on | forces reallocation of PCI resources | not correctly autodetected? |
pcie_ports | native | force native access to PCIe services | ? |
usbcore.nousb | - | disables USB | ? |
vsyscall | none | disables vsyscalls (vDSO still works though) | hardening |
intel_iommu | on,sm_on | try loading Intel IOMMU driver in scalable mode | ? |
amd_iommu | on | try loading AMD IOMMU driver | ? |
amd_iommu_intr | vapic | use virtual APIC routing when possible | accelerates virtualization |
iommu | pt | ? | ? |
iommu.strict | 1 | invalidate TLBs synchronously when DMA regions are unmapped, trading performance for isolation | hardening |
iommu.forcedac | 1 | force dual-address cycle for PCI resources; allocate PCI ranges in 64-bit address space when possible | ? |
kvm-amd.avic | 1 | force-enable AVIC for AMD KVM | required for amd_iommu_intr=vapic
|
kvm-amd.nested | 0 | disable nested virtualization for AMD KVM | hardening? |
kvm-amd.npt | 1 | force-enable nested page tables | cargo cult; should already be enabled by default unless hardware doesn't support it |
transparent_hugepage | never | disable automatic transparent hugepages backing anonymous mmap | ? |
nmi_watchdog | 0 | disable NMI watchdog for hardlocks | ? |
default_hugepagesz | 2M | allocate hugepages with 2MiB size | ? |
hugepagesz | 2M | allocate hugepages of 2MiB size at boot | ? |
hugepages | 400000 | allocate 400k hugepages at boot | ? |
video | efifb:off | disable UEFI-compatible framebuffer | cargo cult |
iommu.passthrough | 1 | don't route DMA through IOMMU | performance |
rd.driver.pre | vfio-pci | prefer Virtual Function I/O for PCI resource access | ? |
pcie_port_pm | off | disable PCIe power management | performance? |
pcie_bus_perf | - | try to configure PCIe busses for best performance | performance |
pcie_aspm | off | disable PCIe power management | performance? |
Redistribution of the resulting image is limited due to proprietary licensing terms.
To satisfy Intel's and nVidia's licenses, the resulting image must not only expose their library functionality, but must also include at least one of our workloads.
To satisfy nVidia's license, the following notice shall be included in modifications and derivative works of sample source code distributed: "This software contains source code provided by NVIDIA Corporation."
- When PR is submitted, a build is executed.
- When PR is merged, all build artifacts are committed to Oracle Artifact storage.
- A build is defined by a baseline runtime environment image retrieved from the Oracle artifact storage.
- A build consists of multiple components.
- Each component is built using
docker buildx
. - The output of each component build is stored in
/opt/computelify/component_manufacturer/component_name
- Python implementation packaged as a Python Wheel with a
.whl
suffix. - Binary implementation archived and compressed in the format
manufacturer_component.tar.zstd
- Text file in the format
ld-so-conf-manufacturer_component.conf
to be renamed as a drop-in file to/etc/ld.so.conf.d/component_manufactureer/component_name.conf
.
- Install Docker
- Install CUDA
- Install
faiss
C library and headers - Install `faiss python wheel
- Install
torch
python wheel
- git clone https://github.com/artificialwisdomai/origin
- sudo mv /home/wise/repos/origin/platform/packaging/build/faiss/target/ld-so-conf-intel.conf /etc/ld.so.conf.d/intel-oneapi.conf
- sudo ldconfig
- pushd / ; sudo tar -xf /home/wise/repos/origin/platform/packaging/build/faiss/target/intel-mkl-2023.2.0.tar.gz; popd
- mkdir -p /opt/facebook
- Pushd /opt/Facebook ; sudo tar -xf /home/wise/repos/origin/platform/packaging/build/faiss/target/faiss.tar.gz ; popd
- Sudo mv /home/wise/repos/origin/platform/packaging/buld/faiss/target/ld-so-conf-faiss.conf /etc/ld.so.conf.d/facebook-faiss.conf
- Sudo ldconfig
Please note, you will not see a link to /opt/intel/...
because the current faiss build automation is not a replication of the[ build from source instructions](https://github.com/artificialwisdomai/origin/wiki/Build-FAISS-from-source
wise@wise-a40x1-1:/opt/facebook/lib$ ldd [libfaiss_avx2.so](http://libfaiss_avx2.so/)
Example list of dependencies which are INCORRECT:
linux-vdso.so.1 (0x00007ffe597f0000)
libcudart.so.12 => not found
libcublas.so.12 => not found
libgomp.so.1 => not found
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f6298c00000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f629d376000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f629d354000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6298e1f000)
/lib64/ld-linux-x86-64.so.2 (0x00007f629d45b000)
The file ld-so-conf-facebook-faiss.conf
needs to be created with the contents:
/opt/facebook/lib