-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into ol-fix-caddy
- Loading branch information
Showing
7 changed files
with
280 additions
and
3 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# GPU Instance Creation and Configuration | ||
|
||
This section outlines the process of starting a GPU instance on the Aleph Network and configuring the GPU on it. | ||
|
||
The [aleph-client](https://github.com/aleph-im/aleph-client/) command-line tool is required.<br> | ||
See [CLI Reference](../../tools/aleph-client/usage.md) or use `--help` for a quick overview of a specific command. | ||
|
||
## Setup | ||
|
||
### Create an instance with GPU | ||
|
||
The CLI provides a streamlined command to create a GPU instance. You will be prompted to choose a specific GPU available on a compatible CRN (Compute Resource Node), where your instance will be deployed. Alternatively, you can create your GPU instance on [Twentysix Cloud](https://console.twentysix.cloud/). | ||
|
||
```shell | ||
aleph instance gpu | ||
``` | ||
|
||
 | ||
|
||
Your VM is now ready to use. | ||
|
||
### Retrieve VM Logs | ||
|
||
Monitor your VM's activity: | ||
|
||
```shell | ||
aleph instance logs <vm-hash> | ||
``` | ||
|
||
### Access Your VM via SSH | ||
|
||
#### 1. **Find the Instance Details** | ||
|
||
- **Via CLI**: | ||
|
||
```shell | ||
aleph instance list | ||
``` | ||
|
||
- **Via API**: Access the compute node's API at `https://<node-url>/about/executions/list`. | ||
|
||
#### 2. **Connect via SSH**: | ||
|
||
Use the retrieved IP address to SSH into your VM: | ||
|
||
```shell | ||
ssh <user>@<ip> [-i <path-to-ssh-key>] | ||
``` | ||
|
||
- **Default Users**: | ||
- Debian: `root` | ||
- Ubuntu: `ubuntu` | ||
|
||
## Using the GPU inside your VM | ||
|
||
### Install NVIDIA Linux Drivers | ||
|
||
#### 1. **Installation** | ||
|
||
##### **Debian** | ||
|
||
```shell | ||
echo "deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware" >> /etc/apt/sources.list | ||
apt update -y && apt upgrade -y && apt autoremove -y | ||
``` | ||
|
||
> ℹ️ If prompted, press Enter for `Keep the local version currently installed`. | ||
Exit your vm and then reboot it with: | ||
|
||
```shell | ||
aleph instance reboot <vm-hash> | ||
``` | ||
|
||
Re-connect to your VM via SSH and run the following: | ||
|
||
```shell | ||
apt install linux-headers-$(uname -r) nvidia-driver software-properties-common -y | ||
``` | ||
|
||
##### **Ubuntu** | ||
|
||
```shell | ||
apt update -y && apt upgrade -y && apt autoremove -y | ||
apt install ubuntu-drivers-common --fix-missing -y | ||
ubuntu-drivers --gpgpu install | ||
``` | ||
|
||
> ℹ️ If prompted, just press Enter. | ||
#### 2. **Verify the Installation** | ||
|
||
To simply check that the installation went well, run the following: | ||
|
||
```shell | ||
nvidia-smi | ||
``` | ||
|
||
Alternatively, you can try to run [ollama](https://ollama.com/) using the GPU: | ||
|
||
```shell | ||
curl -fsSL https://ollama.com/install.sh | sh | ||
ollama run deepseek-r1:1.5b "Why use a decentralized cloud?" --verbose | ||
``` | ||
|
||
### Install NVIDIA CUDA Toolkit | ||
|
||
Ensure you already have the NVIDIA drivers installed. | ||
|
||
#### 1. **Installation** | ||
|
||
The following commands are usable for both Debian and Ubuntu. | ||
|
||
```shell | ||
distrib=$(echo "$(lsb_release -si | tr '[:upper:]' '[:lower:]')$(lsb_release -sr | tr -d '.')") | ||
wget https://developer.download.nvidia.com/compute/cuda/repos/$distrib/x86_64/cuda-keyring_1.1-1_all.deb && dpkg -i cuda-keyring_1.1-1_all.deb && rm cuda-keyring_1.1-1_all.deb | ||
apt update -y && apt install cuda-toolkit -y | ||
echo "export PATH=/usr/local/cuda/bin:$PATH" >> ~/.bashrc && source ~/.bashrc | ||
``` | ||
|
||
#### 2. **Verify the Installation** | ||
|
||
To simply check that the installation went well, run the following: | ||
|
||
```shell | ||
nvcc --version | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
# Enabling GPU support | ||
|
||
This guide outlines how to enable GPU support on your CRN, as to allow user do deploy Instance with GPUs. | ||
|
||
Enabling GPU support for Virtualization requires changing which Kernel module (driver) is handling the GPU card, as it need the special `vfio` module to works with qemu. Do note that it will make the GPU unavailable for other purposes. | ||
|
||
To activate this feature it is also required to enable PAYG support and IPv6 on your CRN, as this is the required payment method for GPU. Follow the steps at [Enable PAYG](enable-payg.md) | ||
|
||
## Device compatible with the feature | ||
|
||
At the time of writing the GPUs listed below are compatible with the feature, more will be added soon as they are tested and validated. | ||
|
||
### Standard GPUs | ||
|
||
| GPU Model | vCPU | RAM | vRAM | Price approx | | ||
|--------------|------|-------|-------|-----------------| | ||
| L40S | 12 | 72 GB | 48 GB | 3.33 ALEPH/hour | | ||
| RTX 5090 | 8 | 48 GB | 36 GB | 2.24 ALEPH/hour | | ||
| RTX 4090 | 6 | 36 GB | 24 GB | 1.68 ALEPH/hour | | ||
| RTX 3090 | 4 | 24 GB | 24 GB | 1.12 ALEPH/hour | | ||
| RTX 4000 ADA | 3 | 18 GB | 20 GB | 0.84 ALEPH/hour | | ||
|
||
GPUs must be connected via PCIe 4.0 16x each. | ||
|
||
## Premium GPUs | ||
|
||
|
||
Datacenter grade GPUs. | ||
|
||
| GPU Model | vCPU | RAM | vRAM | Price approx | | ||
|-----------|------|--------|-------|------------------| | ||
| H100 | 24 | 144 GB | 80 GB | 13.33 ALEPH/hour | | ||
| A100 | 16 | 96 GB | 80 GB | 8.89 ALEPH/hour | | ||
|
||
## Switch Kernel modules for GPU | ||
It is possible to enable multiple GPUs on one CRN | ||
|
||
Execute these command as root, use `sudo` on Ubuntu system. | ||
|
||
1. Edit initramfs to attach GPU to vfio: | ||
`vi /etc/initramfs-tools/modules` and set the content to | ||
``` | ||
attach : vfio vfio_iommu_type1 vfio_virqfd vfio_pci ids=10de:27b0,10de:22bc | ||
``` | ||
|
||
replacing the ids `10de:27b0,10de:22bc` with ids of the VGA card, there should be 2, the Video device and the GPU audio device. | ||
You can get them running `lspci -nvv` as root user. | ||
|
||
2. Edit `/etc/modules` to ensure that lods GPU to vfio: | ||
`vi /etc/modules` and add | ||
``` | ||
attach: vfio vfio_iommu_type1 vfio_pci ids=10de:27b0,10de:22bc | ||
``` | ||
|
||
3. Modify nvidia drivers to load after the vfio: | ||
`vi /etc/modprobe.d/nvidia.conf` | ||
|
||
set | ||
``` | ||
softdep nouveau pre: vfio-pci | ||
softdep nvidia pre: vfio-pci | ||
softdep nvidia* pre: vfio-pci | ||
``` | ||
|
||
4. Get the know alias from the PCI device: | ||
`cat /sys/bus/pci/devices/0000:01:00.0/modalias` | ||
replace pci address `0000:01:00.0` with the one for your GPU, you can get it using `lspci` | ||
|
||
5. Configure vfio module to use that devices: | ||
`vi /etc/modprobe.d/vfio.conf` | ||
|
||
``` | ||
blacklist nouveau | ||
blacklist snd_hda_intel | ||
alias pci:v000010DEd000027B0sv000010DEsd000016FAbc03sc00i00 vfio-pci | ||
options vfio-pci ids=10de:27b0,10de:22bc | ||
``` | ||
|
||
6. Enable modprobe vfio-pci | ||
``` | ||
modprobe vfio-pci | ||
``` | ||
|
||
7 . Update initramfs image with your changes: | ||
```shell | ||
update-initramfs -u -k all | ||
``` | ||
|
||
8. Finally Reboot the server | ||
```shell | ||
reboot | ||
``` | ||
|
||
9. Confirm that the `vfio-pci` kernel module is used for the card | ||
```shell | ||
lspci -k | ||
``` | ||
|
||
Should display something similar to | ||
``` | ||
[...] | ||
01:00.0 VGA compatible controller: NVIDIA Corporation AD104GL [RTX 4000 SFF Ada Generation] (rev a1) | ||
Subsystem: NVIDIA Corporation AD104GL [RTX 4000 SFF Ada Generation] | ||
Kernel driver in use: vfio-pci | ||
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia | ||
01:00.1 Audio device: NVIDIA Corporation Device 22bc (rev a1) | ||
Subsystem: NVIDIA Corporation Device 16fa | ||
Kernel driver in use: vfio-pci | ||
Kernel modules: snd_hda_intel | ||
[...] | ||
``` | ||
|
||
10. Confirm that the GPU are listed and supported on the CRN index page | ||
No additional modification is needed inside the aleph-vm configuration, apart from [enabling PAYG](./enable-payg.md). | ||
|
||
Start your aleph-vm supervisor, open the index page and it should list your GPU | ||
``` | ||
GPUs | ||
• NVIDIA | AD104GL [RTX 4000 SFF Ada Generation] is compatible ✅ | ||
``` | ||
|
||
|
||
## Disabling support. | ||
To disable support for the GPU, revert the modification done to attach the kernel module to the GPU. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
## Enable PAYG (Pay-As-You-Go) | ||
|
||
Pay-As-You-Go allows user to pay for resources as you use them on the Aleph.im network, eliminating the need to hold or | ||
stake large amounts of $ALEPH. | ||
|
||
The feature is currently available on BASE and Avalanche c-chain. | ||
|
||
This guide outlines how to enable [Confidential Computing](../../../computing/confidential/index.md) on a CRN. | ||
|
||
#### Configure the stream reward address | ||
|
||
1. Create an Avalanche (AVAX) or BASE wallet. | ||
2. Open the information of your CRN on the [aleph.im account page](https://account.aleph.im/) and enter the address in | ||
the section named STREAM REWARD ADDRESS. | ||
|
||
3. Add the reward address inside the CRN configuration `/etc/aleph-vm/supervisor.env` in the form of: | ||
``` | ||
ALEPH_VM_PAYMENT_RECEIVER_ADDRESS="0x0000000000000000000000000000000000000000" | ||
``` | ||
Where `0x0000000000000000000000000000000000000000` is the address of your wallet. | ||
4. Restart the node with `systemctl restart aleph-vm-supervisor.service` | ||
5. Confirm that the address appears on the path `/status/config` on the CRN's URL/config |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters