Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial support for GPU on linux (TRACKING, replaced by separate PRs) #2180

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions packages/backend/src/managers/GPUManager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ export class GPUManager extends Publisher<IGPUInfo[]> implements Disposable {
case 'Intel Corporation':
return GPUVendor.INTEL;
case 'NVIDIA':
case 'NVIDIA Corporation':
return GPUVendor.NVIDIA;
case 'Apple':
return GPUVendor.APPLE;
Expand Down
40 changes: 36 additions & 4 deletions packages/backend/src/workers/provider/LlamaCppPython.ts
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,24 @@ export class LlamaCppPython extends InferenceProvider {
PathInContainer: '/dev/dri',
CgroupPermissions: '',
});
break;
case VMType.UNKNOWN:
// Only supports NVIDIA
if (gpu.vendor !== GPUVendor.NVIDIA) break;

supported = true;
devices.push({
PathOnHost: 'nvidia.com/gpu=all',
Copy link
Contributor

@axel7083 axel7083 Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This need the NVIDIA Container Device Interface installed ?

This is a bit problematic, as today we cannot detect the device installed on the podman machine without some hacky stuff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@axel7083 do you mean detection on the local machine instead of podman machine? It looks like to me that on Linux there is no podman machine as podman runs natively on the local machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but you still need to check for the CDI to be installed on the system ?

Copy link
Contributor Author

@mhdawson mhdawson Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be able to do that by looking for the files that you need to generate for CDI. From

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html#procedure

on my Fedora system that is /etc/cdi/nvidia.yaml

Since it says it deps on the container engine you use, we could possibly just check in /etc/cdi/nvidia.yaml on the assumption that is what is for podman. We might later add additional places to check if needed later on.

If a check for the existance of that file would be enough I can look at adding it.

PathInContainer: '',
CgroupPermissions: '',
});

user = '0';

entrypoint = '/usr/bin/sh';

cmd = ['-c', 'chmod 755 ./run.sh && ./run.sh'];

break;
}

Expand Down Expand Up @@ -197,9 +215,21 @@ export class LlamaCppPython extends InferenceProvider {
if (this.configurationRegistry.getExtensionConfiguration().experimentalGPU) {
const gpus: IGPUInfo[] = await this.gpuManager.collectGPUs();
if (gpus.length === 0) throw new Error('no gpu was found.');
if (gpus.length > 1)
console.warn(`found ${gpus.length} gpus: using multiple GPUs is not supported. Using ${gpus[0].model}.`);
gpu = gpus[0];
let selectedGPU = 0;
if (gpus.length > 1) {
// Look for a GPU that is of a known type, use the first one found.
// Fall back to the first one if no GPUs are of known type.
for (let i = 0; i < gpus.length; i++) {
if (gpus[i].vendor !== GPUVendor.UNKNOWN) {
selectedGPU = i;
break;
}
}
console.warn(
`found ${gpus.length} gpus: using multiple GPUs is not supported. Using ${gpus[selectedGPU].model}.`,
);
}
gpu = gpus[selectedGPU];
}

let connection: ContainerProviderConnection | undefined = undefined;
Expand All @@ -224,7 +254,7 @@ export class LlamaCppPython extends InferenceProvider {
const containerCreateOptions: ContainerCreateOptions = await this.getContainerCreateOptions(
config,
imageInfo,
connection.vmType as VMType,
vmType,
gpu,
);

Expand Down Expand Up @@ -254,6 +284,8 @@ export class LlamaCppPython extends InferenceProvider {
case VMType.LIBKRUN_LABEL:
return gpu ? llamacpp.vulkan : llamacpp.default;
// no GPU support
case VMType.UNKNOWN:
return gpu?.vendor === GPUVendor.NVIDIA ? llamacpp.cuda : llamacpp.default;
default:
return llamacpp.default;
}
Expand Down