Llama Unreal

An Unreal focused API wrapper for llama.cpp to support embedding LLMs into your games locally. Forked from upstream to focus on improved API with wider support for builds (CPU, CUDA, Android, Mac).

Early releases, api still pretty unstable YMMV.

NB: currently has #7 issue which may require you to do your own static llama.cpp build until resolved.

Discord Server

Install & Setup

Download Latest Release Ensure to use the Llama-Unreal-UEx.x-vx.x.x.7z link which contains compiled binaries, not the Source Code (zip) link.
Create new or choose desired unreal project.
Browse to your project folder (project root)
Copy Plugins folder from .7z release into your project root.
Plugin should now be ready to use. NB: You may need to manually copy ggml.dll and llama.dll to your project binaries folder for it to run correctly. (v0.5.0 issue)

How to use - Basics

NB: Early days of API, unstable.

Everything is wrapped inside a ULlamaComponent which interfaces internally via FLlama.

Setup your ModelParams of type FLLMModelParams
Call InsertPromptTemplated (or InsertPrompt if you're doing raw input style without formatting. NB: only ChatML templating is currently specified for templated input.
You should receive replies via OnNewTokenGenerated callback

Explore LlamaComponent.h for detailed API.

Llama.cpp Build Instructions

If you want to do builds for your own use case or replace the llama.cpp backend. Note that these build instructions should be run from the cloned llama.cpp root directory, not the plugin root.

Forked Plugin Llama.cpp was built from git hash: 2f3c1466ff46a2413b0e363a5005c46538186ee6

Windows build

With the following build commands for windows (cpu build only, CUDA ignored, see upstream for GPU version):

CPU Only

mkdir build
cd build/
cmake ..
cmake --build . --config Release -j --verbose

CUDA

ATM built for CUDA 12.2 runtime

Use cuda branch if you want cuda enabled.
We build statically due to dll runtime load bug so you need to copy cudart.lib cublas.lib and cuda.lib into your libraries/win64 path. These are ignored atm.
Ensure bTryToUseCuda = true; is set in LlamaCore.build.cs to add CUDA libs to build.
NB help wanted: Ideally this needs a variant that build with -DBUILD_SHARED_LIBS=ON

mkdir build
cd build
cmake .. -DGGML_CUDA=ON
cmake --build . --config Release -j --verbose

Mac build

mkdir build
cd build/
cmake .. -DBUILD_SHARED_LIBS=ON
cmake --build . --config Release -j --verbose

Android build

For Android build see: https://github.com/ggerganov/llama.cpp/blob/master/docs/android.md#cross-compile-using-android-ndk

mkdir build-android
cd build-android
export NDK=<your_ndk_directory>
cmake -DCMAKE_TOOLCHAIN_FILE=$NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=android-23 -DCMAKE_C_FLAGS=-march=armv8.4a+dotprod ..
$ make

Then the .so or .lib file was copied into e.g. ThirdParty/LlamaCpp/Win64/cpu directory and all the .h files were copied to the Includes directory.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
Includes		Includes
Source/LlamaCore		Source/LlamaCore
ThirdParty/LlamaCpp		ThirdParty/LlamaCpp
.gitignore		.gitignore
LICENSE		LICENSE
Llama.uplugin		Llama.uplugin
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llama Unreal

Install & Setup

How to use - Basics

Llama.cpp Build Instructions

Windows build

CPU Only

CUDA

Mac build

Android build

About

Releases 4

Packages

Languages

License

getnamo/Llama-Unreal

Folders and files

Latest commit

History

Repository files navigation

Llama Unreal

Install & Setup

How to use - Basics

Llama.cpp Build Instructions

Windows build

CPU Only

CUDA

Mac build

Android build

About

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages