Skip to content

Commit

Permalink
chore: remove llama.cpp submodule
Browse files Browse the repository at this point in the history
* update docs
  • Loading branch information
Avram Tudor committed Oct 8, 2024
1 parent 6b5a4c7 commit 91e18ba
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 18 deletions.
3 changes: 0 additions & 3 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +0,0 @@
[submodule "llama.cpp"]
path = llama.cpp
url = https://github.com/ggerganov/llama.cpp
14 changes: 4 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Skynet is an API server for AI services wrapping several apps and models.

It is comprised of specialized modules which can be enabled or disabled as needed.

- **Summary and Action Items** with llama.cpp (enabled by default)
- **Summary and Action Items** with vllm (or llama.cpp)
- **Live Transcriptions** with Faster Whisper via websockets
- 🚧 _More to follow_

Expand All @@ -16,16 +16,10 @@ It is comprised of specialized modules which can be enabled or disabled as neede
## Summaries Quickstart

```bash
# Init and update submodules if you haven't already. This will add llama.cpp which provides the OpenAI api server
git submodule update --init

# Download the preferred GGUF llama model
mkdir "$HOME/models"

wget -q --show-progress "https://huggingface.co/jitsi/Llama-3.1-8B-GGUF/blob/main/Llama-3.1-8B-Instruct-Q8_0.gguf?download=true" -O "$HOME/models/Llama-3.1-8B-Instruct-Q8_0.gguf"

export OPENAI_API_SERVER_PATH="$HOME/skynet/llama.cpp/llama-server"
# if VLLM cannot be used, use llama.cpp server with a gguf model, otherwise, simply point LLAMA_PATH to your raw model folder
export LLAMA_CPP_SERVER_PATH="$HOME/llama.cpp/llama-server"
export LLAMA_PATH="$HOME/models/Llama-3.1-8B-Instruct-Q8_0.gguf"

# disable authorization (for testing)
export BYPASS_AUTHORIZATION=1

Expand Down
8 changes: 4 additions & 4 deletions docs/summaries_module.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Skynet Summaries Module

Extracts summaries and action items from a given text. The API wraps the wonderful [ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp). It is split into two sub-modules: `summaries:dispatcher` and `summaries:executor`.
Extracts summaries and action items from a given text. The service can be deployed to use either vllm or llama.cpp. It is split into two sub-modules: `summaries:dispatcher` and `summaries:executor`.

`summaries:dispatcher` will push jobs and retrieve job results from a Redis queue while `summaries:executor` performs the actual inference. They can both be enabled at the same time or deployed separately.
`summaries:dispatcher` will do CRUD for jobs with a Redis installation, while `summaries:executor` performs the actual inference. They can both be enabled at the same time or deployed separately.

> All requests to this service will require a standard HTTP Authorization header with a Bearer JWT. Check the [**Authorization page**](auth.md) for detailed information on how to generate JWTs or disable authorization.
Expand All @@ -19,15 +19,15 @@ Extracts summaries and action items from a given text. The API wraps the wonderf

All of the configuration is done via env vars. Check the [Skynet Environment Variables](env_vars.md) page for a list of values.

## Running
## Running with Llama.cpp

```bash
# Download the preferred GGUF llama model
mkdir "$HOME/models"

wget -q --show-progress "https://huggingface.co/jitsi/Llama-3.1-8B-GGUF/blob/main/Llama-3.1-8B-Instruct-Q8_0.gguf?download=true" -O "$HOME/models/Llama-3.1-8B-Instruct-Q8_0.gguf"

export OPENAI_API_SERVER_PATH="$HOME/skynet/llama.cpp/llama-server"
export LLAMA_CPP_SERVER_PATH="$HOME/skynet/llama.cpp/llama-server"
export LLAMA_PATH="$HOME/models/Llama-3.1-8B-Instruct-Q8_0.gguf"
# disable authorization (for testing)
export BYPASS_AUTHORIZATION=1
Expand Down
1 change: 0 additions & 1 deletion llama.cpp
Submodule llama.cpp deleted from 6026da

0 comments on commit 91e18ba

Please sign in to comment.