Skip to content

Deploy a self-hosted Large Language Model (LLM) using Ollama and Open WebUI.

Notifications You must be signed in to change notification settings

lucasrod16/self-hosted-llm

Repository files navigation

Self-hosted LLM

Deploy a self-hosted Large Language Model (LLM) using Ollama and Open WebUI.

Currently using the Llama 3.3 70B model.

Architecture

The LLM is deployed on an AWS EC2 instance with a pre-configured Amazon Machine Image (AMI) that includes Docker, the NVIDIA driver, and the NVIDIA Container Toolkit. The EC2 instance is equipped with a single NVIDIA L40S Tensor Core GPU.

Docker Compose is used to manage two containers:

  • Ollama: serves the model via an HTTP API on port 11434.
  • Open WebUI: serves the user interface and interacts with Ollama for generating responses. The interface is exposed on port 8080.

An EBS volume is mounted to the EC2 instance to persist chat history, models, Docker data, etc.

Deploy

aws sso login
./deploy.sh

Teardown

terraform destroy --auto-approve

About

Deploy a self-hosted Large Language Model (LLM) using Ollama and Open WebUI.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published