Self-hosted LLM

Deploy a self-hosted Large Language Model (LLM) using Ollama and Open WebUI.

Currently using the Llama 3.3 70B model.

Architecture

The LLM is deployed on an AWS EC2 instance with a pre-configured Amazon Machine Image (AMI) that includes Docker, the NVIDIA driver, and the NVIDIA Container Toolkit. The EC2 instance is equipped with a single NVIDIA L40S Tensor Core GPU.

Docker Compose is used to manage two containers:

Ollama: serves the model via an HTTP API on port 11434.
Open WebUI: serves the user interface and interacts with Ollama for generating responses. The interface is exposed on port 8080.

An EBS volume is mounted to the EC2 instance to persist chat history, models, Docker data, etc.

Deploy

aws sso login

./deploy.sh

Teardown

terraform destroy --auto-approve

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
ami		ami
docker		docker
.gitignore		.gitignore
.terraform.lock.hcl		.terraform.lock.hcl
README.md		README.md
deploy.sh		deploy.sh
docker-compose.yml		docker-compose.yml
main.tf		main.tf
user_data.sh		user_data.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-hosted LLM

Architecture

Deploy

Teardown

About

Releases

Packages

Contributors 2

Languages

lucasrod16/self-hosted-llm

Folders and files

Latest commit

History

Repository files navigation

Self-hosted LLM

Architecture

Deploy

Teardown

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages