Recent advancements in artificial intelligence, particularly in language models and retrieval-augmented generation (RAG), have opened new avenues for creating sophisticated applications. This project aims to leverage these technologies to create a powerful question-answering system using the world of Percy Jackson books. The system will use LangChain for orchestration, a Vector Database (Vector DB) for embeddings and retrieval, and the Ollama API with Llama3 for language modeling. At the end I've built a demo app using Gradio to interact with the RAG.
-
Create a RAG + LLM system that can answer questions based on the content of Percy Jackson books.
-
Implement and integrate LangChain, Vector DB, and Ollama (using Llama3).
First, we need to install Ollama. Ollama is an open-source project that serves as a powerful and user-friendly platform for running LLMs on your local machine. Download and install the application via the link here. Once Ollama is installed (including the ollama cli, which is part of the installation stesp), in the terminal run the following commands.
ollama pull llama3
ollama pull nomic-embed-text
ollama serve
These set of commands will download the models we need to run our RAG and to start the ollama server.
Second, in a separete terminal, we create a conda enviroment and install the required dependencies.
conda create --name percyjacksonrag python=3.8
conda activate percyjacksonrag
pip install -r requirements.txt
We then run populate_db.py
. This script will download the Percy Jackson books and populate them into our Chroma vector db by using embeddings dervied from nomic-embed-text
. This will take a while.
python populate_db.py --reset
Once this is done, everything it set up to run our RAG and ask our LLM questions about the world of Percy Jackson using a demo app created via Gradio! All you have to do is run:
python app.py
Below is a demo video!