Skip to content

This is a quick RAG project to test out running Ollama with Weaviate while querying from a PDF file.

Notifications You must be signed in to change notification settings

omarsinno54/OllamaPDFInsight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OllamaPDFInsight

OllamaPDFInsight is a small RAG (Retrieval-Augmented Generation) system that leverages Ollama to create embeddings from a PDF file, stores these embeddings in a Weaviate vector store, and uses Ollama to answer questions regarding the PDF content.

Steps

  • Use LangChain document loader to turn the PDF into a set of documents.
  • Create a collection from these documents in the Weaviate vector store.
  • Use Ollama to generate embeddings from the documents, and store the embeddings in the collection.
  • Query and retrieve information from the PDF using Ollama.

Run & Modify

  1. Create a virtual environment:
pyenv install 3.8.10
pyenv virtualenv 3.8.10 ollama-pdf-insight-env
pyenv activate ollama-pdf-insight-env
  1. Once the virtual environment is activated, install the requirements from the requirements.txt file.
pip install -r requirements.txt
  1. Clone the repository:
git clone https://github.com/yourusername/OllamaPDFInsight.git
cd OllamaPDFInsight
  1. Prepare the data and the weaviate vector store.
python load_data.py
  1. Prepare the template, llm, and run the prompt.
python retrieve_context.py

References

I essentially followed the steps in this article and added my own touch to it.

About

This is a quick RAG project to test out running Ollama with Weaviate while querying from a PDF file.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages