- About RAGPost
- Key Features
- Technology Stack
- Getting Started
- Usage
- Project Structure
- Contributing
- License
- Acknowledgements
RAGPost is an innovative blog post generator that harnesses the power of Retrieval-Augmented Generation (RAG) technology to transform PDF documents into engaging blog content. This cutting-edge tool streamlines the content creation process, allowing users to effortlessly convert their PDF files into well-structured, informative blog posts.
Whether you're a content creator, researcher, or knowledge worker, RAGPost empowers you to unlock the potential of your PDF documents and share your insights with the world.
- 📄 PDF Text Extraction: Efficiently extracts text from PDF files, preserving the original document's structure and content.
- 🧠 Advanced Retrieval System: Utilizes LlamaIndex to create a powerful retrieval system for accessing relevant information.
- ✍️ AI-Powered Blog Generation: Leverages state-of-the-art language models to generate coherent and engaging blog posts.
- 🌐 User-Friendly Web Interface: Built with Flask, providing an intuitive and responsive user experience.
- 🔍 Customizable Content Generation: Allows users to specify topics and tailor the generated content to their needs.
- 📊 Scalable Architecture: Designed to handle multiple users and large PDF documents efficiently.
RAGPost is built using a robust and modern technology stack:
- Python 🐍: The core programming language used for backend development.
- Flask 🌶️: A lightweight WSGI web application framework.
- LlamaIndex 🦙: An advanced data framework for building LLM applications.
- OpenAI API 🤖: Provides access to powerful language models for text generation.
- PyPDF2 📚: A library for reading and manipulating PDF files.
- NLTK 🗣️: Natural Language Toolkit for text processing and analysis.
- SQLAlchemy 🗃️: SQL toolkit and Object-Relational Mapping (ORM) library.
Before you begin, ensure you have the following installed:
- Python 3.8 or higher
- pip (Python package installer)
- Git
-
Clone the repository:
git clone https://github.com/yourusername/RAGPost.git cd RAGPost
-
Set up a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up environment variables:
cp .env.example .env
Edit the
.env
file and add your OpenAI API key. -
Initialize the database:
flask db upgrade
-
Start the Flask development server:
flask run
-
Open your web browser and navigate to
http://localhost:5000
. -
Upload a PDF file using the web interface.
-
Specify the topic or focus for your blog post.
-
Click "Generate" and wait for RAGPost to create your blog content.
-
Review, edit, and publish your generated blog post!
RAGPost/
│
├── src/
│ └── RAGPost/
│ ├── components/
│ ├── utils/
│ ├── config/
│ ├── pipeline/
│ ├── entity/
│ └── constants/
│
├── config/
├── notebooks/
├── templates/
├── static/
├── tests/
│
├── .github/
├── .gitignore
├── LICENSE
├── README.md
├── requirements.txt
├── setup.py
└── app.py
For a detailed explanation of each directory and file, please refer to our Project Structure Guide.
We welcome contributions to RAGPost! Please see our Contributing Guidelines for more information on how to get started.
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for their powerful language models
- LlamaIndex for the excellent retrieval framework
- All the open-source libraries that made this project possible
📬 For any questions or feedback, please open an issue or contact the maintainers. We'd love to hear from you!
🌟 If you find RAGPost helpful, please consider giving it a star on GitHub!