This project is designed to monitor Reddit data in real-time. It streams data from specified subreddits using Kafka, preprocesses it, and stores it in a PostgreSQL database. The project uses Docker and Docker Compose for easy deployment and management.
- Project Overview
- Directory Structure
- Prerequisites
- Setup Instructions
- Usage
- Environment Variables
- Makefile Commands
- Contributing
- License
This project consists of several components:
- reddit_stream.py: Streams Reddit data to Kafka.
- preprocess.py: (Optional) Preprocesses the data.
- store_data.py: Consumes Kafka messages and stores them in PostgreSQL.
Before you begin, ensure you have the following installed:
- Docker
- Docker Compose
-
Clone the repository:
git clone https://github.com/yourusername/real-time-reddit-monitoring.git cd real-time-reddit-monitoring
-
Set up environment variables: Create a
.env
file in the project root directory and add your Reddit and PostgreSQL credentials:# Reddit API credentials REDDIT_CLIENT_ID=your_reddit_client_id REDDIT_CLIENT_SECRET=your_reddit_client_secret REDDIT_USERNAME=your_reddit_username REDDIT_PASSWORD=your_reddit_password REDDIT_USER_AGENT=your_user_agent # PostgreSQL credentials POSTGRES_USER=user POSTGRES_PASSWORD=password POSTGRES_DB=reddit_db POSTGRES_HOST=db POSTGRES_PORT=5432
-
Build and run the Docker containers:
make up
-
Start the application:
make up
-
Stop the application:
make down
-
View logs:
make logs
Ensure the following environment variables are set in your .env
file:
REDDIT_CLIENT_ID
: Your Reddit client IDREDDIT_CLIENT_SECRET
: Your Reddit client secretREDDIT_USERNAME
: Your Reddit usernameREDDIT_PASSWORD
: Your Reddit passwordREDDIT_USER_AGENT
: Your Reddit user agentPOSTGRES_USER
: PostgreSQL usernamePOSTGRES_PASSWORD
: PostgreSQL passwordPOSTGRES_DB
: PostgreSQL database namePOSTGRES_HOST
: PostgreSQL host (usually the service name in Docker Compose)POSTGRES_PORT
: PostgreSQL port (default is 5432)
make up
: Build and start the Docker containers.make down
: Stop and remove the Docker containers.make logs
: View logs of the running containers.make build
: Build the Docker images.
Contributions are welcome! Please fork the repository and submit a pull request.