A web application that analyzes website content to understand visitor intent through AI-powered question generation. This tool takes a URL as input, scrapes the website content, and generates relevant multiple-choice questions to classify visitor interests.
- 🔍 Real-time website content analysis
- 🤖 AI-powered question generation using Groq API
- 💾 Content caching with Redis
- 🗄️ Persistent storage with PostgreSQL
- ⚡ Fast and responsive React frontend
- React
- Redux for state management
- Tailwind CSS for styling
- Lucide React for icons
- Shadcn UI components
- Python 3.x
- Flask web framework
- BeautifulSoup4 for web scraping
- Groq API for AI-powered analysis
- Redis for content caching
- PostgreSQL for data persistence
- SQLAlchemy ORM
- Flask-Migrate for database migrations
- Python 3.x
- Node.js and npm
- Redis server
- PostgreSQL database
- Groq API key
- Clone the repository:
git clone https://github.com/samiamjidkhan/website-intent-analyzer.git
cd website-intent-analyzer
- Set up the backend environment:
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create .env file
cp .env.example .env
- Set up the frontend environment:
cd frontend
npm install
- Configure environment variables in
.env
:
DATABASE_URL=postgresql://username:password@localhost:5432/dbname
GROQ_API_KEY=your_groq_api_key
REDIS_URL=redis://localhost:6379
PORT=5001
- Start the Redis server:
redis-server
- Start the backend server:
python app.py # Or python3 app.py
- Start the frontend development server:
cd frontend
npm run dev
user-classifier/
├── frontend/
│ ├── src/
│ │ ├── components/
│ │ │ └── URLAnalyzer.jsx
│ │ ├── store/
│ │ │ └── urlAnalyzerSlice.js
| | | └── store.js
│ │ └── App.jsx
│ └── package.json
├── backend/
│ ├── services/
│ │ ├── intent_analyzer.py
│ │ └── scraper.py
│ ├── models/
│ │ └── website.py
│ ├── extensions.py
│ └── app.py
├── requirements.txt
└── README.md
-
URL Submission: User submits a website URL through the frontend interface.
-
Content Scraping: The backend scrapes the website content using BeautifulSoup4:
- Checks Redis cache for previously scraped content
- If not cached, scrapes the website and stores in Redis for 1 hour
-
Intent Analysis:
- Scraped content is processed by the Groq API
- Generates a contextual multiple-choice question
- Creates 4 relevant options based on the content
-
Result Storage:
- Analysis results are stored in PostgreSQL
- Cached for 7 days to prevent redundant processing
-
Response Display:
- Question and options are displayed to the user
- Interface automatically resets after selection
Analyzes a website and generates an intent classification question.
Request Body:
{
"url": "https://example.com"
}
Example response:
{
"url": "https://example.com",
"question": "Which product category are you interested in?",
"options": ["Smartphones", "Laptops", "Smart Home", "Wearables"]
}
Health check endpoint for monitoring service status.
Many improvements can be made with regards to scraping and generating questions. Feel free to jump in!