The Customer Satisfaction Project is designed to answer the following question of how to maintain continuous competitive monitoring focused on customer experience in the banking sector. By basing itself on web data, it seeks to identify opportunities for improvement that will strengthen customer satisfaction competitiveness in this sector. This project employs a comprehensive approach involving data extraction, transformation, loading, analysis, and visualization to provide actionable insights.
Data comes from Trustpilot, featuring banking sector companies. We collect:
- General company information (e.g., overall rating, location, phone number).
- Customer reviews (e.g., comment, date, rating, company response).
Data extraction is performed using Python scripts with Beautiful Soup, storing the data in a JSON file.
The extracted data undergoes processing to:
- Convert formats.
- Replace special characters to standardize the data.
This step is done using a Python script.
The processed data is stored in:
- Elasticsearch: For data quality checks via Kibana dashboards and aggregating customer ratings.
- Postgres Database: Temporary tables from the previous day's scrape are merged with historical tables through an upsert operation, updating or adding comments as needed.
A Python script, integrating Langchain and OpenAI's ChatGPT, analyzes comments to categorize feedback (e.g., customer relations, pricing) and assess sentiment (positive or negative). Comments not yet analyzed are queried, analyzed, added to a feedback table, and marked as "analyzed" in the comments table.
An interactive Power BI report connected to the Postgres database provides:
- A main tab for general data.
- A detail tab for company-specific information.
- An analysis tab for feedback insights.
The project is orchestrated with Airflow, running daily, and containerized using Docker.
Navigate to the Docker folder (Docker\docker_Airflow_ES_Kibana) and initiate the setup by running:
./setup.sh
This is used to create the required folders needed by Postgres
Start the project components with Docker Compose:
docker-compose up
This will create all the images needed and launch the containers associated.
- Connect to the Postgres Database: Utilize the provided credentials within pgadmin at port 5050 (http://localhost:5050/).
- Activate Airflow DAG: Access Airflow to enable the DAG for daily operations at port 8080 (http://localhost:8080/).
- Connect to Kibana dashboard: Access Kibana dashboard to check the data at port 5601 (http://localhost:5601/).
In order to launch all the containers, Docker must have around 11GB of available memory