This project was completed in week 7 of the Data Science Bootcamp at Spiced Academy in Berlin.
The goal of this project is to create a database of tweets that use the hashtag #OnThisDay along with their sentiment score, and post a tweet in a Slack channel, to inform members about historical events that happened on that day.
The Docker-Compose pipeline includes five steps (containters):
- Collect tweets using the Twitter API and tweepy (
tweet_collector
) - Store the tweets in a MongoDB
- Apply ETL job (
etl_job
)
- Extract tweets from MongoDB.
- Clean the text and apply sentiment analyis with VADER/TextBlob.
- Load the cleaned tweet texts and their sentiment scores in a Postgres database.
- Create a Slackbot that post a randomly selected tweet from the Postgres database into a Slack channel (
slackbot
)