Skip to content

Docker pipeline for streaming tweets and their sentiment score to a Slack channel

License

Notifications You must be signed in to change notification settings

lorenanda/tweets-docker-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tweet streaming pipeline for a Slackbot

This project was completed in week 7 of the Data Science Bootcamp at Spiced Academy in Berlin.

pipeline

Description

The goal of this project is to create a database of tweets that use the hashtag #OnThisDay along with their sentiment score, and post a tweet in a Slack channel, to inform members about historical events that happened on that day.

The Docker-Compose pipeline includes five steps (containters):

  1. Collect tweets using the Twitter API and tweepy (tweet_collector)
  2. Store the tweets in a MongoDB
  3. Apply ETL job (etl_job)
  • Extract tweets from MongoDB.
  • Clean the text and apply sentiment analyis with VADER/TextBlob.
  1. Load the cleaned tweet texts and their sentiment scores in a Postgres database.
  2. Create a Slackbot that post a randomly selected tweet from the Postgres database into a Slack channel (slackbot)