NLP for Classifying Consumer Tweets

Phase 4 Project

Flatiron Online Data Science Bootcamp

Prepared and presented by: Leah Pope

Presentation: here

Introduction

The goal of this project is to build a Natural Language Processing model to analyze sentiment about Apple and Google products. I'll classify a Tweet as negative, positive or neutral based on its content.

The Stakeholders for my project are marketing professionals in either company who are interested in learning consumer sentiment. It appears that these tweets were gathered during a session of the South by Southwest film, culture, music, and technology conference. This consumer sentiment would be of interest to the conference marketing professionals and vendor organizers.

Data Description

The dataset for the project comes from CrowdFlower via data.world. Human raters rated the sentiment in over 9,000 Tweets as positive, negative, or neither.

EDA Questions Explored

Question 1: What is the general breakdown for these Tweets?

Question 2: Are the two brands represented equally in the labeled data?

Question 3: What insights can the data provide for specific brands?

Notebook

Corpus EDA Questions Explored

Question 1: What are the most common words in Postitive/Negative/Neutral Tweets?

Question 2: Is there a difference in character count between Postitive/Negative/Neutral Tweets?

Notebook

Modeling

Creating binary and multiclass classifiers for Tweets on Apple and Google products.

Notebook for Binary Classifiers

Notebook for Multiclass Classifers

Conclusions

Binary Classifer - winning model

RandomForest with RandomOverSampling: Weighted F1 Score of 0.87

None of the binary classifiers did a good job with classifying Negative tweets, even with RandomOverSampling. RandomForest and RandomForest with RandomOverSampling had the highest weighted F1 scores of all models I trained. I'm calling RandomForest with RandomOverSampling the winner as it has a slightly better True Positive rate for correctly identifying Negative Tweets (0.37 vs 0.3). This is still crummy.

Here's the breakdown of all Binary Classifier Models and scores:

RandomForest - 0.87
RandomForest ROS - 0.87
Multimonial NB - 0.82
Multimonial NB ROS - 0.85

Multiclass Classifer - winning model

RandomForest with RandomOverSampling: Weighted F1 Score of 0.68

RandomForest (and RandomForest with RandomOverSampling) had the highest weighted F1 scores of all models I trained. I'm calling RandomForest with RandomOverSampling the winner as it has a slightly better True Positive rate for correctly identifying Negative Tweets (0.29 vs 0.25). I imagine Negative tweets are most interesting to stakeholders. It also has a better True Positive rate for correctly identifying Positive Tweets (0.56 vs 0.48). These numbers are still pretty poor.

Here's the breakdown of all Multiclass Classifier Models and scores:

RandomForest - 0.68
RandomForest ROS - 0.68
Multimonial NB - 0.60
Multimonial NB ROS - 0.62

Stakeholder Recommendations

After analyzing the most common terms/bi-grams in Positive and Negative tweets, I can make the following Stakeholder Recommendations:

From the Negative Tweets, we see “iPad design, design headaches, and iphone battery” as common terms.

Recommend to check iPhone battery performance
Revaluate iPad design in light of this Negative sentiment

From the Positive Tweets, we see “Apple store, opening temporary, and Google party” as common terms.

Recommend to repeat those two well received events

Next Steps/Future Work

Futher analysis into the following areas could yield additional insights.

Check if punctuation count could be a good feature for Tweet classification.
Use spacy to replace NLTK
Try using SMOTE to address class imbalance and see if it results in similar increases in overfitting as RandomOverSampling
Hyper-parameter tuning for the Random Forest classifiers.
Better understanding of using LIME to explain model behaviour.
Try Transfer Learning using the GloVE pre-trained word embeddings for Twitter
Get more Tweets for the corpus!
- Research for pre-labled Tweets
- Perform Sentiment Analysis using VADER and Text Blob tools and compare tool results to the human-annotated Tweets in this corpus. If the combined tool sentiment is similar enough to human labled sentiment, I could get more Google/Apple product Tweets and label them using tools to create a larger corpus. :)

For More Information

Review the non-technical presentation here
Contact the author Leah Pope

Repository Structure

--notebooks
----data_cleaning_and_eda.ipynb
----modeling.ipynb  (binary classifiers are here)
----modeling2.ipynb (multiclasss classifiers are here)
--data
----cleaned_tweets_all.csv
----cleaned_tweets_positive.csv
----cleaned_tweets_negative.csv
----cleaned_tweets_neutral.csv
----crowdflower-brands-and-product-emotions/ (dir for data downloaded from challenge website)
--images (dir for images)

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
extras		extras
images		images
notebooks		notebooks
.canvas		.canvas
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
PhaseFourProject_LeahPope.pdf		PhaseFourProject_LeahPope.pdf
README.md		README.md
orig_README.md		orig_README.md
student.ipynb		student.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP for Classifying Consumer Tweets

Phase 4 Project

Flatiron Online Data Science Bootcamp

Introduction

Data Description

EDA Questions Explored

Question 1: What is the general breakdown for these Tweets?

Question 2: Are the two brands represented equally in the labeled data?

Question 3: What insights can the data provide for specific brands?

Notebook

Corpus EDA Questions Explored

Question 1: What are the most common words in Postitive/Negative/Neutral Tweets?

Question 2: Is there a difference in character count between Postitive/Negative/Neutral Tweets?

Notebook

Modeling

Creating binary and multiclass classifiers for Tweets on Apple and Google products.

Notebook for Binary Classifiers

Notebook for Multiclass Classifers

Conclusions

Binary Classifer - winning model

Multiclass Classifer - winning model

Stakeholder Recommendations

Next Steps/Future Work

For More Information

Repository Structure

About

Releases

Packages

Languages

License

lspope/dsc-phase-4-project

Folders and files

Latest commit

History

Repository files navigation

NLP for Classifying Consumer Tweets

Phase 4 Project

Flatiron Online Data Science Bootcamp

Introduction

Data Description

EDA Questions Explored

Question 1: What is the general breakdown for these Tweets?

Question 2: Are the two brands represented equally in the labeled data?

Question 3: What insights can the data provide for specific brands?

Notebook

Corpus EDA Questions Explored

Question 1: What are the most common words in Postitive/Negative/Neutral Tweets?

Question 2: Is there a difference in character count between Postitive/Negative/Neutral Tweets?

Notebook

Modeling

Creating binary and multiclass classifiers for Tweets on Apple and Google products.

Notebook for Binary Classifiers

Notebook for Multiclass Classifers

Conclusions

Binary Classifer - winning model

Multiclass Classifer - winning model

Stakeholder Recommendations

Next Steps/Future Work

For More Information

Repository Structure

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages