Binary multinomial NB theorem applied from scratch for sentiment analysis . This is the original datalore notebook where i made the project . I exported the .ipynb for this project.
This is a bayesian Classifier which makes a simplifying (naive) assumption about how the features interact.
We represent the sentences as a bag_of_words. A bag of words is basically an unordered set of words where the position does not matter but we keep track of the frequency of each word.
Now our main task is to find the class given a document . Here the class represents the sentiment.
Cnb=argmaxc∈C P(c|d)
Here we are using Bayes Theorem to predict the class. Now according to Bayes theorem :
P(x|y)=P(y|x)P(x)/P(y)
applying this in the Naive Bias eqt:
Cnb=argmaxc∈CP(d|c)P(c)/P(d)
Now since the P(d) will remain constant throughout the process we can drop that from the equation.
so the equation turns into:
Cnb=argmaxc∈CP(d|c)P(c)
Here the P(c) is called the prior probability while P(d|c) is called the likelihood probability
Now to analyse the document we concentrate onto words (wi)
We assume two importanat things:
- The bag of words concept. The position of the word does not matter in the sentence.
- The probability of each factor(here words) is independent of each other so: P(w1,w2,.......|c)= P(w1|c)* P(w2|c)....
So finally we get:
Cnb=argmaxc∈CP(c)Π(i∈positions) P(wi|c)
where P(wi/C)=count(wi,c)+a/x+V* a , where a is the laplace smoothing function, x is the length of the class c and V is the total vocabulary(total length of the bag_of_words)
Heres an example:
This was Multinomial Naive Bayes....
We are going to implement the Binary Multinomial Naive Bayes for sentiment analysis which differs with Multinomial in the respect that in binary NB the repition of the words in each documents are not counted. The presence of specific words matters more than the frequency .
For example: