OpinionSpam

Research code for opinion spam detection

Dataset:

400 truthful positive reviews from TripAdvisor (described in [1])
400 truthful negative reviews from Expedia, Hotels.com, Orbitz, Priceline,TripAdvisor and Yelp (described in [2])

Linguistic Approaches:

word_embedding.py : using word-embedding method to detect opinion spam
tf_idf.py : using TF-IDF method to detect opinion spam (PCA deduction is optional)
unigram.py : using unigram method to detect opinion spam (PCA deduction is optional)

Behavior-based Approaches:

To be implemented...

##References [1] M. Ott, Y. Choi, C. Cardie, and J.T. Hancock. 2011. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.

[2] M. Ott, C. Cardie, and J.T. Hancock. 2013. Negative Deceptive Opinion Spam. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
dataset		dataset
.gitignore		.gitignore
README.md		README.md
tf_idf.py		tf_idf.py
unigram.py		unigram.py
word_embedding.py		word_embedding.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpinionSpam

Dataset:

Linguistic Approaches:

Behavior-based Approaches:

About

Releases

Packages

Languages

mingaoo/OpinionSpam

Folders and files

Latest commit

History

Repository files navigation

OpinionSpam

Dataset:

Linguistic Approaches:

Behavior-based Approaches:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages