Makes predictions of trustworthiness of people in your network based on who their connections are. Downloads timelines and connections, and saves each user's data to a separate json file. Assigns a score to each user based on vocabulary, re-tweets, and favorited tweets. Predicts score based on 'friends_ids' with a 75 percent accuracy using my network.
- If you plan to grab data for more than just a few users, make sure to prevent your machine from sleeping in your energy settings, or find another way to keep the network connection alive.
- Make sure you have tweepy installed
- Before you run this tool, make sure to create/populate a file named 'secret' in the root directory with your authentication keys from twitter (each on a separate line):
- CONSUMER_KEY
- CONSUMER_SECRET
- ACCESS_KEY
- ACCESS_SECRET
python download_data.py
(leave this running until you have downloaded enough data)python generate_user_scores_dict
(check out the histogram of user scores)python predict_scores.py
(check out how the predicted scores compare to the actual)
- http://positivewordsresearch.com/list-of-positive-words/
- http://boostblogtraffic.com/power-words/
- Bing Liu, Minqing Hu and Junsheng Cheng. "Opinion Observer: Analyzing and Comparing Opinions on the Web."
#####negative words: