-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tokenize and analyze 2016 presidential candidate rhetoric for comparison with extremist communities #52
Comments
Some examples of what's possible are in my personal GitHub repo. FWIW, this should be a beginner-friendly project, but also open to more advanced algorithmic analysis. |
I'm interested in learning R and think this is an interesting project. |
@justinstimatze Excellent! I was able to scrape all of the GOP speeches, press releases, and campaign statements from January 2015 on and assemble into a single CSV, if that helps you explore: https://github.com/kshaffer/presidencyproject/blob/master/data/gop_2016_candidate_docs.csv And if you're using this project to learn R, I highly recommend Tidy Text Mining. It's a free ebook explaining tools that might be helpful for this analysis. |
This seems like an interesting project. Is it possible for me to join in on this project? |
Hi Kshaffer, I would like to join this project. I will be working on Pyhton. Is it possible for me to join this project? |
@princeatul Thanks for your interest. All the things @kshaffer mentioned should be doable in python as well. If you're interested I would suggest grabbing the data linked above and try tackle one task from the list in the original post. Ex If you need any help just visit us in #assemble channel on slack or post back here with any questions. |
I don't see it discussed above, so I'll mention that FiveThirtyEight had a very interesting article on using latent semantic analysis for topic modeling reddit groups. Certainly an interesting starting point for those interested in seeing what might be done here. |
Anyone interested in doing some basic word/n-gram analysis, topic models, etc. on presidential candidate speeches and press releases? Would be really interesting to see which candidates were/weren't plugged in to the extremist communities and when/where certain extremist language creeps into more mainstream campaign discourse.
An R notebook with instructions and code for obtaining this data from The American Presidency Project will be in the exploratory_notebooks folder soon (just submitted a pull request).
The text was updated successfully, but these errors were encountered: