Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokenize and analyze 2016 presidential candidate rhetoric for comparison with extremist communities #52

Open
kshaffer opened this issue Mar 16, 2017 · 7 comments

Comments

@kshaffer
Copy link
Contributor

Anyone interested in doing some basic word/n-gram analysis, topic models, etc. on presidential candidate speeches and press releases? Would be really interesting to see which candidates were/weren't plugged in to the extremist communities and when/where certain extremist language creeps into more mainstream campaign discourse.

An R notebook with instructions and code for obtaining this data from The American Presidency Project will be in the exploratory_notebooks folder soon (just submitted a pull request).

@kshaffer
Copy link
Contributor Author

Some examples of what's possible are in my personal GitHub repo.

FWIW, this should be a beginner-friendly project, but also open to more advanced algorithmic analysis.

@justinstimatze
Copy link

I'm interested in learning R and think this is an interesting project.

@kshaffer
Copy link
Contributor Author

@justinstimatze Excellent! I was able to scrape all of the GOP speeches, press releases, and campaign statements from January 2015 on and assemble into a single CSV, if that helps you explore: https://github.com/kshaffer/presidencyproject/blob/master/data/gop_2016_candidate_docs.csv

And if you're using this project to learn R, I highly recommend Tidy Text Mining. It's a free ebook explaining tools that might be helpful for this analysis.

@ghost
Copy link

ghost commented Mar 21, 2017

This seems like an interesting project. Is it possible for me to join in on this project?

@princeatul
Copy link

Hi Kshaffer,

I would like to join this project. I will be working on Pyhton. Is it possible for me to join this project?

@bstarling
Copy link
Contributor

@princeatul Thanks for your interest. All the things @kshaffer mentioned should be doable in python as well. If you're interested I would suggest grabbing the data linked above and try tackle one task from the list in the original post. Ex topic modeling once you have a preliminary jupyter notebook open a PR to add it to the exploratory_notebooks section of this repo. I'm not an expert in this area but I do have this tutorial in my backlog that may help you get started.

If you need any help just visit us in #assemble channel on slack or post back here with any questions.

@mw0
Copy link

mw0 commented Apr 30, 2017

I don't see it discussed above, so I'll mention that FiveThirtyEight had a very interesting article on using latent semantic analysis for topic modeling reddit groups. Certainly an interesting starting point for those interested in seeing what might be done here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants