Skip to content

This is a sample code for a big data project using R

Notifications You must be signed in to change notification settings

XiaotongCui/SampleCode_R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SampleCode_R

This is a sample code for a big data project using R. This research is mainly about the impact of social networks on COVID-19 vaccine hesitancy. We utilize a sample of 10 million individual-level Twitter data and: I) use machine learning techniques to predict the gender, race, and age of the Twitter user. II) do econometric analysis to do causal inference of the true impact. (Logistic regression, IV regression

All the projects have been done in the private server of R before. Due to privacy reasons, I just uploaded the code without the raw data here. Sorry if this causes confusion.

The code can be divided into two parts. One is the data cleaning part, one is the data analysis part. The data cleaning part mostly turns the massive raw datasets into preferred data structures. In the data analysis part, we do data analysis to conduct different structures of regression analysis.

Further introduction to the project and the result of the project can be found in the slides: "Social Connection and Vaccine Hesitancy Slides"

About

This is a sample code for a big data project using R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages